Computational RAM

Computational RAM  or  C-RAM  is random-access memory with integrated processing elements on the same chip. This enables C-RAM to be used as a SIMD computer. It can also be used as memory bandwidth within a memory chip.

Perhaps the most influential implementations of computational RAM came from The Berkeley IRAM Project . Vector IRAM (V-IRAM) combined DRAM with integrated vector processor on the same chip.  [1]

Reconfigurable DRAM Architecture (RADram) is DRAM with Reconfigurable Computing FPGA logic elements integrated on the same chip.  [2]  SimpleScalar simulations show that RADram (in a system with a conventional processor) can give orders of magnitude better performance than some traditional DRAM (in a system with the same processor).

Some embarrassingly parallel computational problems are already limited by the Neumann bottleneck between the CPU and the DRAM. Some researchers expect that, for the same total cost, a machine built from computational RAM will run faster than a traditional general-purpose computer on these kinds of problems.  [3]

As of 2011, the “DRAM process” (few layers, optimized for high capacitance) and the “CPU process” (optimized for high frequency; Typically twice as Many BEOL layers as DRAM, since Each additional layer Reduces yield and Increases manufacturing cost such chips are relatively expensive per square millimeter compared to DRAM)

  • SRAM with embedded DRAM ( eDRAM ), giving ~ 3x area savings SRAM, add an additional process step (making it even more expensive per square millimeter) on the SRAM areas (and so lowering net cost per chip).
  • Starting a chip with a separate chip and DRAM chip (s), add small amounts of “coprocessor” computational ability to the DRAM, working within the limits of the DRAM process and adding small amounts of area to the DRAM, to do things that would otherwise be slowed down by the narrow bottleneck between the CPU and DRAM: zero-fill-selected areas of memory, copy large blocks of data from one location to another, find where a given byte occurs in some block of data etc. The resulting system-the unchanged CPU chip, and “smart DRAM” chip (s) – is at least as fast as the original system, and is slightly lower in cost. The cost of the small amount of time is more expensive than the cost of time.automatic test equipment .  [1]
  • starting with a DRAM-optimized process, the process is a little more CPU-related, and builds a low-frequency, low-power and high-bandwidth CPU process.

Some of the CPUs that are designed for DRAM process technology include the Berkeley IRAM Project , TOMI Technology  [4  ]  and the AT & T DSP1 .

Because a chip has many times the capacitance of an on-chip memory bus, a system with separate DRAM and CPU chips can have several times the energy consumption of an IRAM system with the same computer performance .  [1]

Because computational DRAM is expected to run hotter than traditional DRAM, DRAM storage cells, computational DRAM is expected to require more frequent DRAM refresh .  [2]


Processor-In- / Near-Memory

processor-in / near-memory (PINM)  refers to a computer processor (CPU) tightly coupled to memory , generally on the same silicon chip .

The chief goal of merging the processing and memory components is to reduce memory latency and increase bandwidth . Reducing the distance that data needs a system. Much of the complexity (and hence power consumption ) in current processes stems from strategies to deal with avoiding memory stalls.


In the 1980s, a tiny CPU Executed That FORTH Was fabricated into a DRAM chip to Improve PUSH and POP. FORTH is a Stack-oriented programming language and this improves its efficiency.

The Transputer was also made in the early 1980s making it essentially a processor-in-memory.

Notable PIM projects include the Berkeley IRAM project (IRAM) at the University of California, Berkeley  [6]  project or the University of Notre Dame PIM  [7]  effort.

See also

  • Computing with Memory


  1. ^ Jump up to: c  Christoforos E. Kozyrakis, Stylianos Perissakis, David Patterson, Thomas Anderson, et al. “Scalable Processors in the Billion-Transistor Era: IRAM” . IEEE Computer (magazine) . 1997. says “Vector IRAM … can operate as a parallel built-in self-test engine for the memory array, dramatically reducing DRAM testing time and cost.”
  2. ^ Jump up to: b  Mark Oskin, Frederic T. Chong, and Timothy Sherwood. “Active Pages: A Computation Model for Intelligent Memory” . 1998.
  3. Jump up^  Daniel J. Bernstein. “Historical notes on mesh routing in NFS”. 2002. “programming a computational RAM”
  4. Jump up^  “TOMI the milliwatt microprocessor”  [  permanent dead link  ]
  5. Jump up^  Kim Yong-Bin and Tom W. Chen. “Assessing Merged DRAM / Logic Technology”. 1998.  “Archived copy” (PDF) . Archived from the original(PDF) on 2011-07-25 . Retrieved 2011-11-27 .  [1]
  6. Jump up^  IRAM
  7. Jump up^  PIM

Leave a Reply

Your email address will not be published. Required fields are marked *

Copyright 2018
Shale theme by Siteturner