Manycore processor

Manycore processors are specialist multi-core processors designed for a high degree of parallel processing , containing a large number of simple, independent processor cores (eg 10s, 100s, or 1,000s). Manycore processors are used extensively in embedded computers and high-performance computing . As of July 2016, the world’s fastest supercomputer (as ranked by the TOP500 list), the Chinese Sunway TaihuLight , obtains its performance from 40,960 SW26010 manycore processors, each containing 260 cores.

Contrast with multicore architecture

Manycore processors are distinct from multi-core processes in which they are optimized for the higher degree of explicit parallelism , and for higher throughput (or lower power consumption) and the expense of latency and lower single-thread performance .

The Broader category of multi-core processors , by contrast, are usually designed to Efficiently run Both parallel and serial code, and therefore instead more emphasis on high single thread performance (eg Devoting more silicon to out of order execution , deeper pipelines , more superscalar performance units, and larger, more general caches), and shared memory. These techniques devote runtime resources to figuring out implicit parallelism in a single thread. They are used in systems where they have evolved continuously (with backward compatibility) from single core processors. They usually have a few cores (eg 2,4,8), and can be complemented by a many accelerator (such as a GPU) in a heterogeneous system .


Cache coherency is an issue limiting the scaling of multicore processors. Manycore processors can bypass this message with message passing , [1] scratchpad memory , DMA , [2] partitioned global address space , [3] or read-only / non-coherent caches. A manycore processor using a network has the opportunity to explicitly optimize the spatial layout of tasks (eg as seen in tooling developed for TrueNorth ). [4]

Manycore processors may have more in common (conceptually) with technologies originating in high performance computing such as clusters and vector processors . [5]

GPUs can be considered a form of multiple processor with multiple shader processing units , and only be suitable for highly parallel code (high throughput, but extremely poor single thread performance).

Suitable programming models

  • Message passing interface
  • OpenCL [6] or other APIs supporting compute kernels
  • Partitioned global address space
  • Actor model
  • OpenMP [7]
  • dataflow


  • ZettaScaler , Japanese PEZY Computing 2048-core chip systems, currently the most energy-efficient (on Green500 ), and the fourth fastest supercomputer
  • Sunway TaihuLight , a Chinese supercomputer , the fastest supercomputer in the world, using a home grown manycore architecture
  • GPUs , which can be described as manycore vector processors
  • Xeon Phi coprocessor, referred to as MIC ( Many Integrated Cores )
  • Tilera
  • adapteva Epiphany Architecture, a lot of chips using PGAS scratchpad memory
  • Coherent Logix hx3100 Processor , a 100-core DSP / GPP processor based on HyperX Architecture
  • Movidius Myriad 2 , has manycore Vision processing unit
  • kalray , manycore PCI accelerator for data-intensive tasks
  • Teraflops Research Chip has manycore processor using message passing
  • TrueNorth has neuromophic processor with a manycore network on a chip architecture
  • Massively parallel processor array
  • Asynchronous array of simple processors
  • Green arrays has manycore processor using message passing at low power applications
  • Eyeriss , a many-core processor for convolutional neural neural net for embedded vision applications [8]
  • Spinnaker
  • XMOS Software Defined Quad-core Silicon XS1-G4

See also

  • Multicore
  • Vector processor
  • High performance computing
  • Computer cluster
  • MPSoC
  • Vision processing unit
  • memory access pattern


  1. Jump up^ Mattson, Tim (January 2010). “The Future of Many Core Computing: A tale of two processors” (PDF) .
  2. Jump up^ Hendry, Gilbert; Kretschmann, Mark. “IBM Cell Processor” (PDF) .
  3. Jump up^ Olofsson, Andreas; Nordström, Tomas; Ul-Abdin, Zain (2014). “High-performance Kickstarting Energy-Efficient Manycore Architectures with Epiphany”. arXiv : 1412.5538  [ cs.AR ].
  4. Jump up^ Amir, Arnon (June 11, 2015). “IBM SyNAPSE Deep Dive Part 3” . IBM Research.
  5. Jump up^ “cell architecture” . “The Cell architecture is a multiprocessor vector”
  6. Jump up^ Rick Merritt (June 20, 2011), “OEMs show systems with Intel chips MIC” , , EE Times
  7. Jump up^ Barker, J; Bowden, J (2013). “Manycore Parallelism through OpenMP” . OpenMP in the Era of Low Power Devices and Accelerators . IWOMP. Reading Notes in Computer Science, Vol 8122. Springer.
  8. Jump up^ Chen, Yu-Hsin and Krishna, Tushar and Emer, Joel and Sze, Vivienne (2016). Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks . IEEE International Solid-State Circuits Conference, ISSCC 2016, Digest of Technical Papers . pp. 262-263.

Leave a Reply

Your email address will not be published. Required fields are marked *

Copyright 2019
Shale theme by Siteturner