Manycore processors are specialist multi-core processors designed for a high degree of parallel processing , containing a large number of simple, independent processor cores (eg 10s, 100s, or 1,000s). Manycore processors are used extensively in embedded computers and high-performance computing . As of July 2016, the world’s fastest supercomputer (as ranked by the TOP500 list), the Chinese Sunway TaihuLight , obtains its performance from 40,960 SW26010 manycore processors, each containing 260 cores.
Contrast with multicore architecture
Manycore processors are distinct from multi-core processes in which they are optimized for the higher degree of explicit parallelism , and for higher throughput (or lower power consumption) and the expense of latency and lower single-thread performance .
The Broader category of multi-core processors , by contrast, are usually designed to Efficiently run Both parallel and serial code, and therefore instead more emphasis on high single thread performance (eg Devoting more silicon to out of order execution , deeper pipelines , more superscalar performance units, and larger, more general caches), and shared memory. These techniques devote runtime resources to figuring out implicit parallelism in a single thread. They are used in systems where they have evolved continuously (with backward compatibility) from single core processors. They usually have a few cores (eg 2,4,8), and can be complemented by a many accelerator (such as a GPU) in a heterogeneous system .
Cache coherency is an issue limiting the scaling of multicore processors. Manycore processors can bypass this message with message passing ,  scratchpad memory , DMA ,  partitioned global address space ,  or read-only / non-coherent caches. A manycore processor using a network has the opportunity to explicitly optimize the spatial layout of tasks (eg as seen in tooling developed for TrueNorth ). 
Manycore processors may have more in common (conceptually) with technologies originating in high performance computing such as clusters and vector processors . 
GPUs can be considered a form of multiple processor with multiple shader processing units , and only be suitable for highly parallel code (high throughput, but extremely poor single thread performance).
Suitable programming models
- Message passing interface
- OpenCL  or other APIs supporting compute kernels
- Partitioned global address space
- Actor model
- OpenMP 
- ZettaScaler , Japanese PEZY Computing 2048-core chip systems, currently the most energy-efficient (on Green500 ), and the fourth fastest supercomputer
- Sunway TaihuLight , a Chinese supercomputer , the fastest supercomputer in the world, using a home grown manycore architecture
- GPUs , which can be described as manycore vector processors
- Xeon Phi coprocessor, referred to as MIC ( Many Integrated Cores )
- adapteva Epiphany Architecture, a lot of chips using PGAS scratchpad memory
- Coherent Logix hx3100 Processor , a 100-core DSP / GPP processor based on HyperX Architecture
- Movidius Myriad 2 , has manycore Vision processing unit
- kalray , manycore PCI accelerator for data-intensive tasks
- Teraflops Research Chip has manycore processor using message passing
- TrueNorth has neuromophic processor with a manycore network on a chip architecture
- Massively parallel processor array
- Asynchronous array of simple processors
- Green arrays has manycore processor using message passing at low power applications
- Eyeriss , a many-core processor for convolutional neural neural net for embedded vision applications 
- XMOS Software Defined Quad-core Silicon XS1-G4
- Vector processor
- High performance computing
- Computer cluster
- Vision processing unit
- memory access pattern
- Jump up^ Mattson, Tim (January 2010). “The Future of Many Core Computing: A tale of two processors” (PDF) .
- Jump up^ Hendry, Gilbert; Kretschmann, Mark. “IBM Cell Processor” (PDF) .
- Jump up^ Olofsson, Andreas; Nordström, Tomas; Ul-Abdin, Zain (2014). “High-performance Kickstarting Energy-Efficient Manycore Architectures with Epiphany”. arXiv : 1412.5538 [ cs.AR ].
- Jump up^ Amir, Arnon (June 11, 2015). “IBM SyNAPSE Deep Dive Part 3” . IBM Research.
- Jump up^ “cell architecture” . “The Cell architecture is a multiprocessor vector”
- Jump up^ Rick Merritt (June 20, 2011), “OEMs show systems with Intel chips MIC” , www.eetimes.com , EE Times
- Jump up^ Barker, J; Bowden, J (2013). “Manycore Parallelism through OpenMP” . OpenMP in the Era of Low Power Devices and Accelerators . IWOMP. Reading Notes in Computer Science, Vol 8122. Springer.
- Jump up^ Chen, Yu-Hsin and Krishna, Tushar and Emer, Joel and Sze, Vivienne (2016). Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks . IEEE International Solid-State Circuits Conference, ISSCC 2016, Digest of Technical Papers . pp. 262-263.