Memory ordering describes the order of accesses to computer memory by a CPU. The term can Either Refer to the memory ordering generated by the compile During compile time , or to the memory ordering generated by a CPU During runtime .
In modern microprocessors , memory ordering characterizes the CPUs ability to reorder memory operations – it is a type of out-of-order execution . Memory reordering can be used to fully utilize the bus-bandwidth of different types of memory such as caches and memory banks .
The most modern uniprocessors memory operations are not executed in the order specified by the program code. In this case, it’s all about executing the execution of the program. However, it can be done in many different ways – but in multi-threaded environments (or when interfacing with other hardware via memory nozzles) problems. To avoid problems memory barriers can be used in these cases.
Compile-time memory ordering
The compiler has some freedom to sort the order of operations during compile time . However this can lead to problems if the order of memory is important.
Compile-time memory barrier implementation
These barriers prevent a compiler from reordering instructions during compile time – they do not prevent reordering by CPU during runtime.
- The GNU inline assembler statement
asm volatile ("" ::: "memory");
__asm__ __volatile__ ("" ::: "memory");
forbids GCC compile to read and read commands around it. 
- The C11 / C ++ 11 command
forbids the compiler to read and read commands around it. 
- Intel ECC compiler uses “full compiler fence”
intrinsics.  
- Microsoft Visual C ++ Compiler: 
Runtime memory ordering
In symmetric multiprocessing (SMP) microprocessor systems
There are several memory-consistency models for SMP systems:
- Sequential consistency (all reads and all writes are in-order)
- Relaxed consistency (some types of reordering are allowed)
- Loads can be reordered after loads (better working of cache coherency, better scaling)
- Loads can be reordered after blinds
- Can be reordered after blinds
- Blinds can be reordered after loads
- Weak consistency (reads and writes are arbitrarily reordered, limited only by explicit memory barriers )
On some CPUs
- Atomic operations can be reordered with loads and stores.
- There can be no instruction cache pipelined, which prevents self-modifying code from being executed without special instruction cache flush / reload instructions.
- Dependent loads can be reordered (this is unique for Alpha). If the processor fetches to point to some data after this reordering, it could not be used in the data itself but it is already cached and not yet invalidated. Allowing this relaxation makes it easier for the reader to save money. 
|Type||Alpha||ARMv7||PA-RISC||POWER||SPARC RMO||SPARC PSO||SPARC TSO||x86||x86 oostore||AMD64||IA-64||z / Architecture|
|Loads reordered after loads||Y||Y||Y||Y||Y||Y||Y|
|Loads reordered after stores||Y||Y||Y||Y||Y||Y||Y|
|Reordered blinds after blinds||Y||Y||Y||Y||Y||Y||Y||Y|
|Reordered blinds after loads||Y||Y||Y||Y||Y||Y||Y||Y||Y||Y||Y||Y|
|Atomic reordered with loads||Y||Y||Y||Y||Y|
|Atomic reordered with blinds||Y||Y||Y||Y||Y||Y|
|Dependent loads reordered||Y|
|Incoherent pipeline cache instruction||Y||Y||Y||Y||Y||Y||Y||Y||Y|
Some older x86 and AMD systems have weaker memory ordering 
SPARC memory ordering modes:
- SPARC TSO = total store order (default)
- SPARC RMO = relaxed-memory order (not supported on recent CPUs)
- SPARC PSO = not supported on recent CPUs
Hardware memory barrier implementation
Many architectures with SMP support have special hardware instruction for flushing reads and writes during runtime .
- x86 , x86-64
lfence (asm), void _mm_lfence (void) sfence (asm), void _mm_sfence (void)  mfence (asm), void _mm_mfence (void) 
- ARMv7 
dmb (asm) dsb (asm) isb (asm)
Compile support for hardware memory barriers
Some compilers support builtins that emits hardware memory barrier instructions:
- GCC ,  version 4.4.0 and later,  has
- Since C11 and C ++ 11 year
atomic_thread_fence()command was added.
- The Microsoft Visual C ++ compile  has
- Sun Studio Compiler Suite  has
- Memory model (programming)
- Memory barrier
- Jump up^ GCC compiler-gcc.h Archived2011-07-24 at theWayback Machine.
- Jump up^ 
- Jump up^ ECC compile-intel.h Archived2011-07-24 at theWayback Machine.
- Jump up^ Intel (R) C ++ Compile Intrinsics Reference
Creates a barrier across which the instruction will be compiled. The compiler can allocate local data in registers across a memory barrier, but not global data.
- Jump up^ Visual C ++ Language Reference_ReadWriteBarrier
- Jump up^ Reordering on the Alpha processor by Kourosh Gharachorloo
- Jump up^ Memory Ordering in Modern Microprocessors by Paul McKenney
- Jump up^ Memory Barriers: Hardware View for Software Hackers, Figure 5 on Page 16
- Jump up^ Table 1. Summary of Memory Ordering, from “Memory Ordering in Modern Microprocessors, Part I”
- Jump up^ SFENCE – Fence Store
- Jump up^ MFENCE – Memory Fence
- Jump up^ Data Memory Barrier, Data Synchronization Barrier, and Instruction Synchronization Barrier.
- Jump up^ Atomic Builtins
- Jump up^ https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36793
- Jump up^ MemoryBarrier macro
- Jump up^ Handling Memory Ordering in Multithreaded Applications with Oracle Solaris Studio 12 Update 2: Part 2, Memory and Memory Barriers Fence