Simple configuration with NUMA architecture
NUMA - architecture
NUMA computers have a serious drawback, which is expressed in the presence of a separate cache memory for each processor element. The cache memory for multiprocessor systems turns out to be a bottleneck
Explanation: If processor P1 has saved the value of X in cell q, then processor P2 wants to read the contents of the same cell q. The P2 processor will get a result different from X, since X has fallen into the cache of the P1 processor. This issue is called cache cache alignment.
Solution: ccNUMA Architecture
The problem of access heterogeneity
The NUMA architecture has non-uniform memory (memory allocation between modules), which in turn requires the user to understand the heterogeneity of the architecture. If accessing the memory of another node requires 5-10% more time than accessing its memory, this may not cause any questions. Most users will relate to such a system as UMA (SMP), and almost all programs developed for SMP will work quite well. However, this is not the case for modern NUMA systems, and the difference in local and remote access time lies in the interval of 200-700%.