For computers with shared memory, it is easier to create parallel programs, but their maximum performance is greatly limited by a small number of processors. For computers with distributed memory, the opposite is true. One of the possible ways of combining the merits of these two classes is the design of computers with NUMA (Non Uniform Memory Access) architecture.
NUMA ( Non-Uniform Memory Access - “ Non- uniform Memory Access ” or Non-Uniform Memory Architecture - “ Non-Uniform Memory Architecture ”) is an implementation scheme of computer memory used in multiprocessor systems when the memory access time is determined by its location relative to the processor.
This computer consists of a set of clusters connected to each other through an intercluster bus. Each cluster combines a processor, a memory controller, a memory module, and sometimes some I / O devices that are interconnected by means of a local bus. When the processor needs to perform read or write operations, it sends a request with the correct address to its memory controller. The controller analyzes the high-order bits of the address, by which it determines in which module the necessary data is stored. If the address is local, then the request is placed on the local bus, otherwise the request for the remote cluster is sent via the intercluster bus. In this mode, the program stored in one memory module can be executed by any processor of the system. The only difference is the speed of execution. All local links are processed much faster than remote links. Therefore, the processor of the cluster where the program is stored will execute it an order of magnitude faster than any other.