OpenSHMEM 1.5 Implementation For Remote Memory Sharing

Intel SHMEM goes from an open-source GitHub project to a verified product version with a Partitioned Global Address Space (PGAS) programming paradigm compatible with OpenSHMEM 1.5

Unlike NVSHMEM and ROC SHMEM, Intel SHMEM's SYCL extension does not restrict users' development environments

A C/C++ application’s host memory is the point of view from which the current OpenSHMEM memory model is defined

Xe-Links allow individual GPU threads to issue loads, stores, and atomic operations to memory on other GPUs within a local group of GPUs

Xe-Links may run at full speed while GPU processing cores are computing by using hardware copy engines to overlap communications and calculation, but with a beginning latency

PEs must be mapped to SYCL devices one to one in order for Intel SHMEM to function

The current Intel SHMEM release includes sample programs, build and run instructions, programming paradigm, and API calls

Point-to-point Remote Memory Access (RMA), Atomic Memory Operations (AMO), Signalling, Memory Ordering, Teams, Collectives, Synchronisation, and strided RMA operations in OpenSHMEM 1.5 and 1.6 allow host and device targeting