5. Plugin - OpenMPDK/SMDK GitHub Wiki

5.1 User Guide

This section explains some technical background, functionality, and testing of SMDK plugin, libraries and tools. It provides a practical approach on how to use the Compatible and Optimization path and CXL-CLI tool provided by SMDK.

For introduction and information about compatible and optimization path, please refer to the User Interface section of the SMDK Architecture chapter.

5.1.1 Compatible Path

The SMDK compatible path provides methods to utilize CXL memory without modifying applications. SMDK provides a plugin for the compatible path; the compatible allocator library, which targets the heap segment of a process. This section introduces how to utilize the compatible path of SMDK in your system, by preloading the SMDK compatible library through an environment variable LD_PRELOAD.

5.1.1.1 How to Use

To enable and use the SMDK compatible library, you should set the environment variables LD_PRELOAD and CXLMALLOC_CONF as below before running your applications.

LD_PRELOAD

LD_PRELOAD is an environment variable that allows you to override a library by specifying a new function in one object. It is used by the Linux system programs' dynamic linker and loader to load specified shared libraries. In particular, the dynamic loader will load shared libraries in LD_PRELOAD before loading other libraries.
Please check the following:

  • Only shared library(*.so) can be preloaded by LD_PRELOAD, so you should specify libcxlmalloc.so as LD_PRELOAD target.
  • LD_PRELOAD is an environment variable, so it affects only the current process.

By referring run_heap_test.sh (/path/to/SMDK/src/test/heap_allocator/comp_api_c), you can find the usage of LD_PRELOAD.
If you run $ export LD_PRELOAD=/path/to/SMDK/lib/smdk_allocator/lib/libcxlmalloc.so and then run your application, the *.so library would be loaded before other library works. So your application is executed, it uses SMDK's compatible library (=heap allocation APIs e.g. malloc, calloc, etc.) with priority.
Note that you do not need to modify your application and its build script at all to enable and use SMDK's compatible path and APIs.

$ export LD_PRELOAD=/path/to/SMDK/lib/smdk_allocator/lib/libcxlmalloc.so
$ ./your_application (written by C, C++, Python and Java)

CXLMALLOC_CONF

SMDK supports various configurations, allowing you to specify each option before running your own applications. You can find the description and default value of each configuration in below table.
To apply the following configurations, set the environment variable CXLMALLOC_CONF and export it.

$ export CXLMALLOC_CONF=priority:exmem,exmem_size:4096,normal_size:4096,maxmemory_policy:interleave

The name of the configuration parameter and its value should be linked by a colon(:), and each configuration is divided by a comma(,). If the configuration sentence ends with a comma (e.g., ......, maxmemory_policy:interleave,), the configurations would not be applied properly.

Note: exmem_size and normal_size from the configurations below should not exceed the system's available memory (DDR and CXL) during the runtime of your application. If the above policy is not followed, the system policy for handling out-of-memory (OOM) occurrences (e.g., process kill by OOM killer) will take precedence over the allocator's policy (maxmemory_policy).

Config. Desc. Default Note
use_exmem Whether to enable CXL memory or not. FALSE true/false
priority Which types of memory allocated to applications first. Having reached up the maximum allocation out of the higher priority memory type(exmem_size or normal_size), in turn, SMDK allocator tries to allocate from lower priority. normal exmem/normal
exmem_size Maximum usage of cxl memory. If the cumulative allocated size of CXL memory allocation exceeds this value, SMDK Allocator performs a follow-up operation according to the current memory priority and the maxmemory_policy below.
When specifying the exmem_size value (number), it is recognized as MB when there is no unit after the number. Or you can specify units such as m/M/g/G after the number. (e.g., 1024 == 1024M == 1G). If it is specified as -1, it has the meaning of unlimited.
1048576(MB)
normal_size Maximum usage of normal memory. If the cumulative allocated size of DRAM allocation exceeds this value, SMDK Allocator performs a follow-up operation according to the current memory priority and the maxmemory_policy below.
When specifying the normal_size value (number), it is recognized as MB when there is no unit after the number. Or you can specify units such as m/M/g/G after the number. (e.g., 1024 == 1024M == 1G). If it is specified as -1, it has the meaning of unlimited.
1048576(MB)
maxmemory_policy interleave: Allocates a high(first)-priority type of memory first, then allocate a low(second)-priority type when allocating all of the memory specified in {high-priority-mem}_size. If all of memory is allocated as much as specified in {low-priority-mem}_size, a high-priority memory type is allocated again.

remain: Allocates a high-priority type of memory first, and allocate a low-priority type when allocating all of the memory specified in {high-priority-mem}_size. Regardless of the value set in {low-priority-mem}_size, allocator will allocate memory from the second priority type continuously.

oom (out-of-memory): Allocates memory in order of priority. When all types of DRAM/CXL memory are allocated as much as specified in {mem-type}_size, the memory is no longer allocated (return error).
oom
use_auto_arena_scaling Affects the following two things:
1) The number of arenas generated during initialization would be:
 false: static number.
 true: in proportion to the number of CPUs.
2) Arena allocation to threads would be:
 false: a round-robin way.
 true: determined based on CPU_ID the thread is running on.
TRUE
exmem_partition_range Sets a CXL memory expander interleave policy.
This configuration can give you the effect of CXL memory expanders' bandwidth aggregation and resource isolation.
If you specify multiple CXL.mem device nodes, SMDK allocator configures and returns a memory chunk from the memory pool of all the specified nodes. That is, as the number of nodes increases, the bandwidth would increase.
If you specify only one CXL.mem device node, SMDK allocator configures and returns a memory chunk from the memory pool of the specified node only. In this case, the bandwidth cannot be higher than the maximum bandwidth of each single device but you can achieve memory resource isolation effect by using only certain CXL.mem devices.
Nodes may be specified as N,N,N or N-N or N,N-N or N-N,N-N and so forth. You may also set this config to all, which means all CXL nodes in your system.
If a normal memory node is specified to this configuration, the value is ignored.
If a value outside the range of nodes in the system is specified, this configuration is ignored due to a setup error.
N/A
(no policy)
all
N,N
N-N
...
use_adaptive_interleaving Whether to enable Bandwidth-based Interleaving or not. FALSE true/false
adaptive_interleaving_policy bw_saturation: Set an adaptive interleaving policy as bandwidth saturation. If DDR DRAM bandwidth is saturated, automatically use CXL DRAM. When the CXL DRAM bandwidth is also saturated, then allocated by the System Fallback Order.
bw_order: Set an adaptive interleaving policy as bandwidth order. In this policy, requested memory is allocated from the highest Bandwidth node.
weighted_interleaving Set an adaptive interleaving policy as weighted interleaving. Set application's memory policy as weighted interleaving. The weight ratio of each node is determined by available bandwidth reported by tierd during runtime dynamically.
bw_saturation
interleave_node Optionally, set weighted interleaving policy's interleaving nodelists. If this config isn't set, interleave all system nodes. N/A N,M
N-M

Note: priority, maxmemory_policy, exmem_size, and normal_size are configurations that are closely related to each other. Please refer to the section Capacity-based Interleaving for details of operations according to each configuration. Also, if use_adaptive_interleaving is true, priority, normal/exmem_size, and maxmemory_policy are ignored. In order to utilize the adaptive interleaving feature of SMDK Allocator, additional settings are required. Please refer to the section Bandwidth-based Interelaving for more details.

Usage

You can enable the SMDK Allocator library by exporting LD_PRELOAD and CXLMALLOC_CONF environment variables as shown below.

$ export LD_PRELOAD=/path/to/SMDK/lib/smdk_allocator/lib/libcxlmalloc.so
$ export CXLMALLOC_CONF=use_exmem:true,exmem_size:16384,normal_size:16384,maxmemory_policy:remain,use_auto_arena_scaling:false,priority:exmem
$ ./your_application

Operation Verification

Before describing details about test cases and applications provided by SMDK, we would like to introduce the way to check the CXL memory usage. More specifically, it will be described through the example of the in-memory database application (Memcached) on an SMDK and CXL memory based system.

Configuration

Memcached server with SMDK/CXL Memcached client
(memtier_benchmark)
[SMDK CXLMALLOC_CONF]
use_exmem:true
normal_size:2048
exmem_size:20480
maxmemory_policy:remain
priority:normal

[Memcached]
--memory-limit=22528
--conn-limit=1400
--threads=24
--clients=50
--requests=2000
--data-size=4096
--ratio=1:0 (100% SET W/L)
  • SMDK configuration in Memcached server: normal_size is set to 2GB and the exmem_size is set to 20GB. priority is normal, and maxmemory_policy is remain. In other words, SMDK allocator allocates DRAM for heap memory requests up to 2GB, and then allocates CXL memory.
  • memtier_benchmark (Memcached client) configuration: 1200 clients (24 threads x 50 client connections per thread) send 2000 data set CMDs to the Memcached server respectively. The data size is 4KB each, so the total data size Memcached server has to store is around 9.2GB. The actual amount of memory used by Memcached server is around 11GB including internal structures and metadata.

CXL memory usage: Buddy information (/proc/buddyinfo)

One of the easiest ways to find whether CXL memory devices has been successfully initialized and used properly is to check the system's buddy allocation information using the command $ cat /proc/buddyinfo. The table below is an example of information through /proc/buddyinfo in a system with a page size of 4KB. Users can find information about the number of free (not used or allocated) chunks from 4KB to 4MB, for each memory zone managed by kernel virtual memory manager (VMM). Please note that the row "Node 1, zone Movable" indicates CXL memory.

4 KB 8 KB 16 KB 32 KB 64 KB 128 KB 256 KB 512 KB 1 MB 2 MB 4 MB
Node 0, zone DMA 1 0 0 1 2 1 1 0 1 1 3
Node 0, zone DMA32 3 8 3 4 4 5 3 5 4 4 436
Node 0, zone Normal 11975 9494 6262 4306 2617 1547 930 537 332 72 45610
Node 1, zone Movable 1 0 1 1 0 1 1 1 0 1 98300

Running test and changes of buddyinfo

  • Launch Memcached server: The table below shows the initial status of free buddy chunk right after running the Memcached server. The test system has 3ea of 128GB CXL memory expanders, and when you see the Movable zone below, you can find the number of 4MB free chunk is 98300(4MB * 98300 ≒384GB).
Node 0, zone      DMA      1      0      0      1      2      1      1      0      1      1      3
Node 0, zone    DMA32      3      8      3      4      4      5      3      5      4      4    436
Node 0, zone   Normal  11975   9494   6262   4306   2617   1547    930    573    332     72  45610
Node 1, zone  Movable      1      0      1      1      0      1      1      1      0      1  98300
  • Set data (~2GB): As described above, SMDK allocates the NORMAL zone for the first 2GB heap allocation requests. You can find from buddyinfo table that NORMAL zone's free chunks of 4MB decreased from 45610 to 45037. (≒2.2GB). Considering the increase and decrease of other free chunks, 2134756KB(2.04GB) from NORMAL zone has been allocated. Now SMDK is beginning to allocate Movable zone, you can find the number of 4MB free chunks in the Movable zone decreased slightly from 98300 to 98050.
Node 0, zone      DMA      1      0      0      1      2      1      1      0      1      1      3
Node 0, zone    DMA32      3      8      3      4      4      5      3      5      4      4    436
Node 0, zone   Normal    204   1291   3670   4713   3701   2348   1367    709    306     76  45037
Node 1, zone  Movable      1      0      0      0      1      1      1      1      1      0  98050
  • Set data (~11GB): The table below shows a free chunk’s status right after all the data set requests from Memcached client are processed. Calculating the free memory usage of the Movable zone, the memory size allocated from Movable zone is 9.2GB (402639796KB → 392957800KB).
Node 0, zone      DMA      1      0      0      1      2      1      1      0      1      1      3
Node 0, zone    DMA32      3      8      3      4      4      5      3      5      4      4    436
Node 0, zone   Normal   4848   1032   2577   5042   3690   2340   1362    711    304     78  45028
Node 1, zone  Movable      0      1      0      1      1      0      1      1      1      1  95936

5.1.1.2 API List

The SMDK compatible path implies the ways to utilize CXL memory without application SW modification. Below is compatible API list that SMDK provides. Technically, those are standard POSIX heap APIs that process dynamically call to extend and shrink heap segments.

API Desc. Note
malloc Allocates size bytes of uninitialized memory.
calloc Allocates memory for an array of num objects of size and initializes all bytes in the allocated memory to zero.
realloc Reallocates the given area of memory.
free Deallocates the space previously allocated by malloc(), calloc(), aligned_alloc(), or realloc().
posix_memalign Allocates size bytes aligned on a boundary specified by alignment, and returns a pointer to the allocated memory in memptr.
aligned_alloc Allocates size bytes of uninitialized memory whose alignment is specified by alignment. C++
new Allocates requested number of bytes. C++
delete Deallocates memory previously allocated by a matching operator new. C++
mmap
mmap64
Creates a new mapping of memory in the virtual address space of the calling process.

1. malloc

void* malloc(size_t size);

Parameters

  • size: number of bytes to allocate.

Return value

  • On success, returns the pointer to the beginning of the newly allocated memory. To avoid a memory leak, the returned pointer must be deallocated by calling free() or realloc().
  • On failure, returns a null pointer.

2. calloc

void* calloc(size_t num, size_t size);

Parameters

  • num: number of objects.
  • size: size of each object.

Return value

  • On success, returns the pointer to the beginning of the newly allocated memory. To avoid a memory leak, the returned pointer must be deallocated by calling free() or realloc().
  • On failure, returns a null pointer.

3. realloc

void *realloc(void *ptr, size_t new_size);

Parameters

  • ptr: pointer to the memory area to be reallocated.
  • new_size: new size of the array in bytes.

Return value

  • On success, returns the pointer to the beginning of the newly allocated memory. To avoid a memory leak, the returned pointer must be deallocated by calling free() or realloc(). The original pointer ptr is invalidated and any access to it is undefined behavior (even if reallocation was in-place).
  • On failure, returns a null pointer. The original pointer ptr remains valid and may need to be deallocated by calling free() or realloc().

4. free

void *free(void *ptr);

Parameters

  • ptr: pointer to the memory to deallocate.

Return Value

  • N/A

5. posix_memalign

int posix_memalign(void **memptr, size_t alignment, size_t size);

Parameters

  • memptr: pointer that shall be returned. Upon success, the value pointed to by memptr shall be a multiple of alignment.
  • alignment: specifies the alignment. The value of alignment shall be a power of two multiple of sizeof(void *).
  • size: number of bytes to allocate.

Return value

  • On success, returns zero.
  • On failure, an error number shall be returned to indicate the error, and the contents of memptr shall either be left unmodified or be set to a null pointer.

6. aligned_alloc

void *aligned_alloc(size_t alignment, size_t size);

Parameters

  • alignment: specifies the alignment. Must be a valid alignment supported by the implementation.
  • size: number of bytes to allocate. An integral multiple of alignment.

Return Value

  • On success, returns the pointer to the beginning of the newly allocated memory. To avoid a memory leak, the returned pointer must be deallocated by calling free() or realloc().
  • On failure, returns a null pointer.

7. new, new[]

void * operator new(std::size_t size);
void * operator new[](std::size_t size);

Parameters

  • size: size in bytes of the requested memory block.

Return value

  • Returns a pointer suitably aligned to point to an object of the requested size.

8. delete, delete[]

void operator delete(void *ptr);
void operator delete[](void *ptr);

Parameters

  • ptr: A pointer to the memory block to be released.

Return value

  • N/A

9. mmap, mmap64

void *mmap(void *addr, size_t len, int protection, int flags, int fd, off_t offset);
void *mmap64(void *addr, size_t len, int protection, int flags, int fd, off_t offset);

Parameters

  • addr: the starting address of the memory area to be mapped.
  • len: the length in bytes to map.
  • protection: the access allowed to this process for this mapping. (PROT_NONE, PROT_READ, PROT_WRITE, PROT_EXEC)
  • flags: further defines the type of mapping desired. (MAP_SHARED, MAP_PRIVATE, MAP_FIXED)
  • fd: an open file descriptor.
  • offset: the offset into the file, in bytes, where the map should begin.

Return value

  • On success, mmap() returns a pointer to the mapped area.
  • On error, the value MAP_FAILED (that is, (void *)-1) is returned, and errno is set to indicate the error.

5.1.1.3 Intelligent Tiering Engine

As you can see in the How to Use section, SMDK includes an intelligent tiering engine that allows a variety of memory usecases through configurations for applications: tiering priorities, capacities, and bandwidth among memories.

5.1.1.3.1 Capacity-based Tiering (priority, size)

By setting the priority, size and maxmemory_policy parameters of the CXLMALLOC_CONF environment variable, you can configure your applications' types priority of memory, maximum usage of each type, and memory usage policies.
The memory usage policies handle how to allocate the memory resources when the memory usage exceeds the pre-defined maximum usage of the first and second priority types of memory.

  • Interleave: the application is allocated the memory resources from the higher priority type of memory.
  • Remain: the application is allocated the memory resources from the second priority type of memory continuously.
  • OOM (Out Of Memory): SMDK returns error. In this case, applications can only use the memory resources when the allocated pages are reclaimed. It is designed to consider that the conventional Linux virtual memory subsystem restricts the system or process swappiness.

Example

We assumed a user scenario with Redis in-memory database application; Redis-client request 1GB of KV data store to the Redis-server running on CXL memory expander and SMDK. Note that allocated size for each memory type is 64MB.

[Redis-server with SMDK]
priority: ExMem
exmem_size: 64MB
normal_size: 64MB
maxmemory_policy: interleave / remain / oom 
 
[Redis-client]
1MB value x 1000 keys (=around 1GB)

In the above configurations, SMDK’s memory allocation ways based on each maxmemory_policy are as follow figure.

image

  • Interleave: Allocator allocates 64MB from CXL memory to the Redis-server first, then allocates 64MB of NORMAL (DRAM) memory. After allocating 64MB of DDR memory, it allocates CXL memory again. It repeats.
  • Remain: Allocator allocates 64MB from CXL memory to the Redis-server first, then allocates 64MB of NORMAL (DRAM) memory. After allocating all 64MB of DDR memory, it does not allocate CXL memory again. In other words, after the initial allocation of CXL memory (because the ‘priority’ is CXL), it allocates DRAM continuously.
  • OOM: Allocator allocates 64MB from CXL memory to the Redis-server first, then allocates 64MB of NORMAL (DRAM) memory. After allocating all 64MB of DDR memory, SMDK does not allocate memory at all (i.e., Returns error).

5.1.1.3.2 Bandwidth-based Tiering (adaptive interleaving)

By setting the use_adaptive_interleaving and adaptive_interleaving_policy parameters of the CXLMALLOC_CONF environment variables, you can configure your applications' bandwidth-aware memory policies.

Current Adaptive Interleaving provides three policies.

  • bw_saturation: When DDR DRAM bandwidth is saturated, it handles in-coming memory allocation request out of CXL DRAM. This is designed to mitigate imbalanced memory use on tiered DDR/CXL memory system.
  • bw_order: Use the node with the highest bandwidth in the system when allocating memory. This is designed to use CXL DRAM more actively through bandwidth-based allocation order rather than latency-based of existing Linux.
  • weighted_interleaving: Similar to the existing interleave policy, but the difference is that the memory is allocated according to the interleave weight ratio not evenly. This is designed to improve the overall performance of the system by considering the bandwidth difference between DDR DRAM and CXL DRAM.

Example

In order to use adaptive interleaving, you should check that composed components (monitor, kernel driver, smdk allocator and pmu plugin as described in Intelligent Tiering Engine) work properly through check_tierd.sh. If the result is passed, then you are ready to use adaptive interleaving. These components can be executed at once through the run_tierd.sh script we provided as shown below. Note that root privileges are required to load kernel modules, and you should modify the contents of the configuration file (/path/to/SMDK/lib/tierd/tierd.conf) to match your repository path.

$ cd /path/to/SMDK/lib/tierd
# Modify MLC_PATH and AMD_UPROFPCM_PATH(in case of AMD architecture)
$ vi tierd.conf
$ sudo ./run_tierd.sh

Note: If the script fails to run, make sure you are booting with the SMDK kernel. For more details, please refer to the Intelligent Tiering Engine section of the Installation chapter.

Alternatively, you can run it manually as shown below.

$ cd /path/to/SMDK/lib/tierd
$ sudo ./tierd -c ./tierd.conf

The current version of adaptive interleaving uses Intel MLC as a memory-intensive workload generator to measure the maximum bandwidth of each node. When tierd starts, it runs MLC once for the first time, which can take several minutes. Adaptive interleaving compares the real-time bandwidth to this measured value to determine whether DDR DRAM bandwidth is saturated. It means you need to wait for tierd to finish executing MLC before using adaptive interleaving feature, which can be verified by tierd's output logs like below.

$ sudo ./tierd -c ./tierd.conf

......

Monitor Constructor
Launch Monitor
Launch Bandwidth Loader Workload.. /path/to/smdk/lib/mlc/Linux/mlc

......

Bandwidth Loader Workload Finish (##.###s)
Notify...

......

You are now all ready to use adaptive interleaving. To enable an application to use the feature with SMDK Allocator, use_adaptive_interleaving parameter should be set to true in CXLMALLOC_CONF environment variable as follows.

$ export LD_PRELOAD=/path/to/SMDK/lib/smdk_allocator/lib/libcxlmalloc.so
$ export CXLMALLOC_CONF=use_exmem:true,use_adaptive_interleaving:true,adaptive_interleaving_policy:bw_saturation
$ ./your_application

The criteria for interleaving can be selected by adaptive_interleaving_policy. Currently, we provide a single option bw_saturation, which is the default value for this configuration parameter.

To terminate the execution of the userspace daemon, run the script below.

$ cd /path/to/SMDK/lib/tierd
$ sudo ./stop_tierd.sh

5.1.2 Optimization Path

5.1.2.1 How to Use

Unlike the compatible path, you do not need to set LD_PRELOAD and CXLMALLOC_CONF for using optimization path library.
Instead, you need to

  • Include the header file (/path/to/SMDK/lib/smdk_allocator/opt_api/include/smdk_opt_api.h) in your application code.
  • Re-write your application with SMDK optimization APIs for better memory optimization.
  • Modify your build script so that your application can be built with SMDK library (Add library path for libsmalloc.so or libsmalloc.a, and libpnm.so; /path/to/SMDK/lib/smdk_allocator/lib).

Makefile modification and LD_LIBRARY_PATH

As for modifying the build script of your application,

  • If you want to link the shared library of SMDK (libsmalloc.so), you need to set the library path and library name with -L(path) and -l(name) options, respectively.
  • If you want to link the static library of SMDK (libsmalloc.a), you need to set the full path of the library to the compiler flag.
  • Regardless of the ways above, you should specify the path of the header file in which the SMDK optimization APIs are defined by using the -I option. In the example below, it is added to CFLAGS variable.
### Example Makefile to link SMDK *optimization* API library: 
 
# dynamic link
......
CFLAGS += -I/path/to/SMDK/lib/smdk_allocator/opt_api/include
LDFLAGS += -L/path/to/SMDK/lib/smdk_allocator/lib/
LIBS += -lsmalloc
......
all: $(APP)
$(APP): $(APP).o
    $(CC) -o $@ $^ $(CFLAGS) $(LDFLAGS) $(LIBS)
 
 
# static link
......
CFLAGS += -I/path/to/SMDK/lib/smdk_allocator/opt_api/include
LIBS += /path/to/SMDK/lib/smdk_allocator/lib/libsmalloc.a
......
all: $(APP)
$(APP): $(APP).o
    $(CC) -o $@ $^ $(CFLAGS) $(LDFLAGS) $(LIBS)
/* example application code */
#include "smdk_opt_api.h"
...
int main(void) {
    ......
    void *buf1 = s_malloc(SMDK_MEM_EXMEM, 4*1024); // 4KB CXL memory allocation request
    void *buf2 = s_malloc(SMDK_MEM_NORMAL, 128); // 128B DRAM allocation request
    ......
    s_free_type(SMDK_MEM_EXMEM, buf1); // or s_free(buf1);
    s_free_type(SMDK_MEM_NORMAL, buf2); // or s_free(buf2);
    ......

Please make sure that if you choose dynamic linking way, you should specify the SMDK library's path in LD_LIBRARY_PATH environment variable before running your application as shown below.

$ export LD_LIBRARY_PATH=/path/to/SMDK/lib/smdk_allocator/lib

For more information about how to use optimization API library, refer to the test applications named opt_api, opt_api_cpp, and metadata_api at /path/to/SMDK/src/test/heap_allocator.

SMALLOC_CONF

Unlike the compatible path where many configurations can be set through the CXLMALLOC_CONF environment variable, the optimization path only allows an option use_auto_arena_scaling through the SMALLOC_CONF environment variable. The way of setting SMALLOC_CONF is same as CXLMALLOC_CONF described in the Compatible Path section.

Config. Desc. Default Note
use_auto_arena_scaling Affects the following two things:
1) The number of arenas generated during initialization would be:
 false: static number.
 true: in proportion to the number of CPUs.
2) Arena allocation to threads would be:
 false: a round-robin way.
 true: determined based on CPU_ID the thread is running on.
TRUE
$ export LD_LIBRARY_PATH=/path/to/SMDK/lib/smdk_allocator/lib  #if needed
$ export SMALLOC_CONF=use_auto_arena_scaling:true
$ ./your_application

PNM API

For using SMDK PNM API, few more steps are required. (Currently, the PNM API works on a C++ basis)
You need to

  • Include the header file (/path/to/SMDK/lib/smdk_allocator/opt_api/include/smdk_opt_api.hpp) in your application code.
  • Re-write your application with SMDK PNM API for better processing operations.
  • Modify the Makefile to set the library name with -l(name) option.
  • Specify the path of the PNMLibrary header files which the SMDK PNM API includes by using the -I option (/path/to/SMDK/lib/PNMLibrary-pnm-v3.0.0/build/libs/include/). In the example below, it is added to CXXFLAGS variable.
### Example Makefile to link *SMDK PNM* API library:
......
CXXFLAGS += -I/path/to/SMDK/lib/smdk_allocator/opt_api/include -I/path/to/SMDK/lib/PNMLibrary-pnm-v3.0.0/build/libs/include
LDFLAGS += -L/path/to/SMDK/lib/smdk_allocator/lib/
LIBS += -lsmalloc -lpnm
......
all: $(APP)
$(APP): $(APP).o
    $(CXX) -o $@ $^ $(CXXFLAGS) $(LDFLAGS) $(LIBS)
/* example application code for IMDB - Range Scan Operation */
#include "smdk_opt_api.hpp"
...
int main(void) {
    ......
    SmdkAllocator& allocator = SmdkAllocator::get_instance();
    allocator.process(SmdkAllocator::Device::PNM,
                      SmdkAllocator::PNMType::IMDB,
                      SmdkAllocator::Operation::ScanRange,
                      column, ranges, results);
    ......
}

Since PNMLibrary is linked dynamically, you should specify the PNMLibrary's path in LD_LIBRARY_PATH environment variable before running your application as shown below.

$ export LD_LIBRARY_PATH=/path/to/SMDK/lib/smdk_allocator/lib

For more information about how to use SMDK PNM API, refer to the test applications named pnm at /path/to/SMDK/src/test.

5.1.2.2 API List

There are three types of API sets in optimization path; Allocation API, Metadata API, and PNM API. For C-based Allocation and Metadata API sets, SMDK provides an allocator class (SmdkAllocator class) that can be utilized in C++. The PNM API works on a C++ basis.

C API Desc. C++ API Device Type
smdk_memtype_t Datatype which represents memory type. Contains two elements SMDK_MEM_NORMAL and SMDK_MEM_EXMEM. DDR, CXL DRAM
s_malloc Allocates size bytes of uninitialized memory. SmdkAllocator::malloc(type, size) DDR, CXL DRAM
s_calloc Allocates memory for an array of num objects of size and initializes all bytes in the allocated memory to zero. SmdkAllocator::calloc(type, num, size) DDR, CXL DRAM
s_realloc Reallocates the given area of memory on designated memory type. Can get any target type of memory regardless of original location of given area. SmdkAllocator::realloc(type, ptr, size) DDR, CXL DRAM
s_free Deallocates the space previously allocated by s_malloc(), s_calloc(), s_posix_memalign(), or s_realloc(). SmdkAllocator::free(ptr) DDR, CXL DRAM
s_free_type Deallocates the space previously allocated on designated type of memory. Operates fine although target type of memory and location of pointer do not match. SmdkAllocator::free(type, ptr) DDR, CXL DRAM
s_posix_memalign Allocates size bytes aligned on a boundary specified by alignment, and returns a pointer to the allocated memory in memptr. SmdkAllocator::posix_memalign(type, memptr, alignment, size) DDR, CXL DRAM
s_get_memsize_total Returns total size of requested type of memory. SmdkAllocator::get_memsize_total(type) DDR, CXL DRAM
s_get_memsize_used Returns size of memory (bytes) allocated with SMDK APIs. SmdkAllocator::get_memsize_used(type) DDR, CXL DRAM
s_get_memsize_available Returns available size of requested type of memory. (bytes) SmdkAllocator::get_memsize_available(type) DDR, CXL DRAM
s_get_memsize_node_total Returns total size of requested type of memory of node. (bytes) DDR, CXL DRAM
s_get_memsize_node_available Returns available size of requested type of memory of node. DDR, CXL DRAM
s_stats_print Prints out above three data (total, used, and available) per type of memory. SmdkAllocator::stats_print(unit) DDR, CXL DRAM
s_stats_node_print Prints out above three data (total, used, and available) per node and per type. SmdkAllocator::stats_node_print(unit) DDR, CXL DRAM
s_enable_node_interleave Sets interleave policy of calling thread with mmap() syscall. SmdkAllocator::enable_node_interleave(nodes) DDR, CXL DRAM
s_disable_node_interleave Unsets interleave policy of calling thread. SmdkAllocator::disable_node_interleave() DDR, CXL DRAM
s_malloc_node Allocates memory from specified node. SmdkAllocator::malloc_node(type, size, node) DDR, CXL DRAM
s_free_node Deallocates the space previously allocated by s_malloc_node(). SmdkAllocator::free_node(type, mem, size) DDR, CXL DRAM
Processes IMDB Scan operation with the PNM device SmdkAllocator::process(PNM, type=IMDB, op, data, op_info, result) PNM
Processes DLRM SLS operation with the PNM device SmdkAllocator::process(PNM, type=DLRM, SLS, data, op_info, result) PNM

1. smdk_memtype_t

Data structure used as parameter of allocation functions to represent target memory types.

typedef enum {
    SMDK_MEM_NORMAL=0,
    SMDK_MEM_EXMEM
} smdk_memtype_t;

2. s_malloc

void *s_malloc(smdk_memtype_t type, size_t size);

Parameters

  • type: target memory types to allocate. (SMDK_MEM_NORMAL / SMDK_MEM_EXMEM)
  • size: number of bytes to allocate.

Return value

  • On success, returns the pointer to the beginning of the newly allocated memory. To avoid a memory leak, the returned pointer must be deallocated by calling s_free(), s_free_type() or s_realloc().
  • On failure, returns a null pointer (e.g. invalid memory type, lack of available memory, etc.).

3. s_calloc

void* s_calloc(smdk_memtype_t type, size_t num, size_t size);

Parameters

  • type: target memory types to allocate. (SMDK_MEM_NORMAL / SMDK_MEM_EXMEM)
  • num: number of objects.
  • size: size of each object.

Return value

  • On success, returns the pointer to the beginning of the newly allocated memory. To avoid a memory leak, the returned pointer must be deallocated by calling s_free(), s_free_type() or s_realloc().
  • On failure, returns a null pointer (e.g. invalid memory type, lack of available memory, etc.).

4. s_realloc

void *s_realloc(smdk_memtype_t type, void *ptr, size_t new_size);

Parameters

  • type: target memory types to reallocate. (SMDK_MEM_NORMAL / SMDK_MEM_EXMEM)
  • ptr: pointer to the memory area to be reallocated.
  • new_size: new size of the array in bytes.

Return value

  • On success, returns the pointer to the beginning of the newly allocated memory. To avoid a memory leak, the returned pointer must be deallocated by calling s_free(), s_free_type() or s_realloc(). The original pointer ptr is invalidated and any access to it is undefined behavior (even if reallocation was in-place).
  • On failure, returns a null pointer (e.g. invalid memory type or address, etc.). The original pointer ptr remains valid and may need to be deallocated by calling s_free(), s_free_type() or s_realloc().

lf the memory type of (old)ptr and the type of new_ptr are different, this function will deallocate (old)ptr then return new memory pointer (buffer) with the type you specified.


5. s_free

void *s_free(void *ptr);

Parameters

  • ptr: pointer to the memory to deallocate.

Return Value

  • N/A

If you use this function you do not need to specify memory type for ptr, but some overhead time may be added to memory deallocation compared to using s_free_type function with the proper type.


6. s_free_type

void *s_free_type(smdk_memtype_t type, void *ptr);

Parameters

  • type: target memory types to free. (SMDK_MEM_NORMAL / SMDK_MEM_EXMEM)
  • ptr: pointer to the memory to deallocate.

Return Value

  • N/A

You need to set the proper type for the ptr you want to free (deallocate). Even if the memory type to which the actual ptr was allocated is different from the type you specified, the memory can be deallocated normally. However, be aware that some overhead time may be added to memory deallocation time compared to when specified correctly.


7. s_posix_memalign

int s_posix_memalign(smdk_memtype_t type, void **memptr, size_t alignment, size_t size);

Parameters

  • type: target memory type to allocate. (SMDK_MEM_NORMAL / SMDK_MEM_EXMEM)
  • memptr: pointer that shall be returned. Upon success, the value pointed to by memptr shall be a multiple of alignment.
  • alignment: specifies the alignment. The value of alignment shall be a power of two multiple of *sizeof(void ).
  • size: number of bytes to allocate.

Return value

  • On success, returns zero.
  • On failure (e.g. invalid memory type or address, etc.), an error number shall be returned to indicate the error and the contents of memptr shall either be left unmodified or be set to a null pointer.

8. s_get_memsize_total

size_t s_get_memsize_total(smdk_memtype_t type);

Parameters

  • type: memory type requesting. (SMDK_MEM_NORMAL / SMDK_MEM_EXMEM)

Return value

  • Returns the system total memory of requested type based on /proc/zoneinfo (bytes).
  • Returns 0 on invalid memory type.

9. s_get_memsize_used

size_t s_get_memsize_used(smdk_memtype_t type);

Parameters

  • type: memory type requesting. (SMDK_MEM_NORMAL / SMDK_MEM_EXMEM)

Return value

  • Returns used memory of the requested type which is allocated by SMDK heap allocation APIs (bytes).
  • Returns 0 on invalid memory type.

10. s_get_memsize_available

size_t s_get_memsize_available(smdk_memtype_t type);

Parameters

  • type: memory type requesting. (SMDK_MEM_NORMAL / SMDK_MEM_EXMEM)

Return value

  • Returns system available memory of the requested type based on /proc/buddyinfo (bytes).
  • Returns 0 on invalid memory type.

11. s_get_memsize_node_total

size_t s_get_memsize_node_total(smdk_memtype_t type, int node);

Parameters

  • type: memory type requesting. (SMDK_MEM_NORMAL / SMDK_MEM_EXMEM)
  • node: number of node requesting.

Return value

  • Returns system total memory of the requested type in designated node based on /proc/zoneinfo (bytes).
  • Returns 0 on invalid memory type or node.

12. s_get_memsize_node_available

size_t s_get_memsize_node_available(smdk_memtype_t type, int node);

Parameters

  • type: memory type requesting. (SMDK_MEM_NORMAL / SMDK_MEM_EXMEM)
  • node: number of node requesting.

Return value

  • Returns system available memory of requested type in designated node based on /proc/buddyinfo (bytes).
  • Returns 0 on invalid memory type or node.

13. s_stats_print

void s_stats_print(char unit);

Parameters

  • unit: Memory units to display statistic information. (k/K/m/M/g/G)

Return value

  • N/A

Result

  • Prints out total / used / available memory statistic information of each type of memory. Below is an example of a console screen output after executing this function on a system equipped with 64GB DRAM and 32GB CXL.mem.
SMDK Memory allocation stats:
    Type             Total              Used         Available
  Normal            62.6GB             0.0GB            57.2GB
   ExMem            32.0GB             0.0GB            32.0GB

14. s_stats_node_print

void s_stats_node_print(char unit);

Parameters

  • unit: Memory units to display statistic information. (k/K/m/M/g/G)

Return value

  • N/A

Result

  • Prints out total / used / available memory statistic information of each type of memory per node.
    Type   Node             Total         Available
  Normal      0            32.0GB            27.9GB
  ExMem       1            32.0GB            29.0GB
  ExMem       2            32.0GB            28.1GB

15. s_enable_node_interleave

void s_enable_node_interleave(char *nodes);

Parameters

  • nodes: nodes to interleave. ("a"/"a,b"/"a-b")

Return Value

  • N/A

Result

  • Sets the interleave policy of the calling thread. After the policy setting, you can allocates requested size of memory from designated nodes by using mmap syscall.

16. s_disable_node_interleave

void s_disable_node_interleave(void);

Parameters

  • N/A

Return Value

  • N/A

Result

  • Unsets the interleave policy of the calling thread.

17. s_malloc_node

void* s_malloc_node(smdk_memtype_t type, size_t size, char *node);

Parameters

  • type: target memory types to allocate. (SMDK_MEM_NORMAL / SMDK_MEM_EXMEM)
  • size: size to allocate.
  • node: node to allocate memory.

Return value

  • On success, returns the pointer to the beginning of the newly allocated memory. To avoid a memory leak, the returned pointer must be deallocated by calling s_free_node().
  • On failure, returns a null pointer (e.g. invalid memory type or node, lack of available memory, etc.).

Result

  • Allocates requested size of memory from designated node. Only one node should be specified for node. Also parameter node should match parameter type.

18. s_free_node

void s_free_node(smdk_memtype_t type, void* mem, size_t size);

Parameters

  • type: target memory type to allocate. (SMDK_MEM_NORMAL / SMDK_MEM_EXMEM)
  • mem: address of memory to deallocate.
  • size: size of memory to deallocate.

Result

  • Deallocates memory which is allocated by s_malloc_node(). The parameter type is necessary for metadata management. As directly mapped memory is not managed by SMDK allocator library, this function requires size parameter.

19. SmdkAllocator::process

void SmdkAllocator::process(Device device, PNMType type, Operation op,
                            const pnm::imdb::compressed_vector &column,
                            const pnm::imdb::Ranges &ranges,
                            pnm::imdb::BitVectors &results)

void SmdkAllocator::process(Device device, PNMType type, Operation op,
                            const pnm::imdb::compressed_vector &column,
                            const pnm::imdb::Ranges &ranges,
                            pnm::imdb::IndexVectors &results)

void SmdkAllocator::process(Device device, PNMType type, Operation op,
                            const pnm::imdb::compressed_vector &column,
                            const pnm::imdb::Predictors &predictors,
                            pnm::imdb::BitVectors &results)

void SmdkAllocator::process(Device device, PNMType type, Operation op,
                            const pnm::imdb::compressed_vector &column,
                            const pnm::imdb::Predictors &predictors,
                            pnm::imdb::IndexVectors &results)

template <typename T>
void SmdkAllocator::process(Device dev, PNMType type, Operation op,
                            const SlsTable<T> &table,
                            const SlsParam &op_param,
                            std::vector<T> &results)

Parameters

  • device: device to process operation. (PNM)
  • type: target application type of the PNM device. (IMDB / DLRM)
  • op: operation to process. (ScanRange / ScanList / Sls)
  • column, table: data that the operation processed onto.
  • ranges, predictors, op_param: information about the operation.
  • results: pointer to the results where the operation outputs are stored.

Return value

  • N/A

Result

  • Processes operation with specified device. Currently, SMDK PNM API supports PNM device.

Notice

  • For IMDB Scan operation, (Range / List) scans with (Bit / Index) vector as outputs are supported.
  • For DLRM SLS operation, data types of float and uint32_t are supported.
  • Data structures used as parameters of DLRM SLS operation are described below.
template <typename T>
struct SlsTable
{
    const std::vector<T> &tables;
    uint32_t tables_count;
    uint32_t rows_count;
    uint32_t feature_size;
    sls_user_preferences alloc_option = SLS_ALLOC_AUTO;

    pnm::operations::SlsOperation::Type data_type() const;
};

struct SlsParam
{
    uint32_t n_batch;
    const std::vector<uint32_t> &lengths;
    const std::vector<uint32_t> &indices;
};

5.1.2.3 Language Binding

5.1.2.3.1 C++

To enable and use SMDK's optimization path in a C++ application, you need to link the SMDK optimization library (libsmalloc.so or libsmalloc.a) when you build your application and include the smdk_opt_api.hpp header file in your application code. The header file defines a SmdkAllocator class, which is implemented in a singleton pattern to have only one instance of the class in the runtime environment. The example code below will help you know about how to use it.

#include "smdk_opt_api.hpp"
......

int main(void) {
    ......
    SmdkAllocator& allocator = SmdkAllocator::get_instance();
    void *buf = allocator.malloc(type, size);
    allocator.free(malloc_buf);
    ......
    buf = allocator.malloc_node(type, size, node);
    allocator.free_node(type, buf, size);
    ......
    allocator.stats_print('G');
    ......
}

The member functions of class SmdkAllocator are as follows.

Function Desc. Reference API in C
get_instance Returns a SMDK allocator instance. N/A
malloc Allocates size bytes of uninitialized memory. s_malloc
calloc Allocates memory for an array of num objects of size and initializes all bytes in the allocated memory to zero. s_calloc
realloc Reallocates the given area of memory on designated memory type. Can get any target type of memory regardless of original location of given area. s_realloc
free Deallocates the space previously allocated by malloc(), calloc(), posix_memalign() or realloc(). s_free
s_free_type
posix_memalign Allocates size bytes aligned on a boundary specified by alignment, and returns a pointer to the allocated memory in memptr. s_posix_memalign
get_memsize_total Returns total size of requested type of memory. s_get_memsize_total
get_memsize_used Returns size of memory allocated with SMDK APIs. s_get_memsize_used
get_memsize_available Returns available size of requested type of memory. s_get_memsize_available
stats_print Prints out above three data (total, used, and available) per type. s_stats_print
stats_node_print Prints out above three data (total, used, and available) per node and per type. s_stats_node_print
enable_node_interleave Sets interleave policy of calling thread with mmap() syscall. s_enable_node_interleave
disable_node_interleave Unsets interleave policy of calling thread. s_disable_node_interleave
malloc_node Allocates memory from specified node. s_malloc_node
free_node Deallocates the space previously allocated by malloc_node(). s_free_node

Since most of the member functions are the same as SMDK optimization APIs in C, only the newly added or changed functions are described in detail below. Please refer to the previous content (API List) for other functions.

1. get_instance

static SmdkAllocator& get_instance();

Parameters

  • N/A

Return value

  • Returns a SMDK allocator instance. The creation of the instance is limited to one.

2. free

void free(void *ptr);
void free(smdk_memtype_t type, void *ptr);

Parameters

  • type: target memory type to free. (SMDK_MEM_NORMAL / SMDK_MEM_EXMEM)
  • ptr: pointer to the memory to deallocate.

Return value

  • N/A

5.1.2.3.2 Python3

SMDK includes py_smdk packages that can be imported in Python3 environments. You can build py_smdk package through the commands below.

$ cd /path/to/SMDK/lib
$ ./build_lib.sh py_smdk #_py_smdk.so is generated in /path/to/SMDK/lib/smdk_allocator/opt_api/py_smdk_pkg

The built package and its py_smalloc.py module provide an interface to SMDK optimization APIs. So you can access CXL memory through the optimization path of SMDK in Python3 application by importing py_smdk package in your Python3 application. Setting the following environment variables would be required to import the py_smdk package.

# 1. LD_LIBRARY_PATH
# You need to specify the path where the SMDK *optimization* library is located.
# If you copied the library to your system's library path, you do not need to set below.
export LD_LIBRARY_PATH=/path/to/SMDK/lib/smdk_allocator/lib
# 2. PYTHONPATH
# You also need to specify the py_smdk package's path so that your python3 application can recognize the package.
# If you copied py_smdk package to the default path on your system(you can get by os.sys.path, FYI), you do not need to set below.
export PYTHONPATH=/path/to/SMDK/lib/smdk_allocator/opt_api/py_smdk_pkg

Below is the way to enable SMDK optimization path in a Python3 application.

Creating class mem_obj or class mem_obj_node
This is a way to create and utilize an object that can store and load your data. (Defined in the module /path/to/py_smdk_pkg/py_smdk/py_smalloc.py.)
Please refer to the example below.

from py_smdk import py_smalloc
memtype = py.smalloc.SMDK_MEM_EXMEM
smdk_obj = py_smalloc.mem_obj(memtype, "hello SMDK")
print(smdk_obj.data) # or print(smdk_obj.get())

del smdk_obj # call smdk_obj.free() explicitly

In addition to the examples described above, you can get a CXL / NORMAL mem object by specifying its size.

smdk_obj = py_smalloc.mem_obj(memtype, size=4096)
smdk_obj.set("hello SMDK")

Also, you can get a CXL / NORMAL mem object from a specific memory node.

py_smalloc.mem_obj_node(memtype, node, "hello SMDK"))

If you want to update the data stored in the assigned object, you can overwrite the new data through set method like below. The SMDK allocator re-allocates the internal memory buffer according to the size of the data.

smdk_obj.set("Scalable Memory Development Kit")

The assigned mem_obj can store a variety of Python data types. However, please note that Python-specific features that involve changes in the size of the data structures would be limited (e.g., list.append(data)). Please refer to opt_api_python for more usecases.

The following is defined in the py_smalloc module of the py_smdk package.

1. Classes
1) py_smalloc.mem_obj

class py_smalloc.mem_obj(self, mem_type, data=None, size=None)

You can create an object with memory space allocated from the mem_type you specify. You can specify the data to write to mem_obj.data, or you can specify the free memory chunk size of mem_obj.data. If you specify both, the size of mem_obj.data is max(getsizeof(data), size). Below is the list of methods this class includes.

set(data, mem_type=None)
  # Set the data to mem_obj.data.
  # If the size of the data is greater than the previously stored data, mem_obj.data is realloc-ed.
  # If the specified mem_type different from the self.mem_type, mem_obj.data is also realloc-ed.

get()
  # Returns mem_obj.data. You can call this method, or you can access mem_obj.data directly.

resize(mem_type, size)
  # The size (self.size) of mem_obj.data will be changed to the size you newly specified.
  # The existing data is maintained, but if the size becomes smaller, the data may be damaged.

free()
  # Free mem_obj.data to return the memory used to the system.
  # This method is also called when you delete(del) this object.

2) py_smalloc.mem_obj_node

class py_smalloc.mem_obj_node(self, mem_type, node, data=None, size=None)

The difference between classes mem_obj_node and mem_obj is whether you need to specify a memory node when you create an object or not. The methods below that mem_obj_node class has and its function are almost the same as those of mem_obj class.

set(self, data)
  # Set the data to mem_obj_node.data.
  # If the size of the data is greater than the previously stored data, mem_obj_node.data is realloc-ed.
get()
  # Returns mem_obj_node.data. You can call this method, or you can access mem_obj_node.data directly.
resize(size):
  # The size (self.size) of mem_obj_node.data will be changed to the size you newly specified.
  # The existing data is maintained, but if the size becomes smaller, the data may be damaged.
free():
  # Free mem_obj_node.data to return the memory used to the system.
  # This method is also called when you delete(del) this object.

2. Functions
The functions below are as interfaces to the optimization APIs with the same function name. See this section for instructions and usage for each function.

py_smalloc.s_stats_print(unit)
py_smalloc.s_stats_node_print(unit)
py_smalloc.get_memsize_total(smdk_memtype)
py_smalloc.get_memsize_used(smdk_memtype)
py_smalloc.get_memsize_available(smdk_memtype)
py_smalloc.get_memsize_node_total(smdk_memtype, node)
py_smalloc.get_memsize_node_available(smdk_memtype, node)
py_smalloc.enable_node_interleave(nodes)
py_smalloc.disable_node_interleave()

3. Constants

# Constants used for 'mem_obj' and 'mem_obj_node' classes that separates memory types in SMDK allocator.
py_smalloc.SMDK_MEM_NORMAL
py_smalloc.SMDK_MEM_EXMEM

5.1.3 CXL-CLI

CXL-CLI is an extension of Intel CXL-CLI that works with SMDK. SMDK extends this tool to provide additional commands (e.g. timestamp, poison, event, identify, fw-update, etc.) defined in CXL specification. It also provides an interface that allows you to group CXL memory devices easily and control SMDK kernel specific features. From SMDK v1.3, commands for checking node-to-node memory latency and manipulating the CXL Swap function are also supported. From SMDK v1.4, commands for manipulating CXL Cache function are also supported.
The newly added commands are described below.

Note1: Running CXL-CLI requires root privileges.
Note2: From SMDK v2.1, set-alert-config command has been integrated into the ndctl upstream since version 79. Please refer to ndctl document for more details.

5.1.3.1 CXL specification-based commands

5.1.3.1.1 Commands

Command Option Description
inject-poison <mem0> -a <dpa> [<options>]
-v, --verbose     turn on debug
-S, --serial     use serial numbers to id memdevs
-a, --address <dpa>     DPA to inject or clear poison (hex value)
-l, --length <dpa length>     length in bytes from the DPA specified by '-a' to inject or clear poison (hex value)
Injects poison into a requested physical address.
clear-poison <mem0> -a <dpa> [<options>]
-v, --verbose     turn on debug
-S, --serial     use serial numbers to id memdevs
-a, --address <dpa>     DPA to inject or clear poison (hex value)
-l, --length <dpa length>     length in bytes from the DPA specified by '-a' to inject or clear poison (hex value)
Clears poison from the requested physical address.
set-timestamp <mem0> [<options>]
-v, --verbose     turn on debug
-S, --serial     use serial numbers to id memdevs
Sets the timestamp on the device. It is recommended that the host set the timestamp after every Conventional or CXL Reset. Otherwise, the timestamp may be inaccurate.
get-timestamp <mem0> [<options>]
-v, --verbose     turn on debug
-S, --serial     use serial numbers to id memdevs
Gets the timestamp from the device. Timestamp is initialized via the set-timestamp command.
get-event-record <mem0> -t <event_type> [<options>]
-v, --verbose     turn on debug
-S, --serial     use serial numbers to id memdevs
-t, --type <n>     type of event 1: info, 2: warning, 3: failure, 4: fatal
Retrieves the next event records that may exist in the device’s requested event log.
clear-event-record <mem0> -t <event_type> [<options>]
-v, --verbose     turn on debug
-S, --serial     use serial numbers to id memdevs
-t, --type <n>     type of event 1: info, 2: warning, 3: failure, 4: fatal
-a, --all     clear all event
-n, --num_handle <n>     event handle number to clear
Provides a mechanism for the host to clear events that it has consumed from the device’s Event Log.
identify <mem0> [<mem1>..<memN>] [<options>]
-v, --verbose     turn on debug
-S, --serial     use serial numbers to id memdevs
Retrieves basic information about the memory device(s), and displays the result. (e.g. FW revision, capacity, event log size, QoS telemetry capabilities, etc.)
get-health-info <mem0> [<mem1>..<memN>] [<options>]
-v, --verbose     turn on debug
-S, --serial     use serial numbers to id memdevs
Gets the current instantaneous health of the device(s) and displays the result. (e.g. health status, life used, device temperature, etc.)
get-alert-config <mem0> [<mem1>..<memN>] [<options>]
-v, --verbose     turn on debug
-S, --serial     use serial numbers to id memdevs
Retrieves the device's critical and programmable warning alert configuration. (e.g. valid alerts, alert thresholds, etc.)
set-alert-config - Allows the host to configure programmable warning alert thresholds optionally.
get-firmware-info <mem0> [<mem1>..<memN>] [<options>]
-v, --verbose     turn on debug
-S, --serial     use serial numbers to id memdevs
Retrieves information about the device(s) FW. (e.g. FW slots info, slot#N FW revision, etc.)
transfer-firmware <mem0> -i <FW package> -s <slot number> [<options>]
-v, --verbose     turn on debug
-S, --serial     use serial numbers to id memdevs
-i, --input <file>     filename of FW package to transfer
-s, --slot <n>     slot number to transfer FW package
Transfers all or part of a FW package from the caller to the device. FW packages shall be 128-byte aligned.
activate-firmware <mem0> -s <slot number> [<options>]
-v, --verbose     turn on debug
-S, --serial     use serial numbers to id memdevs
-s, --slot <n>     slot number to activate FW package
--online     enable online activation
Makes a FW previously stored on the device (slot) with the transfer FW command, the active FW.
get-shutdown-state <mem0> [<options>]
-v, --verbose     turn on debug
-S, --serial     use serial numbers to id memdevs
Gets current Shutdown State (dirty or clean).
set-shutdown-state <mem0> [<options>]
-v, --verbose     turn on debug
-S, --serial     use serial numbers to id memdevs
--clean     set shutdown state to clean (default: dirty)
Sets current Shutdown State to dirty or clean.
get-scan-media-caps <mem0> [<mem1>..<memN>] -a <dpa> -l <length> [<options>]
-v, --verbose     turn on debug
-S, --serial     use serial numbers to id memdevs
-a, --address <dpa>     starting DPA where to retrieve scan media capabilities
-l, --length <length>     range of physical addresses, in units of 64B
Retrieves capabilities and options for the scan-media feature based on the requested range.
scan-media <mem0> [<mem1>..<memN>] -a <dpa> -l <length> [<options>]
-v, --verbose     turn on debug
-S, --serial     use serial numbers to id memdevs
-a, --address <dpa>     starting DPA where to start the scan
-l, --length <length>     range of physical addresses, in units of 64B
Initiates a scan of a portion of CXL devices' media for locations that are poisoned or result in poison by host access.
get-scan-media <mem0> [<mem1>..<memN>] [<options>]
-v, --verbose     turn on debug
-S, --serial     use serial numbers to id memdevs
Retrieves an unordered list of poisoned memory locations, in response to the scan-media command.
sanitize-memdev <mem0> [<mem1>..<memN>] [<options>]
-v, --verbose     turn on debug
-S, --serial     use serial numbers to id memdevs
-e, --secure-erase     secure erase a memdev
-s, --sanitize     sanitize a memdev
Sanitizes the device to securely re-purpose or decommission it.
get-sld-qos-control <mem0> [<mem1>..<memN>] [<options>]
-v, --verbose     turn on debug
-S, --serial     use serial numbers to id memdevs
Retrieves the SLD’s QoS control parameters.
set-sld-qos-control <mem0> [<mem1>..<memN>] [<options>]
-v, --verbose     turn on debug
-S, --serial     use serial numbers to id memdevs
-e, --egress_port_congestion     enable egress port congestion
-d, --throughput_reduction     enable temporary throughput reduction
-m, --egress_moderate_percent <n>     Threshold in % to indicate 'moderate'
-s, --egress_severe_percent <n>     Threshold in % to indicate 'severe'
-i, --backpressure_sample_interval <n>     Interval in ns to take sample(1-15)
Sets the SLD’s QoS control parameters.
get-sld-qos-status <mem0> [<mem1>..<memN>] [<options>]
-v, --verbose     turn on debug
-S, --serial     use serial numbers to id memdevs
Retrieves the SLD’s QoS status, i.e. Backpressure Average Percentage.

5.1.3.1.2 Examples

Poison commands
# ./cxl inject-poison mem0 -a 0x1000
cxl memdev: cmd_inject_poison: inject-poison 1 mem

# ./cxl clear-poison mem0 -a 0x1000
cxl memdev: cmd_clear_poison: clear-poison 1 mem
Timestamp commands
# ./cxl get-timestamp mem1
1/1/1970 09:00:00
cxl memdev: cmd_get_timestamp: get-timestamp 1 mem

# ./cxl set-timestamp mem1
cxl memdev: cmd_set_timestamp: set-timestamp 1 mem

# ./cxl get-timestamp mem1
7/24/2023 16:33:21
cxl memdev: cmd_get_timestamp: get-timestamp 1 mem
Event commands
# ./cxl get-event-record mem0 -t 3
 Received 2 event records from device
No. 1
UUID                             : 601dcbb3-9c064eab-b8af4e9b-fb5c9624 (DRAM Event)
Physical address                 : 0x1000
Memory Event Desc                : Unknown
Memory Event Type                : Data Path Error
Transaction Type                 : Host Read
Event Record Flags               : Failure Event
Event Timestamp                  : 7/26/2023 13:42:35
Handle                           : 1

No. 2
UUID                             : 601dcbb3-9c064eab-b8af4e9b-fb5c9624 (DRAM Event)
Physical address                 : 0x1000
Memory Event Desc                : Unknown
Memory Event Type                : Data Path Error
Transaction Type                 : Host Read
Event Record Flags               : Failure Event
Event Timestamp                  : 7/26/2023 13:42:47
Handle                           : 2

Overflow Error Count             : 0
cxl memdev: cmd_get_event_record: get-event-record 1 mem

# ./cxl clear-event-record mem0 -t 3 -n 2
cxl memdev: cmd_clear_event_record: clear_event_record 1 mem

# ./cxl get-event-record mem0 -t 3
 Received 1 event records from device
No. 1
UUID                             : 601dcbb3-9c064eab-b8af4e9b-fb5c9624 (DRAM Event)
Physical address                 : 0x1000
Memory Event Desc                : Unknown
Memory Event Type                : Data Path Error
Transaction Type                 : Host Read
Event Record Flags               : Failure Event
Event Timestamp                  : 7/26/2023 13:42:35
Handle                           : 1

Overflow Error Count             : 0
cxl memdev: cmd_get_event_record: get-event-record 1 mem

# ./cxl clear-event-record mem0 -t 3 -a
cxl memdev: cmd_clear_event_record: clear_event_record 1 mem

# ./cxl get-event-record mem0 -t 3
  Received 0 event records from device
Overflow Error Count             : 0
cxl memdev: cmd_get_event_record: get_event_record 1 mem
Identify memory device command
# ./cxl identify mem0
CXL Identify Memory Device "mem0"
FW Revision                              : fw_1234
Total Capacity                           : 128.00 GB
Volatile Only Capacity                   : 128.00 GB
Persistent Only Capacity                 : 0 B
Partition Alignment                      : 0 B
......
cxl memdev: cmd_identify: identified 1 mem
Health info and alerts commands
# ./cxl get-health-info mem0
CXL Get Health Information Memory Device "mem0"
Health Status                    : Normal
Media Status                     : Normal
Life Used                        : 4 % (Normal)
Device Temperature               : 32 C (Normal)
Corrected Volatile Error Count   : 0 (Normal)
Corrected Persistent Error Count : 0 (Normal)
Dirty Shutdown Count             : 0
cxl memdev: cmd_get_health_info: get-health-info 1 mem

# ./cxl get-alert-config mem0
CXL Get Alert Configuration Memory Device "mem0"
Life Used Threshold - Critical                        : 75 %
                    - Warning                         : Not Set
Device Over-Temperature Threshold - Critical          : 100 C
                                  - Warning           : 80 C
Device Under-Temperature Threshold - Critical         : -30 C
                                   - Warning          : Not Supported
Corrected Volatile Memory Error Threshold - Warning   : Not Supported
Corrected Persistent Memory Error Threshold - Warning : Not Supported
cxl memdev: cmd_get_alert_config: get-alert-config 1 mem

# ./cxl set-alert-config mem0 --life-used-threshold=50 --life-used-alert=on
{
  "memdev":"mem0",
  "ram_size":"128.00 GiB (137.44 GB)",
  "alert_config":{
  ......
    "life_used_prog_warn_threshold":50,
  ......
}
cxl memdev: cmd_set_alert_config: set alert configuration for 1 mem

# ./cxl get-alert-config mem0
CXL Get Alert Configuration Memory Device "mem0"
Life Used Threshold - Critical                        : 75 %
                    - Warning                         : 50 %
Device Over-Temperature Threshold - Critical          : 100 C
                                  - Warning           : 80 C
Device Under-Temperature Threshold - Critical         : -30 C
                                   - Warning          : Not Supported
Corrected Volatile Memory Error Threshold - Warning   : Not Supported
Corrected Persistent Memory Error Threshold - Warning : Not Supported
cxl memdev: cmd_get_alert_config: get-alert-config 1 mem
Firmware update commands
# ./cxl get-firmware-info mem0
Supported FW Slots           : 2
Slot 1 FW revision           : fw_1234 (Active)
Slot 2 FW revision           :
Online Activation Capability : Supported
cxl memdev: cmd_get_firmware_info: get-firmware-info 1 mem

# ./cxl transfer-firmware mem0 -i fw_5678.bin -s 2
cxl memdev: cmd_transfer_firmware: transfer-firmware 1 mem

# ./cxl activate-firmware mem0 -s 2 --online
cxl memdev: cmd_activate_firmware: activate-firmware 1 mem

# ./cxl get-firmware-info mem0
Supported FW Slots           : 2
Slot 1 FW revision           : fw_1234
Slot 2 FW revision           : fw_5678 (Active)
Online Activation Capability : Supported
cxl memdev: cmd_get_firmware_info: get-firmware-info 1 mem
Shutdown State commands
# ./cxl get-health-info mem0
......
Dirty Shutdown Count             : 0
cxl memdev: cmd_get_health_info: get-health-info 1 mem

# ./cxl set-shutdown-state mem0
cxl memdev: cmd_set_shutdown_state: set-shutdown-state 1 mem

# ./cxl get-shutdown-state mem0
Shutdown State: Dirty
cxl memdev: cmd_get_shutdown_state: get-shutdown-state 1 mem

# ./cxl get-health-info mem0
......
Dirty Shutdown Count             : 1
cxl memdev: cmd_get_health_info: get-health-info 1 mem
Scan Media commands
# ./cxl get-scan-media-caps mem0 -a 0x100 -l 0x1000
Estimated Scan Media Time(ms): 256
cxl memdev: cmd_get_scan_media_caps: get-scan-media-caps 1 mem

# ./cxl scan-media mem0 -a 0x100 -l 0x1000
cxl memdev: cmd_scan_media: scan-media 1 mem

# ./cxl get-scan-media mem0
No poison address
cxl memdev: cmd_get_scan_media: get-scan-media 1 mem
Sanitize commands
# ./cxl set-alert-config mem0 --over-temperature-threshold=40 --over-temperature-alert=on
{
  "memdev":"mem0",
  "ram_size":"128.00 GiB (137.44 GB)",
  "alert_config":{
  ......
    "dev_over_temperature_prog_warn_threshold":40,
  ......
}
cxl memdev: cmd_set_alert_config: set alert configuration for 1 mem

# ./cxl set-timestamp mem0
7/24/2023 17:17:21
cxl memdev: cmd_set_timestamp: set-timestamp 1 mem

# ./cxl get-alert-config mem0
......
Device Over-Temperature Threshold - Critical          : 100 C
                                  - Warning           : 40 C

# ./cxl get-timestamp mem0
7/24/2023 17:17:25
cxl memdev: cmd_get_timestamp: get-timestamp 1 mem

# ./cxl sanitize-memdev mem0
cxl memdev: cmd_sanitize_memdev: sanitation started on 1 mem device

# ./cxl get-alert-config mem0
......
Device Over-Temperature Threshold - Critical          : 100 C
                                  - Warning           : 85 C

# ./cxl get-timestamp mem0
1/1/1970 09:00:00
cxl memdev: cmd_get_timestamp: get-timestamp 1 mem
SLD QoS commands
# ./cxl get-sld-qos-control mem0
Egress Port Congestion: Disable
Temporary Throughput Reduction: Disable
Egress Moderate Percentage: 10%
Egress Severe Percentage: 25%
Backpressure Sample Interval: 8
cxl memdev: cmd_get_sld_qos_control: get-sld-qos-control 1 mem

# ./cxl set-sld-qos-control mem0 -e
cxl memdev: cmd_set_sld_qos_control: set-sld-qos-control 1 mem
# ./cxl set-sld-qos-control mem0 -m 50 -s 75
cxl memdev: cmd_set_sld_qos_control: set-sld-qos-control 1 mem

# ./cxl get-sld-qos-control mem0
Egress Port Congestion: Enable
Temporary Throughput Reduction: Disable
Egress Moderate Percentage: 50%
Egress Severe Percentage: 75%
Backpressure Sample Interval: 8
cxl memdev: cmd_get_sld_qos_control: get-sld-qos-control 1 mem

# ./cxl get-sld-qos-status mem0
Backpressure Average Percent: 0%

5.1.3.2 CXL device grouping commands

By using the region and list CMDs with additional -V(--soft_interleaving), you can perform the device grouping and retrieve list of device information supported by the SMDK.
Note: From SMDK v2.0, additional grouping CMDs (e.g. group-list, group-node, etc.) are not supported.

5.1.3.2.1 Commands

Command Option Description
create-region -V(--soft_interleaving) -G(--group) <'node' or 'noop'> Makes CXL device(s) to be logically represented as Node or Noop Partition.
-N(--target_node) <node_id> -w(--ways) <num_dev> <cxl0> [<cxl1>..<cxlN>] Makes CXL device(s) to be grouped into the specified target node.
destroy-region -V(--soft_interleaving) -N(--target_node) <node_id> Removes all CXL devices from the specified node group.
-w(--ways) <num_dev> <cxl0> [<cxl1>..<cxlN>] Removes the CXL device(s) from the node group to which the device(s) belongs.
list -V(--soft_interleaving) -n(--list_node) [node_id] Displays CXL device(s) configuration status for node(s) of the system.
- If the node id is not specified, it shows all the nodes from the system.
- Otherwise it shows only the node specified.
-C(--list_dev) [cxlN] Displays CXL device(s) information and grouping status.
- If the device name is not specified, it shows information from all of CXL devices in the system.
- Otherwise it only shows information from the device specified.

5.1.3.2.2 Examples

Node partition
/* node 0 : CPU #1 + DDR Memory #1
   node 1 : CXL #1, #2, #3 */

# ./cxl create-region -V -G node

# ./cxl list -V --list_node
[
   {
    "node_id" : -1,
    "devices" : [ ]
   }
   {
    "node_id" : 0,
    "devices" : [ ]
   }
   {
    "node_id" : 1,
    "devices" : [ "cxl0"  "cxl1"  "cxl2"  ]
   }
]

# cat /proc/buddyinfo
Node 0, zone      DMA      0      0      0      0      0      0      0      0      1      1      3
Node 0, zone    DMA32      7      5      5      5      7      6      3      4      5      3    437
Node 0, zone   Normal   2521   2397   1219   4419   2046    912    407    192    106     58  13430
Node 1, zone  Movable      0      0      0      0      0      0      0      0      0      0  98304
Noop partition
/* node 0 : CPU #1 + DDR Memory #1
node 1 : CXL #1
node 2 : CXL #2
node 3 : CXL #3 */

# ./cxl create-region -V -G noop

# ./cxl list -V --list_node
[
   {
    "node_id" : -1,
    "devices" : [ ]
   }
   {
    "node_id" : 0,
    "devices" : [ ]
   }
   {
    "node_id" : 1,
    "devices" : [ "cxl0"  ]
   }
   {
    "node_id" : 2,
    "devices" : [ "cxl1"  ]
   }
   {
    "node_id" : 3,
    "devices" : [ "cxl2"  ]
   }
]

# cat /proc/buddyinfo
Node 0, zone      DMA      0      0      0      0      0      0      0      0      1      1      3
Node 0, zone    DMA32      7      5      5      5      7      6      3      4      5      3    437
Node 0, zone   Normal   2907   1686    828   4416   2038    913    407    192    106     58  13430
Node 1, zone  Movable      0      0      0      0      0      0      0      0      0      0  32768
Node 2, zone  Movable      0      0      0      0      0      0      0      0      0      0  32768
Node 3, zone  Movable      0      0      0      0      0      0      0      0      0      0  32768
List CXL memory (partitioning) information
/* list -V --list_node */

# ./cxl list -V --list_node
[
   {
    "node_id" : -1,
    "devices" : [ ]
   }
   {
    "node_id" : 0,
    "devices" : [ ]
   }
   {
    "node_id" : 1,
    "devices" : [ "cxl0"  ]
   }
   {
    "node_id" : 2,
    "devices" : [ "cxl1"  ]
   }
   {
    "node_id" : 3,
    "devices" : [ "cxl2"  ]
   }
]

# ./cxl list -V --list_node 2
[
   {
    "node_id" : 2,
    "devices" : [ "cxl1"  ]
   }
]


/* list -V --list_dev */

# ./cxl list -V --list_dev
[
   {
      "id":"cxl0",
      "start_address":"0x2080000000",
      "size":"0x2000000000",
      "node_id":"1",
      "socket_id":"0",
      "state":"online",
      "memdev":
      {
         "memdev_id":"0",
         "pci_bus_addr":"0000:16:00.0",
         "pci_cur_link_speed":"32.0 GT/s PCIe",
         "pci_cur_link_width":"8",
      }
   }
   {
      "id":"cxl1",
      "start_address":"0x4080000000",
      "size":"0x2000000000",
      "node_id":"2",
      "socket_id":"0",
      "state":"online",
      "memdev":
      {
         "memdev_id":"1",
         "pci_bus_addr":"0000:27:00.0",
         "pci_cur_link_speed":"32.0 GT/s PCIe",
         "pci_cur_link_width":"8",
      }
   }
   {
      "id":"cxl2",
      "start_address":"0x6080000000",
      "size":"0x2000000000",
      "node_id":"3",
      "socket_id":"0",
      "state":"online",
      "memdev":
      {
         "memdev_id":"2",
         "pci_bus_addr":"0000:38:00.0",
         "pci_cur_link_speed":"32.0 GT/s PCIe",
         "pci_cur_link_width":"8",
      }
   }
]

# ./cxl list -V --list_dev cxl1
[
   {
      "id":"cxl1",
      "start_address":"0x4080000000",
      "size":"0x2000000000",
      "node_id":"2",
      "socket_id":"0",
      "state":"online",
      "memdev":
      {
         "memdev_id":"1",
         "pci_bus_addr":"0000:27:00.0",
         "pci_cur_link_speed":"32.0 GT/s PCIe",
         "pci_cur_link_width":"8",
      }
   }
]
Add device to specific node
# ./cxl list -V --list_node 1
[
   {
    "node_id" : 1,
    "devices" : [ "cxl0"  ]
   }
]

# ./cxl create-region -V -N 1 -w 1 cxl1

# ./cxl list -V --list_node 1
[
   {
    "node_id" : 1,
    "devices" : [ "cxl0"  "cxl1"  ]
   }
]
remove device from specific node
/* destroy-region -V (target_node) */

# ./cxl list -V --list_node 1
[
   {
    "node_id" : 1,
    "devices" : [ "cxl0"  "cxl1"  ]
   }
]

# ./cxl destroy-region -V -N 1

# ./cxl list -V --list_node 1
[
   {
    "node_id" : 1,
    "devices" : [ ]
   }
]


/* destroy-region -V (remove dev) */

# ./cxl list -V --list_node 1
[
   {
    "node_id" : 1,
    "devices" : [ "cxl0"  "cxl1"  "cxl2"  ]
   }
]

# ./cxl destroy-region -V -w 1 cxl1

# ./cxl list -V --list_node 1
[
   {
    "node_id" : 1,
    "devices" : [ "cxl0"  "cxl2"  ]
   }
]

5.1.3.3 CXL Swap commands

5.1.3.3.1 Commands

Command Option Description
enable-cxlswap N/A Enables SMDK CXL Swap function at runtime.
disable-cxlswap N/A Disables SMDK CXL Swap function at runtime.
check-cxlswap N/A Provides information about whether the SMDK CXL Swap is enabled and size of the swap space currently in use.
flush-cxlswap N/A Flushes all swapped out pages in CXL pool.
Note: CXL Swap should be disabled before running this command.

5.1.3.3.2 Examples

enable-cxlswap / check-cxlswap
# ./cxl enable-cxlswap
Success: CXLSwap is enabled.
# ./cxl check-cxlswap
CXLSwap: enabled

CXLSwap Used      : 428 kB
CXLSwap Pages     : 131
disable-cxlswap / check-cxlswap
# ./cxl disable-cxlswap
Success: CXLSwap is disabled.

# ./cxl check-cxlswap
CXLSwap: disabled
flush-cxlswap
/* Grouping cmds are not available when cxl (swap) pool is in use. *flush-cxlswap* command is helpful in this situation. */
# ./cxl disable-cxlswap && ./cxl flush-cxlswap
Success: CXLSwap is disabled.
Success: CXLSwap is flushed.
# ./cxl create-region -V -G node
cxl region: cmd_create_region: created 2 regions
(done)

5.1.3.4 CXL Cache commands

5.1.3.4.1 Commands

Command Option Description
enable-cxlcache N/A Enables SMDK CXL Cache function at runtime.
disable-cxlcache N/A Disables SMDK CXL Cache function at runtime.
check-cxlcache N/A Provides information about whether the SMDK CXL Cache is enabled and size of CXL Cache space currently in use.
flush-cxlcache N/A Flushes all cached pages out from CXL cache pool.
Note: CXL Cache should be disabled before running this command.

5.1.3.4.2 Examples

enable-cxlcache / check-cxlcache
# ./cxl enable-cxlcache
Success: CXLCache is enabled.
# ./cxl check-cxlcache
CXLCache: enabled

CXLCache Used      : 0 kB
CXLCache Pages     : 0
disable-cxlcache / check-cxlcache
# ./cxl disable-cxlcache
Success: CXLCache is disabled.

# ./cxl check-cxlcache
CXLCache: disabled
flush-cxlcache
# ./cxl disable-cxlcache && ./cxl flush-cxlcache
Success: CXLCache is disabled.
Success: CXLCache is flushed.


5.1.3.5 Other commands

5.1.3.5.1 Commands

Command Option Description
get-latency-matrix [<options>]
--size <MB>: size(range) of test buffer in MiBs (default: 20000MiB)
--stride <B>: stride length in bytes (default: 64B). This value cannot be larger than the size
--random: to measure latencies with random access (default: sequential access)
--no-change-prefetcher: not to change hw prefetcher before starting test (default: turn-off hw prefetcher before test)
--iteration <n>: iterate n times (default: iterate only 1 time)
Measures and reports the latency between memory initiator node(s) and target node(s).

5.1.3.5.2 Examples

get-latency-matrix
#./cxl get-latency-matrix
            Numa node           (unit: ns)
Numa node       0       1       2
        0       ......
        1       ......


5.2 Test

5.2.1 Compatible Path

5.2.1.1 Compatible library

5.2.1.1.1 comp_api_c

The tests below are to check whether the standard heap allocation APIs such as malloc, calloc, realloc, posix_memalign, etc., are properly working in SMDK compatible path.

1. run_heap_test.sh

Command lines

$ cd /path/to/SMDK/src/test/heap_allocator/comp_api_c
$ ./run_heap_test.sh <-e | -n>
(Example) $ ./run_heap_test.sh -e

Options

Options Desc.
-e Gives CXL memory a priority.
-n Gives DRAM a priority.

Result

......
prio = [normal->exmem]
exmem_size = 1000 normal_size = 1000
maxmemory_policy = interleave
malloc: 0x7f8494b00900
free: 0x7f8494b00900
calloc: 0x7f8494b00940
free: 0x7f8494b00940
malloc: 0x7f8494b00ec0
realloc: 0x7f8494100880
free: 0x7f8494100880
posix_memalign: 0x7f8495009010
free: 0x7f8495009010

2. run_multi_thread.sh

Command lines

$ cd /path/to/SMDK/src/test/heap_allocator/comp_api_c
$ ./run_multi_thread.sh [options...]
(Example) $ ./run_multi_thread.sh size 1024 iter 1000 nthreads 4

Options

Options Desc. Default
size <byte> Memory allocation size(bytes) per a request. 1024
iter <n> Number of times memory allocation requests are repeated. 1048576
nthreads <n> Number of threads to run specified tests. 10

Result

*** use_adaptive_interleaving is disabled
g_arena_pool[0].nr_arena=64
g_arena_pool[0].type_mem=normal
......
[TEST START] smdk compatible API multi-thread malloc test
[TEST PARAMETERS] size=4096 iter=1000 nthreads=4
thread 1 start
thread 2 start
thread 3 start
thread 4 start
thread 4 done
thread 3 done
thread 2 done
thread 1 done

5.2.1.1.2 comp_api_conf

To verify the SMDK allocator can be configured properly, several test cases with different memory allocating configurations are provided. These include usage of CXL memory, memory capacities(size), priorities, usage of auto arena scaling, memory allocation policies when all memory of the specified size has been allocated, and set CXL.mem devices' interleaving and binding policies. (exmem_partition_range)

Note: exmem_partition_range allows you to get higher bandwidth from multiple CXL.mem devices, or you can use that configuration to isolate CXL resources between the tenants in your system. Please refer to test_config_cxl.sh script and result section below for examples of the setting of exmem_partition_range.

Command lines

$ cd /path/to/SMDK/src/test/heap_allocator/comp_api_conf
$ ./test_config_cxl.sh

Result

### Result example in case which three CXL.mem devices are mounted on nodes 1, 2, and 3 respectively.


run test - t1
 
use_exmem:true
--------------------------------
*** use_adaptive_interleaving is disabled
g_arena_pool[0].nr_arena=20
g_arena_pool[0].type_mem=normal
......

run test - t15						    # exmem_partition_range:"all"

use_exmem:true,priority:exmem,exmem_partition_range:all
--------------------------------
......

run test - t16						    # exmem_partition_range:"0,1,2"

use_exmem:true,priority:exmem,exmem_partition_range:0,1,2
--------------------------------
[Warning] node 0 is not ExMem node
......


run test - t17						    # exmem_partition_range:"1-3"

use_exmem:true,priority:exmem,exmem_partition_range:1-3
--------------------------------
......


cf. # exmem_partition_range:"0,2-4"
use_exmem:true,priority:exmem,exmem_partition_range:0,2-4
--------------------------------
libnuma: Warning: node argument 4 out of range
[Warning] Invalid value for "exmem_partition_range"=0,2-4. This option will be ignored.

5.2.1.1.3 comp_api_cpp

This is the test case to check whether the new and delete operators in C++ are properly working in the SMDK compatible path.

Command lines

$ cd /path/to/SMDK/src/test/heap_allocator/comp_api_cpp
$ ./run_test.sh <-e | -n>
(Example) $ ./run_test.sh -e

Options

Options Desc. Default
-e Gives CXL memory a priority. -n
-n Gives DRAM a priority.

Result

......
prio = [exmem->normal]
normal_size = 2048 MB
exmem_size = 2048 MB
maxmemory_policy = interleave
exmem_partition_range =
use_auto_arena_scaling = 1
test start
test start
ptr: 0x7f6b25c09010     value: 3
heap size: 4
ptr: 0x7f6b25c1f000     value: 0
ptr: 0x7f6b25c1f004     value: 1
ptr: 0x7f6b25c1f008     value: 2
ptr: 0x7f6b25c1f00c     value: 3
ptr: 0x7f6b25c1f010     value: 4
ptr: 0x7f6b25c09010     value: 10
ptr: 0x7f6b25c20000
test done

5.2.1.1.4 comp_api_java

SMDK provides JAVA binding for the compatible path, and the test script (run_java_test.sh) in this directory is to verify the feature.

Testing with the script needs two options to choose from. Please refer to the options table below.

Note: Junit is used at this test and the absolute path of junit4.jar (junit.jar) archive file is specified in the test script. Since the path may vary depending on the system, modification would be required to run this test properly.

Command lines

$ cd /path/to/SMDK/src/test/heap_allocator/comp_api_java
$ ./run_java_test.sh <-e | -n> <-a | -j>
(Example) $ ./run_java_test.sh -e -a

Options

Option1 Option2 Desc. Default
-e Gives CXL memory a priority. -n
-n Gives DRAM a priority.
-a Runs Java application (javaTest/javaHeapTest.java) with SMDK allocator library. none
-j Runs JNI application (jniTest/javaJNITest.java) with SMDK allocator library.

Result

[Case1: ./run_java_test.sh -e -a]
use_exmem:true,exmem_size:65536,normal_size:65536,maxmemory_policy:remain,priority:exmem
*** use_adaptive_interleaving is disabled
g_arena_pool[0].nr_arena=20
g_arena_pool[0].type_mem=normal
......
 
test4GBytes: benchmark is requesting GC (record used memory)...
test4GBytes: used=4318078464, loopCount=0, total=5368709120
……
test100MBytes: benchmark is requesting GC (record used memory)...
test100MBytes: used=132163152, loopCount=0, total=447741952
......
 
 
[Case2: ./run_java_test.sh -e -j]
use_exmem:true,exmem_size:65536,normal_size:65536,maxmemory_policy:remain,priority:exmem
*** use_adaptive_interleaving is disabled
g_arena_pool[0].nr_arena=20
g_arena_pool[0].type_mem=normal
......
 
malloc(0): pid=6713 0x7f22e84c8680
malloc(1): pid=6713 0x7f22e88c9940
……
free(0): pid=6713 0x7f22e84c8680
free(1): pid=6713 0x7f22e88c9940
 
……
MAP_EXMEM
addr[0x7f22e02e6000], one=49 zero=48
addr[0x7f22e02e6000]
munmap success
……

5.2.1.1.5 comp_api_python

SMDK provides Python binding for the compatible path. The test script (run_heapmon.sh) in this directory is to verify the function.

Testing with the script needs two options to choose. Please refer to the options table below.

Command lines

$ cd /path/to/SMDK/src/test/heap_allocator/comp_api_python
$ ./run_heapmon.sh <-e | -n> <-l | -a>
(Example) $ ./run_heapmon.sh -e -a

Options

Option1 Option2 Desc. Default
-e Gives CXL memory a priority. -n
-n Gives DRAM a priority.
-l Runs Python application (heapmon.py) with standard libc library, not SMDK allocator. Option1 (-e or -n) will be ignored. none
-a Runs Python application (heapmon.py) with SMDK allocator library.

Result

use_exmem:true,exmem_size:16384,normal_size:16384,maxmemory_policy:remain,priority:exmem
*** use_adaptive_interleaving is disabled
g_arena_pool[0].nr_arena=20
g_arena_pool[0].type_mem=normal
......
 
Before allocation: Total 1.58 GB, not changed
After allocation: Total 18.21 GB, 16.62 GB increased
After 1st deallocation: Total 18.21 GB, not changed
After 2nd deallocation: Total 1.59 GB, 16.62 GB decreased
After 3rd deallocation: Total 1.59 GB, not changed
After gc: Total 1.58 GB, 256.00 KB decreased
......

5.2.1.1.6 syscall

The SMDK compatible API library supports transparent system call interface mmap, as well as heap management APIs (malloc, calloc, etc.).

Like run_heap_test.sh above, the script run_mmap_test.sh in this directory helps you preload SMDK allocator library and set the required configuration for running test_syscall, calling mmap 100 times with 4MB length each.

Command lines

$ cd /path/to/SMDK/src/test/syscall
$ ./run_mmap_test.sh <-e | -n>
(Example) $ ./run_mmap_test.sh -e

Options

Options Desc.
-e Gives CXL memory a priority.
-n Gives DRAM a priority.

Result

cxlmalloc - test_syscall
use_exmem:true,exmem_size:4096,normal_size:4096,maxmemory_policy:interleave,priority:exmem
*** use_adaptive_interleaving is disabled
g_arena_pool[0].nr_arena=20
g_arena_pool[0].type_mem=normal
......
 
addr[0x7f3211e00000], one='1' zero='0'
addr[0x7f3211a00000], one='1' zero='0'
addr[0x7f3211600000], one='1' zero='0'
addr[0x7f3211200000], one='1' zero='0'
addr[0x7f3210e00000], one='1' zero='0'
......

5.2.1.2 Adaptive Interleaving

Note that you have to set MLC_PATH (and AMD_UPROFPCM_PATH if needed) in /path/to/SMDK/lib/tierd/tierd.conf before run the tests below.

5.2.1.2.1 run_tierd_allocator_test.sh

Command lines

$ cd /path/to/SMDK/src/test/tierd
$ ./run_tierd_allocator_test.sh

Result

run testcase - t1

PASS

......

run testcase - t7

PASS


Total 7 TCs executed:  7  PASSED, 0  FAILED

5.2.1.2.2 run_tierd_daemon_test.sh

Command lines

$ cd /path/to/SMDK/src/test/tierd
$ ./run_tierd_daemon_test.sh

Result

Configurations for tierd:

MLC_PATH=/path/to/smdk/lib/mlc/Linux/mlc
AMD_UPROFPCM_PATH=/path/to/smdk/lib/tierd/AMDuProf_Linux_x64_4.0.341/bin/AMDuProfPcm

Run testcase - run tierd without kmem.ko

PASS

Run testcase - tierd should generate /run/tierd/nodeX

PASS

Run testcase - check SIGINT termination

PASS

Run testcase - tierd should run well even if /dev/tierd is removed.

PASS

Run testcase - tierd should check /run/tierd/nodeX existance.

./run_tierd_daemon_test.sh: line 75: 60056 Killed                  $TIERD -c $TIERD_CONFPATH &> /dev/null
PASS

/path/to/SMDK/src/test/tierd
PASS

5.2.1.2.3 run_tierd_driver_test.sh

Command lines

$ cd /path/to/SMDK/src/test/tierd
$ ./run_tierd_driver_test.sh

Result

/path/to/SMDK/src/test/tierd
PASS

5.2.1.2.4 run_tierd_plugin_test.sh

Command lines

$ cd /path/to/SMDK/src/test/tierd
$ ./run_tierd_plugin_test.sh

Result

PASS

5.2.2 Optimization Path

5.2.2.1 Optimization Library

5.2.2.1.1 opt_api_c

This test is to verify the basic operations of optimization APIs such as s_malloc, s_free, etc. You can run pre-defined 8 tests by running the script with adding several options.

Also, you would be able to get hints on how to use SMDK's optimization APIs in your applications through these test codes.

Command lines

$ cd /path/to/SMDK/src/test/heap_allocator/opt_api_c
$ ./run_test_opt_api.sh test <n> [options...]
(Example) $ ./run_test_opt_api.sh test 8 size 1024 iter 1000000 nthreads 1

Options

Options Desc. Default
test <n> Selects test id:
1: Basic optimization API running test
2: Multiple s_malloc and s_free_type requests with specified mem type (default: normal)
3: Multiple s_malloc requests alternating two memory types and s_free_type
4: Multiple s_malloc requests with random memory types and s_free_type
5: Memtype exception cases
6: Multiple s_realloc requests by different memtype with old pointer
7: Multiple s_malloc and s_free_type requests with different memtype
8: Multiple s_malloc requests with random memory types and s_free
none
size <n> Memory allocation size per a request. 8(B)
iter <n> Number of times memory allocation requests are repeated. 10
nthreads <n> Number of threads to run specified tests. 1
time Displays test execution time after the test is completed. false
vsizes Variable memory allocation request sizes; 8B, 64B, 512B, 4KB and 2MB. false
perthreadcpu Sets different cpu affinities for each thread. (applied only when nthreads > 1) false
exmem Sets memtype to EXMEM. (otherwise it is set to NORMAL) false
repeat <n> Number of times the test is repeated. 1

Result

g_arena_pool[0].nr_arena=20
g_arena_pool[0].type_mem=normal
......
use_auto_arena_scaling = 1
[Test Parameters] size=1024 ,iter=1000000, nthreads=1, mem type=0
[Test 8(tid=0)] Start
mem_used_normal(before malloc): 0
mem_used_exmem(before malloc): 0
mem_used_normal(after malloc): 511436800
mem_used_exmem(after malloc): 512573440
mem_used_normal(after free): 14336
mem_used_exmem(after free): 20480
[Test 8(tid=0)] End

5.2.2.1.2 opt_api_cpp

In addition to memory allocation functions such as s_malloc, s_calloc, etc., in the SMDK optimization path C++ based memory allocation class (class SmdkAllocator) is provided.  

This test is to verify whether you can generate the class and get the requesting type of memory from the allocator. 

Command lines

$ cd /path/to/SMDK/src/test/heap_allocator/opt_api_cpp
$ ./run_test_opt_api_cpp.sh

Result

g_arena_pool[0].nr_arena=20
g_arena_pool[0].type_mem=normal
......
use_auto_arena_scaling = 1
......
Test(basic functional test) starts
Test(basic functional test) ends
Test(malloc-free test) starts
Test(malloc-free test) ends
Test(malloc-memstat-free test) starts
......
After Malloc
        type: 0
        total: 66571993088
        used: 10240001024
        available: 51565993984
......
Test(malloc-memstat-free test) ends

5.2.2.1.3 opt_api_python

The SMDK allocator library is also available for Python3 applications. The script in this section is provided to test how to use CXL memory in Python application through the py_smdk package of SMDK.

Command lines

$ cd /path/to/SMDK/src/test/heap_allocator/opt_api_python
$ ./run_test_opt_api_py.sh

Result

g_arena_pool[0].nr_arena=20
g_arena_pool[0].type_mem=normal
......
use_auto_arena_scaling = 1
......
hello smdk
nice to meet you!
test 0 done
test 1 done
......
test 11 done

5.2.2.1.4 opt_api_nodectl

The test scripts here are for APIs (s_malloc_node, s_free_node, s_enable_node_interleave, and s_disable_node_interleave) of the optimization path.

1. run_alloc_onnode.sh

Command lines

$ cd /path/to/SMDK/src/test/heap_allocator/opt_api_nodectl
$ ./run_alloc_onnode.sh [options...]

Options

Options Desc. Default
size <byte> Memory allocation size (bytes) per a request. 67108864
iter <n> Number of times memory allocation requests are repeated. 10
nthreads <n> Number of threads to run specified tests. 1
node <n> A character string of CXL memory node. 1

Result

g_arena_pool[0].nr_arena=20
g_arena_pool[0].type_mem=normal
......
use_auto_arena_scaling = 1
......
mem used : 335544320
thread1 malloc test over
SMDK Memory allocation stats:
    Type             Total              Used         Available
  Normal            62.0GB             0.0GB            35.0GB
   ExMem            32.0GB             0.0GB            32.0GB

2. run_policy_test.sh

Command lines

$ cd /path/to/SMDK/src/test/heap_allocator/opt_api_nodectl
$ ./run_policy_test.sh [options...]

Options

Options Desc. Default
size <byte> Memory allocation size (bytes) per a request. 67108864
iter <n> Number of times memory allocation requests are repeated. 10
nthreads <n> Number of threads to run specified tests. 1
node <n> A character string list of CXL memory nodes (e.g., 1-2, 3) 0-1

Result

g_arena_pool[0].nr_arena=20
g_arena_pool[0].type_mem=normal
......
use_auto_arena_scaling = 1
......
[TEST START] smdk smalloc test under node interleave policy
[TEST PARAMETERS] nodes=1-3 size=67108864 iter=10 nthreads=1
create- thread1
thread1 malloc test start
......
[Warning] s_enable_node_interleave:invalid node(s).(1-3)
thread1 malloc test over

5.2.2.1.5 metadata_api

This test is to verify the basic operation of SMDK's metadata APIs such as s_get_memsize_total, s_get_memsize_available, etc. Especially, this test checks whether s_get_memsize_used function can provide memory usage information correctly in each pre-defined case. Also, you would be able to get hints on how to use SMDK's metadata APIs in your applications through this test codes.

Command lines

$cd /path/to/SMDK/src/test/heap_allocator/metadata_api
$ ./run_test_meta_api.sh

Result

g_arena_pool[0].nr_arena=20
g_arena_pool[0].type_mem=normal
......
use_auto_arena_scaling = 1
SMDK Memory allocation stats:
    Type             Total              Used         Available
  Normal            62.6GB             0.0GB            57.3GB
   ExMem            32.0GB             0.0GB            27.3GB
[test] size=4096, total=1GiB, type=0
        mem_used (before malloc) = 0
        mem_requested = 1073741824
        mem_available (before malloc) = 61626081280
 
        mem_used (after malloc) = 1073766400
        mem_available (after malloc) = 60334178304
 
        mem_used (after free) = 61440
        mem_available (after free) = 60400619520
…
SMDK Memory allocation stats:
  Normal            62.6GB             0.0GB            56.4GB
   ExMem            32.0GB             0.0GB            31.9GB

5.2.2.2 PNM API

5.2.2.2.1 pnm_imdb

This test is to run and verify SMDK PNM API related to IMDB application such as Range and List scan operations with bit and Index vector outputs.

Command lines

$ cd /path/to/SMDK/src/test/pnm
$ ./run_test_pnm_imdb.sh

Result

insmod /lib/modules/6.9.0-smdk/kernel/drivers/pnm/imdb_resource/imdb_resource.ko
g_arena_pool[0].nr_arena=32
g_arena_pool[0].type_mem=normal
......
use_auto_arena_scaling = 1
[Test #1] RangeScan - Output BV starts
Sub Test starts
  [Test Info] Column Generator: random, Bit Compression: 2
Sub Test done - PASS
......

[Test #2] RangeScan - Output IV starts
Sub Test starts
  [Test Info] Column Generator: random, Bit Compression: 2
Sub Test done - PASS
......

[Test #3] ListScan - Output BV starts
Sub Test starts
  [Test Info] Column Generator: random, Bit Compression: 2
Sub Test done - PASS
......

[Test #4] ListScan - Output IV starts
Sub Test starts
  [Test Info] Column Generator: random, Bit Compression: 2
Sub Test done - PASS
......

PASS

5.2.2.2.2 pnm_dlrm

This test is to run and verify SMDK PNM API related to DLRM application such as SLS operations with two different data types, Float and Uint32.

Command lines

$ cd /path/to/SMDK/src/test/pnm
$ ./run_test_pnm_dlrm.sh

Result

insmod /lib/modules/6.9.0-smdk/kernel/drivers/pnm/sls_resource/sls_resource.ko
[Test #1] SLS - Data Type: Float starts
Sub Test starts
  [Table Info] Tables count: 50, Rows number: 500000, Feature size: 16
  [SLSOp Info] Batch: 16, Max_lookup: 500, Min_lookup: 1, Alloc_option: REPLICATE_ALL
Sub Test done - PASS
......

[Test #2] SLS - Data Type: Uint32 starts
Sub Test starts
  [Table Info] Tables count: 50, Rows number: 500000, Feature size: 16
  [SLSOp Info] Batch: 16, Max_lookup: 500, Min_lookup: 1, Alloc_option: REPLICATE_ALL
Sub Test done - PASS
......

PASS

5.2.3 CXL-CLI

5.2.3.1 test_cli_cmd.sh

This test is to run and verify the new CXL spec-based commands that SMDK added, including poison, timestamp, event, identify, health, alert, shutdown state and QoS control.

Command lines

$ cd /path/to/SMDK/src/test/cxl_cli
$ ./test_cli_cmd.sh <poison_address>
(Example) $ ./test_cli_cmd.sh 10000

Options

Options Desc.
poison_address Poison inject address. (hexadecimal, should be greater than 0x1000)

Result

[set-timestamp]
$ cxl set-timestamp mem0
cxl memdev: cmd_set_timestamp: set-timestamp 1 mem
[get-timestamp]
$ cxl get-timestamp mem0
8/5/2022 9:19:12
cxl memdev: cmd_get_timestamp: get-timestamp 1 mem
......

5.2.3.2 test_cli_background_cmd.sh

This test is to run and verify the new CXL spec-based background commands that SMDK added, such as scan media and sanitize.

Command lines

$ cd /path/to/SMDK/src/test/cxl_cli
$ ./test_cli_background_cmd.sh <scan_media_address>
(Example) $ ./test_cli_background_cmd.sh 10000

Options

Options Desc.
scan_media_address Scan media start address. (hexadecimal, should be greater than 0x1000)

Result

[get-scan-media-caps]
$ cxl get-scan-media-caps mem0 -a 0x10000 -l 0x80
Estimated Scan Media Time(ms): 256
cxl memdev: cmd_get_scan_media_caps: get-scan-media-caps 1 mem

[scan-media]
$ cxl scan-media mem0 -a 0x10000 -l 0x80
cxl memdev: cmd_scan_media: scan-media 1 mem
......

5.2.3.3 test_cli_group_cmd.sh

This test is to run and verify the new CXL node grouping (i.e. CXL memory partitioning) functions with CLI's create-region, destroy-region and list CMDs extended by SMDK.

Command lines

$ cd /path/to/SMDK/src/test/cxl_cli
$ ./test_cli_group_cmd.sh

Result

[create-region -V -G node]
        /proc/buddyinfo
Node 0, zone      DMA      0      0      0      0      0      0      0      0      1      1      3
......
        /proc/iomem CXL related info
1080000000-187fffffff : hmem.2
  1080000000-187fffffff : Soft Reserved
......

[destroy-region -V -N 1]
        /proc/buddyinfo
......

5.2.3.4 test_cli_exception.sh

This is not a script that runs the commands, but a test script to check whether the cli tool returns error and displays information message properly when the user specifies an incorrect input for CXL-CLI's new commands.

Command lines

$ cd /path/to/SMDK/src/test/cxl_cli
$ ./test_cli_exception.sh

Result

......
  Error: unknown option 'inval'

 usage: cxl destroy-region <region0> ... [<options>]

    -b, --bus <bus name>  Limit operation to the specified bus
    -d, --decoder <root decoder name>
                          Limit to / use the specified root decoder
        --debug           turn on debug
    -f, --force           destroy region even if currently active
    -V, --soft_interleaving
                          destory(remove) soft-interelaving node(s)
    -N, --target_node <n>
                          soft-interleaving node id to remove cxl devices followed
    -w, --ways <n>        number of cxldevs participating in the soft-interleaving node
......

Test Total: 15, Test Pass: 15

5.2.3.5 test_cli_cxlswap_cmd.sh

This test is to run and verify the CXL Swap control commands that SMDK added, including enable-cxlswap, disable-cxlswap, check-cxlswap and flush-cxlswap.

Command lines

$ cd /path/to/SMDK/src/test/cxl_cli
$ ./test_cli_cxlswap_cmd.sh

Result

[disable-cxlswap]

[CXLSwap status] enabled

[CXL-CLI] disable-cxlswap
Success: CXLSwap is disabled.

[CXLSwap status] disabled

PASSED


[enable-cxlswap]

[CXLSwap status] disabled
......

5.2.3.6 test_cli_cxlcache_cmd.sh

This test is to run and verify the CXL Cache control commands that SMDK added, including enable-cxlcache, disable-cxlcache, check-cxlcache and flush-cxlcache.

Command lines

$ cd /path/to/SMDK/src/test/cxl_cli
$ ./test_cli_cxlcache_cmd.sh

Result

[disable-cxlcache]

[CXLCache status] enabled

[CXL-CLI] disable-cxlcache
Success: CXLCache is disabled.

[CXLCache status] disabled

PASSED


[enable-cxlcache]

[CXLCache status] disabled
......
⚠️ **GitHub.com Fallback** ⚠️