5. Plugin - OpenMPDK/SMDK GitHub Wiki
This section explains some technical background, functionality, and testing of SMDK plugin, libraries and tools. It provides a practical approach on how to use the Compatible and Optimization path and CXL-CLI tool provided by SMDK.
For introduction and information about compatible and optimization path, please refer to the User Interface section of the SMDK Architecture chapter.
The SMDK compatible path provides methods to utilize CXL memory without modifying applications. SMDK provides a plugin for the compatible path; the compatible allocator library, which targets the heap segment of a process. This section introduces how to utilize the compatible path of SMDK in your system, by preloading the SMDK compatible library through an environment variable LD_PRELOAD.
To enable and use the SMDK compatible library, you should set the environment variables LD_PRELOAD and CXLMALLOC_CONF as below before running your applications.
LD_PRELOAD
LD_PRELOAD is an environment variable that allows you to override a library by specifying a new function in one object. It is used by the Linux system programs' dynamic linker and loader to load specified shared libraries. In particular, the dynamic loader will load shared libraries in LD_PRELOAD before loading other libraries.
Please check the following:
- Only shared library(*.so) can be preloaded by LD_PRELOAD, so you should specify libcxlmalloc.so as LD_PRELOAD target.
- LD_PRELOAD is an environment variable, so it affects only the current process.
By referring run_heap_test.sh (/path/to/SMDK/src/test/heap_allocator/comp_api_c), you can find the usage of LD_PRELOAD.
If you run $ export LD_PRELOAD=/path/to/SMDK/lib/smdk_allocator/lib/libcxlmalloc.so and then run your application, the *.so library would be loaded before other library works. So your application is executed, it uses SMDK's compatible library (=heap allocation APIs e.g. malloc, calloc, etc.) with priority.
Note that you do not need to modify your application and its build script at all to enable and use SMDK's compatible path and APIs.
$ export LD_PRELOAD=/path/to/SMDK/lib/smdk_allocator/lib/libcxlmalloc.so
$ ./your_application (written by C, C++, Python and Java)
CXLMALLOC_CONF
SMDK supports various configurations, allowing you to specify each option before running your own applications. You can find the description and default value of each configuration in below table.
To apply the following configurations, set the environment variable CXLMALLOC_CONF and export it.
$ export CXLMALLOC_CONF=priority:exmem,exmem_size:4096,normal_size:4096,maxmemory_policy:interleave
The name of the configuration parameter and its value should be linked by a colon(:), and each configuration is divided by a comma(,). If the configuration sentence ends with a comma (e.g., ......, maxmemory_policy:interleave,), the configurations would not be applied properly.
Note: exmem_size and normal_size from the configurations below should not exceed the system's available memory (DDR and CXL) during the runtime of your application. If the above policy is not followed, the system policy for handling out-of-memory (OOM) occurrences (e.g., process kill by OOM killer) will take precedence over the allocator's policy (maxmemory_policy).
Config. | Desc. | Default | Note |
---|---|---|---|
use_exmem | Whether to enable CXL memory or not. | FALSE | true/false |
priority | Which types of memory allocated to applications first. Having reached up the maximum allocation out of the higher priority memory type(exmem_size or normal_size), in turn, SMDK allocator tries to allocate from lower priority. | normal | exmem/normal |
exmem_size | Maximum usage of cxl memory. If the cumulative allocated size of CXL memory allocation exceeds this value, SMDK Allocator performs a follow-up operation according to the current memory priority and the maxmemory_policy below. When specifying the exmem_size value (number), it is recognized as MB when there is no unit after the number. Or you can specify units such as m/M/g/G after the number. (e.g., 1024 == 1024M == 1G). If it is specified as -1, it has the meaning of unlimited. |
1048576(MB) | |
normal_size | Maximum usage of normal memory. If the cumulative allocated size of DRAM allocation exceeds this value, SMDK Allocator performs a follow-up operation according to the current memory priority and the maxmemory_policy below. When specifying the normal_size value (number), it is recognized as MB when there is no unit after the number. Or you can specify units such as m/M/g/G after the number. (e.g., 1024 == 1024M == 1G). If it is specified as -1, it has the meaning of unlimited. |
1048576(MB) | |
maxmemory_policy |
interleave: Allocates a high(first)-priority type of memory first, then allocate a low(second)-priority type when allocating all of the memory specified in {high-priority-mem}_size. If all of memory is allocated as much as specified in {low-priority-mem}_size, a high-priority memory type is allocated again. remain: Allocates a high-priority type of memory first, and allocate a low-priority type when allocating all of the memory specified in {high-priority-mem}_size. Regardless of the value set in {low-priority-mem}_size, allocator will allocate memory from the second priority type continuously. oom (out-of-memory): Allocates memory in order of priority. When all types of DRAM/CXL memory are allocated as much as specified in {mem-type}_size, the memory is no longer allocated (return error). |
oom | |
use_auto_arena_scaling | Affects the following two things: 1) The number of arenas generated during initialization would be: false: static number. true: in proportion to the number of CPUs. 2) Arena allocation to threads would be: false: a round-robin way. true: determined based on CPU_ID the thread is running on. |
TRUE | |
exmem_partition_range | Sets a CXL memory expander interleave policy. This configuration can give you the effect of CXL memory expanders' bandwidth aggregation and resource isolation. If you specify multiple CXL.mem device nodes, SMDK allocator configures and returns a memory chunk from the memory pool of all the specified nodes. That is, as the number of nodes increases, the bandwidth would increase. If you specify only one CXL.mem device node, SMDK allocator configures and returns a memory chunk from the memory pool of the specified node only. In this case, the bandwidth cannot be higher than the maximum bandwidth of each single device but you can achieve memory resource isolation effect by using only certain CXL.mem devices. Nodes may be specified as N,N,N or N-N or N,N-N or N-N,N-N and so forth. You may also set this config to all, which means all CXL nodes in your system. If a normal memory node is specified to this configuration, the value is ignored. If a value outside the range of nodes in the system is specified, this configuration is ignored due to a setup error. |
N/A (no policy) |
all N,N N-N ... |
use_adaptive_interleaving | Whether to enable Bandwidth-based Interleaving or not. | FALSE | true/false |
adaptive_interleaving_policy |
bw_saturation: Set an adaptive interleaving policy as bandwidth saturation. If DDR DRAM bandwidth is saturated, automatically use CXL DRAM. When the CXL DRAM bandwidth is also saturated, then allocated by the System Fallback Order. bw_order: Set an adaptive interleaving policy as bandwidth order. In this policy, requested memory is allocated from the highest Bandwidth node. weighted_interleaving Set an adaptive interleaving policy as weighted interleaving. Set application's memory policy as weighted interleaving. The weight ratio of each node is determined by available bandwidth reported by tierd during runtime dynamically. |
bw_saturation | |
interleave_node | Optionally, set weighted interleaving policy's interleaving nodelists. If this config isn't set, interleave all system nodes. | N/A | N,M N-M |
Note: priority, maxmemory_policy, exmem_size, and normal_size are configurations that are closely related to each other. Please refer to the section Capacity-based Interleaving for details of operations according to each configuration. Also, if use_adaptive_interleaving is true, priority, normal/exmem_size, and maxmemory_policy are ignored. In order to utilize the adaptive interleaving feature of SMDK Allocator, additional settings are required. Please refer to the section Bandwidth-based Interelaving for more details.
Usage
You can enable the SMDK Allocator library by exporting LD_PRELOAD and CXLMALLOC_CONF environment variables as shown below.
$ export LD_PRELOAD=/path/to/SMDK/lib/smdk_allocator/lib/libcxlmalloc.so
$ export CXLMALLOC_CONF=use_exmem:true,exmem_size:16384,normal_size:16384,maxmemory_policy:remain,use_auto_arena_scaling:false,priority:exmem
$ ./your_application
Operation Verification
Before describing details about test cases and applications provided by SMDK, we would like to introduce the way to check the CXL memory usage. More specifically, it will be described through the example of the in-memory database application (Memcached) on an SMDK and CXL memory based system.
Configuration
Memcached server with SMDK/CXL | Memcached client (memtier_benchmark) |
---|---|
[SMDK CXLMALLOC_CONF] use_exmem:true normal_size:2048 exmem_size:20480 maxmemory_policy:remain priority:normal [Memcached] --memory-limit=22528 --conn-limit=1400 |
--threads=24 --clients=50 --requests=2000 --data-size=4096 --ratio=1:0 (100% SET W/L) |
- SMDK configuration in Memcached server: normal_size is set to 2GB and the exmem_size is set to 20GB. priority is normal, and maxmemory_policy is remain. In other words, SMDK allocator allocates DRAM for heap memory requests up to 2GB, and then allocates CXL memory.
- memtier_benchmark (Memcached client) configuration: 1200 clients (24 threads x 50 client connections per thread) send 2000 data set CMDs to the Memcached server respectively. The data size is 4KB each, so the total data size Memcached server has to store is around 9.2GB. The actual amount of memory used by Memcached server is around 11GB including internal structures and metadata.
CXL memory usage: Buddy information (/proc/buddyinfo)
One of the easiest ways to find whether CXL memory devices has been successfully initialized and used properly is to check the system's buddy allocation information using the command $ cat /proc/buddyinfo. The table below is an example of information through /proc/buddyinfo in a system with a page size of 4KB. Users can find information about the number of free (not used or allocated) chunks from 4KB to 4MB, for each memory zone managed by kernel virtual memory manager (VMM). Please note that the row "Node 1, zone Movable" indicates CXL memory.
4 KB | 8 KB | 16 KB | 32 KB | 64 KB | 128 KB | 256 KB | 512 KB | 1 MB | 2 MB | 4 MB | |
---|---|---|---|---|---|---|---|---|---|---|---|
Node 0, zone DMA | 1 | 0 | 0 | 1 | 2 | 1 | 1 | 0 | 1 | 1 | 3 |
Node 0, zone DMA32 | 3 | 8 | 3 | 4 | 4 | 5 | 3 | 5 | 4 | 4 | 436 |
Node 0, zone Normal | 11975 | 9494 | 6262 | 4306 | 2617 | 1547 | 930 | 537 | 332 | 72 | 45610 |
Node 1, zone Movable | 1 | 0 | 1 | 1 | 0 | 1 | 1 | 1 | 0 | 1 | 98300 |
Running test and changes of buddyinfo
- Launch Memcached server: The table below shows the initial status of free buddy chunk right after running the Memcached server. The test system has 3ea of 128GB CXL memory expanders, and when you see the Movable zone below, you can find the number of 4MB free chunk is 98300(4MB * 98300 ≒384GB).
Node 0, zone DMA 1 0 0 1 2 1 1 0 1 1 3
Node 0, zone DMA32 3 8 3 4 4 5 3 5 4 4 436
Node 0, zone Normal 11975 9494 6262 4306 2617 1547 930 573 332 72 45610
Node 1, zone Movable 1 0 1 1 0 1 1 1 0 1 98300
- Set data (~2GB): As described above, SMDK allocates the NORMAL zone for the first 2GB heap allocation requests. You can find from buddyinfo table that NORMAL zone's free chunks of 4MB decreased from 45610 to 45037. (≒2.2GB). Considering the increase and decrease of other free chunks, 2134756KB(2.04GB) from NORMAL zone has been allocated. Now SMDK is beginning to allocate Movable zone, you can find the number of 4MB free chunks in the Movable zone decreased slightly from 98300 to 98050.
Node 0, zone DMA 1 0 0 1 2 1 1 0 1 1 3
Node 0, zone DMA32 3 8 3 4 4 5 3 5 4 4 436
Node 0, zone Normal 204 1291 3670 4713 3701 2348 1367 709 306 76 45037
Node 1, zone Movable 1 0 0 0 1 1 1 1 1 0 98050
- Set data (~11GB): The table below shows a free chunk’s status right after all the data set requests from Memcached client are processed. Calculating the free memory usage of the Movable zone, the memory size allocated from Movable zone is 9.2GB (402639796KB → 392957800KB).
Node 0, zone DMA 1 0 0 1 2 1 1 0 1 1 3
Node 0, zone DMA32 3 8 3 4 4 5 3 5 4 4 436
Node 0, zone Normal 4848 1032 2577 5042 3690 2340 1362 711 304 78 45028
Node 1, zone Movable 0 1 0 1 1 0 1 1 1 1 95936
The SMDK compatible path implies the ways to utilize CXL memory without application SW modification. Below is compatible API list that SMDK provides. Technically, those are standard POSIX heap APIs that process dynamically call to extend and shrink heap segments.
API | Desc. | Note |
---|---|---|
malloc | Allocates size bytes of uninitialized memory. | |
calloc | Allocates memory for an array of num objects of size and initializes all bytes in the allocated memory to zero. | |
realloc | Reallocates the given area of memory. | |
free | Deallocates the space previously allocated by malloc(), calloc(), aligned_alloc(), or realloc(). | |
posix_memalign | Allocates size bytes aligned on a boundary specified by alignment, and returns a pointer to the allocated memory in memptr. | |
aligned_alloc | Allocates size bytes of uninitialized memory whose alignment is specified by alignment. | C++ |
new | Allocates requested number of bytes. | C++ |
delete | Deallocates memory previously allocated by a matching operator new. | C++ |
mmap mmap64 |
Creates a new mapping of memory in the virtual address space of the calling process. |
1. malloc
void* malloc(size_t size);
Parameters
- size: number of bytes to allocate.
Return value
- On success, returns the pointer to the beginning of the newly allocated memory. To avoid a memory leak, the returned pointer must be deallocated by calling free() or realloc().
- On failure, returns a null pointer.
2. calloc
void* calloc(size_t num, size_t size);
Parameters
- num: number of objects.
- size: size of each object.
Return value
- On success, returns the pointer to the beginning of the newly allocated memory. To avoid a memory leak, the returned pointer must be deallocated by calling free() or realloc().
- On failure, returns a null pointer.
3. realloc
void *realloc(void *ptr, size_t new_size);
Parameters
- ptr: pointer to the memory area to be reallocated.
- new_size: new size of the array in bytes.
Return value
- On success, returns the pointer to the beginning of the newly allocated memory. To avoid a memory leak, the returned pointer must be deallocated by calling free() or realloc(). The original pointer ptr is invalidated and any access to it is undefined behavior (even if reallocation was in-place).
- On failure, returns a null pointer. The original pointer ptr remains valid and may need to be deallocated by calling free() or realloc().
4. free
void *free(void *ptr);
Parameters
- ptr: pointer to the memory to deallocate.
Return Value
- N/A
5. posix_memalign
int posix_memalign(void **memptr, size_t alignment, size_t size);
Parameters
- memptr: pointer that shall be returned. Upon success, the value pointed to by memptr shall be a multiple of alignment.
- alignment: specifies the alignment. The value of alignment shall be a power of two multiple of sizeof(void *).
- size: number of bytes to allocate.
Return value
- On success, returns zero.
- On failure, an error number shall be returned to indicate the error, and the contents of memptr shall either be left unmodified or be set to a null pointer.
6. aligned_alloc
void *aligned_alloc(size_t alignment, size_t size);
Parameters
- alignment: specifies the alignment. Must be a valid alignment supported by the implementation.
- size: number of bytes to allocate. An integral multiple of alignment.
Return Value
- On success, returns the pointer to the beginning of the newly allocated memory. To avoid a memory leak, the returned pointer must be deallocated by calling free() or realloc().
- On failure, returns a null pointer.
7. new, new[]
void * operator new(std::size_t size);
void * operator new[](std::size_t size);
Parameters
- size: size in bytes of the requested memory block.
Return value
- Returns a pointer suitably aligned to point to an object of the requested size.
8. delete, delete[]
void operator delete(void *ptr);
void operator delete[](void *ptr);
Parameters
- ptr: A pointer to the memory block to be released.
Return value
- N/A
9. mmap, mmap64
void *mmap(void *addr, size_t len, int protection, int flags, int fd, off_t offset);
void *mmap64(void *addr, size_t len, int protection, int flags, int fd, off_t offset);
Parameters
- addr: the starting address of the memory area to be mapped.
- len: the length in bytes to map.
- protection: the access allowed to this process for this mapping. (PROT_NONE, PROT_READ, PROT_WRITE, PROT_EXEC)
- flags: further defines the type of mapping desired. (MAP_SHARED, MAP_PRIVATE, MAP_FIXED)
- fd: an open file descriptor.
- offset: the offset into the file, in bytes, where the map should begin.
Return value
- On success, mmap() returns a pointer to the mapped area.
- On error, the value MAP_FAILED (that is, (void *)-1) is returned, and errno is set to indicate the error.
As you can see in the How to Use section, SMDK includes an intelligent tiering engine that allows a variety of memory usecases through configurations for applications: tiering priorities, capacities, and bandwidth among memories.
By setting the priority, size and maxmemory_policy parameters of the CXLMALLOC_CONF environment variable, you can configure your applications' types priority of memory, maximum usage of each type, and memory usage policies.
The memory usage policies handle how to allocate the memory resources when the memory usage exceeds the pre-defined maximum usage of the first and second priority types of memory.
- Interleave: the application is allocated the memory resources from the higher priority type of memory.
- Remain: the application is allocated the memory resources from the second priority type of memory continuously.
- OOM (Out Of Memory): SMDK returns error. In this case, applications can only use the memory resources when the allocated pages are reclaimed. It is designed to consider that the conventional Linux virtual memory subsystem restricts the system or process swappiness.
Example
We assumed a user scenario with Redis in-memory database application; Redis-client request 1GB of KV data store to the Redis-server running on CXL memory expander and SMDK. Note that allocated size for each memory type is 64MB.
[Redis-server with SMDK]
priority: ExMem
exmem_size: 64MB
normal_size: 64MB
maxmemory_policy: interleave / remain / oom
[Redis-client]
1MB value x 1000 keys (=around 1GB)
In the above configurations, SMDK’s memory allocation ways based on each maxmemory_policy are as follow figure.
- Interleave: Allocator allocates 64MB from CXL memory to the Redis-server first, then allocates 64MB of NORMAL (DRAM) memory. After allocating 64MB of DDR memory, it allocates CXL memory again. It repeats.
- Remain: Allocator allocates 64MB from CXL memory to the Redis-server first, then allocates 64MB of NORMAL (DRAM) memory. After allocating all 64MB of DDR memory, it does not allocate CXL memory again. In other words, after the initial allocation of CXL memory (because the ‘priority’ is CXL), it allocates DRAM continuously.
- OOM: Allocator allocates 64MB from CXL memory to the Redis-server first, then allocates 64MB of NORMAL (DRAM) memory. After allocating all 64MB of DDR memory, SMDK does not allocate memory at all (i.e., Returns error).
By setting the use_adaptive_interleaving and adaptive_interleaving_policy parameters of the CXLMALLOC_CONF environment variables, you can configure your applications' bandwidth-aware memory policies.
Current Adaptive Interleaving provides three policies.
- bw_saturation: When DDR DRAM bandwidth is saturated, it handles in-coming memory allocation request out of CXL DRAM. This is designed to mitigate imbalanced memory use on tiered DDR/CXL memory system.
- bw_order: Use the node with the highest bandwidth in the system when allocating memory. This is designed to use CXL DRAM more actively through bandwidth-based allocation order rather than latency-based of existing Linux.
- weighted_interleaving: Similar to the existing interleave policy, but the difference is that the memory is allocated according to the interleave weight ratio not evenly. This is designed to improve the overall performance of the system by considering the bandwidth difference between DDR DRAM and CXL DRAM.
Example
In order to use adaptive interleaving, you should check that composed components (monitor, kernel driver, smdk allocator and pmu plugin as described in Intelligent Tiering Engine) work properly through check_tierd.sh. If the result is passed, then you are ready to use adaptive interleaving. These components can be executed at once through the run_tierd.sh script we provided as shown below. Note that root privileges are required to load kernel modules, and you should modify the contents of the configuration file (/path/to/SMDK/lib/tierd/tierd.conf) to match your repository path.
$ cd /path/to/SMDK/lib/tierd
# Modify MLC_PATH and AMD_UPROFPCM_PATH(in case of AMD architecture)
$ vi tierd.conf
$ sudo ./run_tierd.sh
Note: If the script fails to run, make sure you are booting with the SMDK kernel. For more details, please refer to the Intelligent Tiering Engine section of the Installation chapter.
Alternatively, you can run it manually as shown below.
$ cd /path/to/SMDK/lib/tierd
$ sudo ./tierd -c ./tierd.conf
The current version of adaptive interleaving uses Intel MLC as a memory-intensive workload generator to measure the maximum bandwidth of each node. When tierd starts, it runs MLC once for the first time, which can take several minutes. Adaptive interleaving compares the real-time bandwidth to this measured value to determine whether DDR DRAM bandwidth is saturated. It means you need to wait for tierd to finish executing MLC before using adaptive interleaving feature, which can be verified by tierd's output logs like below.
$ sudo ./tierd -c ./tierd.conf
......
Monitor Constructor
Launch Monitor
Launch Bandwidth Loader Workload.. /path/to/smdk/lib/mlc/Linux/mlc
......
Bandwidth Loader Workload Finish (##.###s)
Notify...
......
You are now all ready to use adaptive interleaving. To enable an application to use the feature with SMDK Allocator, use_adaptive_interleaving parameter should be set to true in CXLMALLOC_CONF environment variable as follows.
$ export LD_PRELOAD=/path/to/SMDK/lib/smdk_allocator/lib/libcxlmalloc.so
$ export CXLMALLOC_CONF=use_exmem:true,use_adaptive_interleaving:true,adaptive_interleaving_policy:bw_saturation
$ ./your_application
The criteria for interleaving can be selected by adaptive_interleaving_policy. Currently, we provide a single option bw_saturation, which is the default value for this configuration parameter.
To terminate the execution of the userspace daemon, run the script below.
$ cd /path/to/SMDK/lib/tierd
$ sudo ./stop_tierd.sh
Unlike the compatible path, you do not need to set LD_PRELOAD and CXLMALLOC_CONF for using optimization path library.
Instead, you need to
- Include the header file (/path/to/SMDK/lib/smdk_allocator/opt_api/include/smdk_opt_api.h) in your application code.
- Re-write your application with SMDK optimization APIs for better memory optimization.
- Modify your build script so that your application can be built with SMDK library (Add library path for libsmalloc.so or libsmalloc.a, and libpnm.so; /path/to/SMDK/lib/smdk_allocator/lib).
Makefile modification and LD_LIBRARY_PATH
As for modifying the build script of your application,
- If you want to link the shared library of SMDK (libsmalloc.so), you need to set the library path and library name with -L(path) and -l(name) options, respectively.
- If you want to link the static library of SMDK (libsmalloc.a), you need to set the full path of the library to the compiler flag.
- Regardless of the ways above, you should specify the path of the header file in which the SMDK optimization APIs are defined by using the -I option. In the example below, it is added to CFLAGS variable.
### Example Makefile to link SMDK *optimization* API library:
# dynamic link
......
CFLAGS += -I/path/to/SMDK/lib/smdk_allocator/opt_api/include
LDFLAGS += -L/path/to/SMDK/lib/smdk_allocator/lib/
LIBS += -lsmalloc
......
all: $(APP)
$(APP): $(APP).o
$(CC) -o $@ $^ $(CFLAGS) $(LDFLAGS) $(LIBS)
# static link
......
CFLAGS += -I/path/to/SMDK/lib/smdk_allocator/opt_api/include
LIBS += /path/to/SMDK/lib/smdk_allocator/lib/libsmalloc.a
......
all: $(APP)
$(APP): $(APP).o
$(CC) -o $@ $^ $(CFLAGS) $(LDFLAGS) $(LIBS)
/* example application code */
#include "smdk_opt_api.h"
...
int main(void) {
......
void *buf1 = s_malloc(SMDK_MEM_EXMEM, 4*1024); // 4KB CXL memory allocation request
void *buf2 = s_malloc(SMDK_MEM_NORMAL, 128); // 128B DRAM allocation request
......
s_free_type(SMDK_MEM_EXMEM, buf1); // or s_free(buf1);
s_free_type(SMDK_MEM_NORMAL, buf2); // or s_free(buf2);
......
Please make sure that if you choose dynamic linking way, you should specify the SMDK library's path in LD_LIBRARY_PATH environment variable before running your application as shown below.
$ export LD_LIBRARY_PATH=/path/to/SMDK/lib/smdk_allocator/lib
For more information about how to use optimization API library, refer to the test applications named opt_api, opt_api_cpp, and metadata_api at /path/to/SMDK/src/test/heap_allocator.
SMALLOC_CONF
Unlike the compatible path where many configurations can be set through the CXLMALLOC_CONF environment variable, the optimization path only allows an option use_auto_arena_scaling through the SMALLOC_CONF environment variable. The way of setting SMALLOC_CONF is same as CXLMALLOC_CONF described in the Compatible Path section.
Config. | Desc. | Default | Note |
---|---|---|---|
use_auto_arena_scaling | Affects the following two things: 1) The number of arenas generated during initialization would be: false: static number. true: in proportion to the number of CPUs. 2) Arena allocation to threads would be: false: a round-robin way. true: determined based on CPU_ID the thread is running on. |
TRUE |
$ export LD_LIBRARY_PATH=/path/to/SMDK/lib/smdk_allocator/lib #if needed
$ export SMALLOC_CONF=use_auto_arena_scaling:true
$ ./your_application
PNM API
For using SMDK PNM API, few more steps are required. (Currently, the PNM API works on a C++ basis)
You need to
- Include the header file (/path/to/SMDK/lib/smdk_allocator/opt_api/include/smdk_opt_api.hpp) in your application code.
- Re-write your application with SMDK PNM API for better processing operations.
- Modify the Makefile to set the library name with -l(name) option.
- Specify the path of the PNMLibrary header files which the SMDK PNM API includes by using the -I option (/path/to/SMDK/lib/PNMLibrary-pnm-v3.0.0/build/libs/include/). In the example below, it is added to CXXFLAGS variable.
### Example Makefile to link *SMDK PNM* API library:
......
CXXFLAGS += -I/path/to/SMDK/lib/smdk_allocator/opt_api/include -I/path/to/SMDK/lib/PNMLibrary-pnm-v3.0.0/build/libs/include
LDFLAGS += -L/path/to/SMDK/lib/smdk_allocator/lib/
LIBS += -lsmalloc -lpnm
......
all: $(APP)
$(APP): $(APP).o
$(CXX) -o $@ $^ $(CXXFLAGS) $(LDFLAGS) $(LIBS)
/* example application code for IMDB - Range Scan Operation */
#include "smdk_opt_api.hpp"
...
int main(void) {
......
SmdkAllocator& allocator = SmdkAllocator::get_instance();
allocator.process(SmdkAllocator::Device::PNM,
SmdkAllocator::PNMType::IMDB,
SmdkAllocator::Operation::ScanRange,
column, ranges, results);
......
}
Since PNMLibrary is linked dynamically, you should specify the PNMLibrary's path in LD_LIBRARY_PATH environment variable before running your application as shown below.
$ export LD_LIBRARY_PATH=/path/to/SMDK/lib/smdk_allocator/lib
For more information about how to use SMDK PNM API, refer to the test applications named pnm at /path/to/SMDK/src/test.
There are three types of API sets in optimization path; Allocation API, Metadata API, and PNM API. For C-based Allocation and Metadata API sets, SMDK provides an allocator class (SmdkAllocator class) that can be utilized in C++. The PNM API works on a C++ basis.
C API | Desc. | C++ API | Device Type |
---|---|---|---|
smdk_memtype_t | Datatype which represents memory type. Contains two elements SMDK_MEM_NORMAL and SMDK_MEM_EXMEM. | DDR, CXL DRAM | |
s_malloc | Allocates size bytes of uninitialized memory. | SmdkAllocator::malloc(type, size) | DDR, CXL DRAM |
s_calloc | Allocates memory for an array of num objects of size and initializes all bytes in the allocated memory to zero. | SmdkAllocator::calloc(type, num, size) | DDR, CXL DRAM |
s_realloc | Reallocates the given area of memory on designated memory type. Can get any target type of memory regardless of original location of given area. | SmdkAllocator::realloc(type, ptr, size) | DDR, CXL DRAM |
s_free | Deallocates the space previously allocated by s_malloc(), s_calloc(), s_posix_memalign(), or s_realloc(). | SmdkAllocator::free(ptr) | DDR, CXL DRAM |
s_free_type | Deallocates the space previously allocated on designated type of memory. Operates fine although target type of memory and location of pointer do not match. | SmdkAllocator::free(type, ptr) | DDR, CXL DRAM |
s_posix_memalign | Allocates size bytes aligned on a boundary specified by alignment, and returns a pointer to the allocated memory in memptr. | SmdkAllocator::posix_memalign(type, memptr, alignment, size) | DDR, CXL DRAM |
s_get_memsize_total | Returns total size of requested type of memory. | SmdkAllocator::get_memsize_total(type) | DDR, CXL DRAM |
s_get_memsize_used | Returns size of memory (bytes) allocated with SMDK APIs. | SmdkAllocator::get_memsize_used(type) | DDR, CXL DRAM |
s_get_memsize_available | Returns available size of requested type of memory. (bytes) | SmdkAllocator::get_memsize_available(type) | DDR, CXL DRAM |
s_get_memsize_node_total | Returns total size of requested type of memory of node. (bytes) | DDR, CXL DRAM | |
s_get_memsize_node_available | Returns available size of requested type of memory of node. | DDR, CXL DRAM | |
s_stats_print | Prints out above three data (total, used, and available) per type of memory. | SmdkAllocator::stats_print(unit) | DDR, CXL DRAM |
s_stats_node_print | Prints out above three data (total, used, and available) per node and per type. | SmdkAllocator::stats_node_print(unit) | DDR, CXL DRAM |
s_enable_node_interleave | Sets interleave policy of calling thread with mmap() syscall. | SmdkAllocator::enable_node_interleave(nodes) | DDR, CXL DRAM |
s_disable_node_interleave | Unsets interleave policy of calling thread. | SmdkAllocator::disable_node_interleave() | DDR, CXL DRAM |
s_malloc_node | Allocates memory from specified node. | SmdkAllocator::malloc_node(type, size, node) | DDR, CXL DRAM |
s_free_node | Deallocates the space previously allocated by s_malloc_node(). | SmdkAllocator::free_node(type, mem, size) | DDR, CXL DRAM |
Processes IMDB Scan operation with the PNM device | SmdkAllocator::process(PNM, type=IMDB, op, data, op_info, result) | PNM | |
Processes DLRM SLS operation with the PNM device | SmdkAllocator::process(PNM, type=DLRM, SLS, data, op_info, result) | PNM |
1. smdk_memtype_t
Data structure used as parameter of allocation functions to represent target memory types.
typedef enum {
SMDK_MEM_NORMAL=0,
SMDK_MEM_EXMEM
} smdk_memtype_t;
2. s_malloc
void *s_malloc(smdk_memtype_t type, size_t size);
Parameters
- type: target memory types to allocate. (SMDK_MEM_NORMAL / SMDK_MEM_EXMEM)
- size: number of bytes to allocate.
Return value
- On success, returns the pointer to the beginning of the newly allocated memory. To avoid a memory leak, the returned pointer must be deallocated by calling s_free(), s_free_type() or s_realloc().
- On failure, returns a null pointer (e.g. invalid memory type, lack of available memory, etc.).
3. s_calloc
void* s_calloc(smdk_memtype_t type, size_t num, size_t size);
Parameters
- type: target memory types to allocate. (SMDK_MEM_NORMAL / SMDK_MEM_EXMEM)
- num: number of objects.
- size: size of each object.
Return value
- On success, returns the pointer to the beginning of the newly allocated memory. To avoid a memory leak, the returned pointer must be deallocated by calling s_free(), s_free_type() or s_realloc().
- On failure, returns a null pointer (e.g. invalid memory type, lack of available memory, etc.).
4. s_realloc
void *s_realloc(smdk_memtype_t type, void *ptr, size_t new_size);
Parameters
- type: target memory types to reallocate. (SMDK_MEM_NORMAL / SMDK_MEM_EXMEM)
- ptr: pointer to the memory area to be reallocated.
- new_size: new size of the array in bytes.
Return value
- On success, returns the pointer to the beginning of the newly allocated memory. To avoid a memory leak, the returned pointer must be deallocated by calling s_free(), s_free_type() or s_realloc(). The original pointer ptr is invalidated and any access to it is undefined behavior (even if reallocation was in-place).
- On failure, returns a null pointer (e.g. invalid memory type or address, etc.). The original pointer ptr remains valid and may need to be deallocated by calling s_free(), s_free_type() or s_realloc().
lf the memory type of (old)ptr and the type of new_ptr are different, this function will deallocate (old)ptr then return new memory pointer (buffer) with the type you specified.
5. s_free
void *s_free(void *ptr);
Parameters
- ptr: pointer to the memory to deallocate.
Return Value
- N/A
If you use this function you do not need to specify memory type for ptr, but some overhead time may be added to memory deallocation compared to using s_free_type function with the proper type.
6. s_free_type
void *s_free_type(smdk_memtype_t type, void *ptr);
Parameters
- type: target memory types to free. (SMDK_MEM_NORMAL / SMDK_MEM_EXMEM)
- ptr: pointer to the memory to deallocate.
Return Value
- N/A
You need to set the proper type for the ptr you want to free (deallocate). Even if the memory type to which the actual ptr was allocated is different from the type you specified, the memory can be deallocated normally. However, be aware that some overhead time may be added to memory deallocation time compared to when specified correctly.
7. s_posix_memalign
int s_posix_memalign(smdk_memtype_t type, void **memptr, size_t alignment, size_t size);
Parameters
- type: target memory type to allocate. (SMDK_MEM_NORMAL / SMDK_MEM_EXMEM)
- memptr: pointer that shall be returned. Upon success, the value pointed to by memptr shall be a multiple of alignment.
- alignment: specifies the alignment. The value of alignment shall be a power of two multiple of *sizeof(void ).
- size: number of bytes to allocate.
Return value
- On success, returns zero.
- On failure (e.g. invalid memory type or address, etc.), an error number shall be returned to indicate the error and the contents of memptr shall either be left unmodified or be set to a null pointer.
8. s_get_memsize_total
size_t s_get_memsize_total(smdk_memtype_t type);
Parameters
- type: memory type requesting. (SMDK_MEM_NORMAL / SMDK_MEM_EXMEM)
Return value
- Returns the system total memory of requested type based on /proc/zoneinfo (bytes).
- Returns 0 on invalid memory type.
9. s_get_memsize_used
size_t s_get_memsize_used(smdk_memtype_t type);
Parameters
- type: memory type requesting. (SMDK_MEM_NORMAL / SMDK_MEM_EXMEM)
Return value
- Returns used memory of the requested type which is allocated by SMDK heap allocation APIs (bytes).
- Returns 0 on invalid memory type.
10. s_get_memsize_available
size_t s_get_memsize_available(smdk_memtype_t type);
Parameters
- type: memory type requesting. (SMDK_MEM_NORMAL / SMDK_MEM_EXMEM)
Return value
- Returns system available memory of the requested type based on /proc/buddyinfo (bytes).
- Returns 0 on invalid memory type.
11. s_get_memsize_node_total
size_t s_get_memsize_node_total(smdk_memtype_t type, int node);
Parameters
- type: memory type requesting. (SMDK_MEM_NORMAL / SMDK_MEM_EXMEM)
- node: number of node requesting.
Return value
- Returns system total memory of the requested type in designated node based on /proc/zoneinfo (bytes).
- Returns 0 on invalid memory type or node.
12. s_get_memsize_node_available
size_t s_get_memsize_node_available(smdk_memtype_t type, int node);
Parameters
- type: memory type requesting. (SMDK_MEM_NORMAL / SMDK_MEM_EXMEM)
- node: number of node requesting.
Return value
- Returns system available memory of requested type in designated node based on /proc/buddyinfo (bytes).
- Returns 0 on invalid memory type or node.
13. s_stats_print
void s_stats_print(char unit);
Parameters
- unit: Memory units to display statistic information. (k/K/m/M/g/G)
Return value
- N/A
Result
- Prints out total / used / available memory statistic information of each type of memory. Below is an example of a console screen output after executing this function on a system equipped with 64GB DRAM and 32GB CXL.mem.
SMDK Memory allocation stats:
Type Total Used Available
Normal 62.6GB 0.0GB 57.2GB
ExMem 32.0GB 0.0GB 32.0GB
14. s_stats_node_print
void s_stats_node_print(char unit);
Parameters
- unit: Memory units to display statistic information. (k/K/m/M/g/G)
Return value
- N/A
Result
- Prints out total / used / available memory statistic information of each type of memory per node.
Type Node Total Available
Normal 0 32.0GB 27.9GB
ExMem 1 32.0GB 29.0GB
ExMem 2 32.0GB 28.1GB
15. s_enable_node_interleave
void s_enable_node_interleave(char *nodes);
Parameters
- nodes: nodes to interleave. ("a"/"a,b"/"a-b")
Return Value
- N/A
Result
- Sets the interleave policy of the calling thread. After the policy setting, you can allocates requested size of memory from designated nodes by using mmap syscall.
16. s_disable_node_interleave
void s_disable_node_interleave(void);
Parameters
- N/A
Return Value
- N/A
Result
- Unsets the interleave policy of the calling thread.
17. s_malloc_node
void* s_malloc_node(smdk_memtype_t type, size_t size, char *node);
Parameters
- type: target memory types to allocate. (SMDK_MEM_NORMAL / SMDK_MEM_EXMEM)
- size: size to allocate.
- node: node to allocate memory.
Return value
- On success, returns the pointer to the beginning of the newly allocated memory. To avoid a memory leak, the returned pointer must be deallocated by calling s_free_node().
- On failure, returns a null pointer (e.g. invalid memory type or node, lack of available memory, etc.).
Result
- Allocates requested size of memory from designated node. Only one node should be specified for node. Also parameter node should match parameter type.
18. s_free_node
void s_free_node(smdk_memtype_t type, void* mem, size_t size);
Parameters
- type: target memory type to allocate. (SMDK_MEM_NORMAL / SMDK_MEM_EXMEM)
- mem: address of memory to deallocate.
- size: size of memory to deallocate.
Result
- Deallocates memory which is allocated by s_malloc_node(). The parameter type is necessary for metadata management. As directly mapped memory is not managed by SMDK allocator library, this function requires size parameter.
19. SmdkAllocator::process
void SmdkAllocator::process(Device device, PNMType type, Operation op,
const pnm::imdb::compressed_vector &column,
const pnm::imdb::Ranges &ranges,
pnm::imdb::BitVectors &results)
void SmdkAllocator::process(Device device, PNMType type, Operation op,
const pnm::imdb::compressed_vector &column,
const pnm::imdb::Ranges &ranges,
pnm::imdb::IndexVectors &results)
void SmdkAllocator::process(Device device, PNMType type, Operation op,
const pnm::imdb::compressed_vector &column,
const pnm::imdb::Predictors &predictors,
pnm::imdb::BitVectors &results)
void SmdkAllocator::process(Device device, PNMType type, Operation op,
const pnm::imdb::compressed_vector &column,
const pnm::imdb::Predictors &predictors,
pnm::imdb::IndexVectors &results)
template <typename T>
void SmdkAllocator::process(Device dev, PNMType type, Operation op,
const SlsTable<T> &table,
const SlsParam &op_param,
std::vector<T> &results)
Parameters
- device: device to process operation. (PNM)
- type: target application type of the PNM device. (IMDB / DLRM)
- op: operation to process. (ScanRange / ScanList / Sls)
- column, table: data that the operation processed onto.
- ranges, predictors, op_param: information about the operation.
- results: pointer to the results where the operation outputs are stored.
Return value
- N/A
Result
- Processes operation with specified device. Currently, SMDK PNM API supports PNM device.
Notice
- For IMDB Scan operation, (Range / List) scans with (Bit / Index) vector as outputs are supported.
- For DLRM SLS operation, data types of float and uint32_t are supported.
- Data structures used as parameters of DLRM SLS operation are described below.
template <typename T>
struct SlsTable
{
const std::vector<T> &tables;
uint32_t tables_count;
uint32_t rows_count;
uint32_t feature_size;
sls_user_preferences alloc_option = SLS_ALLOC_AUTO;
pnm::operations::SlsOperation::Type data_type() const;
};
struct SlsParam
{
uint32_t n_batch;
const std::vector<uint32_t> &lengths;
const std::vector<uint32_t> &indices;
};
To enable and use SMDK's optimization path in a C++ application, you need to link the SMDK optimization library (libsmalloc.so or libsmalloc.a) when you build your application and include the smdk_opt_api.hpp header file in your application code. The header file defines a SmdkAllocator class, which is implemented in a singleton pattern to have only one instance of the class in the runtime environment. The example code below will help you know about how to use it.
#include "smdk_opt_api.hpp"
......
int main(void) {
......
SmdkAllocator& allocator = SmdkAllocator::get_instance();
void *buf = allocator.malloc(type, size);
allocator.free(malloc_buf);
......
buf = allocator.malloc_node(type, size, node);
allocator.free_node(type, buf, size);
......
allocator.stats_print('G');
......
}
The member functions of class SmdkAllocator are as follows.
Function | Desc. | Reference API in C |
---|---|---|
get_instance | Returns a SMDK allocator instance. | N/A |
malloc | Allocates size bytes of uninitialized memory. | s_malloc |
calloc | Allocates memory for an array of num objects of size and initializes all bytes in the allocated memory to zero. | s_calloc |
realloc | Reallocates the given area of memory on designated memory type. Can get any target type of memory regardless of original location of given area. | s_realloc |
free | Deallocates the space previously allocated by malloc(), calloc(), posix_memalign() or realloc(). | s_free s_free_type |
posix_memalign | Allocates size bytes aligned on a boundary specified by alignment, and returns a pointer to the allocated memory in memptr. | s_posix_memalign |
get_memsize_total | Returns total size of requested type of memory. | s_get_memsize_total |
get_memsize_used | Returns size of memory allocated with SMDK APIs. | s_get_memsize_used |
get_memsize_available | Returns available size of requested type of memory. | s_get_memsize_available |
stats_print | Prints out above three data (total, used, and available) per type. | s_stats_print |
stats_node_print | Prints out above three data (total, used, and available) per node and per type. | s_stats_node_print |
enable_node_interleave | Sets interleave policy of calling thread with mmap() syscall. | s_enable_node_interleave |
disable_node_interleave | Unsets interleave policy of calling thread. | s_disable_node_interleave |
malloc_node | Allocates memory from specified node. | s_malloc_node |
free_node | Deallocates the space previously allocated by malloc_node(). | s_free_node |
Since most of the member functions are the same as SMDK optimization APIs in C, only the newly added or changed functions are described in detail below.
Please refer to the previous content (API List) for other functions.
1. get_instance
static SmdkAllocator& get_instance();
Parameters
- N/A
Return value
- Returns a SMDK allocator instance. The creation of the instance is limited to one.
2. free
void free(void *ptr);
void free(smdk_memtype_t type, void *ptr);
Parameters
- type: target memory type to free. (SMDK_MEM_NORMAL / SMDK_MEM_EXMEM)
- ptr: pointer to the memory to deallocate.
Return value
- N/A
SMDK includes py_smdk packages that can be imported in Python3 environments. You can build py_smdk package through the commands below.
$ cd /path/to/SMDK/lib
$ ./build_lib.sh py_smdk #_py_smdk.so is generated in /path/to/SMDK/lib/smdk_allocator/opt_api/py_smdk_pkg
The built package and its py_smalloc.py module provide an interface to SMDK optimization APIs. So you can access CXL memory through the optimization path of SMDK in Python3 application by importing py_smdk package in your Python3 application. Setting the following environment variables would be required to import the py_smdk package.
# 1. LD_LIBRARY_PATH
# You need to specify the path where the SMDK *optimization* library is located.
# If you copied the library to your system's library path, you do not need to set below.
export LD_LIBRARY_PATH=/path/to/SMDK/lib/smdk_allocator/lib
# 2. PYTHONPATH
# You also need to specify the py_smdk package's path so that your python3 application can recognize the package.
# If you copied py_smdk package to the default path on your system(you can get by os.sys.path, FYI), you do not need to set below.
export PYTHONPATH=/path/to/SMDK/lib/smdk_allocator/opt_api/py_smdk_pkg
Below is the way to enable SMDK optimization path in a Python3 application.
Creating class mem_obj or class mem_obj_node
This is a way to create and utilize an object that can store and load your data. (Defined in the module /path/to/py_smdk_pkg/py_smdk/py_smalloc.py.)
Please refer to the example below.
from py_smdk import py_smalloc
memtype = py.smalloc.SMDK_MEM_EXMEM
smdk_obj = py_smalloc.mem_obj(memtype, "hello SMDK")
print(smdk_obj.data) # or print(smdk_obj.get())
del smdk_obj # call smdk_obj.free() explicitly
In addition to the examples described above, you can get a CXL / NORMAL mem object by specifying its size.
smdk_obj = py_smalloc.mem_obj(memtype, size=4096)
smdk_obj.set("hello SMDK")
Also, you can get a CXL / NORMAL mem object from a specific memory node.
py_smalloc.mem_obj_node(memtype, node, "hello SMDK"))
If you want to update the data stored in the assigned object, you can overwrite the new data through set method like below. The SMDK allocator re-allocates the internal memory buffer according to the size of the data.
smdk_obj.set("Scalable Memory Development Kit")
The assigned mem_obj can store a variety of Python data types. However, please note that Python-specific features that involve changes in the size of the data structures would be limited (e.g., list.append(data)).
Please refer to opt_api_python for more usecases.
The following is defined in the py_smalloc module of the py_smdk package.
1. Classes
1) py_smalloc.mem_obj
class py_smalloc.mem_obj(self, mem_type, data=None, size=None)
You can create an object with memory space allocated from the mem_type you specify. You can specify the data to write to mem_obj.data, or you can specify the free memory chunk size of mem_obj.data. If you specify both, the size of mem_obj.data is max(getsizeof(data), size). Below is the list of methods this class includes.
set(data, mem_type=None)
# Set the data to mem_obj.data.
# If the size of the data is greater than the previously stored data, mem_obj.data is realloc-ed.
# If the specified mem_type different from the self.mem_type, mem_obj.data is also realloc-ed.
get()
# Returns mem_obj.data. You can call this method, or you can access mem_obj.data directly.
resize(mem_type, size)
# The size (self.size) of mem_obj.data will be changed to the size you newly specified.
# The existing data is maintained, but if the size becomes smaller, the data may be damaged.
free()
# Free mem_obj.data to return the memory used to the system.
# This method is also called when you delete(del) this object.
2) py_smalloc.mem_obj_node
class py_smalloc.mem_obj_node(self, mem_type, node, data=None, size=None)
The difference between classes mem_obj_node and mem_obj is whether you need to specify a memory node when you create an object or not. The methods below that mem_obj_node class has and its function are almost the same as those of mem_obj class.
set(self, data)
# Set the data to mem_obj_node.data.
# If the size of the data is greater than the previously stored data, mem_obj_node.data is realloc-ed.
get()
# Returns mem_obj_node.data. You can call this method, or you can access mem_obj_node.data directly.
resize(size):
# The size (self.size) of mem_obj_node.data will be changed to the size you newly specified.
# The existing data is maintained, but if the size becomes smaller, the data may be damaged.
free():
# Free mem_obj_node.data to return the memory used to the system.
# This method is also called when you delete(del) this object.
2. Functions
The functions below are as interfaces to the optimization APIs with the same function name.
See this section for instructions and usage for each function.
py_smalloc.s_stats_print(unit)
py_smalloc.s_stats_node_print(unit)
py_smalloc.get_memsize_total(smdk_memtype)
py_smalloc.get_memsize_used(smdk_memtype)
py_smalloc.get_memsize_available(smdk_memtype)
py_smalloc.get_memsize_node_total(smdk_memtype, node)
py_smalloc.get_memsize_node_available(smdk_memtype, node)
py_smalloc.enable_node_interleave(nodes)
py_smalloc.disable_node_interleave()
3. Constants
# Constants used for 'mem_obj' and 'mem_obj_node' classes that separates memory types in SMDK allocator.
py_smalloc.SMDK_MEM_NORMAL
py_smalloc.SMDK_MEM_EXMEM
CXL-CLI is an extension of Intel CXL-CLI that works with SMDK. SMDK extends this tool to provide additional commands (e.g. timestamp, poison, event, identify, fw-update, etc.) defined in CXL specification. It also provides an interface that allows you to group CXL memory devices easily and control SMDK kernel specific features.
From SMDK v1.3, commands for checking node-to-node memory latency and manipulating the CXL Swap function are also supported.
From SMDK v1.4, commands for manipulating CXL Cache function are also supported.
The newly added commands are described below.
Note1: Running CXL-CLI requires root privileges.
Note2: From SMDK v2.1, set-alert-config command has been integrated into the ndctl upstream since version 79. Please refer to ndctl document for more details.
Command | Option | Description |
---|---|---|
inject-poison |
<mem0> -a <dpa> [<options>] -v, --verbose turn on debug -S, --serial use serial numbers to id memdevs -a, --address <dpa> DPA to inject or clear poison (hex value) -l, --length <dpa length> length in bytes from the DPA specified by '-a' to inject or clear poison (hex value) |
Injects poison into a requested physical address. |
clear-poison |
<mem0> -a <dpa> [<options>] -v, --verbose turn on debug -S, --serial use serial numbers to id memdevs -a, --address <dpa> DPA to inject or clear poison (hex value) -l, --length <dpa length> length in bytes from the DPA specified by '-a' to inject or clear poison (hex value) |
Clears poison from the requested physical address. |
set-timestamp |
<mem0> [<options>] -v, --verbose turn on debug -S, --serial use serial numbers to id memdevs |
Sets the timestamp on the device. It is recommended that the host set the timestamp after every Conventional or CXL Reset. Otherwise, the timestamp may be inaccurate. |
get-timestamp |
<mem0> [<options>] -v, --verbose turn on debug -S, --serial use serial numbers to id memdevs |
Gets the timestamp from the device. Timestamp is initialized via the set-timestamp command. |
get-event-record |
<mem0> -t <event_type> [<options>] -v, --verbose turn on debug -S, --serial use serial numbers to id memdevs -t, --type <n> type of event 1: info, 2: warning, 3: failure, 4: fatal |
Retrieves the next event records that may exist in the device’s requested event log. |
clear-event-record |
<mem0> -t <event_type> [<options>] -v, --verbose turn on debug -S, --serial use serial numbers to id memdevs -t, --type <n> type of event 1: info, 2: warning, 3: failure, 4: fatal -a, --all clear all event -n, --num_handle <n> event handle number to clear |
Provides a mechanism for the host to clear events that it has consumed from the device’s Event Log. |
identify |
<mem0> [<mem1>..<memN>] [<options>] -v, --verbose turn on debug -S, --serial use serial numbers to id memdevs |
Retrieves basic information about the memory device(s), and displays the result. (e.g. FW revision, capacity, event log size, QoS telemetry capabilities, etc.) |
get-health-info |
<mem0> [<mem1>..<memN>] [<options>] -v, --verbose turn on debug -S, --serial use serial numbers to id memdevs |
Gets the current instantaneous health of the device(s) and displays the result. (e.g. health status, life used, device temperature, etc.) |
get-alert-config |
<mem0> [<mem1>..<memN>] [<options>] -v, --verbose turn on debug -S, --serial use serial numbers to id memdevs |
Retrieves the device's critical and programmable warning alert configuration. (e.g. valid alerts, alert thresholds, etc.) |
set-alert-config | - | Allows the host to configure programmable warning alert thresholds optionally. |
get-firmware-info |
<mem0> [<mem1>..<memN>] [<options>] -v, --verbose turn on debug -S, --serial use serial numbers to id memdevs |
Retrieves information about the device(s) FW. (e.g. FW slots info, slot#N FW revision, etc.) |
transfer-firmware |
<mem0> -i <FW package> -s <slot number> [<options>] -v, --verbose turn on debug -S, --serial use serial numbers to id memdevs -i, --input <file> filename of FW package to transfer -s, --slot <n> slot number to transfer FW package |
Transfers all or part of a FW package from the caller to the device. FW packages shall be 128-byte aligned. |
activate-firmware |
<mem0> -s <slot number> [<options>] -v, --verbose turn on debug -S, --serial use serial numbers to id memdevs -s, --slot <n> slot number to activate FW package --online enable online activation |
Makes a FW previously stored on the device (slot) with the transfer FW command, the active FW. |
get-shutdown-state |
<mem0> [<options>] -v, --verbose turn on debug -S, --serial use serial numbers to id memdevs |
Gets current Shutdown State (dirty or clean). |
set-shutdown-state |
<mem0> [<options>] -v, --verbose turn on debug -S, --serial use serial numbers to id memdevs --clean set shutdown state to clean (default: dirty) |
Sets current Shutdown State to dirty or clean. |
get-scan-media-caps |
<mem0> [<mem1>..<memN>] -a <dpa> -l <length> [<options>] -v, --verbose turn on debug -S, --serial use serial numbers to id memdevs -a, --address <dpa> starting DPA where to retrieve scan media capabilities -l, --length <length> range of physical addresses, in units of 64B |
Retrieves capabilities and options for the scan-media feature based on the requested range. |
scan-media |
<mem0> [<mem1>..<memN>] -a <dpa> -l <length> [<options>] -v, --verbose turn on debug -S, --serial use serial numbers to id memdevs -a, --address <dpa> starting DPA where to start the scan -l, --length <length> range of physical addresses, in units of 64B |
Initiates a scan of a portion of CXL devices' media for locations that are poisoned or result in poison by host access. |
get-scan-media |
<mem0> [<mem1>..<memN>] [<options>] -v, --verbose turn on debug -S, --serial use serial numbers to id memdevs |
Retrieves an unordered list of poisoned memory locations, in response to the scan-media command. |
sanitize-memdev |
<mem0> [<mem1>..<memN>] [<options>] -v, --verbose turn on debug -S, --serial use serial numbers to id memdevs -e, --secure-erase secure erase a memdev -s, --sanitize sanitize a memdev |
Sanitizes the device to securely re-purpose or decommission it. |
get-sld-qos-control |
<mem0> [<mem1>..<memN>] [<options>] -v, --verbose turn on debug -S, --serial use serial numbers to id memdevs |
Retrieves the SLD’s QoS control parameters. |
set-sld-qos-control |
<mem0> [<mem1>..<memN>] [<options>] -v, --verbose turn on debug -S, --serial use serial numbers to id memdevs -e, --egress_port_congestion enable egress port congestion -d, --throughput_reduction enable temporary throughput reduction -m, --egress_moderate_percent <n> Threshold in % to indicate 'moderate' -s, --egress_severe_percent <n> Threshold in % to indicate 'severe' -i, --backpressure_sample_interval <n> Interval in ns to take sample(1-15) |
Sets the SLD’s QoS control parameters. |
get-sld-qos-status |
<mem0> [<mem1>..<memN>] [<options>] -v, --verbose turn on debug -S, --serial use serial numbers to id memdevs |
Retrieves the SLD’s QoS status, i.e. Backpressure Average Percentage. |
# ./cxl inject-poison mem0 -a 0x1000
cxl memdev: cmd_inject_poison: inject-poison 1 mem
# ./cxl clear-poison mem0 -a 0x1000
cxl memdev: cmd_clear_poison: clear-poison 1 mem
# ./cxl get-timestamp mem1
1/1/1970 09:00:00
cxl memdev: cmd_get_timestamp: get-timestamp 1 mem
# ./cxl set-timestamp mem1
cxl memdev: cmd_set_timestamp: set-timestamp 1 mem
# ./cxl get-timestamp mem1
7/24/2023 16:33:21
cxl memdev: cmd_get_timestamp: get-timestamp 1 mem
# ./cxl get-event-record mem0 -t 3
Received 2 event records from device
No. 1
UUID : 601dcbb3-9c064eab-b8af4e9b-fb5c9624 (DRAM Event)
Physical address : 0x1000
Memory Event Desc : Unknown
Memory Event Type : Data Path Error
Transaction Type : Host Read
Event Record Flags : Failure Event
Event Timestamp : 7/26/2023 13:42:35
Handle : 1
No. 2
UUID : 601dcbb3-9c064eab-b8af4e9b-fb5c9624 (DRAM Event)
Physical address : 0x1000
Memory Event Desc : Unknown
Memory Event Type : Data Path Error
Transaction Type : Host Read
Event Record Flags : Failure Event
Event Timestamp : 7/26/2023 13:42:47
Handle : 2
Overflow Error Count : 0
cxl memdev: cmd_get_event_record: get-event-record 1 mem
# ./cxl clear-event-record mem0 -t 3 -n 2
cxl memdev: cmd_clear_event_record: clear_event_record 1 mem
# ./cxl get-event-record mem0 -t 3
Received 1 event records from device
No. 1
UUID : 601dcbb3-9c064eab-b8af4e9b-fb5c9624 (DRAM Event)
Physical address : 0x1000
Memory Event Desc : Unknown
Memory Event Type : Data Path Error
Transaction Type : Host Read
Event Record Flags : Failure Event
Event Timestamp : 7/26/2023 13:42:35
Handle : 1
Overflow Error Count : 0
cxl memdev: cmd_get_event_record: get-event-record 1 mem
# ./cxl clear-event-record mem0 -t 3 -a
cxl memdev: cmd_clear_event_record: clear_event_record 1 mem
# ./cxl get-event-record mem0 -t 3
Received 0 event records from device
Overflow Error Count : 0
cxl memdev: cmd_get_event_record: get_event_record 1 mem
# ./cxl identify mem0
CXL Identify Memory Device "mem0"
FW Revision : fw_1234
Total Capacity : 128.00 GB
Volatile Only Capacity : 128.00 GB
Persistent Only Capacity : 0 B
Partition Alignment : 0 B
......
cxl memdev: cmd_identify: identified 1 mem
# ./cxl get-health-info mem0
CXL Get Health Information Memory Device "mem0"
Health Status : Normal
Media Status : Normal
Life Used : 4 % (Normal)
Device Temperature : 32 C (Normal)
Corrected Volatile Error Count : 0 (Normal)
Corrected Persistent Error Count : 0 (Normal)
Dirty Shutdown Count : 0
cxl memdev: cmd_get_health_info: get-health-info 1 mem
# ./cxl get-alert-config mem0
CXL Get Alert Configuration Memory Device "mem0"
Life Used Threshold - Critical : 75 %
- Warning : Not Set
Device Over-Temperature Threshold - Critical : 100 C
- Warning : 80 C
Device Under-Temperature Threshold - Critical : -30 C
- Warning : Not Supported
Corrected Volatile Memory Error Threshold - Warning : Not Supported
Corrected Persistent Memory Error Threshold - Warning : Not Supported
cxl memdev: cmd_get_alert_config: get-alert-config 1 mem
# ./cxl set-alert-config mem0 --life-used-threshold=50 --life-used-alert=on
{
"memdev":"mem0",
"ram_size":"128.00 GiB (137.44 GB)",
"alert_config":{
......
"life_used_prog_warn_threshold":50,
......
}
cxl memdev: cmd_set_alert_config: set alert configuration for 1 mem
# ./cxl get-alert-config mem0
CXL Get Alert Configuration Memory Device "mem0"
Life Used Threshold - Critical : 75 %
- Warning : 50 %
Device Over-Temperature Threshold - Critical : 100 C
- Warning : 80 C
Device Under-Temperature Threshold - Critical : -30 C
- Warning : Not Supported
Corrected Volatile Memory Error Threshold - Warning : Not Supported
Corrected Persistent Memory Error Threshold - Warning : Not Supported
cxl memdev: cmd_get_alert_config: get-alert-config 1 mem
# ./cxl get-firmware-info mem0
Supported FW Slots : 2
Slot 1 FW revision : fw_1234 (Active)
Slot 2 FW revision :
Online Activation Capability : Supported
cxl memdev: cmd_get_firmware_info: get-firmware-info 1 mem
# ./cxl transfer-firmware mem0 -i fw_5678.bin -s 2
cxl memdev: cmd_transfer_firmware: transfer-firmware 1 mem
# ./cxl activate-firmware mem0 -s 2 --online
cxl memdev: cmd_activate_firmware: activate-firmware 1 mem
# ./cxl get-firmware-info mem0
Supported FW Slots : 2
Slot 1 FW revision : fw_1234
Slot 2 FW revision : fw_5678 (Active)
Online Activation Capability : Supported
cxl memdev: cmd_get_firmware_info: get-firmware-info 1 mem
# ./cxl get-health-info mem0
......
Dirty Shutdown Count : 0
cxl memdev: cmd_get_health_info: get-health-info 1 mem
# ./cxl set-shutdown-state mem0
cxl memdev: cmd_set_shutdown_state: set-shutdown-state 1 mem
# ./cxl get-shutdown-state mem0
Shutdown State: Dirty
cxl memdev: cmd_get_shutdown_state: get-shutdown-state 1 mem
# ./cxl get-health-info mem0
......
Dirty Shutdown Count : 1
cxl memdev: cmd_get_health_info: get-health-info 1 mem
# ./cxl get-scan-media-caps mem0 -a 0x100 -l 0x1000
Estimated Scan Media Time(ms): 256
cxl memdev: cmd_get_scan_media_caps: get-scan-media-caps 1 mem
# ./cxl scan-media mem0 -a 0x100 -l 0x1000
cxl memdev: cmd_scan_media: scan-media 1 mem
# ./cxl get-scan-media mem0
No poison address
cxl memdev: cmd_get_scan_media: get-scan-media 1 mem
# ./cxl set-alert-config mem0 --over-temperature-threshold=40 --over-temperature-alert=on
{
"memdev":"mem0",
"ram_size":"128.00 GiB (137.44 GB)",
"alert_config":{
......
"dev_over_temperature_prog_warn_threshold":40,
......
}
cxl memdev: cmd_set_alert_config: set alert configuration for 1 mem
# ./cxl set-timestamp mem0
7/24/2023 17:17:21
cxl memdev: cmd_set_timestamp: set-timestamp 1 mem
# ./cxl get-alert-config mem0
......
Device Over-Temperature Threshold - Critical : 100 C
- Warning : 40 C
# ./cxl get-timestamp mem0
7/24/2023 17:17:25
cxl memdev: cmd_get_timestamp: get-timestamp 1 mem
# ./cxl sanitize-memdev mem0
cxl memdev: cmd_sanitize_memdev: sanitation started on 1 mem device
# ./cxl get-alert-config mem0
......
Device Over-Temperature Threshold - Critical : 100 C
- Warning : 85 C
# ./cxl get-timestamp mem0
1/1/1970 09:00:00
cxl memdev: cmd_get_timestamp: get-timestamp 1 mem
# ./cxl get-sld-qos-control mem0
Egress Port Congestion: Disable
Temporary Throughput Reduction: Disable
Egress Moderate Percentage: 10%
Egress Severe Percentage: 25%
Backpressure Sample Interval: 8
cxl memdev: cmd_get_sld_qos_control: get-sld-qos-control 1 mem
# ./cxl set-sld-qos-control mem0 -e
cxl memdev: cmd_set_sld_qos_control: set-sld-qos-control 1 mem
# ./cxl set-sld-qos-control mem0 -m 50 -s 75
cxl memdev: cmd_set_sld_qos_control: set-sld-qos-control 1 mem
# ./cxl get-sld-qos-control mem0
Egress Port Congestion: Enable
Temporary Throughput Reduction: Disable
Egress Moderate Percentage: 50%
Egress Severe Percentage: 75%
Backpressure Sample Interval: 8
cxl memdev: cmd_get_sld_qos_control: get-sld-qos-control 1 mem
# ./cxl get-sld-qos-status mem0
Backpressure Average Percent: 0%
By using the region and list CMDs with additional -V(--soft_interleaving), you can perform the device grouping and retrieve list of device information supported by the SMDK.
Note: From SMDK v2.0, additional grouping CMDs (e.g. group-list, group-node, etc.) are not supported.
Command | Option | Description |
---|---|---|
create-region -V(--soft_interleaving) | -G(--group) <'node' or 'noop'> | Makes CXL device(s) to be logically represented as Node or Noop Partition. |
-N(--target_node) <node_id> -w(--ways) <num_dev> <cxl0> [<cxl1>..<cxlN>] | Makes CXL device(s) to be grouped into the specified target node. | |
destroy-region -V(--soft_interleaving) | -N(--target_node) <node_id> | Removes all CXL devices from the specified node group. |
-w(--ways) <num_dev> <cxl0> [<cxl1>..<cxlN>] | Removes the CXL device(s) from the node group to which the device(s) belongs. | |
list -V(--soft_interleaving) | -n(--list_node) [node_id] | Displays CXL device(s) configuration status for node(s) of the system. - If the node id is not specified, it shows all the nodes from the system. - Otherwise it shows only the node specified. |
-C(--list_dev) [cxlN] | Displays CXL device(s) information and grouping status. - If the device name is not specified, it shows information from all of CXL devices in the system. - Otherwise it only shows information from the device specified. |
/* node 0 : CPU #1 + DDR Memory #1
node 1 : CXL #1, #2, #3 */
# ./cxl create-region -V -G node
# ./cxl list -V --list_node
[
{
"node_id" : -1,
"devices" : [ ]
}
{
"node_id" : 0,
"devices" : [ ]
}
{
"node_id" : 1,
"devices" : [ "cxl0" "cxl1" "cxl2" ]
}
]
# cat /proc/buddyinfo
Node 0, zone DMA 0 0 0 0 0 0 0 0 1 1 3
Node 0, zone DMA32 7 5 5 5 7 6 3 4 5 3 437
Node 0, zone Normal 2521 2397 1219 4419 2046 912 407 192 106 58 13430
Node 1, zone Movable 0 0 0 0 0 0 0 0 0 0 98304
/* node 0 : CPU #1 + DDR Memory #1
node 1 : CXL #1
node 2 : CXL #2
node 3 : CXL #3 */
# ./cxl create-region -V -G noop
# ./cxl list -V --list_node
[
{
"node_id" : -1,
"devices" : [ ]
}
{
"node_id" : 0,
"devices" : [ ]
}
{
"node_id" : 1,
"devices" : [ "cxl0" ]
}
{
"node_id" : 2,
"devices" : [ "cxl1" ]
}
{
"node_id" : 3,
"devices" : [ "cxl2" ]
}
]
# cat /proc/buddyinfo
Node 0, zone DMA 0 0 0 0 0 0 0 0 1 1 3
Node 0, zone DMA32 7 5 5 5 7 6 3 4 5 3 437
Node 0, zone Normal 2907 1686 828 4416 2038 913 407 192 106 58 13430
Node 1, zone Movable 0 0 0 0 0 0 0 0 0 0 32768
Node 2, zone Movable 0 0 0 0 0 0 0 0 0 0 32768
Node 3, zone Movable 0 0 0 0 0 0 0 0 0 0 32768
/* list -V --list_node */
# ./cxl list -V --list_node
[
{
"node_id" : -1,
"devices" : [ ]
}
{
"node_id" : 0,
"devices" : [ ]
}
{
"node_id" : 1,
"devices" : [ "cxl0" ]
}
{
"node_id" : 2,
"devices" : [ "cxl1" ]
}
{
"node_id" : 3,
"devices" : [ "cxl2" ]
}
]
# ./cxl list -V --list_node 2
[
{
"node_id" : 2,
"devices" : [ "cxl1" ]
}
]
/* list -V --list_dev */
# ./cxl list -V --list_dev
[
{
"id":"cxl0",
"start_address":"0x2080000000",
"size":"0x2000000000",
"node_id":"1",
"socket_id":"0",
"state":"online",
"memdev":
{
"memdev_id":"0",
"pci_bus_addr":"0000:16:00.0",
"pci_cur_link_speed":"32.0 GT/s PCIe",
"pci_cur_link_width":"8",
}
}
{
"id":"cxl1",
"start_address":"0x4080000000",
"size":"0x2000000000",
"node_id":"2",
"socket_id":"0",
"state":"online",
"memdev":
{
"memdev_id":"1",
"pci_bus_addr":"0000:27:00.0",
"pci_cur_link_speed":"32.0 GT/s PCIe",
"pci_cur_link_width":"8",
}
}
{
"id":"cxl2",
"start_address":"0x6080000000",
"size":"0x2000000000",
"node_id":"3",
"socket_id":"0",
"state":"online",
"memdev":
{
"memdev_id":"2",
"pci_bus_addr":"0000:38:00.0",
"pci_cur_link_speed":"32.0 GT/s PCIe",
"pci_cur_link_width":"8",
}
}
]
# ./cxl list -V --list_dev cxl1
[
{
"id":"cxl1",
"start_address":"0x4080000000",
"size":"0x2000000000",
"node_id":"2",
"socket_id":"0",
"state":"online",
"memdev":
{
"memdev_id":"1",
"pci_bus_addr":"0000:27:00.0",
"pci_cur_link_speed":"32.0 GT/s PCIe",
"pci_cur_link_width":"8",
}
}
]
# ./cxl list -V --list_node 1
[
{
"node_id" : 1,
"devices" : [ "cxl0" ]
}
]
# ./cxl create-region -V -N 1 -w 1 cxl1
# ./cxl list -V --list_node 1
[
{
"node_id" : 1,
"devices" : [ "cxl0" "cxl1" ]
}
]
/* destroy-region -V (target_node) */
# ./cxl list -V --list_node 1
[
{
"node_id" : 1,
"devices" : [ "cxl0" "cxl1" ]
}
]
# ./cxl destroy-region -V -N 1
# ./cxl list -V --list_node 1
[
{
"node_id" : 1,
"devices" : [ ]
}
]
/* destroy-region -V (remove dev) */
# ./cxl list -V --list_node 1
[
{
"node_id" : 1,
"devices" : [ "cxl0" "cxl1" "cxl2" ]
}
]
# ./cxl destroy-region -V -w 1 cxl1
# ./cxl list -V --list_node 1
[
{
"node_id" : 1,
"devices" : [ "cxl0" "cxl2" ]
}
]
Command | Option | Description |
---|---|---|
enable-cxlswap | N/A | Enables SMDK CXL Swap function at runtime. |
disable-cxlswap | N/A | Disables SMDK CXL Swap function at runtime. |
check-cxlswap | N/A | Provides information about whether the SMDK CXL Swap is enabled and size of the swap space currently in use. |
flush-cxlswap | N/A | Flushes all swapped out pages in CXL pool. Note: CXL Swap should be disabled before running this command. |
# ./cxl enable-cxlswap
Success: CXLSwap is enabled.
# ./cxl check-cxlswap
CXLSwap: enabled
CXLSwap Used : 428 kB
CXLSwap Pages : 131
# ./cxl disable-cxlswap
Success: CXLSwap is disabled.
# ./cxl check-cxlswap
CXLSwap: disabled
/* Grouping cmds are not available when cxl (swap) pool is in use. *flush-cxlswap* command is helpful in this situation. */
# ./cxl disable-cxlswap && ./cxl flush-cxlswap
Success: CXLSwap is disabled.
Success: CXLSwap is flushed.
# ./cxl create-region -V -G node
cxl region: cmd_create_region: created 2 regions
(done)
Command | Option | Description |
---|---|---|
enable-cxlcache | N/A | Enables SMDK CXL Cache function at runtime. |
disable-cxlcache | N/A | Disables SMDK CXL Cache function at runtime. |
check-cxlcache | N/A | Provides information about whether the SMDK CXL Cache is enabled and size of CXL Cache space currently in use. |
flush-cxlcache | N/A | Flushes all cached pages out from CXL cache pool. Note: CXL Cache should be disabled before running this command. |
# ./cxl enable-cxlcache
Success: CXLCache is enabled.
# ./cxl check-cxlcache
CXLCache: enabled
CXLCache Used : 0 kB
CXLCache Pages : 0
# ./cxl disable-cxlcache
Success: CXLCache is disabled.
# ./cxl check-cxlcache
CXLCache: disabled
# ./cxl disable-cxlcache && ./cxl flush-cxlcache
Success: CXLCache is disabled.
Success: CXLCache is flushed.
Command | Option | Description |
---|---|---|
get-latency-matrix |
[<options>] --size <MB>: size(range) of test buffer in MiBs (default: 20000MiB) --stride <B>: stride length in bytes (default: 64B). This value cannot be larger than the size --random: to measure latencies with random access (default: sequential access) --no-change-prefetcher: not to change hw prefetcher before starting test (default: turn-off hw prefetcher before test) --iteration <n>: iterate n times (default: iterate only 1 time) |
Measures and reports the latency between memory initiator node(s) and target node(s). |
#./cxl get-latency-matrix
Numa node (unit: ns)
Numa node 0 1 2
0 ......
1 ......
The tests below are to check whether the standard heap allocation APIs such as malloc, calloc, realloc, posix_memalign, etc., are properly working in SMDK compatible path.
1. run_heap_test.sh
Command lines
$ cd /path/to/SMDK/src/test/heap_allocator/comp_api_c
$ ./run_heap_test.sh <-e | -n>
(Example) $ ./run_heap_test.sh -e
Options
Options | Desc. |
---|---|
-e | Gives CXL memory a priority. |
-n | Gives DRAM a priority. |
Result
......
prio = [normal->exmem]
exmem_size = 1000 normal_size = 1000
maxmemory_policy = interleave
malloc: 0x7f8494b00900
free: 0x7f8494b00900
calloc: 0x7f8494b00940
free: 0x7f8494b00940
malloc: 0x7f8494b00ec0
realloc: 0x7f8494100880
free: 0x7f8494100880
posix_memalign: 0x7f8495009010
free: 0x7f8495009010
2. run_multi_thread.sh
Command lines
$ cd /path/to/SMDK/src/test/heap_allocator/comp_api_c
$ ./run_multi_thread.sh [options...]
(Example) $ ./run_multi_thread.sh size 1024 iter 1000 nthreads 4
Options
Options | Desc. | Default |
---|---|---|
size <byte> | Memory allocation size(bytes) per a request. | 1024 |
iter <n> | Number of times memory allocation requests are repeated. | 1048576 |
nthreads <n> | Number of threads to run specified tests. | 10 |
Result
*** use_adaptive_interleaving is disabled
g_arena_pool[0].nr_arena=64
g_arena_pool[0].type_mem=normal
......
[TEST START] smdk compatible API multi-thread malloc test
[TEST PARAMETERS] size=4096 iter=1000 nthreads=4
thread 1 start
thread 2 start
thread 3 start
thread 4 start
thread 4 done
thread 3 done
thread 2 done
thread 1 done
To verify the SMDK allocator can be configured properly, several test cases with different memory allocating configurations are provided. These include usage of CXL memory, memory capacities(size), priorities, usage of auto arena scaling, memory allocation policies when all memory of the specified size has been allocated, and set CXL.mem devices' interleaving and binding policies. (exmem_partition_range)
Note: exmem_partition_range allows you to get higher bandwidth from multiple CXL.mem devices, or you can use that configuration to isolate CXL resources between the tenants in your system. Please refer to test_config_cxl.sh script and result section below for examples of the setting of exmem_partition_range.
Command lines
$ cd /path/to/SMDK/src/test/heap_allocator/comp_api_conf
$ ./test_config_cxl.sh
Result
### Result example in case which three CXL.mem devices are mounted on nodes 1, 2, and 3 respectively.
run test - t1
use_exmem:true
--------------------------------
*** use_adaptive_interleaving is disabled
g_arena_pool[0].nr_arena=20
g_arena_pool[0].type_mem=normal
......
run test - t15 # exmem_partition_range:"all"
use_exmem:true,priority:exmem,exmem_partition_range:all
--------------------------------
......
run test - t16 # exmem_partition_range:"0,1,2"
use_exmem:true,priority:exmem,exmem_partition_range:0,1,2
--------------------------------
[Warning] node 0 is not ExMem node
......
run test - t17 # exmem_partition_range:"1-3"
use_exmem:true,priority:exmem,exmem_partition_range:1-3
--------------------------------
......
cf. # exmem_partition_range:"0,2-4"
use_exmem:true,priority:exmem,exmem_partition_range:0,2-4
--------------------------------
libnuma: Warning: node argument 4 out of range
[Warning] Invalid value for "exmem_partition_range"=0,2-4. This option will be ignored.
This is the test case to check whether the new and delete operators in C++ are properly working in the SMDK compatible path.
Command lines
$ cd /path/to/SMDK/src/test/heap_allocator/comp_api_cpp
$ ./run_test.sh <-e | -n>
(Example) $ ./run_test.sh -e
Options
Options | Desc. | Default |
---|---|---|
-e | Gives CXL memory a priority. | -n |
-n | Gives DRAM a priority. |
Result
......
prio = [exmem->normal]
normal_size = 2048 MB
exmem_size = 2048 MB
maxmemory_policy = interleave
exmem_partition_range =
use_auto_arena_scaling = 1
test start
test start
ptr: 0x7f6b25c09010 value: 3
heap size: 4
ptr: 0x7f6b25c1f000 value: 0
ptr: 0x7f6b25c1f004 value: 1
ptr: 0x7f6b25c1f008 value: 2
ptr: 0x7f6b25c1f00c value: 3
ptr: 0x7f6b25c1f010 value: 4
ptr: 0x7f6b25c09010 value: 10
ptr: 0x7f6b25c20000
test done
SMDK provides JAVA binding for the compatible path, and the test script (run_java_test.sh) in this directory is to verify the feature.
Testing with the script needs two options to choose from. Please refer to the options table below.
Note: Junit is used at this test and the absolute path of junit4.jar (junit.jar) archive file is specified in the test script. Since the path may vary depending on the system, modification would be required to run this test properly.
Command lines
$ cd /path/to/SMDK/src/test/heap_allocator/comp_api_java
$ ./run_java_test.sh <-e | -n> <-a | -j>
(Example) $ ./run_java_test.sh -e -a
Options
Option1 | Option2 | Desc. | Default |
---|---|---|---|
-e | Gives CXL memory a priority. | -n | |
-n | Gives DRAM a priority. | ||
-a | Runs Java application (javaTest/javaHeapTest.java) with SMDK allocator library. | none | |
-j | Runs JNI application (jniTest/javaJNITest.java) with SMDK allocator library. |
Result
[Case1: ./run_java_test.sh -e -a]
use_exmem:true,exmem_size:65536,normal_size:65536,maxmemory_policy:remain,priority:exmem
*** use_adaptive_interleaving is disabled
g_arena_pool[0].nr_arena=20
g_arena_pool[0].type_mem=normal
......
test4GBytes: benchmark is requesting GC (record used memory)...
test4GBytes: used=4318078464, loopCount=0, total=5368709120
……
test100MBytes: benchmark is requesting GC (record used memory)...
test100MBytes: used=132163152, loopCount=0, total=447741952
......
[Case2: ./run_java_test.sh -e -j]
use_exmem:true,exmem_size:65536,normal_size:65536,maxmemory_policy:remain,priority:exmem
*** use_adaptive_interleaving is disabled
g_arena_pool[0].nr_arena=20
g_arena_pool[0].type_mem=normal
......
malloc(0): pid=6713 0x7f22e84c8680
malloc(1): pid=6713 0x7f22e88c9940
……
free(0): pid=6713 0x7f22e84c8680
free(1): pid=6713 0x7f22e88c9940
……
MAP_EXMEM
addr[0x7f22e02e6000], one=49 zero=48
addr[0x7f22e02e6000]
munmap success
……
SMDK provides Python binding for the compatible path. The test script (run_heapmon.sh) in this directory is to verify the function.
Testing with the script needs two options to choose. Please refer to the options table below.
Command lines
$ cd /path/to/SMDK/src/test/heap_allocator/comp_api_python
$ ./run_heapmon.sh <-e | -n> <-l | -a>
(Example) $ ./run_heapmon.sh -e -a
Options
Option1 | Option2 | Desc. | Default |
---|---|---|---|
-e | Gives CXL memory a priority. | -n | |
-n | Gives DRAM a priority. | ||
-l | Runs Python application (heapmon.py) with standard libc library, not SMDK allocator. Option1 (-e or -n) will be ignored. | none | |
-a | Runs Python application (heapmon.py) with SMDK allocator library. |
Result
use_exmem:true,exmem_size:16384,normal_size:16384,maxmemory_policy:remain,priority:exmem
*** use_adaptive_interleaving is disabled
g_arena_pool[0].nr_arena=20
g_arena_pool[0].type_mem=normal
......
Before allocation: Total 1.58 GB, not changed
After allocation: Total 18.21 GB, 16.62 GB increased
After 1st deallocation: Total 18.21 GB, not changed
After 2nd deallocation: Total 1.59 GB, 16.62 GB decreased
After 3rd deallocation: Total 1.59 GB, not changed
After gc: Total 1.58 GB, 256.00 KB decreased
......
The SMDK compatible API library supports transparent system call interface mmap, as well as heap management APIs (malloc, calloc, etc.).
Like run_heap_test.sh above, the script run_mmap_test.sh in this directory helps you preload SMDK allocator library and set the required configuration for running test_syscall, calling mmap 100 times with 4MB length each.
Command lines
$ cd /path/to/SMDK/src/test/syscall
$ ./run_mmap_test.sh <-e | -n>
(Example) $ ./run_mmap_test.sh -e
Options
Options | Desc. |
---|---|
-e | Gives CXL memory a priority. |
-n | Gives DRAM a priority. |
Result
cxlmalloc - test_syscall
use_exmem:true,exmem_size:4096,normal_size:4096,maxmemory_policy:interleave,priority:exmem
*** use_adaptive_interleaving is disabled
g_arena_pool[0].nr_arena=20
g_arena_pool[0].type_mem=normal
......
addr[0x7f3211e00000], one='1' zero='0'
addr[0x7f3211a00000], one='1' zero='0'
addr[0x7f3211600000], one='1' zero='0'
addr[0x7f3211200000], one='1' zero='0'
addr[0x7f3210e00000], one='1' zero='0'
......
Note that you have to set MLC_PATH (and AMD_UPROFPCM_PATH if needed) in /path/to/SMDK/lib/tierd/tierd.conf before run the tests below.
Command lines
$ cd /path/to/SMDK/src/test/tierd
$ ./run_tierd_allocator_test.sh
Result
run testcase - t1
PASS
......
run testcase - t7
PASS
Total 7 TCs executed: 7 PASSED, 0 FAILED
Command lines
$ cd /path/to/SMDK/src/test/tierd
$ ./run_tierd_daemon_test.sh
Result
Configurations for tierd:
MLC_PATH=/path/to/smdk/lib/mlc/Linux/mlc
AMD_UPROFPCM_PATH=/path/to/smdk/lib/tierd/AMDuProf_Linux_x64_4.0.341/bin/AMDuProfPcm
Run testcase - run tierd without kmem.ko
PASS
Run testcase - tierd should generate /run/tierd/nodeX
PASS
Run testcase - check SIGINT termination
PASS
Run testcase - tierd should run well even if /dev/tierd is removed.
PASS
Run testcase - tierd should check /run/tierd/nodeX existance.
./run_tierd_daemon_test.sh: line 75: 60056 Killed $TIERD -c $TIERD_CONFPATH &> /dev/null
PASS
/path/to/SMDK/src/test/tierd
PASS
Command lines
$ cd /path/to/SMDK/src/test/tierd
$ ./run_tierd_driver_test.sh
Result
/path/to/SMDK/src/test/tierd
PASS
Command lines
$ cd /path/to/SMDK/src/test/tierd
$ ./run_tierd_plugin_test.sh
Result
PASS
This test is to verify the basic operations of optimization APIs such as s_malloc, s_free, etc. You can run pre-defined 8 tests by running the script with adding several options.
Also, you would be able to get hints on how to use SMDK's optimization APIs in your applications through these test codes.
Command lines
$ cd /path/to/SMDK/src/test/heap_allocator/opt_api_c
$ ./run_test_opt_api.sh test <n> [options...]
(Example) $ ./run_test_opt_api.sh test 8 size 1024 iter 1000000 nthreads 1
Options
Options | Desc. | Default |
---|---|---|
test <n> | Selects test id: 1: Basic optimization API running test 2: Multiple s_malloc and s_free_type requests with specified mem type (default: normal) 3: Multiple s_malloc requests alternating two memory types and s_free_type 4: Multiple s_malloc requests with random memory types and s_free_type 5: Memtype exception cases 6: Multiple s_realloc requests by different memtype with old pointer 7: Multiple s_malloc and s_free_type requests with different memtype 8: Multiple s_malloc requests with random memory types and s_free |
none |
size <n> | Memory allocation size per a request. | 8(B) |
iter <n> | Number of times memory allocation requests are repeated. | 10 |
nthreads <n> | Number of threads to run specified tests. | 1 |
time | Displays test execution time after the test is completed. | false |
vsizes | Variable memory allocation request sizes; 8B, 64B, 512B, 4KB and 2MB. | false |
perthreadcpu | Sets different cpu affinities for each thread. (applied only when nthreads > 1) | false |
exmem | Sets memtype to EXMEM. (otherwise it is set to NORMAL) | false |
repeat <n> | Number of times the test is repeated. | 1 |
Result
g_arena_pool[0].nr_arena=20
g_arena_pool[0].type_mem=normal
......
use_auto_arena_scaling = 1
[Test Parameters] size=1024 ,iter=1000000, nthreads=1, mem type=0
[Test 8(tid=0)] Start
mem_used_normal(before malloc): 0
mem_used_exmem(before malloc): 0
mem_used_normal(after malloc): 511436800
mem_used_exmem(after malloc): 512573440
mem_used_normal(after free): 14336
mem_used_exmem(after free): 20480
[Test 8(tid=0)] End
In addition to memory allocation functions such as s_malloc, s_calloc, etc., in the SMDK optimization path C++ based memory allocation class (class SmdkAllocator) is provided.
This test is to verify whether you can generate the class and get the requesting type of memory from the allocator.
Command lines
$ cd /path/to/SMDK/src/test/heap_allocator/opt_api_cpp
$ ./run_test_opt_api_cpp.sh
Result
g_arena_pool[0].nr_arena=20
g_arena_pool[0].type_mem=normal
......
use_auto_arena_scaling = 1
......
Test(basic functional test) starts
Test(basic functional test) ends
Test(malloc-free test) starts
Test(malloc-free test) ends
Test(malloc-memstat-free test) starts
......
After Malloc
type: 0
total: 66571993088
used: 10240001024
available: 51565993984
......
Test(malloc-memstat-free test) ends
The SMDK allocator library is also available for Python3 applications. The script in this section is provided to test how to use CXL memory in Python application through the py_smdk package of SMDK.
Command lines
$ cd /path/to/SMDK/src/test/heap_allocator/opt_api_python
$ ./run_test_opt_api_py.sh
Result
g_arena_pool[0].nr_arena=20
g_arena_pool[0].type_mem=normal
......
use_auto_arena_scaling = 1
......
hello smdk
nice to meet you!
test 0 done
test 1 done
......
test 11 done
The test scripts here are for APIs (s_malloc_node, s_free_node, s_enable_node_interleave, and s_disable_node_interleave) of the optimization path.
1. run_alloc_onnode.sh
Command lines
$ cd /path/to/SMDK/src/test/heap_allocator/opt_api_nodectl
$ ./run_alloc_onnode.sh [options...]
Options
Options | Desc. | Default |
---|---|---|
size <byte> | Memory allocation size (bytes) per a request. | 67108864 |
iter <n> | Number of times memory allocation requests are repeated. | 10 |
nthreads <n> | Number of threads to run specified tests. | 1 |
node <n> | A character string of CXL memory node. | 1 |
Result
g_arena_pool[0].nr_arena=20
g_arena_pool[0].type_mem=normal
......
use_auto_arena_scaling = 1
......
mem used : 335544320
thread1 malloc test over
SMDK Memory allocation stats:
Type Total Used Available
Normal 62.0GB 0.0GB 35.0GB
ExMem 32.0GB 0.0GB 32.0GB
2. run_policy_test.sh
Command lines
$ cd /path/to/SMDK/src/test/heap_allocator/opt_api_nodectl
$ ./run_policy_test.sh [options...]
Options
Options | Desc. | Default |
---|---|---|
size <byte> | Memory allocation size (bytes) per a request. | 67108864 |
iter <n> | Number of times memory allocation requests are repeated. | 10 |
nthreads <n> | Number of threads to run specified tests. | 1 |
node <n> | A character string list of CXL memory nodes (e.g., 1-2, 3) | 0-1 |
Result
g_arena_pool[0].nr_arena=20
g_arena_pool[0].type_mem=normal
......
use_auto_arena_scaling = 1
......
[TEST START] smdk smalloc test under node interleave policy
[TEST PARAMETERS] nodes=1-3 size=67108864 iter=10 nthreads=1
create- thread1
thread1 malloc test start
......
[Warning] s_enable_node_interleave:invalid node(s).(1-3)
thread1 malloc test over
This test is to verify the basic operation of SMDK's metadata APIs such as s_get_memsize_total, s_get_memsize_available, etc. Especially, this test checks whether s_get_memsize_used function can provide memory usage information correctly in each pre-defined case. Also, you would be able to get hints on how to use SMDK's metadata APIs in your applications through this test codes.
Command lines
$cd /path/to/SMDK/src/test/heap_allocator/metadata_api
$ ./run_test_meta_api.sh
Result
g_arena_pool[0].nr_arena=20
g_arena_pool[0].type_mem=normal
......
use_auto_arena_scaling = 1
SMDK Memory allocation stats:
Type Total Used Available
Normal 62.6GB 0.0GB 57.3GB
ExMem 32.0GB 0.0GB 27.3GB
[test] size=4096, total=1GiB, type=0
mem_used (before malloc) = 0
mem_requested = 1073741824
mem_available (before malloc) = 61626081280
mem_used (after malloc) = 1073766400
mem_available (after malloc) = 60334178304
mem_used (after free) = 61440
mem_available (after free) = 60400619520
…
SMDK Memory allocation stats:
Normal 62.6GB 0.0GB 56.4GB
ExMem 32.0GB 0.0GB 31.9GB
This test is to run and verify SMDK PNM API related to IMDB application such as Range and List scan operations with bit and Index vector outputs.
Command lines
$ cd /path/to/SMDK/src/test/pnm
$ ./run_test_pnm_imdb.sh
Result
insmod /lib/modules/6.9.0-smdk/kernel/drivers/pnm/imdb_resource/imdb_resource.ko
g_arena_pool[0].nr_arena=32
g_arena_pool[0].type_mem=normal
......
use_auto_arena_scaling = 1
[Test #1] RangeScan - Output BV starts
Sub Test starts
[Test Info] Column Generator: random, Bit Compression: 2
Sub Test done - PASS
......
[Test #2] RangeScan - Output IV starts
Sub Test starts
[Test Info] Column Generator: random, Bit Compression: 2
Sub Test done - PASS
......
[Test #3] ListScan - Output BV starts
Sub Test starts
[Test Info] Column Generator: random, Bit Compression: 2
Sub Test done - PASS
......
[Test #4] ListScan - Output IV starts
Sub Test starts
[Test Info] Column Generator: random, Bit Compression: 2
Sub Test done - PASS
......
PASS
This test is to run and verify SMDK PNM API related to DLRM application such as SLS operations with two different data types, Float and Uint32.
Command lines
$ cd /path/to/SMDK/src/test/pnm
$ ./run_test_pnm_dlrm.sh
Result
insmod /lib/modules/6.9.0-smdk/kernel/drivers/pnm/sls_resource/sls_resource.ko
[Test #1] SLS - Data Type: Float starts
Sub Test starts
[Table Info] Tables count: 50, Rows number: 500000, Feature size: 16
[SLSOp Info] Batch: 16, Max_lookup: 500, Min_lookup: 1, Alloc_option: REPLICATE_ALL
Sub Test done - PASS
......
[Test #2] SLS - Data Type: Uint32 starts
Sub Test starts
[Table Info] Tables count: 50, Rows number: 500000, Feature size: 16
[SLSOp Info] Batch: 16, Max_lookup: 500, Min_lookup: 1, Alloc_option: REPLICATE_ALL
Sub Test done - PASS
......
PASS
This test is to run and verify the new CXL spec-based commands that SMDK added, including poison, timestamp, event, identify, health, alert, shutdown state and QoS control.
Command lines
$ cd /path/to/SMDK/src/test/cxl_cli
$ ./test_cli_cmd.sh <poison_address>
(Example) $ ./test_cli_cmd.sh 10000
Options
Options | Desc. |
---|---|
poison_address | Poison inject address. (hexadecimal, should be greater than 0x1000) |
Result
[set-timestamp]
$ cxl set-timestamp mem0
cxl memdev: cmd_set_timestamp: set-timestamp 1 mem
[get-timestamp]
$ cxl get-timestamp mem0
8/5/2022 9:19:12
cxl memdev: cmd_get_timestamp: get-timestamp 1 mem
......
This test is to run and verify the new CXL spec-based background commands that SMDK added, such as scan media and sanitize.
Command lines
$ cd /path/to/SMDK/src/test/cxl_cli
$ ./test_cli_background_cmd.sh <scan_media_address>
(Example) $ ./test_cli_background_cmd.sh 10000
Options
Options | Desc. |
---|---|
scan_media_address | Scan media start address. (hexadecimal, should be greater than 0x1000) |
Result
[get-scan-media-caps]
$ cxl get-scan-media-caps mem0 -a 0x10000 -l 0x80
Estimated Scan Media Time(ms): 256
cxl memdev: cmd_get_scan_media_caps: get-scan-media-caps 1 mem
[scan-media]
$ cxl scan-media mem0 -a 0x10000 -l 0x80
cxl memdev: cmd_scan_media: scan-media 1 mem
......
This test is to run and verify the new CXL node grouping (i.e. CXL memory partitioning) functions with CLI's create-region, destroy-region and list CMDs extended by SMDK.
Command lines
$ cd /path/to/SMDK/src/test/cxl_cli
$ ./test_cli_group_cmd.sh
Result
[create-region -V -G node]
/proc/buddyinfo
Node 0, zone DMA 0 0 0 0 0 0 0 0 1 1 3
......
/proc/iomem CXL related info
1080000000-187fffffff : hmem.2
1080000000-187fffffff : Soft Reserved
......
[destroy-region -V -N 1]
/proc/buddyinfo
......
This is not a script that runs the commands, but a test script to check whether the cli tool returns error and displays information message properly when the user specifies an incorrect input for CXL-CLI's new commands.
Command lines
$ cd /path/to/SMDK/src/test/cxl_cli
$ ./test_cli_exception.sh
Result
......
Error: unknown option 'inval'
usage: cxl destroy-region <region0> ... [<options>]
-b, --bus <bus name> Limit operation to the specified bus
-d, --decoder <root decoder name>
Limit to / use the specified root decoder
--debug turn on debug
-f, --force destroy region even if currently active
-V, --soft_interleaving
destory(remove) soft-interelaving node(s)
-N, --target_node <n>
soft-interleaving node id to remove cxl devices followed
-w, --ways <n> number of cxldevs participating in the soft-interleaving node
......
Test Total: 15, Test Pass: 15
This test is to run and verify the CXL Swap control commands that SMDK added, including enable-cxlswap, disable-cxlswap, check-cxlswap and flush-cxlswap.
Command lines
$ cd /path/to/SMDK/src/test/cxl_cli
$ ./test_cli_cxlswap_cmd.sh
Result
[disable-cxlswap]
[CXLSwap status] enabled
[CXL-CLI] disable-cxlswap
Success: CXLSwap is disabled.
[CXLSwap status] disabled
PASSED
[enable-cxlswap]
[CXLSwap status] disabled
......
This test is to run and verify the CXL Cache control commands that SMDK added, including enable-cxlcache, disable-cxlcache, check-cxlcache and flush-cxlcache.
Command lines
$ cd /path/to/SMDK/src/test/cxl_cli
$ ./test_cli_cxlcache_cmd.sh
Result
[disable-cxlcache]
[CXLCache status] enabled
[CXL-CLI] disable-cxlcache
Success: CXLCache is disabled.
[CXLCache status] disabled
PASSED
[enable-cxlcache]
[CXLCache status] disabled
......