808: NFS Cache Validation - heterodb/pg-strom GitHub Wiki
NFS Cache is a function that caches data on the NFS server on the disk of the client server. This document describes how to configure NFS Cache, and the actual results of retrieving Apache Arrow files stored on other servers with NFS Cache enabled.
- NFSv4 server must be running.
- Kernel must support fscache, cachefiles.
lsmod | grep -E 'fscache|cachefiles' sudo dnf install -y cachefilesd mkdir /var/cache/fscache The SELinux default configuration prevents from saving the cache. You should change the SELinux configuration, but we omit it in this explanation.
sudo setenforce 0 vi /etc/cachefilesd.conf Configuration
## Specify the directory created in ## 3.
dir /var/cache/fscache
## Specify cache tag
tag mycache
sudo systemctl enable cachefilesd
sudo systemctl start cachefilesd sudo mount -t nfs -o fsc <NFS Server Host>:<NFS Server Path> <Mount path> Star Schema Benchmark lineorder tables were output to an Apache Arrow file, placed on an NFS server, mounted with FS Cache enabled, and referenced by Arrow Fdw to measure the time required to run Star Schema Benchmark. The time required to run all of the Star Schema Benchmark was measured.
Data generation
. /dbgen -s 400 -Ta Arrow Fdw setup
IMPORT FOREIGN SCHEMA arroworder FROM SERVER arrow_fdw
INTO public
OPTIONS (file '/mnt/0/lineorder.arrow');
We measured the time (in milliseconds) taken to execute all queries #1 through #13. The measurement was performed three times. The measurement results are shown in the graph below.

The first time the query is executed, it takes time because the target file does not exist in the cache and needs to be retrieved via the network, but after the second time, it is clear that the cache speeds up the process.