Ceph Cache Tiering - infn-bari-school/cloud-storage-tutorials GitHub Wiki

Add the new SSD OSDs to the cluster:

cache-tier

From the admin node, into the working directory (cd cluster-ceph), run these commands:

ceph-deploy osd create ceph-node1-$GN:vdX ceph-node2-$GN:vdX ceph-node3-$GN:vdX  

Replace vdX with the right device name.

Setup the cache pool

  1. Get the crush map

    To get the CRUSH map for your cluster, execute the following:

    ceph osd getcrushmap -o {compiled-crushmap-filename}
    
  2. decompile the crush map

    To decompile a CRUSH map, execute the following:

    crushtool -d {compiled-crushmap-filename} -o {decompiled-crushmap-filename}
    
  3. Modify the crush map moving the osd.[3-5] into dedicated buckets, group then in the new root ssd and add a new rule ssd_ruleset:

host ceph-node1-0-ssd { id -5 # do not change unnecessarily alg straw hash 0 # rjenkins1 item osd.3 weight 0.044 } host ceph-node2-0-ssd { id -6 # do not change unnecessarily alg straw hash 0 # rjenkins1 item osd.4 weight 0.044 } host ceph-node3-0-ssd { id -7 # do not change unnecessarily alg straw hash 0 # rjenkins1 item osd.5 weight 0.044 }

root ssd { id -8 # do not change unnecessarily alg straw hash 0 # rjenkins1 item ceph-node1-0-ssd weight 0.044 item ceph-node2-0-ssd weight 0.044 item ceph-node3-0-ssd weight 0.044 } rule ssd_ruleset { ruleset 1 type replicated min_size 1 max_size 10 step take ssd step chooseleaf firstn 0 type host step emit }

 
Note: the ids of the added entities may be different - take care of replacing them in order to use unique ids inside the map.
Now, you can apply the modified crush map

1. Compile the crush map

crushtool -c {decompiled-crush-map-filename} -o {compiled-crush-map-filename}

1. Set the new crush map

To set the CRUSH map for your cluster, execute the following:

ceph osd setcrushmap -i {compiled-crushmap-filename}


Note: you can disable updating the crushmap on start of the daemon:

[osd] osd crush update on start = false


Check the cluster status:

ceph status ceph osd tree


The new cluster structure should look like this:

$ ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -8 0.13197 root ssd
-5 0.04399 host ceph-node1-0-ssd
3 0.04399 osd.3 up 1.00000 1.00000 -6 0.04399 host ceph-node2-0-ssd
4 0.04399 osd.4 up 1.00000 1.00000 -7 0.04399 host ceph-node3-0-ssd
5 0.04399 osd.5 up 1.00000 1.00000 -1 0.13197 root default
-2 0.04399 host ceph-node1-0
0 0.04399 osd.0 up 1.00000 1.00000 -3 0.04399 host ceph-node2-0
1 0.04399 osd.1 up 1.00000 1.00000 -4 0.04399 host ceph-node3-0
2 0.04399 osd.2 up 1.00000 1.00000


1. Create the pool 'ssdcache'

Create the pool "ssdcache" and assign the rule "ssd_ruleset" to the pool:

ceph osd pool create ssdcache <num_pg> ceph osd pool set ssdcache crush_ruleset


Check the results:

ceph osd dump


1. Create the cache tier

Associate a backing storage pool with a cache pool:

ceph osd tier add {storagepool} {cachepool}

Set the cache mode, execute the following:

ceph osd tier cache-mode {cachepool} {cache-mode}

Set overlay of the storage pool, so that all the IOs are now routing to the cache pool

ceph osd tier set-overlay {storagepool} {cachepool}


1. Configure the cache tier
There are several parameters which can be set to configure the sizing of the cache tier. ‘target_max_bytes’ and ‘target_max_objects’ are used to set the max size of the cache tier in bytes or in number of objects. When either of these is reached, the cache tier is ‘full’. ‘cache_target_dirty_ratio’ is used to control when to start the flush operation. When the percentage of dirty data in bytes or number of objects reaches this ratio, the tiering agent starts to do flush. This is the same for ‘cache_target_full_ratio’. But it is for evict operation.

ceph osd pool set {cache-pool-name} target_max_bytes {#bytes} ceph osd pool set {cache-pool-name} target_max_objects {#objects} ceph osd pool set {cache-pool-name} cache_target_dirty_ratio {0.0..1.0} ceph osd pool set {cache-pool-name} cache_target_full_ratio {0.0..1.0}


There are some other parameters for cache tiering, such as ‘cache_min_flush_age’ and ‘cache_min_evict_age’. These are not required settings. You can set them as needed.

### Test it!
Configure the cache tier as follows:

ceph osd pool set cache-pool hit_set_type bloom ceph osd pool set cache-pool hit_set_count 1 ceph osd pool set cache-pool hit_set_period 3600 ceph osd pool set cache-pool target_max_bytes 10000000000 ceph osd pool set cache-pool target_max_objects 10000 ceph osd pool set cache-pool cache_min_flush_age 300 ceph osd pool set cache-pool cache_min_evict_age 600 ceph osd pool set cache-pool cache_target_dirty_ratio 0.4 ceph osd pool set cache-pool cache_target_full_ratio 0.8

Create a temporary file of 500 MB that we will use to write to the brd pool, which will eventually be written to the cache-pool:
    `# dd if=/dev/zero of=/tmp/file1 bs=1M count=500``
Put this file inside rbd pool:
rados -p rbd put object1 /tmp/file1

After 300 seconds, the cache-tiering agent will migrate object1 from the cache-pool to brd pool; object1 will then be removed from the cache-pool:

rados -p ls rados -p ls date


⚠️ **GitHub.com Fallback** ⚠️