Ceph Cache Tiering - infn-bari-school/cloud-storage-tutorials GitHub Wiki
From the admin node, into the working directory (cd cluster-ceph), run these commands:
ceph-deploy osd create ceph-node1-$GN:vdX ceph-node2-$GN:vdX ceph-node3-$GN:vdX
Replace vdX
with the right device name.
-
Get the crush map
To get the CRUSH map for your cluster, execute the following:
ceph osd getcrushmap -o {compiled-crushmap-filename}
-
decompile the crush map
To decompile a CRUSH map, execute the following:
crushtool -d {compiled-crushmap-filename} -o {decompiled-crushmap-filename}
-
Modify the crush map moving the osd.[3-5] into dedicated buckets, group then in the new root
ssd
and add a new rulessd_ruleset
:
host ceph-node1-0-ssd { id -5 # do not change unnecessarily alg straw hash 0 # rjenkins1 item osd.3 weight 0.044 } host ceph-node2-0-ssd { id -6 # do not change unnecessarily alg straw hash 0 # rjenkins1 item osd.4 weight 0.044 } host ceph-node3-0-ssd { id -7 # do not change unnecessarily alg straw hash 0 # rjenkins1 item osd.5 weight 0.044 }
root ssd { id -8 # do not change unnecessarily alg straw hash 0 # rjenkins1 item ceph-node1-0-ssd weight 0.044 item ceph-node2-0-ssd weight 0.044 item ceph-node3-0-ssd weight 0.044 } rule ssd_ruleset { ruleset 1 type replicated min_size 1 max_size 10 step take ssd step chooseleaf firstn 0 type host step emit }
Note: the ids of the added entities may be different - take care of replacing them in order to use unique ids inside the map.
Now, you can apply the modified crush map
1. Compile the crush map
crushtool -c {decompiled-crush-map-filename} -o {compiled-crush-map-filename}
1. Set the new crush map
To set the CRUSH map for your cluster, execute the following:
ceph osd setcrushmap -i {compiled-crushmap-filename}
Note: you can disable updating the crushmap on start of the daemon:
[osd] osd crush update on start = false
Check the cluster status:
ceph status ceph osd tree
The new cluster structure should look like this:
$ ceph osd tree
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-8 0.13197 root ssd
-5 0.04399 host ceph-node1-0-ssd
3 0.04399 osd.3 up 1.00000 1.00000
-6 0.04399 host ceph-node2-0-ssd
4 0.04399 osd.4 up 1.00000 1.00000
-7 0.04399 host ceph-node3-0-ssd
5 0.04399 osd.5 up 1.00000 1.00000
-1 0.13197 root default
-2 0.04399 host ceph-node1-0
0 0.04399 osd.0 up 1.00000 1.00000
-3 0.04399 host ceph-node2-0
1 0.04399 osd.1 up 1.00000 1.00000
-4 0.04399 host ceph-node3-0
2 0.04399 osd.2 up 1.00000 1.00000
1. Create the pool 'ssdcache'
Create the pool "ssdcache" and assign the rule "ssd_ruleset" to the pool:
ceph osd pool create ssdcache <num_pg> ceph osd pool set ssdcache crush_ruleset
Check the results:
ceph osd dump
1. Create the cache tier
Associate a backing storage pool with a cache pool:
ceph osd tier add {storagepool} {cachepool}
Set the cache mode, execute the following:
ceph osd tier cache-mode {cachepool} {cache-mode}
Set overlay of the storage pool, so that all the IOs are now routing to the cache pool
ceph osd tier set-overlay {storagepool} {cachepool}
1. Configure the cache tier
There are several parameters which can be set to configure the sizing of the cache tier. ‘target_max_bytes’ and ‘target_max_objects’ are used to set the max size of the cache tier in bytes or in number of objects. When either of these is reached, the cache tier is ‘full’. ‘cache_target_dirty_ratio’ is used to control when to start the flush operation. When the percentage of dirty data in bytes or number of objects reaches this ratio, the tiering agent starts to do flush. This is the same for ‘cache_target_full_ratio’. But it is for evict operation.
ceph osd pool set {cache-pool-name} target_max_bytes {#bytes} ceph osd pool set {cache-pool-name} target_max_objects {#objects} ceph osd pool set {cache-pool-name} cache_target_dirty_ratio {0.0..1.0} ceph osd pool set {cache-pool-name} cache_target_full_ratio {0.0..1.0}
There are some other parameters for cache tiering, such as ‘cache_min_flush_age’ and ‘cache_min_evict_age’. These are not required settings. You can set them as needed.
### Test it!
Configure the cache tier as follows:
ceph osd pool set cache-pool hit_set_type bloom ceph osd pool set cache-pool hit_set_count 1 ceph osd pool set cache-pool hit_set_period 3600 ceph osd pool set cache-pool target_max_bytes 10000000000 ceph osd pool set cache-pool target_max_objects 10000 ceph osd pool set cache-pool cache_min_flush_age 300 ceph osd pool set cache-pool cache_min_evict_age 600 ceph osd pool set cache-pool cache_target_dirty_ratio 0.4 ceph osd pool set cache-pool cache_target_full_ratio 0.8
Create a temporary file of 500 MB that we will use to write to the brd pool, which will eventually be written to the cache-pool:
`# dd if=/dev/zero of=/tmp/file1 bs=1M count=500``
Put this file inside rbd pool:
rados -p rbd put object1 /tmp/file1
After 300 seconds, the cache-tiering agent will migrate object1 from the cache-pool to brd pool; object1 will then be removed from the cache-pool:
rados -p ls rados -p ls date