Performance - bitfaster/BitFaster.Caching GitHub Wiki

The cache replacement policy must maximize the cache hit rate, and minimize the computational and space overhead involved in implementing the policy.

An analysis of hit rate, throughput and latency is provided comparing a reference LRU implementation that uses a global lock to the .NET MemoryCache, ConcurrentLru and ConcurrentLfu.

Choosing between the ConcurrentLru and ConcurrentLfu algorithms

Each approach makes tradeoffs with advantages and disadvantages depending on the application. In general:

  • ConcurrentLru is better suited to smaller caches (thousands of items rather than millions) and machines with a smaller number of CPU cores (e.g. fewer than 16). This makes it a good fit for containers or desktop computers. It is very lightweight, uses the least memory and does not spawn threads. For highly concurrent workloads on high core count machines (e.g. servers), it will maintain high throughput but consume more CPU due to thrashing spin waits in the centralized ConcurrentQueues.
  • ConcurrentLfu scales better to a higher number of CPU cores, has the highest hit rate, and a higher peak read and update throughput. However, for write workloads buffered writes can introduce a slight throughput disadvantage. It uses more memory due to the internal buffers, but less CPU under heavy load particularly as core count increases. For larger caches, the overhead of the buffers becomes insignificant. Cache maintenance is performed in the background, and highest throughput is achieved using a dedicated thread.

In most applications hit rate is the key metric, and it is best to measure in the context of your application with a representative workload.