Using Cache - internetarchive/openlibrary GitHub Wiki
Cache is a way of taking expensive values we've computed and storing them in resident memory for a period of time. A good example is "top books this week" -- such a query likely won't change much if we re-run it multiple times a second, whereas doing so may have a significant (even untenable) consequence on CPU or the database. Therefore, we cache
the computed value for some period of time (e.g. 1 hour).
Let's assume you have an expensive function like fibonacci:
def fibonacci(n):
return (
fibonacci(n - 1) + fibonacci(n - 2)
if n >= 1
else (1 if n > 0 else 0)
)
The openlibrary.core.cache.memcache_memoize
method is used as a decorator to return a cache-enabled version of fibonacci
(we'll call cached_fibonacci
) which, when called, first checks whether a recent enough, precomputed value exists in cache for the given input arguments before proceeding to call fibonacci
. For instance, if we call cached_fibonacci(4)
, memcache_memoize
will look in cache's memory for a record corresponding to the function fibonacci
with corresponding argument 4
to see if a corresponding pre-computed value exists and if so, will return it. If no such value exists or the value in cache is too old, fibonacci
will then be called and the corresponding value will be inserted or updated within cache under the function fibonacci
and the argument 4
with an updated timestamp to be used/evaluated by the next lookup.
In reality, the parameter key_prefix
to memcache_memoize
(and not the function's name) is used in conjunction with the input parameters to the cached function in order to computer the cache key. For instance, a real example may look like this:
cached_fibonacci = cache.memcache_memoize(fibonacci, key_prefix='fib', timeout=60*5)
This call to memcache_memoize
returns a cache-enabled version of fibonacci
that will cache each computed value for 5 minutes using the key fib
in union with whatever args
and kwargs
get passed into fibonacci
. In practice, for our example, we might expect the cache key
of cached_fibonacci
to look like fib-4
.
See generic_carousel's use of cache.memcache_memoize
in plugins/openlibrary/home.py
to cache the any result computed by get_ia_carousel_books
function for 2 minutes.
First, go into the web
docker container where you have access to python and openlibrary:
docker compose exec web python
import time
import web
import infogami
from openlibrary.config import load_config
load_config('conf/openlibrary.yml') # for local dev
# load_config('/olsystem/etc/openlibrary.yml') # for prod
infogami._setup()
from infogami import config
from openlibrary.core import cache
def my_function(arg1, arg2, kwarg1="a", "kwarg2="b"):
return (arg1, arg2, kwarg1, kwarg2)
# Create a cached version of `my_function`
my_cached_function = cache.memcache_memoize(my_function, key_prefix='myfunc', timeout=60*5)
# An example call:
# Calling `my_cached_function` will first check cache to see if a pre-computed value already exists.
# It does so by first computing a key based on the `key_prefix` a
to the cache that looks something like "myfunc-arg1-arg2-kwarg1-kwarg2"
# and save its value as well as the timestamp for when it was computed.
# The next time we call it, the cache system will check the time to see whether to return
# the cached value or whether to call `my_function` again and update the value in cache.
my_cached_function('arg1', 'arg2', a=2, b=3)
# Sometimes we may manually want to overwrite a function's cached value for a key:
# memcache_set takes a list of args and kwargs (i.e. the inputs for `my_function`) and uses them to create
# a cache key like "myfunc-arg1-arg2-kwarg1-kwarg2" which is what we use to fetch the function's pre-computed
# value from cache.
my_cached_function.memcache_set(['arg1', 'arg2'], {'a': 2, 'b': 3}, 'THIS IS MY NEW VALUE', time.time())
# Similarly, we can update it
my_cached_function.memcache_get(['arg1', 'arg2'], {'a': 2, 'b': 3})