Caching

scalable.cacheable(return_type: type[GenericType] | Callable[[...], Any] | None = None, void: bool = False, check_output: bool = False, recompute: bool = False, store: bool = True, **arg_types: type[GenericType]) Callable[[Callable[[...], Any]], Callable[[...], Any]] | Callable[[...], Any][source]

Decorator function to cache the output of a function.

This function is used to cache other functions’ outputs for certain arguments. The function hashes multiple things for a given function including its name, code content, arguments, and anything else hashed by the hash() function of the arguments. All arguments are wrapped in a type class to enable calling hash() on them. Such type classes can be and often are custom. Since argument types are estimated and not guaranteed to be correct with more exotic data types, it’s best practice to specify the return value’s type class along with the type classes of all the arguments.

Parameters:
  • return_type (Any) – The type class for the return value of the function. Usually a value between ValueType, FileType, DirType, ObjectType but custom classes with a defined hash() function can be used as well. Defaults to None. If None, the return_type will be estimated which is not guaranteed to be correct.

  • void (bool, optional) – Whether the function returns a value or not. A function is void if it does not return a value. The default is False.

  • check_output (bool, optional) – Whether to check the output of a function has the same hash as when it was stored. Useful to ensure entities like files haven’t been modified since initially stored. The default is False.

  • recompute (bool, optional) – Whether to recompute the value or not. The default is False.

  • store (bool, optional) – Whether to store the value in the cache or not. The default is True.

  • arg_types (dict) – The type classes for the arguments of the function. The keys are the argument names and the values are the type classes. If none are given for a certain argument, the type class will be estimated which is not guaranteed to be correct.

Examples

>>> @cacheable
    def func(arg1, arg2):
        ...
>>> @cacheable()
    def func(arg1, arg2):
        ...
>>> @cacheable(void=True)
    def func(arg1, arg2):
        ...
>>> @cacheable(ValueType)
    def func(arg1, arg2):
        ...
>>> @cacheable(return_type=DirType, arg1=UtilityType, arg2=FileType)
    def func(arg1, arg2):
        ...
>>> @cacheable(return_type=ValueType, recompute=False, store=True, arg1=DirType, arg2=FileType)
    def func(arg1, arg2):
        ...
class scalable.GenericType(value: Any)[source]

Bases: object

The GenericType class is a base class for all types that can be hashed.

Parameters:

value (Any) – The value to be hashed.

class scalable.FileType(value: Any)[source]

Bases: GenericType

The FileType class is used to hash files.

Parameters:

value (str) – The path to the file.

class scalable.DirType(value: Any)[source]

Bases: GenericType

The DirType class is used to hash directories.

Parameters:

value (str) – The path to the directory.

class scalable.ValueType(value: Any)[source]

Bases: GenericType

Hash for generic primitive values (int, str, float, bytes, bool).

class scalable.ObjectType(value: Any)[source]

Bases: GenericType

Hash for composite objects (lists, tuples, dicts, fall-through pickle).

Notes

The original implementation silently swallowed any exception when sorting dict keys. We narrow that to TypeError and log a debug message so unexpected errors surface during development.

class scalable.UtilityType(value: Any)[source]

Bases: GenericType

Hash for numpy arrays and pandas dataframes.

More utility data types can be added by subclassing or registering.