Skip to content

Data Caching

Caching is optional in the Yuzuha Protocol, but if implemented it must satisfy the requirements on this page. A non-caching implementation may return freshly computed values on every call; a caching implementation must behave identically to a non-caching one from the caller's perspective.


1. Correctness (Determinism)

Requirement: A cached result must be bitwise-equal to the value that would be returned by a fresh computation with no cache present.

Rationale: Callers must be able to rely on the cache being a pure performance optimization with no effect on results. This rules out approximate or compressed caches.


2. Key Spaces

Each cached function must use the following minimum key spaces. Additional implementation-specific key components (e.g. a format version number) are permitted but must not affect the correctness guarantee.

Canonical Basis Cache

The cache key for canonical_basis(spec) must uniquely determine the result. A sufficient key is:

key = (
    tuple((edge.rep, edge.dir) for edge in spec.edges),
    tuple(spec.alphas)           # internal representation sequence
)

For SU(2), this reduces to the tuple of (twice_spin, direction_sign) pairs for external edges plus the tuple of doubled internal spins.

X-Symbol Cache

The cache key for compute_x_symbol(spec_a, spec_b, contraction) must uniquely determine both the returned array and the returned spec_c. A sufficient key is:

key = (
    tuple((e.rep, e.dir) for e in spec_a.edges),
    tuple(spec_a.alphas),
    tuple((e.rep, e.dir) for e in spec_b.edges),
    tuple(spec_b.alphas),
    tuple(contraction.axes_a),
    tuple(contraction.axes_b),
)

R-Symbol Cache

The cache key for compute_r_symbol(spec, permutation) must uniquely determine both the returned array and the returned spec_permuted. A sufficient key is:

key = (
    tuple((e.rep, e.dir) for e in spec.edges),
    tuple(spec.alphas),
    tuple(permutation),
)

3. Thread Safety

Requirement: All cache read and write operations must be thread-safe. Concurrent calls from different threads with the same or different keys must not produce data races, corruption, or incorrect results.

Acceptable strategies include:

  • Per-cache mutex protecting all database operations
  • Connection pool with per-connection serialization
  • Immutable (write-once) on-disk stores

4. Persistence

Recommendation: Caches should be persistent across Python sessions (i.e. stored on disk rather than in memory only). This is a recommendation, not a hard requirement.

If a persistent cache is implemented:

  • The cache file format must be versioned or include a format identifier so that incompatible format changes can be detected and the old cache discarded safely.
  • The implementation must handle a corrupted or missing cache file gracefully, falling back to fresh computation.

5. Cache Invalidation

The Yuzuha Protocol does not specify when a cache must be invalidated. In practice, the cache is valid as long as the implementation of canonical_basis, compute_x_symbol, and compute_r_symbol does not change in a way that would alter their outputs for the same inputs.

If an implementation update changes outputs, it must either:

  • Increment a format version tag in the cache key space, or
  • Provide a documented procedure for clearing the old cache (e.g. deleting the database file or calling a clear_caches function)


6. Isolation for Testing

Requirement: If a caching layer is implemented, the implementation must provide a mechanism for test code to use an isolated, temporary cache so that tests do not interfere with production data or with each other.

In the SU(2) reference implementation, this is provided by TestCacheContext:

with yuzuha.TestCacheContext():
    # All cache operations use a unique temporary directory
    x, spec_c = yuzuha.compute_xsymbol(spec_a, spec_b, contraction)
# Temporary cache cleaned up on exit

7. Cache Management API

If a persistent cache is implemented, the following utility functions are recommended (present in the SU(2) reference implementation):

Function Description
set_cache_path(path) Override the cache directory location
reset_caches() Close in-memory cache connections and reset state
clear_all_caches() Delete all cached entries (keep schema)
get_cache_stats() → dict Return entry counts and file sizes per cache
print_cache_stats() Print a human-readable cache summary

8. SU(2) Reference Implementation Summary

Cache Storage File Key Value
Canonical basis SQLite (Rust) cgbasis.db (external edges, internal spins) ndarray in .npy format
X-symbol SQLite (Python) xsymbol.db (spec_a, spec_b, contraction) (ndarray, spec_c)
R-symbol SQLite (Python) rsymbol.db (spec, permutation) (ndarray, spec_permuted)

See Database and Cache Management for the full Python API.