Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement memo garbage collection #137

Merged
merged 2 commits into from
Mar 27, 2024
Merged

Implement memo garbage collection #137

merged 2 commits into from
Mar 27, 2024

Conversation

nicholasjng
Copy link
Collaborator

@nicholasjng nicholasjng commented Mar 27, 2024

Two steps:

  1. Add back the compressed parameter representation to nnbench's runner. There is unfortunately no other way around this, since we have no way out after the record gets persisted.
  2. Implement another memo cache API, getting a memo ID (or None) for a memoized value. This is needed for eviction in teardown tasks, which are passed the memo values and not the memos themselves.

The documentation on memos was amended to showcase a dealloc in a teardown task.

The repro on a ~1GB array:

import gc
import logging
import numpy as np

import nnbench
from nnbench.types import Memo, cached_memo
from nnbench.types.memo import evict_memo, get_memo_by_value, memo_cache_size


logging.basicConfig()
logger = logging.getLogger("nnbench")
logger.setLevel(logging.DEBUG)


class MyMemo(Memo[np.ndarray]):
    @cached_memo
    def __call__(self) -> np.ndarray:
        return np.random.random_sample((10000, 10000))


def tearDown(state, params):
    logger.debug(f"Current memo cache size: {memo_cache_size()}")
    logger.debug("Evicting memo for benchmark parameter 'a':")
    m = get_memo_by_value(params["a"])
    if m is not None:
        evict_memo(m)
        gc.collect()
    logger.debug(f"New memo cache size: {memo_cache_size()}")


@nnbench.product(a=[MyMemo(), MyMemo(), MyMemo(), MyMemo()], tearDown=tearDown)
def matrixmult(a: np.ndarray, b: np.ndarray):
    return a @ b


if __name__ == "__main__":

    lhs = np.random.random_sample((10000,))

    runner = nnbench.BenchmarkRunner()
    res = runner.run(__name__, params={"b": lhs})
    print(res.benchmarks[0]["parameters"])

Image proof that it works (from the memray flamegraph):
Screenshot 2024-03-27 at 15 46 13

(The deallocs are the orange downward spikes.)

Closes #105.

There's no way around it anymore to get garbage collection, unfortunately.
@nicholasjng nicholasjng added the enhancement New feature or request label Mar 27, 2024
@nicholasjng nicholasjng self-assigned this Mar 27, 2024
@nicholasjng nicholasjng linked an issue Mar 27, 2024 that may be closed by this pull request
@nicholasjng nicholasjng merged commit 7df827d into main Mar 27, 2024
5 checks passed
@nicholasjng nicholasjng deleted the memo-gc branch March 27, 2024 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
1 participant