Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache processed activities #42

Merged
merged 3 commits into from
Feb 11, 2024

Conversation

hugovk
Copy link
Collaborator

@hugovk hugovk commented Jan 14, 2024

This speeds up the "Processing data..." step by caching the generated Pandas dataframe as a pickle file on disk.

For example, with an 8-core Mac, processing all my 3,699 GPX files takes 34s on first pass and creates a 305 MB cache file on disk (the GPX files are 822 MB). For the second run, it takes less than 2s to load the cache file.

For 580 GPX files from 2023, it takes 4s on first pass to create a 50 MB file.

Also add some type hints.

@hugovk hugovk added the enhancement New feature or request label Jan 14, 2024
@hugovk hugovk force-pushed the cache-processed-activities branch from e99c44c to 0e705cd Compare February 9, 2024 16:54
@marcusvolz
Copy link
Owner

Fantastic!

@hugovk hugovk merged commit 157d0de into marcusvolz:main Feb 11, 2024
19 checks passed
@hugovk hugovk deleted the cache-processed-activities branch February 11, 2024 11:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants