Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory usage by setupDiscovrExperiment and clusterDiscovrExperiment #8

Open
mjdufort opened this issue Feb 4, 2022 · 0 comments
Open

Comments

@mjdufort
Copy link
Contributor

mjdufort commented Feb 4, 2022

These two functions have very high peak memory usage, which limits the size of datasets that can be analyzed using these tools. Suggest modifying the way the data are stored in order to reduce peak memory usage. Possible solutions include

  1. keeping cached versions of the data on disk rather than in system memory
  2. optimizing the deduplication steps in clusterDiscovrExperiment
  3. running the clustering step on one sample at a time without keeping all other sample data in system memory.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant