Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reduce memory_footprint for sparse PCA transform #5964

Merged
merged 3 commits into from
Jul 28, 2024

Conversation

Intron7
Copy link
Contributor

@Intron7 Intron7 commented Jul 12, 2024

The sparse PCA still densified X during the transform step. This defeats the purpose of a sparse PCA in a sense. However

precomputed_mean_impact = self.mean_ @ self.components_.T
mean_impact = cp.ones((X.shape[0], 1)) @ precomputed_mean_impact.reshape(1, -1)
X_transformed = X.dot(self.components_.T) -mean_impact

is the same as

X = X - self.mean_
X_transformed = X.dot(self.components_.T)

The new implementation is faster (but mainly due to the fact that we don't have to rely on cupy's to_array()) and uses a lot less memory.

@Intron7 Intron7 requested a review from a team as a code owner July 12, 2024 14:35
Copy link

copy-pr-bot bot commented Jul 12, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions github-actions bot added the Cython / Python Cython or Python issue label Jul 12, 2024
@lowener
Copy link
Contributor

lowener commented Jul 15, 2024

/ok to test

@lowener lowener added 3 - Ready for Review Ready for review by team improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jul 15, 2024
@Intron7
Copy link
Contributor Author

Intron7 commented Jul 15, 2024

ok I added the right year in the cupyright

@lowener
Copy link
Contributor

lowener commented Jul 15, 2024

/ok to test

@Intron7
Copy link
Contributor Author

Intron7 commented Jul 17, 2024

I think I can't fix the issue on the PR since it's in regards to the new layout

@dantegd
Copy link
Member

dantegd commented Jul 17, 2024

@Intron7 yeah, I think the issue is that we just merged the PR to nest the python package one level lower #5944 so it should be an easy resolution of conflict I hope

@Intron7 Intron7 closed this Jul 18, 2024
@Intron7 Intron7 force-pushed the sparse_pca-update branch from 121945b to 99b20f6 Compare July 18, 2024 10:07
@Intron7 Intron7 reopened this Jul 18, 2024
@Intron7
Copy link
Contributor Author

Intron7 commented Jul 18, 2024

@lowener I merged in the updated main branch

@dantegd
Copy link
Member

dantegd commented Jul 20, 2024

/ok to test

@dantegd
Copy link
Member

dantegd commented Jul 28, 2024

/ok to test

@dantegd dantegd changed the title reduce memory_footprint for transform reduce memory_footprint for sparse PCA transform Jul 28, 2024
@dantegd
Copy link
Member

dantegd commented Jul 28, 2024

/ok to test

@dantegd
Copy link
Member

dantegd commented Jul 28, 2024

/merge

@rapids-bot rapids-bot bot merged commit d4535d2 into rapidsai:branch-24.08 Jul 28, 2024
61 of 63 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team Cython / Python Cython or Python issue improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants