-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adding sparse support to TreeSHAP in lightgbm #3000
adding sparse support to TreeSHAP in lightgbm #3000
Conversation
39e14fc
to
44f9bf9
Compare
@imatiach-msft ping me when this is ready to review. |
44f9bf9
to
166dd64
Compare
b8a932b
to
6f7439d
Compare
b6c6af1
to
3f9760b
Compare
@guolinke thank you, pinging as this PR is ready for review, also tagging @slundberg for review |
4041429
to
9d37bb9
Compare
I will look once they rebuild. Sorry, they are still new and GitHub Actions is also still a bit rough to work with. |
@imatiach-msft I see the two Windows R4.0 builds are now failing. From the logs, I don't think it's a result of this PR. Will try to reproduce tonight and get it resolved quickly. |
@jameslamb FYI, there are a lot of PRs failing with these GitHub Actions jobs (R 4). |
gah! They don't look like networking issues, but I'm not sure. Investigating in #3191 |
Ok @imatiach-msft now that we've merged #3193 , I think if you merge that into this branch the R CI jobs will be working |
92801a2
to
777ac74
Compare
@jameslamb done, thanks! |
close-reopen for CI, getting ".ci/test.sh: line 150: pytest: command not found" |
which job had that failure? Could you share a link? |
great to see this merged, thank you for the great reviews! |
This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
Adding sparse support to TreeSHAP algorithm in lightgbm. The feature importances for a sparse matrix should be returned as a sparse matrix as well. This should improve both performance and memory usage for very sparse datasets.
Unlike other predict APIs, it's not as easy to figure out prior to prediction what the size will be of the sparse matrix result, so we allocate the data on the native-side and expose an additional API to deallocate the sparse matrix arrays.