adding sparse support to TreeSHAP in lightgbm #3000

imatiach-msft · 2020-04-16T04:48:08Z

Adding sparse support to TreeSHAP algorithm in lightgbm. The feature importances for a sparse matrix should be returned as a sparse matrix as well. This should improve both performance and memory usage for very sparse datasets.

Unlike other predict APIs, it's not as easy to figure out prior to prediction what the size will be of the sparse matrix result, so we allocate the data on the native-side and expose an additional API to deallocate the sparse matrix arrays.

include/LightGBM/c_api.h

src/application/predictor.hpp

guolinke · 2020-04-25T04:31:26Z

@imatiach-msft ping me when this is ready to review.

imatiach-msft · 2020-04-30T21:51:08Z

@guolinke thank you, pinging as this PR is ready for review, also tagging @slundberg for review

src/boosting/gbdt.cpp

jameslamb · 2020-06-26T15:10:34Z

@jameslamb for some reason I still see "GitHub Actions / r-package (windows-latest, MINGW, R 3.6) (pull_request) " builds fail

I will look once they rebuild. Sorry, they are still new and GitHub Actions is also still a bit rough to work with.

jameslamb · 2020-06-26T19:52:46Z

@imatiach-msft I see the two Windows R4.0 builds are now failing. From the logs, I don't think it's a result of this PR. Will try to reproduce tonight and get it resolved quickly.

StrikerRUS · 2020-06-26T20:08:06Z

@jameslamb FYI, there are a lot of PRs failing with these GitHub Actions jobs (R 4). master is also failing. Network issues again?..

jameslamb · 2020-06-27T02:19:03Z

@jameslamb FYI, there are a lot of PRs failing with these GitHub Actions jobs (R 4). master is also failing. Network issues again?..

gah! They don't look like networking issues, but I'm not sure. Investigating in #3191

jameslamb · 2020-06-27T20:25:15Z

Ok @imatiach-msft now that we've merged #3193 , I think if you merge that into this branch the R CI jobs will be working

imatiach-msft · 2020-06-28T03:06:32Z

@jameslamb done, thanks!

imatiach-msft · 2020-06-28T04:20:06Z

close-reopen for CI, getting ".ci/test.sh: line 150: pytest: command not found"

jameslamb · 2020-06-28T04:30:53Z

close-reopen for CI, getting ".ci/test.sh: line 150: pytest: command not found"

which job had that failure? Could you share a link?

imatiach-msft · 2020-07-06T14:23:17Z

great to see this merged, thank you for the great reviews!

github-actions · 2023-08-24T12:20:53Z

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

imatiach-msft requested review from btrotta, chivee and guolinke as code owners April 16, 2020 04:48

imatiach-msft commented Apr 16, 2020

View reviewed changes

include/LightGBM/c_api.h Outdated Show resolved Hide resolved

include/LightGBM/c_api.h Show resolved Hide resolved

jameslamb added the feature label Apr 20, 2020

imatiach-msft force-pushed the ilmat/feature-imp-sparse branch 2 times, most recently from 39e14fc to 44f9bf9 Compare April 24, 2020 21:45

guolinke reviewed Apr 25, 2020

View reviewed changes

src/application/predictor.hpp Outdated Show resolved Hide resolved

imatiach-msft force-pushed the ilmat/feature-imp-sparse branch from 44f9bf9 to 166dd64 Compare April 28, 2020 21:37

imatiach-msft requested review from henry0312, StrikerRUS and wxchan as code owners April 28, 2020 21:37

imatiach-msft force-pushed the ilmat/feature-imp-sparse branch 2 times, most recently from b8a932b to 6f7439d Compare April 30, 2020 02:59

imatiach-msft requested review from jameslamb and Laurae2 as code owners April 30, 2020 02:59

imatiach-msft force-pushed the ilmat/feature-imp-sparse branch 6 times, most recently from b6c6af1 to 3f9760b Compare April 30, 2020 21:44

imatiach-msft changed the title ~~[WIP] adding sparse support to TreeSHAP in lightgbm~~ adding sparse support to TreeSHAP in lightgbm Apr 30, 2020

imatiach-msft commented Apr 30, 2020

View reviewed changes

src/boosting/gbdt.cpp Show resolved Hide resolved

imatiach-msft force-pushed the ilmat/feature-imp-sparse branch 4 times, most recently from 4041429 to 9d37bb9 Compare May 1, 2020 21:53

imatiach-msft added 12 commits June 27, 2020 23:04

adding sparse support to TreeSHAP in lightgbm

920a578

updating based on comments

4e1f4e7

updated based on comments, used fromiter instead of frombuffer

342ebaf

updated based on comments

4f26c9d

fixed limits import order

bda9c9d

fix sparse feature contribs to work with more than int32 max rows

4286737

really fixed int64 max error and build warnings

6d616c6

added sparse test with >int32 max rows

bb843fc

fixed python side reshape check on sparse data

998d71b

updated based on latest comments

e773d9b

fixed comments

b33e78e

added CSC INT32_MAX validation to test, fixed comments

777ac74

imatiach-msft force-pushed the ilmat/feature-imp-sparse branch from 92801a2 to 777ac74 Compare June 28, 2020 03:04

imatiach-msft closed this Jun 28, 2020

imatiach-msft reopened this Jun 28, 2020

StrikerRUS merged commit 9f367d1 into microsoft:master Jun 28, 2020

imatiach-msft mentioned this pull request Sep 11, 2020

fix sparse multiclass local feature contributions and add test #3382

Merged

StrikerRUS mentioned this pull request Jan 28, 2021

[dask] Add type hints in Dask package #3866

Merged

jameslamb mentioned this pull request Jun 14, 2021

[dask] Make output of feature contribution predictions for sparse matrices match those from sklearn estimators (fixes #3881) #4378

Merged

github-actions bot locked as resolved and limited conversation to collaborators Aug 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adding sparse support to TreeSHAP in lightgbm #3000

adding sparse support to TreeSHAP in lightgbm #3000

imatiach-msft commented Apr 16, 2020 •

edited

Loading

guolinke commented Apr 25, 2020

imatiach-msft commented Apr 30, 2020

jameslamb commented Jun 26, 2020

jameslamb commented Jun 26, 2020

StrikerRUS commented Jun 26, 2020

jameslamb commented Jun 27, 2020

jameslamb commented Jun 27, 2020

imatiach-msft commented Jun 28, 2020

imatiach-msft commented Jun 28, 2020

jameslamb commented Jun 28, 2020

imatiach-msft commented Jul 6, 2020

github-actions bot commented Aug 24, 2023

adding sparse support to TreeSHAP in lightgbm #3000

adding sparse support to TreeSHAP in lightgbm #3000

Conversation

imatiach-msft commented Apr 16, 2020 • edited Loading

guolinke commented Apr 25, 2020

imatiach-msft commented Apr 30, 2020

jameslamb commented Jun 26, 2020

jameslamb commented Jun 26, 2020

StrikerRUS commented Jun 26, 2020

jameslamb commented Jun 27, 2020

jameslamb commented Jun 27, 2020

imatiach-msft commented Jun 28, 2020

imatiach-msft commented Jun 28, 2020

jameslamb commented Jun 28, 2020

imatiach-msft commented Jul 6, 2020

github-actions bot commented Aug 24, 2023

imatiach-msft commented Apr 16, 2020 •

edited

Loading