Support for SHAP #219

mortonjt · 2022-10-02T15:55:01Z

Addition Description
SHAP is one of the state-of-the-art methods for computing feature importance using concepts from game theory.
Briefly, for each prediction, SHAP will estimate how much each feature contributed to the prediction, by computing leave-one-feature-out estimation across all possible subsets of features (making it optimal, while being scalable). Shapely values can be positive or negative, indicating if a feature contributed "positively" or "negatively" to a prediction. See original paper for details as well as the follow up solution for tree-ensemble methods

Current Behavior
Feature importance is estimated based on leave-one-feature out estimation, based on only the full table (i.e. for 1000 features, feature importance is based on 1000 iterations of leaving out a feature). Feature importances are strictly positive, so directionality cannot be inferred. It is also suboptimal.

Proposed Behavior
It would be useful if there is a separate method that computes Shapley values for Gradient Boosting or Random Forests classifiers.
The syntax is simple, requiring 2 lines of additional code after fitting the model (see here). I have verified that this code is functional.

Questions

Would having an optional dependency to the Shap package acceptable? If there is a separate command, it is easier to self-contain without needing to add Shap as a required dependency to the entire QIIME2 suite.

Comments

There are many options for visualizations in terms of visualization overall contribution and interactions between features. While the forceplot is a reasonable default visualization, but I think having the outputted Shapely values should be the minimum output, since there are so many use cases for interpreting them.

References

nbokulich · 2022-10-02T18:08:00Z

Hi @mortonjt ,

Thanks for opening this feature request! Adding a SHAP wrapper has been on the unwritten issue list for some time now 😁

Would you be interested in working on this method?

Would having an optional dependency to the Shap package acceptable? If there is a separate command, it is easier to self-contain without needing to add Shap as a required dependency to the entire QIIME2 suite.

Technically this would be possible but maybe not desirable, as it complicates installation. How large are the SHAP packages? I think we should make SHAP a dependency if the license is compatible, and as long as it does not introduce conflicts. CC: @ebolyen @misialq for any thoughts on this.

There are many options for visualizations in terms of visualization overall contribution and interactions between features. While the forceplot is a reasonable default visualization, but I think having the outputted Shapely values should be the minimum output, since there are so many use cases for interpreting them.

Yes! I agree, output the SHAP values and these can be passed to different other plots... this gives more flexibility also in case other relevant visualization options are added in other Q2 plugins.

cc: @adamovanja

lizgehret · 2023-12-11T17:30:08Z

Closing this issue since the related PRs have been closed.

nbokulich · 2024-04-08T08:57:02Z

Re-opening as this has been requested again on the QIIME 2 forum.

Forum x-ref

mortonjt mentioned this issue Oct 31, 2022

WIP : adding in shaply values #221

Closed

lizgehret linked a pull request Nov 1, 2022 that will close this issue

WIP : adding in shaply values #221

Closed

lizgehret added this to QIIME 2 - Triage 🚑 Nov 1, 2022

lizgehret moved this to Needs Triage in QIIME 2 - Triage 🚑 Nov 1, 2022

lizgehret moved this from Needs Triage to Awaiting Info in QIIME 2 - Triage 🚑 Nov 1, 2022

lizgehret added the type:improvement Making something better. label Nov 1, 2022

lizgehret removed this from QIIME 2 - Triage 🚑 Nov 2, 2022

This was referenced Nov 8, 2022

Shap mortonjt/q2-sample-classifier#2

Merged

adding in shapley values #222

Closed

lizgehret closed this as not planned Won't fix, can't repro, duplicate, stale Dec 11, 2023

nbokulich reopened this Apr 8, 2024

nbokulich added the help wanted Extra attention is needed label Apr 8, 2024

lizgehret added this to QIIME 2 - Triage 🚑 Apr 15, 2024

github-project-automation bot moved this to Needs Triage in QIIME 2 - Triage 🚑 Apr 15, 2024

colinvwood removed this from QIIME 2 - Triage 🚑 May 30, 2024

colinvwood added this to QIIME 2 - Approved but unscheduled ✅ May 30, 2024

lizgehret moved this to Issues in QIIME 2 - Approved but unscheduled ✅ Jun 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for SHAP #219

Support for SHAP #219

mortonjt commented Oct 2, 2022

nbokulich commented Oct 2, 2022

lizgehret commented Dec 11, 2023

nbokulich commented Apr 8, 2024