Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Joint feature importance #156

Merged
merged 1 commit into from
Jan 12, 2021
Merged

Joint feature importance #156

merged 1 commit into from
Jan 12, 2021

Conversation

grantirv
Copy link

@grantirv grantirv commented Jan 7, 2021

Motivation
Sometimes features have a natural grouping, for example are sourced from a specific data provider. It is therefore usefull to know the joint importance of group(s) of features. Maybe that data provider is very expensive and you want to decide on a cost/benefit basis if you can drop those features from your model.

Additionally, for large datasets, the feature importance calculation is computationally expensive. It would therefore be useful to specify which features you want importance scores calculated on rather than have to calculate the full set.

Implementation
This PR adds an argument features to the FeatureImp$new call. It works as follows:

  • If NULL (the default) behavior is unchanged. i.e. scores are calculated for all features individually.
  • If a character vector of feature names, scores are calculated only for that subset of features.
  • If a list of character vectors, then joint importance of each group is calculated.

Allows for the calculation of joint importance scores for group(s) of features. You can now specify which features/groups you want importance scores calculated on. This is usefull for large datasets where permuting all features is computationally expensive.
@codecov-io
Copy link

Codecov Report

Merging #156 (b446afe) into master (9fdecc0) will increase coverage by 0.06%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #156      +/-   ##
==========================================
+ Coverage   90.66%   90.72%   +0.06%     
==========================================
  Files          17       17              
  Lines        1628     1639      +11     
==========================================
+ Hits         1476     1487      +11     
  Misses        152      152              
Impacted Files Coverage Δ
R/FeatureImp.R 97.36% <100.00%> (+0.28%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9fdecc0...b446afe. Read the comment docs.

@grantirv grantirv marked this pull request as ready for review January 7, 2021 19:33
@christophM
Copy link
Collaborator

That's awesome and a new feature many people might be looking for.

@christophM christophM merged commit 814cdb9 into giuseppec:master Jan 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants