-
Notifications
You must be signed in to change notification settings - Fork 252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Anchor documentation improvements #711
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Codecov Report
@@ Coverage Diff @@
## master #711 +/- ##
==========================================
+ Coverage 80.55% 81.06% +0.51%
==========================================
Files 105 105
Lines 11790 11794 +4
==========================================
+ Hits 9497 9561 +64
+ Misses 2293 2233 -60
|
Probability for a pixel to be represented by the average value of its superpixel. | ||
Probability for a pixel to be represented by the average value of its superpixel. The missingness of a | ||
superpixel (i.e. querying the model on a reduced input) is simulated by randomly turning it on and off | ||
with a probability `p_sample`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if images_background
we replace with the other image, not the average.
f'Now returning the best non-eligible result. The desired precision threshold might not be ' | ||
f'achieved due to the quantile-based discretisation of the numerical features since the ' | ||
f'synthetic instances satisfying the anchor are constructed by sampling the numerical ' | ||
f'features from their corresponding, potentially large, quantile intervals.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sort of tempted to suggest just saying something like: "The resolution of bins may be too large to ensure we find an anchor of required precision. Note that higher resolution may not be possible due to sparsity of samples.". But I'm not sure... I think the above is fine as well.
Margin between lower confidence bound and minimum precision of upper bound. | ||
Multi-armed bandit parameter used to select candidate anchors in each iteration. The multi-armed bandit | ||
algorithm tries to find the potentially best (i.e. highest precision) `beam_size` candidate anchors from a | ||
list of anchors created by including a new predicate in the candidate anchors form the previous iteration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
form -> from
the algorithm returns with a probability of at least `1 - delta` an anchor :math:`A` with a precision lower | ||
than the precision of the highest precision anchor in the current iteration, :math:`A^\\star`, | ||
with a maximum error tolerance of `tau`. A bigger value for `tau` means faster convergence but also looser | ||
anchor conditions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps mention that the aim of the algorithm is to return an anchor you're confident is good rather than the best anchor you're less confident in. Hence it doesn't necessarily return the best anchor instead it's managing a trade-off between confidence and precision.
Ignore, I misuderstood
coverage_samples | ||
Number of samples used to estimate coverage from during result search. | ||
beam_size | ||
The number of anchors extended at each step of new anchors construction. | ||
The number of anchors extended (i.e. candidate anchors returned by the multi-armed bandit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note sure what you mean by extended? Perhaps: "Number of anchors selected by the MAB algorithm in each generation of building the anchors..."
The number of anchors extended at each step of new anchors construction. | ||
The number of anchors extended (i.e. candidate anchors returned by the multi-armed bandit) | ||
at each step of new anchors construction. A bigger beam width can lead to a better overall anchor at the | ||
expense of more computation time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps mention that the beam size aims to prevent the anchor being in a local maximum...
doc/source/methods/Anchors.ipynb
Outdated
"There are some edge cases that a practitioner should be aware:\n", | ||
"\n", | ||
"- An anchor with many predicates and a small coverage might indicate that the explained input lies near the decision boundary. Many more predicates are needed to ensure that an instance keeps the predicted label since minor perturbations may push the prediction to another class.\n", | ||
"- An empty anchor with a coverage of 1 indicates that there is no salient subset of features that is necessary for the prediction to hold. In other words, with high probability (as measured by the precision), the predicted class of the data point does not change regardless of the perturbations applied to it." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Perhaps add that this is likely to occur if the data set is very unbalanced
- Also maybe link to the FAQs on this
f'Now returning the best non-eligible result. The desired precision threshold might not be ' | ||
f'achieved due to the quantile-based discretisation of the numerical features. The ' | ||
f'resolution of the bins may be too large to find an anchor of required precision. ' | ||
f'Note that higher resolution may or may not be easily achieved depending on the underling ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
underling -> underlying
With respect to the message itself, should we replace the last sentence with something actionable? E.g. mention increasing the number of bins in disc_perc
(whilst keeping the caveat that it may not help if the numerical feature distributions are skewed).
delta | ||
Used to compute `beta`. | ||
Significant threshold. `1 - delta` represents the confidence threshold for the anchor precision |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Significant -> Significance?
@@ -49,6 +49,40 @@ | |||
"As highlighted by the above example, an anchor explanation consists of *if-then rules*, called the anchors, which sufficiently guarantee the explanation locally and try to maximize the area for which the explanation holds. This means that as long as the anchor holds, the prediction should remain the same regardless of the values of the features not present in the anchor. Going back to the sentiment example: as long as *not good* is present, the sentiment is negative, regardless of the other words in the movie review." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Should we have a "loose" definition of a "predicate" as well before using it in the definitions of precision/coverage?
- With respect to the FAQ link, I think we discussed that the 2nd reason for the empty anchor is actually not possible ("The predicted class of the data point always changes regardless of the perturbations applied to it."). Would you be able to edit the FAQ entry to reflect that?
Reply via ReviewNB
@@ -49,6 +49,40 @@ | |||
"As highlighted by the above example, an anchor explanation consists of *if-then rules*, called the anchors, which sufficiently guarantee the explanation locally and try to maximize the area for which the explanation holds. This means that as long as the anchor holds, the prediction should remain the same regardless of the values of the features not present in the anchor. Going back to the sentiment example: as long as *not good* is present, the sentiment is negative, regardless of the other words in the movie review." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Robert, left a few comments.
"For a more intuitive understanding of what the method tries to achieve, we will loosely define a few concepts and explain some insights we get from an anchor explanation.\n", | ||
"\n", | ||
"A **predicate** represents an expression involving a single feature. Some examples of predicates for a tabular dataset having features such as *Age*, *Relationship*, and *Occupation* are: \n", | ||
"\n", | ||
" - `28 < Age < 50`\n", | ||
" - `Relationship = Husband`\n", | ||
" - `Occupation = Blue-Collar`\n", | ||
"\n", | ||
"A **rule** represents a set of predicates connected by the `AND` operator. Considering all the predicate examples above, we can construct the following rule: `28 < Age < 50 AND Relationship = Husband AND Occupation = Blue-Collar`. Note that a rule selects/refers to a particular subpopulation from the given dataset.\n", | ||
"\n", | ||
"We can now define the notion of an **anchor**. Following the definition from [Ribeiro et al. (2018)](https://homes.cs.washington.edu/~marcotcr/aaai18.pdf), \"an **anchor** explanation is a **rule** that sufficiently 'anchors' the prediction locally – such that changes to the rest of the feature values of the instance do not matter\".\n", | ||
"\n", | ||
"As previously mentioned, the power of the Anchors over other local explanations methods comes from the objective formulation which is to maximize the **coverage** under the **precision** constraints. \n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice explanation!
doc/source/methods/Anchors.ipynb
Outdated
"\n", | ||
"- An anchor with many predicates and a small coverage might indicate that the explained input lies near the decision boundary. Many more predicates are needed to ensure that an instance keeps the predicted label since minor perturbations may push the prediction to another class.\n", | ||
"- An empty anchor with a coverage of 1 indicates that there is no salient subset of features that is necessary for the prediction to hold. In other words, with high probability (as measured by the precision), the predicted class of the data point does not change regardless of the perturbations applied to it. This behaviour can be typical for very imbalanced datasets.\n", | ||
"\n", | ||
"Check [FAQ](https://docs.seldon.io/projects/alibi/en/stable/overview/faq.html#anchor-explanations) for further details." | ||
"Check [FAQ](https://docs.seldon.io/projects/alibi/en/latest/overview/faq.html#anchor-explanations) for further details." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how this sneaked in but should be stable
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I think I got confused myself that the FAQ had already deleted the wrong justification for an empty anchor but the rendered docs show stable
so the change won't be visible until next release. Just to clarify, all docs links should point to stable
as that's the version the vast majority of people will be using.
This PR addresses the following updates:
Use-case insights
as suggested by the product team.