`GradientSimilarity` - allow models of any input type #912

RobertSamoilescu · 2023-04-18T23:16:38Z

This PR extends the support for the GradientSimilarity explainer to support other data types apart from np.ndarray, torch.Tensor and tf.Tensor.

This extension is useful when a user defines a custom model for text modality, for example, for which the input is a list of strings. Note that for tensorflow we can get around this issue by casting the input to a numpy array of string. This is because tensorflow can convert from numpy string to tensorflow string tensors. Unfortunately, for pytorch the string type is not supported.

codecov · 2023-04-19T00:04:10Z

Codecov Report

Merging #912 (94dc01a) into master (92f808a) will decrease coverage by 0.05%.
The diff coverage is 85.45%.

❗ Current head 94dc01a differs from pull request most recent head 3368bea. Consider uploading reports for the commit 3368bea to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #912      +/-   ##
==========================================
- Coverage   85.32%   85.28%   -0.05%     
==========================================
  Files          74       74              
  Lines        8804     8834      +30     
==========================================
+ Hits         7512     7534      +22     
- Misses       1292     1300       +8

Impacted Files	Coverage Δ
alibi/explainers/similarity/base.py	`87.95% <82.85%> (-5.04%)`	⬇️
alibi/explainers/similarity/grad.py	`93.15% <84.61%> (-2.51%)`	⬇️
alibi/explainers/similarity/backends/__init__.py	`83.33% <100.00%> (ø)`
...ibi/explainers/similarity/backends/pytorch/base.py	`95.74% <100.00%> (ø)`
.../explainers/similarity/backends/tensorflow/base.py	`100.00% <100.00%> (ø)`

mauicv · 2023-04-19T08:43:54Z

Can you give me an idea of the use case by describing a simple example? Is it just that if you have a model with a text embedding layer you might want X_train to be a set of strings instead of processed tokens?

alibi/explainers/similarity/base.py

mauicv

LGTM for the most part. I have some minor questions but once they're answered I'm happy to merge.

One thing is that a lot of the logic is conditional statements of the form if instance(x, np.ndarray) ... else ... which if may be possible to refactor into something more modular. If the logic is really generic then can we just add a method that deals with each input type:

def _format(self, x):
    if isinstance(x, Tensors):
        # other formatting that makes sense here
        return x[None]
    elif isinstance(x, List):
        return [x]

This way if we add new cases later then it's just a case of extending this function rather than making lots of small changes... I'm not sure the above is possible though, just something to think about!

jklaise

Looks good, a few minor comments.

Was wondering if there would be much effort to add a small end-to-end test calculating explanations on a model operating on List[Any]?

alibi/explainers/similarity/base.py

mauicv

LGTM

RobertSamoilescu added 4 commits April 18, 2023 23:22

Included support for List[Any]

c2f95ee

Fixed grad implementation to support List[Any]

f22996a

Fixed typo.

765c358

Fixed type check.

c64f96c

RobertSamoilescu requested review from mauicv and jklaise April 18, 2023 23:16