# TP Explainable AI at SET


This tutorial aims to provide an overview on the most popular techniques of explainable AI (xAI). As we saw during the presentation, we can broadly divide those techniques in two kinds:
1. _Post-hoc_ explanation methods, that are used to analyze existing models
2.  _by-design_ explainable models, programs that embed explanations into their decision process

For the _post-hoc_ methods, we will use the [Captum](https://captum.ai/) library. Part of this tutorial is adapted from the CAPTUM [original tutorial on CIFAR10](https://captum.ai/tutorials/CIFAR_TorchVision_Interpret).
For the by-design model, we will use the [CaBRNet](https://git.frama-c.com/pub/cabrnet) library, developped at CEA.

## Preliminaries

### Environment setup

Install all dependencies in a dedicated virtual environment. A `setup.sh` script is provided at the root of the session repository. This section ensures that the downloaded packages are correctly setup, and that the pretrained models behave as expected.


In [None]:
import matplotlib.pyplot as plt
import numpy as np
import os
import ipyplot
from IPython.core.display import SVG

%matplotlib inline

import captum
from captum.attr import visualization as viz

import torchvision
import torchvision.transforms.v2 as transforms

import cabrnet 
from zenodo_get import zenodo_get
from IPython.display import IFrame, Image, display

We will use for this session a reduced image set of the dataset [CUB200](http://www.vision.caltech.edu/datasets/cub_200_2011/). This is to avoid unecessary training time and inference.

In [None]:
transform = transforms.Compose(
    [transforms.ToImage(), transforms.ToDtype(torch.float32, scale=True),
     transforms.Resize((224,224),antialias=True),
     transforms.Normalize(mean=[0.485, 0.456, 0.406],std=[0.229, 0.224, 0.225]),
    ])
tinyCub = torchvision.datasets.ImageFolder(root="./data/cub_train_tiny", transform=transform)

 We will also load a pretrained model on Cub200 (a ResNet 50) for Post-Hoc explanations.

In [None]:
modelPostHoc = torch.load('./models/r50_CUB200_i448.pth',map_location='cpu')

### Sanity checks

We will begin by loading some images from the dataset, pass them through the model and see if the predictions are correct. 

In [None]:
def imshow(img):
    img = img / 4.3 + 0.4     # hackish unnormalization
    npimg = img.numpy()
    plt.imshow(np.transpose(npimg, (1, 2, 0)))
    plt.show()

loader = torch.utils.data.DataLoader(tinyCub, batch_size=16)
classes = list(map(lambda x: x.split(".")[1], tinyCub.classes))
[imgs, targets] = next(iter(loader))
res = modelPostHoc(imgs)
imshow(torchvision.utils.make_grid(imgs,nrow=4))
print(f"Ground truth predictions:  {' ' .join('%2s' % targets[j].item()+ ' ' + classes[targets[j]] for j in range(5) )}")
_, predicted = torch.max(res, 1)
print(f"Predicted classes:  {' ' .join('%2s' % predicted[j].item()+ ' ' + classes[predicted[j]] for j in range(5) )}")

Finally, we will compute the average precision on the dataset. We should have an accuracy of about 61%.

In [None]:
acc = 0
for idx, (img, target) in enumerate(loader):
    _, predicted = torch.max(modelPostHoc(img), 1)
    batch_acc = (torch.sum((predicted==target))*True).item()/16
    acc += batch_acc
print(f"Accuracy: {acc/idx*100:.2f}%")

## Post Hoc explanation methods


All the following approaches aim to do _feature attribution_: given a sample $x$ with features $x^i$, a program $f$ and a prediction $y$, the aim is to answer the following question: "which $x^i$ contributed the most to $f(x)=y$? 

![](post-hoc.png)



### Saliency maps

We will first apply the simplest attribution method: [backpropagating the gradient](https://arxiv.org/abs/1312.6034) of $y$ on the chosen sample $x$:

$\frac{\partial{f(x)}}{\partial{x}}$

It is done automatically with most of modern deep learning frameworks.

Note that you can change the `sign` parameter to `"all"` to see the sign variations for all following methods. 

In [None]:
input = imgs[0].unsqueeze(0)
target = targets[0]
input.require_grads = True

In [None]:
original_image = np.transpose((imgs[0].cpu().detach().numpy() / 4.3) + 0.4, (1, 2, 0))
saliency = captum.attr.Saliency(modelPostHoc)
grads = saliency.attribute(inputs=input, target=0, abs=False)
grads = np.transpose(grads.squeeze(0).cpu().detach().numpy(), (1, 2, 0))
_ = viz.visualize_image_attr(None, original_image, 
                      method="original_image", title="Original Image")
_ = viz.visualize_image_attr(grads, original_image, method="blended_heat_map", sign="absolute_value", 
                             outlier_perc=5, show_colorbar=True, 
                             title="Overlayed Gradient Magnitudes")

We see that the gradients focus a lot on the neck and the tail, but also on the top corners of the image. Altough it may describe how the neural network take its decision, it may not match the human decision process to classify a duck.

### Saliency maps with SmoothGrads

Given $x$, [SmoothGrads](https://arxiv.org/abs/1706.03825) aims to compute an average of the gradients in a neighborhood $x^{*}$ to reduce the influence of sharp, local variations. An approximation of this averaged gradient can be computed by the following:

$$
\nabla_{x^{*}}y \approx \frac{1}{n}\sum_0^{n}\nabla_xf(x+\mathcal{N}(0,\sigma))
$$

There are two parameters here:
1. $\sigma$: the standard deviation of the gaussian sampling
2. $n$: the number of samples computed by smoothgrad 

Experiment by changing those parameters and calling the `attribute` method (it may take long if you increase the number of samples: start by increments of 5).


In [None]:
n_samples = 50
sigma = 0.1

In [None]:
saliency = captum.attr.Saliency(modelPostHoc)
nt = captum.attr.NoiseTunnel(saliency)
attrs = nt.attribute(inputs=input, target=0, nt_type='smoothgrad_sq', nt_samples=n_samples, stdevs=sigma)
attrs= np.transpose(attrs.squeeze(0).cpu().detach().numpy(), (1, 2, 0))
_ = viz.visualize_image_attr(attrs, original_image, method="blended_heat_map", sign="absolute_value", 
                             outlier_perc=10, show_colorbar=True, 
                             title="Overlayed Gradient Magnitudes \n with SmoothGrad Squared")

With averaged gradients, the interpretation seems much less noisy. With a sufficiently high number of samples and a low standard deviation, the gradient seems to vary a lot around the neck to the tail, with some specks on the corner of the image and the beak.

### Integrated gradients

The previous approaches have limitations. Namely, they exist some situations where the gradient of different values is the same. 

To tackle this issue, [integrated Gradients](https://arxiv.org/abs/1703.01365) computes a linear approximation of the gradient on the line between an baseline image $x^{'}$ and the image $x$.

$$
IG_i = (x_i - x^{'}_i) \int_{\alpha=0}^{1} \nabla_{x_i}
f(x^{'}+\alpha(x-x^{'}))d\alpha
$$

In [None]:
ig = captum.attr.IntegratedGradients(modelPostHoc)
attributions, delta = ig.attribute(inputs=input,  baselines=input*0, target=0, return_convergence_delta=True)
attributions = np.transpose(attributions.squeeze().cpu().detach().numpy(), (1, 2, 0))
print('Approximation delta: ', abs(delta))
_ = viz.visualize_image_attr(attributions, original_image, method="blended_heat_map",sign="absolute_value",
                          show_colorbar=True, title="Overlayed Integrated Gradients")

The Integrated Gradients display how much variations (in term of gradient) exist between a white image and the actual image.
We will now combine Integrated Gradients with SmoothGrads.

In [None]:
n_samples = 50
sigma = 0.1

In [None]:
ig = captum.attr.IntegratedGradients(modelPostHoc)
nt = captum.attr.NoiseTunnel(ig)
attributions_smoothgrad = nt.attribute(inputs=input, baselines=input * 0, target=1, nt_type='smoothgrad_sq', nt_samples=n_samples, stdevs=sigma)
attributions_smoothgrad = np.transpose(attributions_smoothgrad.squeeze(0).cpu().detach().numpy(), (1, 2, 0))
_ = viz.visualize_image_attr(attributions_smoothgrad, original_image, method="blended_heat_map", sign="absolute_value", 
                             outlier_perc=10, show_colorbar=True, 
                             title="Overlayed Integrated Gradients \n with SmoothGrad Squared")

We note that integrated gradients with smoothgrads provide much more focused variations.

Overall, we note that with these three approaches we obtain seemingly similar results. But the following questions remain:

* why an explanation method chose this particular zone of the image
* how can we state that one explanation method is more representative of the network behaviour than the other
* what do we do of the explanations?

## Explainable by design: ProtoTree with the CaBRNet library


We will now look at another class of interpretability models: interpretable by-design models. We will focus on ProtoTrees. 
We will study the [ProtoTree](https://arxiv.org/abs/2012.02046) architecture. 


Some discussion about ProtoTree, namely the parameters we will consider:
* tree depth
* the effect of pruning


![](prototree.png)


### Preliminary

We downloaded the model and the corresponding generated prototypes. For this session, we also provided pre-made configuration files.
First, instanciate the model and the config files.

In [None]:
# Instanciation of paths 
from cabrnet.generic.model import ProtoClassifier
root_cabrnet_config=os.path.join("models","cabrnet","cabrnet")
root_model=os.path.join(root_cabrnet_config,"model")
root_protos=os.path.join(root_cabrnet_config,"prototypes")
root_out=os.path.join("outs")

# Configuration files, we change one faulty line
path_to_model_config=os.path.join(root_model,"model.yml")
path_to_visu_config=os.path.join(root_protos,"visualization.yml")

path_to_state_dict=os.path.join(root_model,"model_state.pth")

img_path =os.path.join("data","cub_train_tiny","001.Black_footed_Albatross","Black_Footed_Albatross_0051_796103.jpg")

model = ProtoClassifier.build_from_config(config_file=path_to_model_config,state_dict_path=path_to_state_dict)

We loaded a pretrained ProtoTree using CaBRNet, as well as two configuration files. Let us look at `model.yml`:

In [None]:
!cat $path_to_model_config

This file defines the architecture of a ProtoTree. Consider the _classifier_ section. Among several parameters, we define  `depth`: it is the depth of the soft decision tree used in ProtoTree. The higher this parameter, the deeper the tree will be (and thus higher the number of prototypes). Here, 9 was chosen after cross-validation on this dataset. We will examine the influence of changing the depth on another model.

Note that we did not put anything under the "weights" section, as we are loading an already pretrained model through the `model_state.pth`.

#### Evaluate the ProtoTree performance

The snippet of code below calls the CaBRNet `evaluate` method on the model to perform a basic inference and collect some stats. This should take less than one minute.

In [None]:
stats = model.evaluate(dataloader=loader, device='cpu', progress_bar_position=0)
for name, value in stats.items():
    print(f"{name}: {value:.3f}")


The accuracy should be above $0.98$. For this test set, the ProtoTree has a similar performance compared to a classical model. It brings the additionnal benefit of being interpretable, as we will see now. 

#### Explain local

We will first examine the inference pipeline of a ProtoTree. We will need
* a specific image with the same preprocessing used during the ProtoTree's training
* a model
* a way to visualize the similarity computed at each node

We have a pre-configured configuration file visualizer under `path_to_visu_config`.

In [None]:
!cat $path_to_visu_config

<div style="color:red"> TODO: explain briefly all parameters and provide a configuration function to change the test_patch viz </div>


In [None]:
from cabrnet.generic.model import SimilarityVisualizer
!rm -rf $root_out/test_patches # removing existing folder
visualizer = SimilarityVisualizer.build_from_config(config_file=path_to_visu_config,target="test_patch")
model.explain(prototype_dir_path=root_protos,output_dir_path=root_out,img_path=img_path,preprocess=transform,device="cpu",visualizer=visualizer)

imgs = [Image(filename=os.path.join(root_out,"test_patches",i)) for i in os.listdir(os.path.join(root_out,"test_patches"))]
display(*imgs)

#### Explain global

Given extracted prototypes, provide the inference of a ProtoTree

In [None]:
model = ProtoClassifier.build_from_config(config_file=path_to_model_config,state_dict_path=path_to_state_dict)
model.explain_global(prototype_dir_path=root_protos,output_dir_path=root_out)


In [None]:
IFrame(os.path.join(root_out,"global_explanation.pdf"), width=800, height=200)

### On the effect of pruning

In [None]:
# TODO: 
# * load the model with no pruning
# * use protolib.model.prune() with several threshold
# * provide global explanation with several thresholds 

## Wrapping up

<div style="color:red"> TODO: by-design models are a bit more cumbersome to train and use, but they provide an easier to grasp decision process </div>


### 

### 