Skip to content


Repository files navigation

Visualizing Adversarial Examples on Convolutional Networks

This is my B.Sc. Thesis at CAMP, TUM, under supervision of Professor Nassir Navab, Dr. Federico Tombari with Magda Paschali as my advisor. It's a package for attacking and visualizing convolutional networks with the purpose of understanding and comparing the effects of adversarial example on such networks.


  1. Intro
  2. Tools
    1. Visualization Methods
    2. Attack Types
    3. Convolutional Network & Training Choices
  3. Code Structure
    1. Other Functions
    2. TL;DR: Step By Step Instructions
  4. Requirements
  5. References


To be updated soon.


Visualization Methods

Guided Back Prop

  1. Instructions In run the function runGBackProp for guided backprop method, or runVanillaBP for Vanilla Back prop. For instance:
runGBackProp(choose_network = 'ResNet50',
                 isTrained = True,
                 training = "Normal",
                 target_example = 4,
                 attack_type = 'LBFGS')


  • For more information on choose_network, isTrained, training, structure see this section.
  • For more information on attack_type check the list of attacks. `
  • target_example let's you choose between 6 sample images drawn from ImageNet if you are using a pretrained Pytorch network. In case of using a custom network, this argument is redundant, because every time a random image is chosen from CIRFAR10 test set. To change this random setting, you can change get_params function in
  1. Reference

J. T. Springenberg, A. Dosovitskiy, T. Brox, and M. Riedmiller. Striving for Simplicity: The All Convolutional Net,

K. Simonyan, A. Vedaldi, A. Zisserman. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps,

Smooth Grad

  1. Instructions In run the function runsmoothGrad for smooth guided grad method. For instance:
runsmoothGrad(choose_network = 'VGG19',
                 isTrained = True,
                 training = "Normal",
                 target_example = 4,
                 attack_type = 'SalMap')


  • For more information on choose_network, isTrained, training, structure see this section.
  • For more information on attack_type check the list of attacks. `
  • target_example let's you choose between 6 sample images drawn from ImageNet if you are using a pretrained Pytorch network. In case of using a custom network, this argument is redundant, because every time a random image is chosen from CIRFAR10 test set. To change this random setting, you can change get_params function in
  1. Reference

D. Smilkov, N. Thorat, N. Kim, F. Viégas, M. Wattenberg. SmoothGrad: removing noise by adding noise

Grad Cam

  1. Instructions In run the function runGradCam for Grad Cam method, or runGGradCam for Guided Grad Cam. For instance:
runGradCam(choose_network = 'ResNet50',
                 isTrained = True,
                 target_example = 4,
                 attack_type = 'SalMap)


  • For more information on choose_network, isTrained, training, structure see this section.
  • For more information on attack_type check the list of attacks. `
  • target_example let's you choose between 6 sample images drawn from ImageNet if you are using a pretrained Pytorch network. In case of using a custom network, this argument is redundant, because every time a random image is chosen from CIRFAR10 test set. To change this random setting, you can change get_params function in
  1. Reference

R. R. Selvaraju, A. Das, R. Vedantam, M. Cogswell, D. Parikh, and D. Batra. Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization,

Interpretable Explanations

  1. Instructions In run the function runExplain for Interpretable explanations method. For instance:


  • For more information on choose_network, isTrained, training, structure see this section.
  • For more information on attack_type check the list of attacks. `
  • target_example let's you choose between 6 sample images drawn from ImageNet if you are using a pretrained Pytorch network. In case of using a custom network, this argument is redundant, because every time a random image is chosen from CIRFAR10 test set. To change this random setting, you can change get_params function in
  • iters sets the number of iterations for optimizing the Interpretable mask. For a clean output, choose a value above 100.
  1. Reference

R. Fong, A. Vedaldi. Interpratable Explanations of Black Boxes by Meaningful Perturbations,

Inverted Image Representations

  1. Instructions

Note that this method is only implemented for Pytorch pretrained AlexNet or VGG19. The method is also not supported by any of the comparison functions. Use with caution!

In run the function runInvRep for Inverted Image Representations method. For instance:

runInvRep(choose_network = 'AlexNet',
              isTrained = True,
              target_example = 4,
              target_layer = 10,
              attack_type = 'FGSM')


  • For more information on choose_network, isTrained, training, structure see this section.
  • For more information on attack_type check the list of attacks. `
  • target_example let's you choose between 6 sample images drawn from ImageNet if you are using a pretrained Pytorch network. In case of using a custom network, this argument is redundant, because every time a random image is chosen from CIRFAR10 test set. To change this random setting, you can change get_params function in
  • target_layer sets the number of the layer you want to start the inverting algorithm from.
  1. Reference

A. Mahendran, A. Vedaldi. Understanding Deep Image Representations by Inverting Them,

Deep Dream

  1. Instructions

Note that this method is only implemented for Pytorch pretrained AlexNet or VGG19. The method is also not supported by any of the comparison functions. Use with caution! In run the function runDeepDream for Inverted Image Representations method. For instance:

runDeepDream(choose_network = 'VGG19',
                 isTrained = True,
                 target_example = 3,
                 attack_type = 'FGSM',
                 cnn_layer = 34,
                 filter_pos = 94,
                 iters = 50)
  • For more information on choose_network, isTrained, training, structure see this section.
  • For more information on attack_type check the list of attacks. `
  • target_example let's you choose between 6 sample images drawn from ImageNet if you are using a pretrained Pytorch network. In case of using a custom network, this argument is redundant, because every time a random image is chosen from CIRFAR10 test set. To change this random setting, you can change get_params function in
  • cnn_layer
  • filter_pos
  • iters
  1. Reference

D. Smilkov, N. Thorat, N. Kim, F. Viégas, M. Wattenberg. SmoothGrad: removing noise by adding noise

Deep Image Prior

To be added soon.

Dmitry Ulyanov, Andrea Vedaldi, Victor Lempitsky. Deep Image Prior,

Attack Types

All the attacks are implemented using FoolBox Package.


Alexey Kurakin, Ian Goodfellow, Samy Bengio, Adversarial examples in the physical world,


Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, Adrian Vladu, Towards Deep Learning Models Resistant to Adversarial Attacks,

Single Pixel

Nina Narodytska, Shiva Prasad Kasiviswanathan, Simple Black-Box Adversarial Perturbations for Deep Networks,


Wieland Brendel, Jonas Rauber, Matthias Bethge, Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models,

Deep Fool

Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, Pascal Frossard, DeepFool: a simple and accurate method to fool deep neural networks,


Pedro Tabacof, Eduardo Valle. Exploring the Space of Adversarial Images,

Saliency Map

Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z. Berkay Celik, Ananthram Swami. The Limitations of Deep Learning in Adversarial Settings,

Convolutional Networks


The only available option is the Pytorch Model, Pretrained on ImageNet.

  • Set choose_network = 'AlexNet'.
  • Set isTrained = True if you want to work with the pretrained PyTorch Model. You may set isTrained = False to run the model with random weights for sanity check.


There are 3 available training options.

  • Pytorch Model, Pretrained on ImageNet.

    • Set choose_network = 'VGG19'.
    • Set isTrained = True if you want to work with the pretrained PyTorch Model. You may set isTrained = False to run the model with random weights for sanity check.
  • Normal Custom Training on CIFAR10. You should train the model by running if the corresponding ckpt file doesn't exist in the customization/trainedmodels directory.

    • Set choose_network = 'Custom'.
    • Set structure = 'VGG19'.
    • Set training = 'Normal.
  • Custom Adversarial Training on CIFAR10. You should train the model by running if the corresponding ckpt file doesn't exist in the customization/trainedmodels directory.

    • Set choose_network = 'Custom'.
    • Set structure = 'VGG19'.
    • Set training = 'Adversarial.


There are 3 available training options.

  • Pytorch Model, Pretrained on ImageNet.
    • Set choose_network = 'ResNet50'.
    • Set isTrained = True if you want to work with the pretrained PyTorch Model. You may set isTrained = False to run the model with random weights for sanity check.
  • Normal Custom Training on CIFAR10. You should train the model by running if the corresponding ckpt file doesn't exist in the customization/trainedmodels directory.
    • Set choose_network = 'Custom'.
    • Set structure = 'ResNet50'.
    • Set training = 'Normal.
  • Custom Adversarial Training on CIFAR10. You should train the model by running if the corresponding ckpt file doesn't exist in the customization/trainedmodels directory.
    • Set choose_network = 'Custom'.
    • Set structure = 'ResNet50'.
    • Set training = 'Adversarial.

Code Structure

Other Functions

Comparison Functions

There are 4 functions written for making the following comparisons:

  • Among Attacks(CompareAttacks): For a specific network, one can see how different attacks are visualized using the same visualization method. It is executed from by entering:
compareAttacks(vizmethod = 'Explain',  
                   choose_network = 'Custom',  
                   image_index = 4,  


  • Among Visualization Methods(CompareVisualization): For a specific network and attack type, one can compare chosen visualization methods. Implemented similar to CompareAttacks,
compareVisualizations(attackmethod = 'Boundary',  
                          choose_network = 'Custom',  
                          image_index = 5,  
  • Among Networks(CompareNetworks): For a specific attack, one can see how different networks are visualized using the same visualization method. Implemented similar to CompareAttacks,
compareNetworks(attackmethod = 'PGD,  
                    vizmethod = 'GradCam',  
                    image_index = 3,  
                    training='Normal') # or `Adversarial`
  • Among Training (CompareTraining): For a selected attack and network, one can compare how different training methods affect the chosen visualization. Currently Normal and adversarial training are available, distillation will soon be added. In addition for sanity check, visualization with a noisy input as well as untrained network could be shown. Implemented similar to CompareAttacks,
compareTraining(attackmethod = 'SinglePixle',  
                    vizmethod = 'VanillaBP',  
                    structure = 'ResNet50',  
                    image_index = 2)


An extension to runGradCam which allows you to compare the following Grad Cam visualizations:

  • Natural Input Image with the correct class prediction (Ground truth)
  • Adversarial Input Image with the adversarial class prediction
  • Adversarial Input Image with the correct class prediction (Ground truth)
  • Natural Input Image with the adversarial class prediction (The wrong network prediction when fed the adversarial image)

The arguments are similar to Grad Cam and the output will look like this: Output


An extension to runExplain which allows you to compare the following Interpretable Explanations visualizations:

  • Natural Input Image with the correct class prediction (Ground truth)
  • Adversarial Input Image with the adversarial class prediction
  • Natural Input Image with the adversarial class prediction (The wrong network prediction when fed the adversarial image)

The arguments are similar to Explainable Interpretations and the output will look like this: Output

Step By Step Instructions

  1. open
  2. Choose your function amongst the following available ones:
  • Single Visualization:

runGradCam, runGradCam2, runGGradCam, runsmoothGrad, runExplain, runExplain2, runVanillaBP, runInvRep, runDeepDream.

  • Comparison Visualizations:

CompareTraining, CompareVisualizaion, CompareNetworks, CompareAttacks

  1. As explained above there is a function for each type of visualization or comparison. The common arguments between all functions are:
  • Choose Network: Currently you can either choose pretrained ResNet50, VGG19 or AlexNet or Custom network.

  • Training: Choose either Normal or Adversarial.

  • Structure: Having chosen 'Custom' network, choose its structure from 'ResNet50' and 'VGG19'.

  • Attack Type: Can be chosen from: FGSM, LBFGS, PGD, RPGD, Boundary, DeepFool, SinglePixel, SalMap

  • Example Index (Only for ImageNet): Choose a number from 0-6 to choose an image from input_images. If you are using a network trained on CIFAR10 the example would be chosen randomly.


python = 3.5
torch >= 0.4.0
torchvision >= 0.1.9
numpy >= 1.13.0
opencv >= 3.1.0
foolbox >= 1.3.1


  1. Convolutional Neural Network Visualizations By Utku Ozbulak.
  2. CIFAR10 Adversarial Examples Challenge By Madry Lab.
  3. Train CIFAR10 with PyTorch By kuangliu.


No releases published


No packages published
