Skip to content

Fault injection tool for reliability assessment of deep learning algorithm such as Neural Networks

Notifications You must be signed in to change notification settings

cad-polito-it/SFIadvancedmodels

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Fault Injection Tool for the Reliability Assessment of Deep Learning Algorithms

Overview

SFIadvancedmodels is an open-source software designed to test the resilience of deep learning algorithms against the occurrence of random-hardware faults. The intent of the framework is to execute advanced statistical fault injection analyses by extending the available and known fault models in the literature.

Projects structure

This project is structured as follows:

  • requirements.txt: packages to install in a virtual environment to run the application
  • main.py: The main entry point for our application. It performs fault list generations, FI campaigns where it saves the OFM and the outputs (golden and faulty) and a final FI analysis
  • SETTINGS.py: Configuration file to set preferences
  • utils.py: Utility functions and helper modules
  • faultManager/: Contains the files used to manage the FI campaigns
  • ofmapManager/: Saves the OFM of the golden network
  • dlModels/: Directory where models and weights are stored

Setup

To get started, first clone the repository from GitHub:

git clone https://github.com/your-username/SFIadvancedmodels.git

Creating a Python Environment

It is recommended to create a virtual environment to manage your dependencies. You can do this using venv:

python3 -m venv environment_name

source environment_name/bin/activate

Installing Dependencies

Once your virtual environment is activated, install the required packages listed in requirements.txt:

pip install -r requirements.txt

Usage

To generate the fault list, to start a fault injection, or to analyze the data, compile the SETTINGS.py file to configure your experiments, then run:

python3 main.py

It is noted that the type of fault injected is permanent and simulates a stuck-at fault in the memory where the model weights are stored

Outputs

The code is divided into four individually activatable parts that produce different outputs, controlled by boolean variables in the SETTINGS.py file:

  • FAULT_LIST_GENERATION: Generates a fault list for the selected network based on the set parameters.
  • FAULTS_INJECTION: Loads the fault list and executes the fault injection campaign, saving outputs or golden/corrupted OFMs based on the preferences set.
  • FI_ANALYSIS: Analyzes the corrupted outputs against the golden ones and returns the number of masked, non-critical, and critical (SDC-1) fault.
  • FI_ANALYSIS_SUMMARY: When injecting a large number of faults or using large datasets, the previous analysis can produce very large and hard-to-handle CSV files. This variable activates a script that summarizes the previously generated data to make it more accessible.

The output of the SFI is stored in the folder output. More in details:

  • output/clean_feature_maps: Stores the clean feature maps
  • output/clean_ouput: Stores the clean output
  • outpput/fault_list: The fault list used for the injections
  • output/faulty_feature_maps : Stores the faulty feature maps
  • output/faulty_ouput: Stores the faulty output
  • results/: Stores the analysis of the outputs
  • results_summary/: Stores the summarized analysis of the outputs

The file are named as follow:

  • clean FM: batch_[batch_id]_layer_[layer_name].npz. This file contains the clean output feature map of layer [layer_id] given the input batch [batch_id].
  • clean output: clean_output.npy. This file contains the clean output for all the input batches.
  • faulty FM: fault_[fault_id]_batch_[batch_id]_layer_[layer_name].npz. This file contains the faulty output feature map of layer [layer_id] given the input batch [batch_id] when the fault [fault_id] is injected.
  • faulty output: [fault_model]/batch_[batch_id].npy. This file contains the clean output given the input batch [batch_id] for all the faults injected.

The files are either np or npz array. The dimensions are the following:

  • clean FM: BxKxHxW
  • clean output: NxBxC
  • faulty FM: BxKxHxW
  • clean output: FxBxC

Where F is the length of the fault list, N is the number of batches, B is the batch size, C is the number of classes, K is the number of channels of an OFM, H is the height of an OFM and W is the width.

To load the FM arrays call np.load(file_name)['arr_0']). To load the output array call np.load(file_name, allow_pickle=True).

Fault list

The generated fault lists are CSV files with a specific format to which the FI refers in order to inject faults into the neural model. The structure is as follows:

FL example for a VGG-11 model with GTSRB dataset

Injection Layer TensorIndex Bit
0 features.0 "(3, 0, 2, 1)" 15
... ... ... ...
  • Injection: Column indicating the injection number.
  • Layer: The layer in which the fault is injected.
  • TensorIndex: Coordinate of the weight where the fault is injected.
  • Bit: Corrupted bit that is flipped.

Analysis

The analysis files obtained with FI_ANALYSIS option are contained in the results/ folder and are organized by dataset, model, and batch size: results/dataset-name/model-name/batch-size/. Inside, there are two files:

  • fault_statistics.txt: A text file where the total number of masked, non-critical, and critical (SDC-1) inferences are saved.
  • output_analysis.csv: A CSV file containing all the information regarding the classification of each fault for every inference.

Faults were classified according to 3 typologies:

  • masked: Inference that mask the fault.
  • non-critical: Inferences where the fault alters the output but not the prediction.
  • critical (SDC-1): Inference where the fault is classified as SDC-1, meaning it alters the final prediction.

The output_analysis.csv is organized as follows:

fault batch image output
0 0 0 1
0 0 1 0
0 0 2 0
0 0 3 2
... ... ... ...
16663 9 1024 1
a
  • fault: Unique identifier of the injected fault, corresponding to the Injection column in the fault list used.
  • batch: Batch containing the dataset images used for inference.
  • image: Image in the batch on which the inference was performed.
  • output: Classification of the injected fault by comparing the golden outputs with the corrupted ones obtained from the image inference. The returned values are 0 for a masked fault, 1 for a non-critical fault, and 2 for a critical fault (SDC-1).

Summarized analysis

Due to the verbosity of the output_analysis.csv file, if many faults are injected or a large number of images are used for inferences, the readability of the CSV decreases significantly. To address this issue, using the FI_ANALYSIS_SUMMARY option, you can generate a new CSV file named model-name_summary.csv inside the results_summary/dataset-name/model-name/batch-size/ folder. This file comprises the original fault list integrated with summarized results for each fault obtained from the previous analysis. The CSV is organized as follows:

Injection Layer TensorIndex Bit n_injections masked non_critical critical
0 conv1 "(7, 0, 2, 1)" 15 10000 10000 0 0
1 conv1 "(14, 0, 2, 0)" 5 10000 10000 0 0
2 conv1 "(27, 0, 0, 0)" 13 10000 701 9298 1
3 conv1 "(14, 2, 2, 0)" 12 10000 9998 2 0
... ... ... ... ... ... ... ...
  • Injection: Column indicating the injection number.
  • Layer: The layer in which the fault is injected.
  • TensorIndex: Coordinate of the weight where the fault is injected.
  • Bit: Corrupted bit that is flipped.
  • n_injections: Number of summarized inferences, representing the entire test dataset executed with the injected fault.
  • masked: Number of dataset inferences that identified the fault as masked.
  • non_critical: Number of dataset inferences that identified the fault as non-critical.
  • critical: Number of dataset inferences that identified the fault as critical (SDC-1).

Acknowledgments

This study was carried out within the FAIR - Future Artificial Intelligence Research and received funding from the European Union Next-GenerationEU (PIANO NAZIONALE DI RIPRESA E RESILIENZA (PNRR) – MISSIONE 4 COMPONENTE 2, INVESTIMENTO 1.3 – D.D. 1555 11/10/2022, PE00000013). This manuscript reflects only the authors’ views and opinions, neither the European Union nor the European Commission can be considered responsible for them.

About

Fault injection tool for reliability assessment of deep learning algorithm such as Neural Networks

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages