FaultPredictionOnGPGPUs

This repository includes the experiment materials and results for the soft error vulnerability study for GPGPU applications.

BenchmarkGPGPUApplications folder includes the benchmark applications used in the experiments. These applications belong to the PolyBench benchmark suite.
gpgpu-sim folder includes the simulation results of benchmark applications.
with_GPGPUsim and with-Nsight folders include prediction results for the fault rates obtained with GPGPU-Sim 4.0 metrics and Nsight Compute Tool metrics, respectively.
plots folder includes resultant plots for correlation results among features, between features and fault rates and prediction results.
data_metrics_GPUsim.xls and data_metrics_NSC.xls files include profiling metrics obtained from GPGPU-Sim and Nsight Compute tool.

Required Python Libraries: One can use the last versions of each library.

pandas
numpy
csv
sklearn
xlsxwriter
openpyxl
matplotlib
seaborn

To run the experiments: --$ python3 plotsCorrelatorFaults.py --arg

The arg can be:

--fault_rates_plot -> this will plot fault rates for each of the fault type
--corr_results_heat_map -> this will plot the correlation results (Spearman and Pearson) among the features and faults
--corr_results_features_and_faults -> this plot the same correlation results between faults and features.

$ python3 plotsCorrelatorFaults.py --arg1 --arg2 --arg3

For the classification experiments:

--arg1:

--gpgpu-sim -> prediction experiments obtained with gpgpu-sim metrics
--nsight-compute -> prediction experiments obtained with nsight-compute metrics

--arg2:

--crash -> crash classification results
--sdc -> sdc classification results

--arg3:

--all_features -> prediction study with all features
--sel_features -> prediction study with selected features

$ python3 plotsCorrelatorFaults.py --arg1 --arg2

For the regression experiments experiments:

--arg1:

--gpgpu-sim -> prediction experiments obtained with gpgpu-sim metrics
--nsight-compute -> prediction experiments obtained with nsight compute tool metrics

--arg2:

--masked -> prediction experiment for masked faults
--others -> prediction experiment for SDCs and crashes

There are three types of faults examined in this study: 1 - Crash Faults 2 - Silent Data Corruptions (SDCs) 3 - Masked Faults

To predict the crash and masked fault rates, different classification methods is used, while regression method is used to predict masked fault rates. Further details for the approaches used in the prediction studies, collected metrics, and feature selection stages are examined in our paper published soon.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FaultPredictionOnGPGPUs

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
BenchmarkGPGPUApplications		BenchmarkGPGPUApplications
gpgpu-sim		gpgpu-sim
plots		plots
with_GPGPUsim		with_GPGPUsim
with_Nsight		with_Nsight
data_metrics_GPUsim.xls		data_metrics_GPUsim.xls
data_metrics_NSC.xls		data_metrics_NSC.xls
plotsCorrelatorFaults.py		plotsCorrelatorFaults.py
predictorMain.py		predictorMain.py
readme.md		readme.md

parsiyte/GPU-Reliability-Prediction

Folders and files

Latest commit

History

Repository files navigation

FaultPredictionOnGPGPUs

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages