Skip to content

DamienCode404/Anonym_stats_tests

Repository files navigation

Anonym_stats_tests

The aim of this project is to anonymize a set of real patient data. This is a method frequently used in science to respect medical confidentiality. This anonymization will be used, for example, to reuse medical data for statistical testing. Concerning our anonymisation method, we have decide to change the values of the variables (except Time) from the series.txt by an amount between -8 to 8% (random). The first method take a random value (between -8 to 8) different for every patients, and the second method use the same random value for every patients.

Install Python dependencies using the install.txt file

pip install -r install.txt

Move to the Makefile directory and run Make

cd anonym_meth
make

Try out the program with a toy version in the exemple_meth directory

python3 launch_toy_exemple.py

Execute the launch.py file in the anonym_meth directory, specifying the 4 arguments

python3 launch.py ../data/multivariate/ ../data/CSV_output/ ../gener_simulated_data_meth/ 1000

Choice of statistical tests and parameter selection

Voulez-vous effectuer des tests stats ? (O/n): O
Voulez-vous passer par une interface graphique ? (O/n): O

A graphical interface will appear, giving you a choice of several parameters

Capture d’écran du 2024-02-25 11-59-50

If you're on a server and can't display an interface

Use the parameters.txt file in the anonym_meth folder to fill in the variables to launch the statistics calculation.

OUTPUT

stats :

  • In this directory, you'll obtain the results of mean, maximum, minimum and standard deviation values for each physiological variable by patient (anonymous and real).

tests : Test results for each statistic obtained from the stats folder :

  • statKS,pvalKS: test statistics and Kolmogorov-Smirnov p-value.
  • statWMW_p,pvalWMW_p: Wilcoxon / Mann-Whiney paired test statistic and p-value.
  • statWMW_up,pvalWMW_up: test statistic and p-value of the Wilcoxon / Mann-Whiney unpaired test.
  • Euclidean distance.

dissimilarités normalisées pour chaque patient + boxplot :

  • For each of the anonymous patients, a number of real patients were randomly selected (10 recommended). We then calculated the univariate DTW between anonymous and random patients for each physiological parameter (HR, SBP, MAP, DBP). We then calculated the multivariate DTW (DTWm) as the mean of the previous 4 univariate DTWs. We keep the minimum DTWm_min on the 10 real patients. We calculated the mean (E) and standard deviation (S) on all DTWm_min, normalizing each dissimilarity: normalized DTWm_min = (DTWm_min-E)/S.

graphs :

  • Graphical comparison of values obtained in the stats directory.

Resources

  • See the repo for the progress bar in the cpp section here
  • tqdm : A Fast, Extensible Progress Bar for Python and CLI here
  • customtkinter : A modern and customizable python UI-library based on Tkinter here
  • dtaidistance : Library for time series distances here
  • numpy : The fundamental package for scientific computing here
  • pandas : Open source data analysis and manipulation tool here
  • scipy : Fundamental algorithms for scientific computing here
  • seaborn : Statistical data visualization here

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published