The aim of this project is to anonymize a set of real patient data. This is a method frequently used in science to respect medical confidentiality. This anonymization will be used, for example, to reuse medical data for statistical testing. Concerning our anonymisation method, we have decide to change the values of the variables (except Time) from the series.txt by an amount between -8 to 8% (random). The first method take a random value (between -8 to 8) different for every patients, and the second method use the same random value for every patients.
pip install -r install.txt
cd anonym_meth
make
python3 launch_toy_exemple.py
python3 launch.py ../data/multivariate/ ../data/CSV_output/ ../gener_simulated_data_meth/ 1000
Voulez-vous effectuer des tests stats ? (O/n): O
Voulez-vous passer par une interface graphique ? (O/n): O
Use the parameters.txt file in the anonym_meth folder to fill in the variables to launch the statistics calculation.
- In this directory, you'll obtain the results of mean, maximum, minimum and standard deviation values for each physiological variable by patient (anonymous and real).
- statKS,pvalKS: test statistics and Kolmogorov-Smirnov p-value.
- statWMW_p,pvalWMW_p: Wilcoxon / Mann-Whiney paired test statistic and p-value.
- statWMW_up,pvalWMW_up: test statistic and p-value of the Wilcoxon / Mann-Whiney unpaired test.
- Euclidean distance.
- For each of the anonymous patients, a number of real patients were randomly selected (10 recommended). We then calculated the univariate DTW between anonymous and random patients for each physiological parameter (HR, SBP, MAP, DBP). We then calculated the multivariate DTW (DTWm) as the mean of the previous 4 univariate DTWs. We keep the minimum DTWm_min on the 10 real patients. We calculated the mean (E) and standard deviation (S) on all DTWm_min, normalizing each dissimilarity: normalized DTWm_min = (DTWm_min-E)/S.
- Graphical comparison of values obtained in the stats directory.
- See the repo for the progress bar in the cpp section here
- tqdm : A Fast, Extensible Progress Bar for Python and CLI here
- customtkinter : A modern and customizable python UI-library based on Tkinter here
- dtaidistance : Library for time series distances here
- numpy : The fundamental package for scientific computing here
- pandas : Open source data analysis and manipulation tool here
- scipy : Fundamental algorithms for scientific computing here
- seaborn : Statistical data visualization here