We've created a system in which you can easily select and combine both pre-processing and learning algorithms from state-of-the-art machine learning toolboxes, and arrange them in simple or parallel pipeline data streams.
In addition, you can parametrize your training and testing workflow choosing cross-validation schemes, performance metrics and hyperparameter optimization metrics from a list of pre-registered options.
Importantly, you can integrate custom solutions into your data processing pipeline, but also for any part of the model training and evaluation process including custom hyperparameter optimization strategies.
For a detailed description, visit our website and read the documentation
or you can read our paper in PLOS ONE
In order to use PHOTONAI you only need to have your favourite Python IDE ready. Then install the latest stable version simply via pip
pip install photonai
# Or try out the latest features if you don't rely on a stable version, using:
pip install --upgrade git+https://github.com/wwu-mmll/photonai.git@develop
You can setup a full stack machine learning pipeline in a few lines of code:
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import KFold
from photonai import Hyperpipe, PipelineElement, FloatRange, Categorical, IntegerRange
# DESIGN YOUR PIPELINE
my_pipe = Hyperpipe('basic_svm_pipe', # the name of your pipeline
# which optimizer PHOTONAI shall use
optimizer='sk_opt',
optimizer_params={'n_configurations': 25},
# the performance metrics of your interest
metrics=['accuracy', 'precision', 'recall', 'balanced_accuracy'],
# after hyperparameter optimization, this metric declares the winner config
best_config_metric='accuracy',
# repeat hyperparameter optimization three times
outer_cv=KFold(n_splits=3),
# test each configuration five times respectively,
inner_cv=KFold(n_splits=5),
verbosity=1,
project_folder='./tmp/')
# first normalize all features
my_pipe.add(PipelineElement('StandardScaler'))
# then do feature selection using a PCA
my_pipe += PipelineElement('PCA',
hyperparameters={'n_components': IntegerRange(5, 20)},
test_disabled=True)
# engage and optimize the good old SVM for Classification
my_pipe += PipelineElement('SVC',
hyperparameters={'kernel': Categorical(['rbf', 'linear']),
'C': FloatRange(0.5, 2)}, gamma='scale')
# train pipeline
X, y = load_breast_cancer(return_X_y=True)
my_pipe.fit(X, y)
We pre-registered diverse preprocessing and learning algorithms from state-of-the-art toolboxes e.g. scikit-learn, keras and imbalanced learn to rapidly build custom pipelines
With PHOTONAI you can seamlessly switch between diverse hyperparameter optimization strategies, such as (random) grid-search or bayesian optimization (scikit-optimize, smac3).
You can build custom sequences of processing and learning algorithms with a simple syntax. PHOTONAI offers extended pipeline functionality such as parallel sequences, custom callbacks in-between pipeline elements, AND- and OR- Operations, as well as the possibility to flexibly position data augmentation, class balancing or learning algorithms anywhere in the pipeline.
PHOTONAI provides a standardized format for sharing and loading optimized pipelines across platforms with only one line of code.
While you concentrate on selecting appropriate processing steps, learning algorithms, hyperparameters and training parameters, PHOTONAI automates the nested cross-validated optimization and evaluation loop for any custom pipeline.
PHOTONAI comes with extensive logging of all information in the training, testing and hyperparameter optimization process. In addition, optimum performances and the hyperparameter optimization progress are visualized in the PHOTONAI Explorer.