Skip to content

PHOTONAI is a high level python API for designing and optimizing machine learning pipelines.

License

Notifications You must be signed in to change notification settings

wwu-mmll/photonai

Repository files navigation

PHOTON LOGO

GitHub Workflow Status Coverage Status Github Contributors Github Commits PyPI Version License Twitter URL

PHOTONAI is a high level python API for designing and optimizing machine learning pipelines.

We've created a system in which you can easily select and combine both pre-processing and learning algorithms from state-of-the-art machine learning toolboxes, and arrange them in simple or parallel pipeline data streams.

In addition, you can parametrize your training and testing workflow choosing cross-validation schemes, performance metrics and hyperparameter optimization metrics from a list of pre-registered options.

Importantly, you can integrate custom solutions into your data processing pipeline, but also for any part of the model training and evaluation process including custom hyperparameter optimization strategies.

For a detailed description, visit our website and read the documentation

or you can read our paper in PLOS ONE


Getting Started

In order to use PHOTONAI you only need to have your favourite Python IDE ready. Then install the latest stable version simply via pip

pip install photonai
# Or try out the latest features if you don't rely on a stable version, using:
pip install --upgrade git+https://github.com/wwu-mmll/photonai.git@develop

You can setup a full stack machine learning pipeline in a few lines of code:

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import KFold

from photonai import Hyperpipe, PipelineElement, FloatRange, Categorical, IntegerRange

# DESIGN YOUR PIPELINE
my_pipe = Hyperpipe('basic_svm_pipe',  # the name of your pipeline
                    # which optimizer PHOTONAI shall use
                    optimizer='sk_opt',
                    optimizer_params={'n_configurations': 25},
                    # the performance metrics of your interest
                    metrics=['accuracy', 'precision', 'recall', 'balanced_accuracy'],
                    # after hyperparameter optimization, this metric declares the winner config
                    best_config_metric='accuracy',
                    # repeat hyperparameter optimization three times
                    outer_cv=KFold(n_splits=3),
                    # test each configuration five times respectively,
                    inner_cv=KFold(n_splits=5),
                    verbosity=1,
                    project_folder='./tmp/')


# first normalize all features
my_pipe.add(PipelineElement('StandardScaler'))

# then do feature selection using a PCA
my_pipe += PipelineElement('PCA', 
                           hyperparameters={'n_components': IntegerRange(5, 20)}, 
                           test_disabled=True)

# engage and optimize the good old SVM for Classification
my_pipe += PipelineElement('SVC', 
                           hyperparameters={'kernel': Categorical(['rbf', 'linear']),
                                            'C': FloatRange(0.5, 2)}, gamma='scale')

# train pipeline
X, y = load_breast_cancer(return_X_y=True)
my_pipe.fit(X, y)

Features

Easy access to established ML implementations

We pre-registered diverse preprocessing and learning algorithms from state-of-the-art toolboxes e.g. scikit-learn, keras and imbalanced learn to rapidly build custom pipelines

Hyperparameter Optimization

With PHOTONAI you can seamlessly switch between diverse hyperparameter optimization strategies, such as (random) grid-search or bayesian optimization (scikit-optimize, smac3).

Extended ML Pipeline

You can build custom sequences of processing and learning algorithms with a simple syntax. PHOTONAI offers extended pipeline functionality such as parallel sequences, custom callbacks in-between pipeline elements, AND- and OR- Operations, as well as the possibility to flexibly position data augmentation, class balancing or learning algorithms anywhere in the pipeline.

Model Sharing

PHOTONAI provides a standardized format for sharing and loading optimized pipelines across platforms with only one line of code.

Automation

While you concentrate on selecting appropriate processing steps, learning algorithms, hyperparameters and training parameters, PHOTONAI automates the nested cross-validated optimization and evaluation loop for any custom pipeline.

Results Visualization

PHOTONAI comes with extensive logging of all information in the training, testing and hyperparameter optimization process. In addition, optimum performances and the hyperparameter optimization progress are visualized in the PHOTONAI Explorer.

For more use cases, examples, contribution guidelines and API details visit our website

About

PHOTONAI is a high level python API for designing and optimizing machine learning pipelines.

Resources

License

Stars

Watchers

Forks

Languages