Skip to content

CNTK 2.0 Python API

Amit Agarwal edited this page Oct 8, 2016 · 103 revisions

The first cut of the CNTK v2 Python and C++ APIs are now available. These APIs enable programmatically defining CNTK models and drive their training/evaluation, using either built-in data readers or user supplied data in native Python numpy/C++ arrays.

[Note: This is an alpha release meant for early users to try the bits and provide feedback on the usability and functional aspects of the API. Currently there are a few known limitations and rough edges (listed at the bottom of this page) that are being addressed. ]  

Installation:

Pip Package:

[Note: If you previously installed an earlier version of the CNTK 2.0 python pip package, you can skip steps 1 through 3 below and directly jump to step 4 to update your existing CNTK 2.0 package installation from your Python 3.5 (on Linux) and Python 3.4 (on Windows) environment]

  1. Follow the instructions on the CNTK github wiki page CNTK Binary Download and Configuration to install the necessary prerequisites for running CNTK binary installation on your machine.

    [Note: Please only follow the prerequisites section – download of the binaries is not required since they are part of the pip package you will install in the next step.]

  2. On Linux install Anaconda python 3.5

    On Windows, currently the bindings only work with Python 3.4 so either install Anaconda Python 3.4 or create a Python 3.4.3 environment in your existing Python 3.5 anaconda or miniconda installation using the following commands:

    conda create --name cntk python=3.4.3 numpy scipy

    activate cntk

    [Note: Make sure that the python version installed above is what you use for the remainder of the instructions.]

  3. Upgrade pip: python -m pip install --upgrade pip

    [Note: If you get an error about insufficient permissions, run the command from an elevated command prompt]

  4. Install the CNTK 2.0 alpha2 pip package:

    Windows: pip install --upgrade https://cntk.ai/PipPackages/gpu/cntk-2.0a3-cp34-cp34m-win_amd64.whl

    Linux: pip install --upgrade https://cntk.ai/PipPackages/gpu/cntk-2.0a3-cp34-cp34m-linux_x86_64.whl

  5. Get a clone (or update your existing clone) of the CNTK repository (master branch) to get the python examples and training data files used in these examples.

  6. Include the examples directory in PYTHONPATH:

    Windows: setx PYTHONPATH [CNTK repo root]\bindings\python\examples;%PYTHONPATH%

    Linux: export PYTHONPATH=[CNTK repo root]/bindings/python/examples:$PYTHONPATH

  7. Verify PYTHONPATH is appropriately updated (on Windows this will require launching a new command window) and run an example from inside the [CNTK clone root]/bindings/python directory to verify your installation:

    python examples/NumpyInterop/feedforwardNet.py

From CNTK sources (build latest bits):

  1. If you do not have CNTK development environment already setup on your machine, follow the instructions on CNTK github wiki to do so.

    Setting up CNTK on Windows

    Setting up CNTK on Linux

  2. Do a build of CNTK (Release flavor). On Linux you must build the GPU SKU.

  3. Install SWIG:

    Windows: SWIG 3.0.10

    Linux: Run the [CNTK clone root]/bindings/python/cntk/swig_install.sh script

  4. On Linux install Anaconda python 3.5

    On Windows, currently the bindings only work with Python 3.4 so either install Anaconda Python 3.4 or create a Python 3.4.3 environment in your existing Python 3.5 anaconda or miniconda installation using the following commands:

    conda create --name cntk python=3.4.3 numpy scipy

    activate cntk

    [Note: Make sure that the python version installed above is what you use for the remainder of the instructions.]

  5. If you previously installed any version of the CNTK 2.0 pip-package on your machine, uninstall it:

    pip uninstall cntk

  6. On Linux:

    cd [CNTK clone root]/bindings/python]

    swig -version (Make sure swig with version >= 3.0.10 is in path)

    python ./setup.py build_ext (Ignore any warnings reported by this step - they are currently expected)

    cp ./build/lib.linux-x86_64-3.5/_cntk_py.cpython-35m-x86_64-linux-gnu.so .

    export PYTHONPATH=[CNTK clone root]/bindings/python:$PYTHONPATH

    python examples/NumpyInterop/FeedForwardNet.py

    If your build and setup succeeded, you should following output on the console:

    Minibatch: 0, Train Loss: 0.7915553283691407, Train Evaluation Criterion: 0.48

    Minibatch: 20, Train Loss: 0.6266774368286133, Train Evaluation Criterion: 0.48

    Minibatch: 40, Train Loss: 1.0378565979003906, Train Evaluation Criterion: 0.64

    Minibatch: 60, Train Loss: 0.6558118438720704, Train Evaluation Criterion: 0.56

    On Windows:

    Follow the instructions in “section # a” of [CNTK clone root]/bindings/python/readme.txt for setup. Then run the examples from inside the [CNTK clone root]/bindings/python directory, to verify your installation:

    python examples/NumpyInterop/feedforwardNet.py

Documentation:

The API documentation is currently in progress and detailed operator and tutorials will become available very soon. Currently the main form of documentation are Docstrings which are available for most of the python APIs that are displayed by intellisense. Also the core C++ APIs CNTKLibrary.h have fairly detailed documentation for the entire API, in case you want to lookup something that doesn't yet have a python docstring documentation though SWIG generated python wrapper is available for use.  

Examples:

The best way to learn about the APIs currently, is to look at the following examples in the [CNTK clone root]/bindings/python/examples directory:

  • MNIST: A fully connected feed-forward model for classification of MNIST images. (follow the instructions in Examples/Image/MNIST/README.md)

  • CifarRest: An image classification ResNet model for training on the CIFAR image dataset. (follow the instructions in Examples/Image/Miscellaneous/CIFAR-10/README.md to get the CIFAR dataset and convert it to the CNTK supported format)

  • SequenceClassification: An LSTM sequence classification model for text data.

  • Sequence2Sequence: A sequence to sequence grapheme to phoneme translation model that trains on the CMUDict corpus.

  • NumpyInterop – numpy interop example showing how to train a simple feed-forward network with training data fed using numpy arrays.

Known issues and limitations:

This is an alpha release meant for early users to try the bits and provide feedback on the usability and functional aspects of the API. These bits have undergone limited testing so far, so expect some rough edges. Also please expect the API to undergo changes over the coming weeks, which may break backwards compatibility of programs written against the alpha release.

  • Only a subset of the planned functionality is available; features like distributed training, automatic LR and MB size search and API extensibility will become available over the next few weeks.

  • Python 2.7 support is currently unavailable but will be part of the upcoming beta release.

  • On Windows only Python 3.4 is supported and not Python 3.5 since the latter requires Visual Studio 2015 which CNTK has not yet migrated to. This will also be addressed before the upcoming beta release.

  • The core API itself is implemented in C++ for speed and efficiency and python bindings are created through SWIG. We are increasingly creating thin python wrappers for the APIs to attach docstrings to, but this is a work in progress and for some of the APIs, you may directly encounter SWIG generated API definitions (which are not the prettiest to read)

  • Shape and dimension inference support is currently unavailable and the shapes of all Variable objects have to be fully specified.

Clone this wiki locally