-
Notifications
You must be signed in to change notification settings - Fork 4.3k
CNTK 2.0 Python API
The first cut of the CNTK v2 Python and C++ APIs are now available. These APIs enable programmatically defining CNTK models and drive their training/evaluation, using either built-in data readers or user supplied data in native Python numpy/C++ arrays.
[Note: This is an alpha release meant for early users to try the bits and provide feedback on the usability and functional aspects of the API. Currently there are a few known limitations and rough edges (listed at the bottom of this page) that are being addressed. ]
[Note: If you previously installed an earlier version of the CNTK 2.0 python pip package, you can skip steps 1 through 3 below and directly jump to step 4 to update your existing CNTK 2.0 package installation from your Python 3.5 (on Linux) and Python 3.4 (on Windows) environment]
-
Follow the instructions on the CNTK github wiki page CNTK Binary Download and Configuration to install the necessary prerequisites for running CNTK binary installation on your machine.
[Note: Please only follow the prerequisites section – download of the binaries is not required since they are part of the pip package you will install in the next step.]
-
On Linux install Anaconda python 3.5
On Windows, currently the bindings only work with Python 3.4 so either install Anaconda Python 3.4 or create a Python 3.4.3 environment in your existing Python 3.5 anaconda or miniconda installation using the following commands:
conda create --name cntk python=3.4.3 numpy scipy
activate cntk
[Note: Make sure that the python version installed above is what you use for the remainder of the instructions.]
-
Upgrade pip: python -m pip install --upgrade pip
[Note: If you get an error about insufficient permissions, run the command from an elevated command prompt]
-
Download the CNTK 2.0 alpha2 pip package for your platform and unzip locally:
Install the package: pip install --upgrade [Local unzipped pip package path]
-
Get a clone (or update your existing clone) of the CNTK repository (master branch) to get the python examples and training data files used in these examples.
-
Include the examples directory in PYTHONPATH:
Windows: setx PYTHONPATH [CNTK repo root]\bindings\python\examples;%PYTHONPATH%
Linux: export PYTHONPATH=[CNTK repo root]/bindings/python/examples:$PYTHONPATH
-
Verify PYTHONPATH is appropriately updated (on Windows this will require launching a new command window) and run an example from inside the [CNTK clone root]/bindings/python directory to verify your installation:
python examples/NumpyInterop/feedforwardNet.py
-
If you do not have CNTK development environment already setup on your machine, follow the instructions on CNTK github wiki to do so.
-
Do a build of CNTK (Release flavor). On Linux you must build the GPU SKU.
-
Install SWIG:
Windows: SWIG 3.0.10
Linux: Run the [CNTK clone root]/bindings/python/cntk/swig/swig_install.sh script
-
On Linux install Anaconda python 3.5
On Windows, currently the bindings only work with Python 3.4 so either install Anaconda Python 3.4 or create a Python 3.4.3 environment in your existing Python 3.5 anaconda or miniconda installation using the following commands:
conda create --name cntk python=3.4.3 numpy scipy
activate cntk
[Note: Make sure that the python version installed above is what you use for the remainder of the instructions.]
-
On Linux:
cd [CNTK clone root]/bindings/python/cntk/swig]
./swig.sh
cd ../..
python ./setup.py build_ext
cp ./build/lib.linux-x86_64-3.5/_cntk_py.cpython-35m-x86_64-linux-gnu.so .
export PYTHONPATH=[CNTK clone root]/bindings/python:$PYTHONPATH
python examples/NumpyInterop/FeedForwardNet.py
If your build and setup succeeded, you should following output on the console:
Minibatch: 0, Train Loss: 0.7915553283691407, Train Evaluation Criterion: 0.48
Minibatch: 20, Train Loss: 0.6266774368286133, Train Evaluation Criterion: 0.48
Minibatch: 40, Train Loss: 1.0378565979003906, Train Evaluation Criterion: 0.64
Minibatch: 60, Train Loss: 0.6558118438720704, Train Evaluation Criterion: 0.56
On Windows:
Follow the instructions in [CNTK clone root]/bindings/python/readme.txt file to build the python bindings. You should properly set the PYTHON_LIB and SWIG variables in the “bindings/python/cntk/swig/swig.bat” file to the corresponding installation locations on your machine for a successful build of the python bindings.
Then follow the instructions in “section # a” of [CNTK clone root]/bindings/python/readme.txt to setup PATH and PYTHONPATH appropriately and then run the examples from inside the [CNTK clone root]/bindings/python directory, to verify your installation:
python examples/NumpyInterop/feedforwardNet.py
The API documentation is currently in progress and detailed operator and tutorials will become available very soon. Currently the main form of documentation are Docstrings which are available for most of the python APIs that are displayed by intellisense. Also the core C++ APIs CNTKLibrary.h have fairly detailed documentation for the entire API, in case you want to lookup something that doesn't yet have a python docstring documentation though SWIG generated python wrapper is available for use.
The best way to learn about the APIs currently, is to look at the following examples in the [CNTK clone root]/bindings/python/examples directory:
-
MNIST: A fully connected feed-forward model for classification of MNIST images. (follow the instructions in Examples/Image/MNIST/README.md)
-
CifarRest: An image classification ResNet model for training on the CIFAR image dataset. (follow the instructions in Examples/Image/Miscellaneous/CIFAR-10/README.md to get the CIFAR dataset and convert it to the CNTK supported format)
-
SequenceClassification: An LSTM sequence classification model for text data.
-
Sequence2Sequence: A sequence to sequence grapheme to phoneme translation model that trains on the CMUDict corpus.
-
NumpyInterop – numpy interop example showing how to train a simple feed-forward network with training data fed using numpy arrays.
This is an alpha release meant for early users to try the bits and provide feedback on the usability and functional aspects of the API. These bits have undergone limited testing so far, so expect some rough edges. Also please expect the API to undergo changes over the coming weeks, which may break backwards compatibility of programs written against the alpha release.
-
Only a subset of the planned functionality is available; features like distributed training, automatic LR and MB size search and API extensibility will become available over the next few weeks.
-
Python 2.7 support is currently unavailable but will be part of the upcoming beta release.
-
On Windows only Python 3.4 is supported and not Python 3.5 since the latter requires Visual Studio 2015 which CNTK has not yet migrated to. This will also be addressed before the upcoming beta release.
-
A known memory leak related message appears upon the first use of the API:
*swig/python detected a memory leak of type 'std::vector< CNTK::Parameter,std::allocator< CNTK::Parameter > > ', no destructor found.
-
The core API itself is implemented in C++ for speed and efficiency and python bindings are created through SWIG. We are increasingly creating thin python wrappers for the APIs to attach docstrings to, but this is a work in progress and for some of the APIs, you may directly encounter SWIG generated API definitions (which are not the prettiest to read)
-
Shape and dimension inference support is currently unavailable and the shapes of all Variable objects have to be fully specified.
-
There is a known issue with the current model saving functionality due to which the names of Functions and Variables are not retained across save and load.