Iteration Plan (September - October 2017) #2410

cha-zhang · 2017-09-25T22:44:59Z

This plan captures our work from mid September to end of October. We will ship around November 22nd. Major work items of this iteration include ONNX support in CNTK, MKL integration, and many others.

Endgame

November 8: Code freeze for the end game
November 22: Release date

Planned items

We plan to ship these items at the end of this iteration.

Legend of annotations:

Icon	Description
	Item not started
	Item finished
🏃	Work in progress
✋	Blocked
💪	Stretch

Documentation

Finalize learner design and fix related documentation

System

Support import/export ONNX format models
A network optimization API that helps model compression via SVD, quantization, etc.
16bit support for training on Volta GPU (limited functionality)
C# high-level API design (no implementation)
Reader improvement for large data sets (sequential reader)

Examples

Faster R-CNN object detection
- Clean up the code to use arbitrary input image size
- C++ implementation of some Python layers
- Usability improvement
New example for natural language processing (NLP)
New tutorial on WGAN and LS-GAN
Semantic segmentation (stretch goal)

Operations

Specify frequency in the number of epochs and minibatches for progress report, validation, checkpoints
Improve statistics for distributed evaluation

Performance

Intel MKL update to improve inference speed on CPU by around 2x on AlexNet

Others

Continue work on Deep Learning Explained course on edX.

kyoro1 · 2017-09-26T00:24:50Z

@cha-zhang Can we assume that parallel learning for Faster R-CNN will be implemented in this sprint?
I put my comments for Fast R-CNN in the issue. Indeed, I don't stick to this issue, and I'd like to know if "Faster R-CNN" can include MORE FASTER implementation in this sprint:)

arijit17 · 2017-09-26T06:36:07Z

Continue work on Deep Learning Explained course on edX.

Does it mean an advanced course is coming up?

grzsz · 2017-09-26T09:16:05Z

Will new release be available for .netcore2.0?

cha-zhang · 2017-09-26T14:14:13Z

@arijit17 No, we are not working on an advanced course at this moment. It's there just to indicate some routine maintenance needed for the course.

cha-zhang · 2017-09-26T14:15:39Z

@kyoro1 Yes, faster implementation is on the roadmap, but we first want to achieve full parity.

cha-zhang · 2017-09-26T14:17:24Z

@grzsz We are making some fixes for the C# low-level API as well during this iteration (didn't mention above). .netcore2.0 compatibility is not a very high priority at this moment. How important is this?

helloguo · 2017-09-26T16:58:05Z

We are making some fixes for the C# low-level API as well during this iteration (didn't mention above).

@cha-zhang This C# support is language binding? Or the APIs will be implemented in C#?

cha-zhang · 2017-09-26T17:01:05Z

@helloguo The C# API is SWIG generated binding.

helloguo · 2017-09-26T17:59:11Z

@cha-zhang Thank you for your clarification.

The example Evaluation code shows the target framework is .NET Framework, which is Windows only. So can I assume these C# APIs are Windows only at this moment? If yes, are you planning to support Linux as well (e.g. using .NET Core since it supports Windows, Linux and macOS)?

liqunfu · 2017-09-26T20:34:33Z

@helloguo people had raised this .NET Code issue #2346, #2352. We are investigating. Not sure if we can push into this release or not. However if we can, we will update this iteration plan.

Dozer3D · 2017-09-26T22:04:39Z

Regarding the Usability improvements to the Faster r-cnn, would this include a GPU enabled version of the proposal layer UDF? Otherwise I find the faster r-cnn example is already quite usable as it is. Since adding the 'STORE_EVAL_MODEL_WITH_NATIVE_UDF' option it now has everything you need to include it in a native c++ windows based product for example (i.e. without the need for python dependencies) . The only problem is that the evaluation is a very slow because we are stuck using the CPU.

main76 · 2017-09-27T03:24:42Z

A network optimization API that helps model compression via SVD, quantization, etc.

Awesome! And, does there exist a way to get early access?

cha-zhang · 2017-09-27T03:54:46Z

@master76 We have some prototype code but they are not written as CNTK API. So the answer to your question is no, you will have to wait till the end of the iteration. Thanks!

grzsz · 2017-09-27T05:36:30Z

@grzsz We are making some fixes for the C# low-level API as well during this iteration (didn't mention above). .netcore2.0 compatibility is not a very high priority at this moment. How important is this?

@cha-zhang
As everything - it depends :) I can use C++/Python, but I suppose many people want/have to stick to .netcore2 and will choose a competition or home-made solution when CNTK was their first choice due to assumed platform support

JimSEOW · 2017-09-27T09:09:09Z

@cha-zhang
Can you please elaborate "Continue work on Deep Learning Explained course on edX."

If there a plan or milestone?
edX's CNTK course is an important way to promote and explain the "comprehensive extensive coverage of Deep Learning Topics" by CNTK.

It could be useful to use this thread to get feedback "WHAT GO INTO THE edX course"

Use this thread or a dedicated one to discuss

what have gone in, for that,
what users think about that,
What are the new topics YET to be included.

rhy-ama · 2017-09-27T11:37:14Z

#2422

what is the medium term planning in terms of NNs debugging facilities?

Can we output few more metrics using existing TensorBoard facilities within the next release under "improve statistics for distributed evaluation"? A good start would be weights histogram.

cha-zhang · 2017-09-27T16:49:22Z

@JimSEOW Sure let's create a dedicated thread for edX course.

As I mentioned earlier, for this iteration, we are just doing maintenance. Maybe I'll remove it from the list.

clintjcampbell · 2017-09-28T22:06:49Z

Does onnx mean that the model format will stabilize in the near future so models i have already trained will continue to work with future versions of cntk? At least for after onnx is implemented?

cha-zhang · 2017-09-29T03:25:22Z

@clintjcampbell Yes when ONNX is implemented it will be stable. ONNX itself is still evolving, but in a few weeks it should stabilize and be backward compatible.

cha-zhang · 2017-09-29T03:30:06Z

@rhy-ama weight histogram is not part of "improve statistics for distributed evaluation". This item specifically refers to improving printed information about training statistics when in distributed eval.

NN debugging facility is not in the current plan. The team is busy delivering a major milestone that sets a few things to relatively lower priority. If someone could contribute this, it would be great!

e-thereal · 2017-10-10T11:07:05Z

On the note of Improve statistics: It was possible in BrainScript to specify multiple metrics that were all evaluated and reported during training, but it seems that you can only monitor the loss and one metric using the Python API. It would be great to add the old BrainScript feature of multiple metrics back to the Python API.

skynode · 2017-10-16T00:35:25Z

We are making some fixes for the C# low-level API as well during this iteration (didn't mention above). .netcore2.0 compatibility is not a very high priority at this moment. How important is this?

This is super important to us. We would like to be able to reuse and maintain C# across the dev spectrum especially for business continuity. Plus there are performance improvements on .NET Core 2.0 which we would like to take advantage of without further optimization of our codebase. Please consider making it high priority.

Thank you for your time and efforts!

cha-zhang · 2017-10-16T03:48:21Z

@skynode Please refer to #2352.

mhjabreel · 2017-10-19T17:15:08Z

Hi @cha-zhang,

I am willing to implement high level API for C#, actually I have started that and I have implemented the following layers:

Linear
Convolution: Conv1D, Conv2D and Conv3D
Pooling: Max(Pool1D, Pool2D and Pool3D) and Avg(Pool1D, Pool2D and Pool3D)

You can find it in this link:
https://github.com/mhjabreel/DeepSharp

Regards,

Mohammed

cha-zhang · 2017-10-25T16:32:54Z

Hi, we have to postpone the release date for this iteration to Nov. 14. We added one week to wrap up a few features under implementation, and another week to fix some bugs reported in GitHub issues. Sorry for the delay!

IvanFarkas · 2017-10-28T18:36:33Z

I highly recommend the Deep Learning Explained course on edX.
Waiting patiently for the advanced course.

.NET Core 2.0 support is very important.
I hope CUDA 9 support and VS 2017 build is part of this iteration.

mstockfo · 2017-11-08T18:19:42Z

Does C++ implementation of some Python layers for Faster R-CNN object detection include gpu enabled evaluation from c#?

ddurschlag · 2017-11-12T17:53:49Z

These features sound awesome. Are we still looking at getting them sometime this week? Is there a list of open issues for the release that someone who knows C# well could contribute to?

cha-zhang · 2017-11-12T18:31:21Z

The new ship date for v2.3 is Nov. 14, as updated in the message above.

The C# high-level API design task is now blocked due to internal deadlines. We encourage the community to build high level API on top of the current low level one and share. You may use a similar design as CNTK's high level API, or feel free to mimic other high level APIs such as Keras/Gluon.

Starting next iteration, we will be making some changes to the release procedure. We are working hard to enable nightly releases (ETA before end of this year). Official release will then be done as-needed. Please comment if you have comments/suggestions. Thanks!

bencherian · 2017-11-14T18:36:47Z

Is 2.3 release still planned for today?

ebarsoumMS · 2017-11-14T18:39:53Z

No, it got delayed 1 week. We are releasing it in Nov 22 due to some changes that we need to take.

whatever1983 · 2017-11-14T19:06:19Z

Well, that is a bummer. Might as well delay it all the way till you are ready to release cuda9,cudnn7,and stable fp16 training. It is pretty amazing that mxnet 0.12 beat both cntk and tensorflow on cuda9 fp16 support,but lacks keras 2.0 support.

ebarsoumMS · 2017-11-14T19:09:20Z

Cuda9 and cuDNN7 will follow next.

Dozer3D · 2017-11-14T19:41:24Z

@ebarsoumMS , thank you for keeping us informed. The iteration plan included three improvements to the Faster RCNN example:

Clean up the code to use arbitrary input image size
C++ implementation of some Python layers
Usability improvement

Have these made it into the upcoming release?

ebarsoumMS · 2017-11-14T19:45:51Z

Adding @spandantiwari to comment, arbitrary input image size is in and we fix most OPs to work with arbitrary size.

spandantiwari · 2017-11-14T19:58:48Z

@Dozer3D - we have worked quite a bit to support free static axes (arbitrary input image size) in convolutional pipelines in this iteration. So convolution, pooling and other nodes that may be used in a typical pipeline support free static axes. We have also improved the performance for convolution with free static axes. But the FasterRCNN training using free static axes is not completely ready yet. We are still testing it out to match the numbers stated in the paper. Also, the C++ implementation of ProposalLayer.py is also under works. But these will most probably not make it into 2.3 release. Having said that, this model and making it work fast (especially inference) is still on our priorities.

ddurschlag · 2017-11-14T20:12:20Z

@ebarsoumMS My understanding is that Cuda9 is required to eliminate .Net Framework dependencies and provide a Net Standard version of CNTK. Is that correct? If so, Is that likely to happen for 2.3 next week, or at some future point? If a future point, is there any estimate of when?

Being able to use CNTK effectively in a container would be super useful, and my impression was this wasn't TOO far away...

Dozer3D · 2017-11-15T00:33:52Z

@spandantiwari thank you for that informative reply. We have created two datasets and trained faster RCNN networks with CNTK 2.2 to solve three problems for a client, but currently only one of these is usable without the GPU, and then only just. Having faster GPU and faster CPU inference would be much appreciated (I assume decreasing the input image size would also speed up the CPU processing)

So nothing for us in 2.3? but a good chance something before say, end of January?

Having said that, this model and making it work fast (especially inference) is still on our priorities.

Thank you. As a traditional Windows programmer/solutions provider, who knows very little about machine learning , I find Faster RCNN to be a very practical tool for solving many real problems for our customers.

mathias-brandewinder · 2017-11-15T05:20:07Z

@cha-zhang looking forward to the next release :)

Given that you "encourage the community to build high level API on top of the current low level one and share", I figure I would mention that I started working with some F# community members on exploring what a high-level, script-friendly F# DSL on top of CNTK could look like.

Got some of the C# samples converted to F# scripts already, very close to the original C# version here:

https://github.com/mathias-brandewinder/CNTK.FSharp/tree/master/examples

... and currently trying out something loosely Keras inspired. Plenty of rough edges, not sure yet if the direction is right, but here is how the MNIST CNN sample looks like as of today, interesting part highlighted:

https://github.com/mathias-brandewinder/CNTK.FSharp/blob/a0e9794697afacce65c95c66f5d899a9dd71cbf7/examples/MNIST-CNN.fsx#L89-L123

kodonnell · 2017-11-15T07:58:20Z

@spandantiwari - we're also exploring FasterRCNN. If the improvements aren't going to be released in the next week or so, could you please create a document somewhere with a recommended approach? I'm new to CNTK, but with some direction I may be able to help (especially if there are some examples e.g. 'convert the python layers [files <...>] to C++ in the same way as was done for PR <...>' ... or 'see C++ layer <...> for an example').

cha-zhang · 2017-11-22T17:38:10Z

For those of you who are exploring FasterRCNN, we have a branch chazhang/faster_rcnn that updates the Faster RCNN with free static axis. The code is tangled with Fast RCNN, and Fast RCNN hasn't been verified, so we won't release it in this iteration. On the other hand, FasterRCNN is now functional with arbitrary input image size, tested on Pascal data set. We don't see much accuracy improvement with this, though.

Most code was actually contributed by @spandantiwari. Thanks!

kodonnell · 2017-11-22T18:33:05Z

Thanks @cha-zhang . Could you please provide feedback on the best way to implement some of the C++ layers as per here?

As an aside, pip installs in code might want reconsidering before merging.

cha-zhang · 2017-11-22T18:39:07Z

@kodonnell Are you asking about using C++ to implement the proposal layer instead of Python?

kodonnell · 2017-11-22T19:00:14Z

@cha-zhang I'm referring to the original iteration plan:

C++ implementation of some Python layers

I don't even know what those layers are, hence why I'm asking for a starter = ) From other issues I've read, it sounds like implementing this will make evaluation of Faster RCNN a lot faster.

cha-zhang · 2017-11-22T19:34:38Z

Yes, that's the proposal layer. The current custom proposal layer is in Python and can be written in C++ instead.

You can refer to the binary convolution example for how to write a C++ custom layer:
https://github.com/Microsoft/CNTK/tree/master/Examples/Extensibility/BinaryConvolution

Dozer3D · 2017-11-22T19:38:34Z

I am confused now :-(

The current custom proposal layer is in Python and can be written in C++ instead.

It is my understanding that evaluation using c++ only (no python) already works and was implemented in 2.2 using a UDF by @pkranen (see 2234 ).

i.e. set __C.STORE_EVAL_MODEL_WITH_NATIVE_UDF = True

This does seem to work, except it runs on the CPU only (very slow), and not GPU. If you set the device to a GPU it throws an exception because the GPU version of that layer hasn't been written.

i..e in the file "cntk\Examples\Extensibility\ProposalLayer\ProposalLayerLib\ProposalLayerLib.h" we have the following code.

    if (computeDevice.Type() != DeviceKind::CPU)
           throw std::runtime_error("ProposalLayer: only CPU evaluation is supported at the moment.");

cha-zhang · 2017-11-22T19:44:20Z

@Dozer3D I think I was referring to training. If eval only, then yes, we have a C++ version already.

We are not satisfied with the training speed of Faster RCNN. More work is needed.

kodonnell · 2017-11-22T20:19:51Z

@cha-zhang - might it pay to start a new issue (or update the docs somewhere) to have a single place referring to all the improvements intended for Faster RCNN (with some useful detail to encourage PRs), so it's a little clearer? There are quite a few threads (including the 'pollution' of this one) which I, for one, find hard to follow.

sigfrid696 · 2019-06-07T15:39:47Z

I am confused now :-(

The current custom proposal layer is in Python and can be written in C++ instead.

It is my understanding that evaluation using c++ only (no python) already works and was implemented in 2.2 using a UDF by @pkranen (see 2234 ).

i.e. set __C.STORE_EVAL_MODEL_WITH_NATIVE_UDF = True

This does seem to work, except it runs on the CPU only (very slow), and not GPU. If you set the device to a GPU it throws an exception because the GPU version of that layer hasn't been written.

i..e in the file "cntk\Examples\Extensibility\ProposalLayer\ProposalLayerLib\ProposalLayerLib.h" we have the following code.
    if (computeDevice.Type() != DeviceKind::CPU)
           throw std::runtime_error("ProposalLayer: only CPU evaluation is supported at the moment.");

Is there any support for GPU on Porposal Layer Lib c++ implementation ?
I'm running CNTK 2.7 and it seems there still isn't any support for GPU.
When Is it planned to release this kind of support ?

cha-zhang added the iteration plan label Sep 29, 2017

cha-zhang mentioned this issue Sep 29, 2017

Evaluating FasterRCNN error #2234

Closed

cesarsouza mentioned this issue Nov 21, 2017

SampleApp goes to break mode in the last version (CNTK backend) cesarsouza/keras-sharp#13

Open

ebarsoumMS closed this as completed Dec 2, 2017

Iteration Plan (September - October 2017) #2410

Iteration Plan (September - October 2017) #2410

Comments

cha-zhang commented Sep 25, 2017 • edited by mx-iao Loading

Endgame

Planned items

Documentation

System

Examples

Operations

Performance

Others

kyoro1 commented Sep 26, 2017

arijit17 commented Sep 26, 2017

grzsz commented Sep 26, 2017

cha-zhang commented Sep 26, 2017 • edited Loading

cha-zhang commented Sep 26, 2017

cha-zhang commented Sep 26, 2017

helloguo commented Sep 26, 2017

cha-zhang commented Sep 26, 2017

helloguo commented Sep 26, 2017

liqunfu commented Sep 26, 2017

Dozer3D commented Sep 26, 2017 • edited Loading

main76 commented Sep 27, 2017

cha-zhang commented Sep 27, 2017

grzsz commented Sep 27, 2017 • edited Loading

JimSEOW commented Sep 27, 2017

rhy-ama commented Sep 27, 2017

cha-zhang commented Sep 27, 2017

clintjcampbell commented Sep 28, 2017

cha-zhang commented Sep 29, 2017

cha-zhang commented Sep 29, 2017 • edited Loading

e-thereal commented Oct 10, 2017

skynode commented Oct 16, 2017

cha-zhang commented Oct 16, 2017

mhjabreel commented Oct 19, 2017

cha-zhang commented Oct 25, 2017

IvanFarkas commented Oct 28, 2017

mstockfo commented Nov 8, 2017

ddurschlag commented Nov 12, 2017

cha-zhang commented Nov 12, 2017

bencherian commented Nov 14, 2017

ebarsoumMS commented Nov 14, 2017

whatever1983 commented Nov 14, 2017

ebarsoumMS commented Nov 14, 2017

Dozer3D commented Nov 14, 2017

ebarsoumMS commented Nov 14, 2017

spandantiwari commented Nov 14, 2017

ddurschlag commented Nov 14, 2017

Dozer3D commented Nov 15, 2017

mathias-brandewinder commented Nov 15, 2017

kodonnell commented Nov 15, 2017

cha-zhang commented Nov 22, 2017 • edited Loading

kodonnell commented Nov 22, 2017

cha-zhang commented Nov 22, 2017

kodonnell commented Nov 22, 2017

cha-zhang commented Nov 22, 2017

Dozer3D commented Nov 22, 2017

cha-zhang commented Nov 22, 2017

kodonnell commented Nov 22, 2017

sigfrid696 commented Jun 7, 2019

cha-zhang commented Sep 25, 2017 •

edited by mx-iao

Loading

cha-zhang commented Sep 26, 2017 •

edited

Loading

Dozer3D commented Sep 26, 2017 •

edited

Loading

grzsz commented Sep 27, 2017 •

edited

Loading

cha-zhang commented Sep 29, 2017 •

edited

Loading

cha-zhang commented Nov 22, 2017 •

edited

Loading