Skip to content

Commit

Permalink
Merge pull request #1 from NVIDIA/master
Browse files Browse the repository at this point in the history
update repo
  • Loading branch information
gkucsko authored Jun 2, 2020
2 parents 142bed9 + 78b6bef commit 79bc099
Show file tree
Hide file tree
Showing 501 changed files with 55,926 additions and 11,293 deletions.
9 changes: 9 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@ result
tests/data/asr
.DS_Store
bert.pt.json
work
fastspeech_output

# Byte-compiled / optimized / DLL files
__pycache__/
Expand All @@ -21,6 +23,7 @@ __pycache__/
# Distribution / packaging
.idea
.Python
wandb
build/
develop-eggs/
dist/
Expand Down Expand Up @@ -152,3 +155,9 @@ examples/*/wandb
examples/*/data
wandb
dump.py

docs/sources/source/test_build/

# Checkpoints, config files and temporary files created in tutorials.
examples/neural_graphs/*.chkpt
examples/neural_graphs/*.yml
2 changes: 0 additions & 2 deletions .lgtm.yml

This file was deleted.

31 changes: 31 additions & 0 deletions .readthedocs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# =============================================================================
# Copyright (c) 2020 NVIDIA. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# =============================================================================

# Read the Docs configuration file
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details

# Required field.
version: 2

# Build documentation in the docs/ directory with Sphinx.
sphinx:
configuration: docs/sources/source/conf.py

# Set the version of Python and requirements required to build your docs
python:
version: 3.7
install:
- requirements: requirements/requirements_docs.txt
72 changes: 65 additions & 7 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,14 +70,69 @@ To release a new version, please update the changelog as followed:
## [Unreleased]

### Added
- Added NeMoModels class. Implemented in ASR collection: ASRConvCTCModel, and QuartzNet and JasperNet as its children - @okuchaiev
- Added multi-dataset data-layer and dataset.
([PR #538](https://github.com/NVIDIA/NeMo/pull/538)) - @yzhang123
- Online Data Augmentation for ASR Collection. ([PR #565](https://github.com/NVIDIA/NeMo/pull/565)) - @titu1994
- Speed augmentation on CPU, TimeStretch augmentation on CPU+GPU ([PR #594](https://github.com/NVIDIA/NeMo/pull/565)) - @titu1994
- Added TarredAudioToTextDataLayer, which allows for loading ASR datasets with tarred audio. Existing datasets can be converted with the `convert_to_tarred_audio_dataset.py` script. ([PR #602](https://github.com/NVIDIA/NeMo/pull/602))
- Online audio augmentation notebook in ASR examples ([PR #605](https://github.com/NVIDIA/NeMo/pull/605)) - @titu1994
- ContextNet Encoder + Decoder Initial Support ([PR #630](https://github.com/NVIDIA/NeMo/pull/630)) - @titu1994
- Added finetuning with Megatron-LM ([PR #601](https://github.com/NVIDIA/NeMo/pull/601)) - @ekmb
- Added documentation for 8 kHz model ([PR #632](https://github.com/NVIDIA/NeMo/pull/632)) - @jbalam-nv


### Changed
- quartznet and jasper ASR examples reworked into speech2text.py and speech2text_infer.py - @okuchaiev
- Syncs across workers at each step to check for NaN or inf loss. Terminates all workers if stop\_on\_nan\_loss is set (as before), lets Apex deal with it if apex.amp optimization level is O1 or higher, and skips the step across workers otherwise. ([PR #637](https://github.com/NVIDIA/NeMo/pull/637)) - @redoctopus
- Updated the callback system. Old callbacks will be deprecated in version 0.12. ([PR #615](https://github.com/NVIDIA/NeMo/pull/615)) - @blisc

### Dependencies Update

### Deprecated

### Fixed

### Removed

### Security

### Contributors

## [0.10.2] - 2020-05-05

### Added
- The Neural Graph is a high-level abstract concept empowering the users to build graphs consisting of many, interconnected Neural Modules. A user in his/her application can build any number of graphs, potentially spanning over the same modules. The import/export options combined with the lightweight API make Neural Graphs a perfect tool for rapid prototyping and experimentation. ([PR #413](https://github.com/NVIDIA/NeMo/pull/413)) - @tkornuta

## [0.10.0] - 2020-04-03

### Added
- Roberta and Albert support added to GLUE script, data caching also added.
([PR #413](https://github.com/NVIDIA/NeMo/pull/413)) - @ekmb
- text classification notebook added
([PR #382](https://github.com/NVIDIA/NeMo/pull/382)) - @ericharper
- New Neural Type System documentation. Also added decorator to generate docs for input/output ports.
([PR #370](https://github.com/NVIDIA/NeMo/pull/370)) - @okuchaiev
- New Neural Type System and its tests.
([PR #307](https://github.com/NVIDIA/NeMo/pull/307)) - @okuchaiev
- Named tensors tuple module's output for graph construction.
([PR #268](https://github.com/NVIDIA/NeMo/pull/268)) - @stasbel
- Introduced the `deprecated` decorator.
([PR #298](https://github.com/NVIDIA/NeMo/pull/298)) - @tkornuta-nvidia
- Implemented new mechanisms for importing and exporting of module configuration (init_params) to configuration (yml)
files, along with unit tests, examples and tutorials
([PR #339](https://github.com/NVIDIA/NeMo/pull/339)) - @tkornuta-nvidia
- Speech Commands support.
([PR #375](https://github.com/NVIDIA/NeMo/pull/375)) - @titu1994

### Changed
- Refactoring of `nemo_nlp` collections:
([PR #368](https://github.com/NVIDIA/NeMo/pull/368)) - @VahidooX, @yzhang123, @ekmb
- renaming and restructuring of files, folder, and functions in `nemo_nlp`
- losses cleaned up. LossAggregatorNM moved to nemo/backends/pytorch/common/losses
([PR #316](https://github.com/NVIDIA/NeMo/pull/316)) - @VahidooX, @yzhang123, @ekmb
- renaming and restructuring of files, folder, and functions in `nemo_nlp`
- Updated licenses
- All collections changed to use New Neural Type System.
([PR #307](https://github.com/NVIDIA/NeMo/pull/307)) - @okuchaiev
- Additional Collections Repositories merged into core `nemo_toolkit` package.
Expand All @@ -86,28 +141,30 @@ To release a new version, please update the changelog as followed:
([PR #284](https://github.com/NVIDIA/NeMo/pull/284)) - @stasbel
- NeMo is not longer using pep8 code style rules. Code style rules are now enforced with `isort` and `black` incorporated into CI checks.
([PR #286](https://github.com/NVIDIA/NeMo/pull/286)) - @stasbel
- Major cleanup of Neural Module constructors (init), aiming at increasing the framework robustness: cleanup of NeuralModule initialization logic, refactor of trainer/actions (getting rid of local_params), fixes of several examples and unit tests, extraction and storing of intial parameters (init_params).
- Major cleanup of Neural Module constructors (init), aiming at increasing the framework robustness: cleanup of NeuralModule initialization logic, refactor of trainer/actions (getting rid of local_params), fixes of several examples and unit tests, extraction and storing of intial parameters (init_params).
([PR #309](https://github.com/NVIDIA/NeMo/pull/309)) - @tkornuta-nvidia
- Refactoring of `nemo_nlp` collections:
([PR #316](https://github.com/NVIDIA/NeMo/pull/316)) - @VahidooX, @yzhang123, @ekmb
- renaming of files and restructuring of folder in `nemo_nlp`
- Updated licenses
- Updated nemo's use of the logging library. from nemo import logging is now the reccomended way of using the nemo logger. neural_factory.logger and all other instances of logger are now deprecated and planned for removal in the next version. Please see PR 267 for complete change information.
([PR #267](https://github.com/NVIDIA/NeMo/pull/267), [PR #283](https://github.com/NVIDIA/NeMo/pull/283), [PR #305](https://github.com/NVIDIA/NeMo/pull/305), [PR #311](https://github.com/NVIDIA/NeMo/pull/311)) - @blisc
- Changed Distributed Data Parallel from Apex to Torch
([PR #336](https://github.com/NVIDIA/NeMo/pull/336)) - @blisc

- Added TRADE (dialogue state tracking model) on MultiWOZ dataset
([PR #322](https://github.com/NVIDIA/NeMo/pull/322)) - @chiphuyen, @VahidooX
- Question answering:
([PR #390](https://github.com/NVIDIA/NeMo/pull/390)) - @yzhang123
- Changed question answering task to use Roberta and Albert as alternative backends to Bert
- Added inference mode that does not require ground truth labels

### Dependencies Update
- Added dependency on `wrapt` (the new version of the `deprecated` warning) - @tkornuta-nvidia, @DEKHTIARJonathan

### Deprecated

### Fixed
- Critical fix of the training action on CPU
- Critical fix of the training action on CPU
([PR #308](https://github.com/NVIDIA/NeMo/pull/309)) - @tkornuta-nvidia
- Fixed issue in Tacotron 2 prenet
([PR #444](https://github.com/NVIDIA/NeMo/pull/444)) - @blisc

### Removed
- gradient_predivide_factor arg of train() now has no effect
Expand Down Expand Up @@ -166,7 +223,8 @@ This release also includes nemo_asr'' and nemo_nlp'' collections for Speech Reco

Please refer to the documentation here: https://nvidia.github.io/NeMo/

[Unreleased]: https://github.com/NVIDIA/NeMo/compare/v0.9.0...master
[Unreleased]: https://github.com/NVIDIA/NeMo/compare/v0.10.0...master
[0.10.0]: https://github.com/NVIDIA/NeMo/compare/v0.9.0...v0.10.0
[0.9.0]: https://github.com/NVIDIA/NeMo/compare/v0.8.2...v0.9.0
[0.8.2]: https://github.com/NVIDIA/NeMo/compare/v0.8.1...v0.8.2
[0.8.1]: https://github.com/NVIDIA/NeMo/compare/r0.8...v0.8.1
Expand Down
14 changes: 8 additions & 6 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@

2) Make sure you sign your commits. E.g. use ``git commit -s`` when commiting

3) Make sure all unittests finish successfully before sending PR
3) Make sure all unittests finish successfully before sending PR ``python -m unittest`` from NeMo's root folder

4) Send your Pull Request to `master` branch
4) Send your Pull Request to the `master` branch


# Collection Guidelines
Expand All @@ -28,9 +28,8 @@ Please note that CI needs to pass for all the modules and collections.
1. **Sensible**: code should make sense. If you think a piece of code might be confusing, write comments.

## Python style
We follow [PEP 8 style guide](https://www.python.org/dev/peps/pep-0008/) and we incorporate [pycodestyle](https://pypi.org/project/pycodestyle/) into our CI pipeline to check for style. Make sure that your code passes PEP 8 before creating a Pull Request.

There are several tools to automatically format your code to be PEP 8 compliant, such as [autopep8](https://github.com/hhatto/autopep8). Your text editor might support its own auto PEP 8 plugin.
We use ``black`` as our style guide. To check whether your code will pass style check (from the NeMo's repo folder) run:
``python setup.py style`` and if it does not pass run ``python setup.py style --fix``.

1. Avoid wild import: ``from X import *`` unless in ``X.py``, ``__all__`` is defined.
1. Minimize the use of ``**kwargs``.
Expand All @@ -47,7 +46,10 @@ There are several tools to automatically format your code to be PEP 8 compliant,
1. If a comment lasts multiple lines, use ``'''`` instead of ``#``.

## Nemo style
1. If you import a module from the same collection, use relative path instead of absolute path. For example, inside ``nemo_nlp``, use ``.utils`` instead of ``nemo_nelp.utils``.
1. Use absolute paths.
1. Before accessing something, always make sure that it exists.
1. Right inheritance. For example, if a module doesn't have any trainable weights, don't inherit from TrainableNM.
1. Naming consistency, both within NeMo and between NeMo and external literature. E.g. use the name ``logits`` for ``log_probs``, ``hidden_size`` for ``d_model``.
1. Make an effort to use the right Neural Types when designing your neural modules. If a type you need does not
exists - you can introduce one. See documentation on how to do this
1. When creating input/ouput ports for your modules use "add_port_docs" decorator to nicely generate docs for them
24 changes: 17 additions & 7 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -30,14 +30,23 @@ RUN apt-get update && \
python-dev && \
rm -rf /var/lib/apt/lists/*

# install onnx trt open source plugins
# install trt
ENV PATH=$PATH:/usr/src/tensorrt/bin
WORKDIR /tmp/onnx-trt
COPY scripts/docker/onnx-trt.patch .
RUN git clone -n https://github.com/onnx/onnx-tensorrt.git && cd onnx-tensorrt && \
git checkout 8716c9b && git submodule update --init --recursive && patch -f < ../onnx-trt.patch && \
mkdir build && cd build && cmake .. -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/usr -DGPU_ARCHS="60 70 75" && \
make -j16 && make install && mv -f /usr/lib/libnvonnx* /usr/lib/x86_64-linux-gnu/ && ldconfig && rm -rf /tmp/onnx-tensorrt
WORKDIR /tmp/trt-oss
ARG NV_REPO=https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64

RUN cd /tmp/trt-oss
ARG DEB=libcudnn7_7.6.5.32-1+cuda10.2_amd64.deb
RUN curl -sL --output ${DEB} ${NV_REPO}/${DEB}
ARG DEB=libnvinfer7_7.0.0-1+cuda10.2_amd64.deb
RUN curl -sL --output ${DEB} ${NV_REPO}/${DEB}
ARG DEB=libnvinfer-plugin7_7.0.0-1+cuda10.2_amd64.deb
RUN curl -sL --output ${DEB} ${NV_REPO}/${DEB}
ARG DEB=libnvonnxparsers7_7.0.0-1+cuda10.2_amd64.deb
RUN curl -sL --output ${DEB} ${NV_REPO}/${DEB}
ARG DEB=python-libnvinfer_7.0.0-1+cuda10.2_amd64.deb
RUN curl -sL --output ${DEB} ${NV_REPO}/${DEB}
RUN dpkg -i *.deb && cd ../.. && rm -rf /tmp/trt-oss

# install nemo dependencies
WORKDIR /tmp/nemo
Expand All @@ -52,6 +61,7 @@ COPY . .
FROM nemo-deps as nemo
ARG NEMO_VERSION
ARG BASE_IMAGE

# Check that NEMO_VERSION is set. Build will fail without this. Expose NEMO and base container
# version information as runtime environment variable for introspection purposes
RUN /usr/bin/test -n "$NEMO_VERSION" && \
Expand Down
Loading

0 comments on commit 79bc099

Please sign in to comment.