You can help improve the Triton FIL backend in any of the following ways:
- Submitting a bug report, feature request or documentation issue
- Proposing and implementing a new feature
- Implementing a feature or bug-fix for an outstanding issue
When submitting a bug report, please include a minimum reproducible example. Ideally, this should be a snippet of code that other developers can copy, paste, and immediately run to try to reproduce the error. Please:
- Do include import statements and any other code necessary to immediately run your example
- Avoid examples that require other developers to download models or data unless you cannot reproduce the problem with synthetically-generated data
To contribute code to this project, please follow these steps:
- Find an issue to work on or submit an issue documenting the problem you would like to work on.
- Comment on the issue saying that you plan to work on it.
- Review the implementation details section below for information to help you make your changes in a way that is consistent with the rest of the codebase.
- Code!
- Create your pull request.
- Wait for other developers to review your code and update your PR as needed.
- Once a PR is approved, it will be merged into the main branch.
- We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
- Any contribution which contains commits that are not Signed-Off will not be accepted.
- To sign off on a commit you simply use the
--signoff
(or-s
) option when committing your changes:This will append the following to your commit message:$ git commit -s -m "Add cool feature."
Signed-off-by: Your Name <[email protected]>
- Full text of the DCO:
Developer Certificate of Origin Version 1.1 Copyright (C) 2004, 2006 The Linux Foundation and its contributors. 1 Letterman Drive Suite D4700 San Francisco, CA, 94129 Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed.
Developer's Certificate of Origin 1.1 By making a contribution to this project, I certify that: (a) The contribution was created in whole or in part by me and I have the right to submit it under the open source license indicated in the file; or (b) The contribution is based upon previous work that, to the best of my knowledge, is covered under an appropriate open source license and I have the right under that license to submit that work with modifications, whether created in whole or in part by me, under the same open source license (unless I am permitted to submit under a different license), as indicated in the file; or (c) The contribution was provided directly to me by some other person who certified (a), (b) or (c) and I have not modified it. (d) I understand and agree that this project and the contribution are public and that a record of the contribution (including all personal information I submit with it, including my sign-off) is maintained indefinitely and may be redistributed consistent with this project or the open source license(s) involved.
This repo provides scripts for building the Triton server Docker image as well as a test image which can be used to validate the server image. These images can be built in both CPU-only and GPU-enabled modes, and it is important to ensure that both builds work correctly.
The simplest way to do this is to use the ci/local/build.sh
script, which
will build images for both modes, generate example models to use with them, and
run all available tests. For more rapid iteration, it is sometimes useful to
perform these steps separately, however, and each step is described below in
further detail.
Note that each run of ci/local/build.sh
will generate uniquely-named log
files in qa/logs
. Therefore, it may be helpful to periodically clean out this
directory. Example models are cached between runs, but developers may force
regeneration of models by setting the environment variable RETRAIN=1
.
To minimize build and testing time, it is often beneficial to build the backend
libraries on the host and then copy them into a testing container rather than
building entirely within Docker. To do so, invoke the following from within the
rapids_triton_dev
conda environment defined by
conda/environments/rapids_triton_dev.yml
:
HOST_BUILD=1 ./ci/local/build.sh
This option will also minimize disk consumption by intermediate build layers produced during repeated Docker builds.
By default, ci/local/build.sh
invokes a relatively small number of examples
for each test, while ci/gitlab/build.sh
(used in CI) invokes many more. To
perform more rigorous local tests, set the environment variable TEST_PROFILE
to ci
:
TEST_PROFILE=ci ./ci/local/build.sh
For most development workflows, it is recommended that you simply invoke
./ci/local/build.sh
. The following details are provided only if this script
does not meet your specific workflow needs.
To build the server image, test image, or both, developers may invoke the
build.sh
script in the root of the repo. By default, both the server image
and test image are generated, but just the server image may be generated by
specifying the server
target (./build.sh server
). Using the --cpu-only
flag will invoke CPU-only mode for the build. Remember that a CPU-only build
cannot handle GPU-enabled models during testing.
For most development workflows, the FIL backend is built on top of an existing
Triton server image, but it may be useful to test a build of the entire server
image using Triton's own build infrastructure. For this purpose, build.sh
can
also be invoked with the --buildpy
flag, which will invoke Triton's
build.py
script on the current FIL backend branch. Note that this requires
that the branch be pushed to the FIL backend Github repo, so it is typically
only invoked by internal developers and CI to confirm that there are no
incompatibilities with Triton's build infrastructure. Invoking Triton's
build.py
requires additional dependencies which may be installed using
conda/environments/buildpy.yml
.
To generate example models manually, build the test image and run the following
(assuming you are using the default image name, triton_fil_test
):
docker run --rm -t --gpus all\
-e RETRAIN=1 \
-e OWNER_ID=$(id -u) \
-e OWNER_GID=$(id -g) \
-v "${PWD}/qa/L0_e2e/model_repository:/qa/L0_e2e/model_repository" \
-v "${PWD}/qa/L0_e2e/cpu_model_repository:/qa/L0_e2e/cpu_model_repository" \
triton_fil_test \
bash -c 'conda run -n triton_test /qa/generate_example_models.sh'
To run all tests for a given build, developers can simply invoke the test image with its default entrypoint as follows:
docker run -t --gpus all\
-v "${PWD}/qa/logs:/qa/logs" \
-v "${PWD}/qa/L0_e2e/model_repository:/qa/L0_e2e/model_repository" \
-v "${PWD}/qa/L0_e2e/cpu_model_repository:/qa/L0_e2e/cpu_model_repository" \
--rm triton_fil_test
It is occasionally useful to run tests on a Triton server image which has been
obtained from another source, such as a container registry. To test such an
image, it is recommended that developers make use of the ci/gitlab/build.sh
script, which is used for building and testing individual configurations in CI.
To run tests on a pre-built image, use the environment variable
PREBUILT_SERVER_TAG
:
PREBUILT_SERVER_TAG=nvcr.io/nvidia/tritonserver:22.05-py3 ./ci/gitlab/build.sh
This will build a test image based on this pre-built server image and run it.
For CPU-only builds, the environment variable CPU_ONLY
should be set to 1.
The fastest way to iterate on a build during development is to build the
backend on the host and then copy the resultant libraries into the test Docker
image. To make use of this build path, invoke the following from the
rapids_triton_dev
conda environment defined in
conda/environments/rapids_triton_dev.yml
:
HOST_BUILD=1 ./ci/local/build.sh
This will build the libaries on the host, build a test image based on those libraries, and then invoke the test image.
To analyze the impact of a change on performance, you may wish to run the basic benchmarking script described in the benchmarking docs.
Contributions to the FIL backend should:
- Adhere to Almost-Always-Auto style
- Prefer STL algorithms to raw loops wherever possible
- Use C++ types except where explicitly interfacing with C code (e.g.
std::size_t
as opposed tosize_t
) - Avoid depending on transitive includes