Get embedding sizes refactor #1127

jperez999 · 2021-09-16T21:33:42Z

No description provided.

nvidia-merlin-bot · 2021-09-17T17:59:09Z

Click to view CI Results

GitHub pull request #1127 of commit e2f9d5700e3e5811df6cdef789f5032b5d9aa961, no merge conflicts.
Running as SYSTEM
Setting status of e2f9d5700e3e5811df6cdef789f5032b5d9aa961 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/3489/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --force --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --force --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/1127/*:refs/remotes/origin/pr/1127/* # timeout=10
 > git rev-parse e2f9d5700e3e5811df6cdef789f5032b5d9aa961^{commit} # timeout=10
Checking out Revision e2f9d5700e3e5811df6cdef789f5032b5d9aa961 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f e2f9d5700e3e5811df6cdef789f5032b5d9aa961 # timeout=10
Commit message: "get embedding sizes now working"
 > git rev-list --no-walk 5333ebff2ed0a69be248f36577b2257ec2255c1b # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins3137795139931049970.sh
Installing NVTabular
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Requirement already satisfied: pip in /var/jenkins_home/.local/lib/python3.8/site-packages (21.2.4)
Requirement already satisfied: setuptools in /var/jenkins_home/.local/lib/python3.8/site-packages (58.0.4)
Requirement already satisfied: wheel in /var/jenkins_home/.local/lib/python3.8/site-packages (0.37.0)
Requirement already satisfied: pybind11 in /var/jenkins_home/.local/lib/python3.8/site-packages (2.7.1)
Terminated
Build was aborted
Aborted by �[8mha:////4I6AZwo/1Z8Fal8AhZTEatjIwqNwCcqT21311HdysuK+AAAAlx+LCAAAAAAAAP9b85aBtbiIQTGjNKU4P08vOT+vOD8nVc83PyU1x6OyILUoJzMv2y+/JJUBAhiZGBgqihhk0NSjKDWzXb3RdlLBUSYGJk8GtpzUvPSSDB8G5tKinBIGIZ+sxLJE/ZzEvHT94JKizLx0a6BxUmjGOUNodHsLgAzWEgZu/dLi1CL9xJTczDwAj6GcLcAAAAA=�[0madmin
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins8067199914536865628.sh

nvidia-merlin-bot · 2021-09-18T00:24:29Z

Click to view CI Results

GitHub pull request #1127 of commit df397656eb01c05c19e12dabe8ff2f7b15aa3488, no merge conflicts.
Running as SYSTEM
Setting status of df397656eb01c05c19e12dabe8ff2f7b15aa3488 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/3501/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --force --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --force --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/1127/*:refs/remotes/origin/pr/1127/* # timeout=10
 > git rev-parse df397656eb01c05c19e12dabe8ff2f7b15aa3488^{commit} # timeout=10
Checking out Revision df397656eb01c05c19e12dabe8ff2f7b15aa3488 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f df397656eb01c05c19e12dabe8ff2f7b15aa3488 # timeout=10
Commit message: "Merge branch 'main' into get-embedding-sizes-fix"
 > git rev-list --no-walk bbf74327e67177bdb82fea187ba7aae8193b40d3 # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins554783608279010990.sh
Installing NVTabular
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Requirement already satisfied: pip in /var/jenkins_home/.local/lib/python3.8/site-packages (21.2.4)
Requirement already satisfied: setuptools in /var/jenkins_home/.local/lib/python3.8/site-packages (58.0.4)
Requirement already satisfied: wheel in /var/jenkins_home/.local/lib/python3.8/site-packages (0.37.0)
Requirement already satisfied: pybind11 in /var/jenkins_home/.local/lib/python3.8/site-packages (2.7.1)
running develop
running egg_info
creating nvtabular.egg-info
writing nvtabular.egg-info/PKG-INFO
writing dependency_links to nvtabular.egg-info/dependency_links.txt
writing requirements to nvtabular.egg-info/requires.txt
writing top-level names to nvtabular.egg-info/top_level.txt
writing manifest file 'nvtabular.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
warning: no files found matching '*.h' under directory 'cpp'
warning: no files found matching '*.cu' under directory 'cpp'
warning: no files found matching '*.cuh' under directory 'cpp'
adding license file 'LICENSE'
writing manifest file 'nvtabular.egg-info/SOURCES.txt'
running build_ext
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include/python3.8 -c flagcheck.cpp -o flagcheck.o -std=c++17
building 'nvtabular_cpp' extension
creating build
creating build/temp.linux-x86_64-3.8
creating build/temp.linux-x86_64-3.8/cpp
creating build/temp.linux-x86_64-3.8/cpp/nvtabular
creating build/temp.linux-x86_64-3.8/cpp/nvtabular/inference
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.6.0+73.gdf39765 -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/__init__.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/__init__.o -std=c++17 -fvisibility=hidden -g0
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.6.0+73.gdf39765 -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/inference/__init__.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/__init__.o -std=c++17 -fvisibility=hidden -g0
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.6.0+73.gdf39765 -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/inference/categorify.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/categorify.o -std=c++17 -fvisibility=hidden -g0
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.6.0+73.gdf39765 -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/inference/fill.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/fill.o -std=c++17 -fvisibility=hidden -g0
creating build/lib.linux-x86_64-3.8
x86_64-linux-gnu-g++ -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fwrapv -O2 -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-x86_64-3.8/cpp/nvtabular/__init__.o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/__init__.o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/categorify.o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/fill.o -o build/lib.linux-x86_64-3.8/nvtabular_cpp.cpython-38-x86_64-linux-gnu.so
copying build/lib.linux-x86_64-3.8/nvtabular_cpp.cpython-38-x86_64-linux-gnu.so -> 
Generating nvtabular/inference/triton/model_config_pb2.py from nvtabular/inference/triton/model_config.proto
Creating /var/jenkins_home/.local/lib/python3.8/site-packages/nvtabular.egg-link (link to .)
nvtabular 0.6.0+73.gdf39765 is already the active version in easy-install.pth
Installed /var/jenkins_home/workspace/nvtabular_tests/nvtabular

Processing dependencies for nvtabular==0.6.0+73.gdf39765

Searching for protobuf==3.17.3

Best match: protobuf 3.17.3

Adding protobuf 3.17.3 to easy-install.pth file
Using /usr/local/lib/python3.8/dist-packages

Searching for tensorflow-metadata==1.2.0

Best match: tensorflow-metadata 1.2.0

Processing tensorflow_metadata-1.2.0-py3.8.egg

tensorflow-metadata 1.2.0 is already the active version in easy-install.pth
Using /var/jenkins_home/.local/lib/python3.8/site-packages/tensorflow_metadata-1.2.0-py3.8.egg

Searching for pyarrow==4.0.1

Best match: pyarrow 4.0.1

Adding pyarrow 4.0.1 to easy-install.pth file

Installing plasma_store script to /var/jenkins_home/.local/bin
Using /usr/local/lib/python3.8/dist-packages

Searching for tqdm==4.61.2

Best match: tqdm 4.61.2

Processing tqdm-4.61.2-py3.8.egg

tqdm 4.61.2 is already the active version in easy-install.pth

Installing tqdm script to /var/jenkins_home/.local/bin
Using /var/jenkins_home/.local/lib/python3.8/site-packages/tqdm-4.61.2-py3.8.egg

Searching for numba==0.54.0

Best match: numba 0.54.0

Processing numba-0.54.0-py3.8-linux-x86_64.egg

numba 0.54.0 is already the active version in easy-install.pth

Installing pycc script to /var/jenkins_home/.local/bin

Installing numba script to /var/jenkins_home/.local/bin
Using /var/jenkins_home/.local/lib/python3.8/site-packages/numba-0.54.0-py3.8-linux-x86_64.egg

Searching for pandas==1.2.5

Best match: pandas 1.2.5

Processing pandas-1.2.5-py3.8-linux-x86_64.egg

pandas 1.2.5 is already the active version in easy-install.pth
Using /var/jenkins_home/.local/lib/python3.8/site-packages/pandas-1.2.5-py3.8-linux-x86_64.egg

Searching for distributed==2021.4.1

Best match: distributed 2021.4.1

Processing distributed-2021.4.1-py3.8.egg

distributed 2021.4.1 is already the active version in easy-install.pth

Installing dask-ssh script to /var/jenkins_home/.local/bin

Installing dask-scheduler script to /var/jenkins_home/.local/bin

Installing dask-worker script to /var/jenkins_home/.local/bin
Using /var/jenkins_home/.local/lib/python3.8/site-packages/distributed-2021.4.1-py3.8.egg

Searching for dask==2021.4.1

Best match: dask 2021.4.1

Processing dask-2021.4.1-py3.8.egg

dask 2021.4.1 is already the active version in easy-install.pth
Using /var/jenkins_home/.local/lib/python3.8/site-packages/dask-2021.4.1-py3.8.egg

Searching for PyYAML==5.4.1

Best match: PyYAML 5.4.1

Processing PyYAML-5.4.1-py3.8-linux-x86_64.egg

PyYAML 5.4.1 is already the active version in easy-install.pth
Using /var/jenkins_home/.local/lib/python3.8/site-packages/PyYAML-5.4.1-py3.8-linux-x86_64.egg

Searching for six==1.15.0

Best match: six 1.15.0

Adding six 1.15.0 to easy-install.pth file
Using /usr/local/lib/python3.8/dist-packages

Searching for googleapis-common-protos==1.53.0

Best match: googleapis-common-protos 1.53.0

Processing googleapis_common_protos-1.53.0-py3.8.egg

googleapis-common-protos 1.53.0 is already the active version in easy-install.pth
Using /var/jenkins_home/.local/lib/python3.8/site-packages/googleapis_common_protos-1.53.0-py3.8.egg

Searching for absl-py==0.12.0

Best match: absl-py 0.12.0

Processing absl_py-0.12.0-py3.8.egg

absl-py 0.12.0 is already the active version in easy-install.pth
Using /var/jenkins_home/.local/lib/python3.8/site-packages/absl_py-0.12.0-py3.8.egg

Searching for numpy==1.20.2

Best match: numpy 1.20.2

Adding numpy 1.20.2 to easy-install.pth file

Installing f2py script to /var/jenkins_home/.local/bin

Installing f2py3 script to /var/jenkins_home/.local/bin

Installing f2py3.8 script to /var/jenkins_home/.local/bin
Using /usr/local/lib/python3.8/dist-packages

Searching for setuptools==58.0.4

Best match: setuptools 58.0.4

Adding setuptools 58.0.4 to easy-install.pth file
Using /var/jenkins_home/.local/lib/python3.8/site-packages

Searching for llvmlite==0.37.0

Best match: llvmlite 0.37.0

Processing llvmlite-0.37.0-py3.8-linux-x86_64.egg

llvmlite 0.37.0 is already the active version in easy-install.pth
Using /var/jenkins_home/.local/lib/python3.8/site-packages/llvmlite-0.37.0-py3.8-linux-x86_64.egg

Searching for pytz==2021.1

Best match: pytz 2021.1

Adding pytz 2021.1 to easy-install.pth file
Using /usr/local/lib/python3.8/dist-packages

Searching for python-dateutil==2.8.2

Best match: python-dateutil 2.8.2

Adding python-dateutil 2.8.2 to easy-install.pth file
Using /usr/local/lib/python3.8/dist-packages

Searching for zict==2.0.0

Best match: zict 2.0.0

Processing zict-2.0.0-py3.8.egg

zict 2.0.0 is already the active version in easy-install.pth
Using /var/jenkins_home/.local/lib/python3.8/site-packages/zict-2.0.0-py3.8.egg

Searching for tornado==6.1

Best match: tornado 6.1

Processing tornado-6.1-py3.8-linux-x86_64.egg

tornado 6.1 is already the active version in easy-install.pth
Using /var/jenkins_home/.local/lib/python3.8/site-packages/tornado-6.1-py3.8-linux-x86_64.egg

Searching for toolz==0.11.1

Best match: toolz 0.11.1

Processing toolz-0.11.1-py3.8.egg

toolz 0.11.1 is already the active version in easy-install.pth
Using /var/jenkins_home/.local/lib/python3.8/site-packages/toolz-0.11.1-py3.8.egg

Searching for tblib==1.7.0

Best match: tblib 1.7.0

Processing tblib-1.7.0-py3.8.egg

tblib 1.7.0 is already the active version in easy-install.pth
Using /var/jenkins_home/.local/lib/python3.8/site-packages/tblib-1.7.0-py3.8.egg

Searching for sortedcontainers==2.4.0

Best match: sortedcontainers 2.4.0

Processing sortedcontainers-2.4.0-py3.8.egg

sortedcontainers 2.4.0 is already the active version in easy-install.pth
Using /var/jenkins_home/.local/lib/python3.8/site-packages/sortedcontainers-2.4.0-py3.8.egg

Searching for psutil==5.8.0

Best match: psutil 5.8.0

Processing psutil-5.8.0-py3.8-linux-x86_64.egg

psutil 5.8.0 is already the active version in easy-install.pth
Using /var/jenkins_home/.local/lib/python3.8/site-packages/psutil-5.8.0-py3.8-linux-x86_64.egg

Searching for msgpack==1.0.2

Best match: msgpack 1.0.2

Processing msgpack-1.0.2-py3.8-linux-x86_64.egg

msgpack 1.0.2 is already the active version in easy-install.pth
Using /var/jenkins_home/.local/lib/python3.8/site-packages/msgpack-1.0.2-py3.8-linux-x86_64.egg

Searching for cloudpickle==1.6.0

Best match: cloudpickle 1.6.0

Processing cloudpickle-1.6.0-py3.8.egg

cloudpickle 1.6.0 is already the active version in easy-install.pth
Using /var/jenkins_home/.local/lib/python3.8/site-packages/cloudpickle-1.6.0-py3.8.egg

Searching for click==8.0.1

Best match: click 8.0.1

Processing click-8.0.1-py3.8.egg

click 8.0.1 is already the active version in easy-install.pth
Using /var/jenkins_home/.local/lib/python3.8/site-packages/click-8.0.1-py3.8.egg

Searching for partd==1.2.0

Best match: partd 1.2.0

Processing partd-1.2.0-py3.8.egg

partd 1.2.0 is already the active version in easy-install.pth
Using /var/jenkins_home/.local/lib/python3.8/site-packages/partd-1.2.0-py3.8.egg

Searching for fsspec==2021.8.1

Best match: fsspec 2021.8.1

Processing fsspec-2021.8.1-py3.8.egg

fsspec 2021.8.1 is already the active version in easy-install.pth
Using /var/jenkins_home/.local/lib/python3.8/site-packages/fsspec-2021.8.1-py3.8.egg

Searching for HeapDict==1.0.1

Best match: HeapDict 1.0.1

Processing HeapDict-1.0.1-py3.8.egg

HeapDict 1.0.1 is already the active version in easy-install.pth
Using /var/jenkins_home/.local/lib/python3.8/site-packages/HeapDict-1.0.1-py3.8.egg

Searching for locket==0.2.1

Best match: locket 0.2.1

Processing locket-0.2.1-py3.8.egg

locket 0.2.1 is already the active version in easy-install.pth
Using /var/jenkins_home/.local/lib/python3.8/site-packages/locket-0.2.1-py3.8.egg

Finished processing dependencies for nvtabular==0.6.0+73.gdf39765

Running black --check

All done! ✨ 🍰 ✨

128 files would be left unchanged.

Running flake8

Running isort

Skipped 2 files

Running bandit

Running pylint

************* Module nvtabular.ops.categorify

nvtabular/ops/categorify.py:504:15: I1101: Module 'nvtabular_cpp' has no 'inference' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)

************* Module nvtabular.ops.fill

nvtabular/ops/fill.py:67:15: I1101: Module 'nvtabular_cpp' has no 'inference' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)

Your code has been rated at 10.00/10 (previous run: 10.00/10, +0.00)
Running flake8-nb

Building docs

make: Entering directory '/var/jenkins_home/workspace/nvtabular_tests/nvtabular/docs'

/usr/lib/python3/dist-packages/requests/init.py:89: RequestsDependencyWarning: urllib3 (1.26.6) or chardet (3.0.4) doesn't match a supported version!

warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "

/usr/local/lib/python3.8/dist-packages/recommonmark/parser.py:75: UserWarning: Container node skipped: type=document

warn("Container node skipped: type={0}".format(mdnode.t))

/usr/local/lib/python3.8/dist-packages/recommonmark/parser.py:75: UserWarning: Container node skipped: type=document

warn("Container node skipped: type={0}".format(mdnode.t))

/usr/local/lib/python3.8/dist-packages/recommonmark/parser.py:75: UserWarning: Container node skipped: type=document

warn("Container node skipped: type={0}".format(mdnode.t))

make: Leaving directory '/var/jenkins_home/workspace/nvtabular_tests/nvtabular/docs'

============================= test session starts ==============================

platform linux -- Python 3.8.10, pytest-6.2.5, py-1.10.0, pluggy-1.0.0

rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: pyproject.toml

plugins: cov-2.12.1, forked-1.3.0, xdist-2.3.0

collected 1526 items / 1 skipped / 1525 selected
tests/unit/test_dask_nvt.py ............................................ [  2%]

.....................................................................    [  7%]

tests/unit/test_io.py .................................................. [ 10%]

........................................................................ [ 15%]

..........ssssssss.....................................................s [ 20%]

s                                                                        [ 20%]

tests/unit/test_notebooks.py ...F..                                      [ 20%]

tests/unit/test_tf4rec.py .                                              [ 20%]

tests/unit/test_tools.py ......................                          [ 22%]

tests/unit/test_triton_inference.py ..............................       [ 24%]

tests/unit/columns/test_column_schemas.py .............................. [ 26%]

...................................................                      [ 29%]

tests/unit/columns/test_column_selector.py ....................          [ 30%]

tests/unit/framework_utils/test_tf_feature_columns.py .                  [ 30%]

tests/unit/framework_utils/test_tf_layers.py ........................... [ 32%]

...................................................                      [ 35%]

tests/unit/framework_utils/test_torch_layers.py .                        [ 35%]

tests/unit/loader/test_dataloader_backend.py ..                          [ 36%]

tests/unit/loader/test_tf_dataloader.py ................................ [ 38%]

........................................s..                              [ 40%]

tests/unit/loader/test_torch_dataloader.py ............................. [ 42%]

...................................................FF..                  [ 46%]

tests/unit/ops/test_column_similarity.py ........................        [ 48%]

tests/unit/ops/test_ops.py ............................................. [ 50%]

........................................................................ [ 55%]

........................................................................ [ 60%]

........................................................................ [ 65%]

........................................................................ [ 69%]

........................................................................ [ 74%]

.............................................                            [ 77%]

tests/unit/ops/test_ops_schema.py ................................FFFF.. [ 80%]

..........................................................FFFF.......... [ 84%]

..................................................FFFF.................. [ 89%]

..........................                                               [ 91%]

tests/unit/workflow/test_cpu_workflow.py ......                          [ 91%]

tests/unit/workflow/test_workflow.py ................................... [ 93%]

..........................................................               [ 97%]

tests/unit/workflow/test_workflow_node.py ...........                    [ 98%]

tests/unit/workflow/test_workflow_ops.py ..                              [ 98%]

tests/unit/workflow/test_workflow_schemas.py .......................     [100%]
=================================== FAILURES ===================================

____________________________ test_movielens_example ____________________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-28/test_movielens_example0')
def test_movielens_example(tmpdir):
    _get_random_movielens_data(tmpdir, 10000, dataset="movie")
    _get_random_movielens_data(tmpdir, 10000, dataset="ratings")
    _get_random_movielens_data(tmpdir, 5000, dataset="ratings", valid=True)

    triton_model_path = os.path.join(tmpdir, "models")
    os.environ["INPUT_DATA_DIR"] = str(tmpdir)
    os.environ["MODEL_PATH"] = triton_model_path

    notebook_path = os.path.join(
        dirname(TEST_PATH),
        "examples/getting-started-movielens/",
        "02-ETL-with-NVTabular.ipynb",
    )
    _run_notebook(tmpdir, notebook_path)

    def _modify_tf_nb(line):
        return line.replace(
            # don't require graphviz/pydot
            "tf.keras.utils.plot_model(model)",
            "# tf.keras.utils.plot_model(model)",
        )

    def _modify_tf_triton(line):
        # models are already preloaded
        line = line.replace("triton_client.load_model", "# triton_client.load_model")
        line = line.replace("triton_client.unload_model", "# triton_client.unload_model")
        return line

    notebooks = []
    try:
        import torch  # noqa

        notebooks.append("03-Training-with-PyTorch.ipynb")
    except Exception:
        pass
    try:
        import nvtabular.inference.triton  # noqa
        import nvtabular.loader.tensorflow  # noqa

        notebooks.append("03-Training-with-TF.ipynb")
        has_tf = True

    except Exception:
        has_tf = False

    for notebook in notebooks:
        notebook_path = os.path.join(
            dirname(TEST_PATH),
            "examples/getting-started-movielens/",
            notebook,
        )
        if notebook == "03-Training-with-TF.ipynb":
            _run_notebook(tmpdir, notebook_path, transform=_modify_tf_nb)
        else:


          _run_notebook(tmpdir, notebook_path)


tests/unit/test_notebooks.py:211:

tests/unit/test_notebooks.py:305: in _run_notebook

subprocess.check_output([sys.executable, script_path])

/usr/lib/python3.8/subprocess.py:415: in check_output

return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,

input = None, capture_output = False, timeout = None, check = True

popenargs = (['/usr/bin/python', '/tmp/pytest-of-jenkins/pytest-28/test_movielens_example0/notebook.py'],)

kwargs = {'stdout': -1}, process = <subprocess.Popen object at 0x7fa176467c10>

stdout = b'', stderr = None, retcode = 1
def run(*popenargs,
        input=None, capture_output=False, timeout=None, check=False, **kwargs):
    """Run command with arguments and return a CompletedProcess instance.

    The returned instance will have attributes args, returncode, stdout and
    stderr. By default, stdout and stderr are not captured, and those attributes
    will be None. Pass stdout=PIPE and/or stderr=PIPE in order to capture them.

    If check is True and the exit code was non-zero, it raises a
    CalledProcessError. The CalledProcessError object will have the return code
    in the returncode attribute, and output & stderr attributes if those streams
    were captured.

    If timeout is given, and the process takes too long, a TimeoutExpired
    exception will be raised.

    There is an optional argument "input", allowing you to
    pass bytes or a string to the subprocess's stdin.  If you use this argument
    you may not also use the Popen constructor's "stdin" argument, as
    it will be used internally.

    By default, all communication is in bytes, and therefore any "input" should
    be bytes, and the stdout and stderr will be bytes. If in text mode, any
    "input" should be a string, and stdout and stderr will be strings decoded
    according to locale encoding, or by "encoding" if set. Text mode is
    triggered by setting any of text, encoding, errors or universal_newlines.

    The other arguments are the same as for the Popen constructor.
    """
    if input is not None:
        if kwargs.get('stdin') is not None:
            raise ValueError('stdin and input arguments may not both be used.')
        kwargs['stdin'] = PIPE

    if capture_output:
        if kwargs.get('stdout') is not None or kwargs.get('stderr') is not None:
            raise ValueError('stdout and stderr arguments may not be used '
                             'with capture_output.')
        kwargs['stdout'] = PIPE
        kwargs['stderr'] = PIPE

    with Popen(*popenargs, **kwargs) as process:
        try:
            stdout, stderr = process.communicate(input, timeout=timeout)
        except TimeoutExpired as exc:
            process.kill()
            if _mswindows:
                # Windows accumulates the output in a single blocking
                # read() call run on child threads, with the timeout
                # being done in a join() on those threads.  communicate()
                # _after_ kill() is required to collect that and add it
                # to the exception.
                exc.stdout, exc.stderr = process.communicate()
            else:
                # POSIX _communicate already populated the output so
                # far into the TimeoutExpired exception.
                process.wait()
            raise
        except:  # Including KeyboardInterrupt, communicate handled that.
            process.kill()
            # We don't call process.wait() as .__exit__ does that for us.
            raise
        retcode = process.poll()
        if check and retcode:


          raise CalledProcessError(retcode, process.args,


                                     output=stdout, stderr=stderr)

E               subprocess.CalledProcessError: Command '['/usr/bin/python', '/tmp/pytest-of-jenkins/pytest-28/test_movielens_example0/notebook.py']' returned non-zero exit status 1.
/usr/lib/python3.8/subprocess.py:516: CalledProcessError

----------------------------- Captured stderr call -----------------------------

Traceback (most recent call last):

File "/tmp/pytest-of-jenkins/pytest-28/test_movielens_example0/notebook.py", line 60, in 

EMBEDDING_TABLE_SHAPES, MH_EMBEDDING_TABLE_SHAPES = nvt.ops.get_embedding_sizes(proc)

ValueError: too many values to unpack (expected 2)

____________________________ test_mh_model_support _____________________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-28/test_mh_model_support0')
def test_mh_model_support(tmpdir):
    df = cudf.DataFrame(
        {
            "Authors": [["User_A"], ["User_A", "User_E"], ["User_B", "User_C"], ["User_C"]],
            "Reviewers": [["User_A"], ["User_A", "User_E"], ["User_B", "User_C"], ["User_C"]],
            "Engaging User": ["User_B", "User_B", "User_A", "User_D"],
            "Null_User": ["User_B", "User_B", "User_A", "User_D"],
            "Post": [1, 2, 3, 4],
            "Cont1": [0.3, 0.4, 0.5, 0.6],
            "Cont2": [0.3, 0.4, 0.5, 0.6],
            "Cat1": ["A", "B", "A", "C"],
        }
    )
    cat_names = ["Cat1", "Null_User", "Authors", "Reviewers"]  # , "Engaging User"]
    cont_names = ["Cont1", "Cont2"]
    label_name = ["Post"]
    out_path = os.path.join(tmpdir, "train/")
    os.mkdir(out_path)

    cats = cat_names >> ops.Categorify()
    conts = cont_names >> ops.Normalize()

    processor = nvt.Workflow(cats + conts + label_name)
    df_out = processor.fit_transform(nvt.Dataset(df)).to_ddf().compute()
    data_itr = torch_dataloader.TorchAsyncItr(
        nvt.Dataset(df_out),
        cats=cat_names,
        conts=cont_names,
        labels=label_name,
        batch_size=2,
    )
    emb_sizes = nvt.ops.get_embedding_sizes(processor)
    # check  for correct  embedding representation


  assert len(emb_sizes[1].keys()) == 2  # Authors, Reviewers


E       KeyError: 1
tests/unit/loader/test_torch_dataloader.py:547: KeyError

____________________________ test_horovod_multigpu _____________________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-28/test_horovod_multigpu0')
@pytest.mark.skipif(importlib.util.find_spec("horovod") is None, reason="needs horovod")
@pytest.mark.skipif(
    cupy.cuda.runtime.getDeviceCount() <= 1, reason="This unittest requires multiple gpu's to run"
)
def test_horovod_multigpu(tmpdir):

    json_sample = {
        "conts": {},
        "cats": {
            "genres": {
                "dtype": None,
                "cardinality": 50,
                "min_entry_size": 1,
                "max_entry_size": 5,
                "multi_min": 2,
                "multi_max": 4,
                "multi_avg": 3,
            },
            "movieId": {
                "dtype": None,
                "cardinality": 500,
                "min_entry_size": 1,
                "max_entry_size": 5,
            },
            "userId": {"dtype": None, "cardinality": 500, "min_entry_size": 1, "max_entry_size": 5},
        },
        "labels": {"rating": {"dtype": None, "cardinality": 2}},
    }
    cols = datagen._get_cols_from_schema(json_sample)
    df_gen = datagen.DatasetGen(datagen.UniformDistro(), gpu_frac=0.0001)

    target_path = os.path.join(tmpdir, "input/")
    os.mkdir(target_path)
    df_files = df_gen.full_df_create(10000, cols, output=target_path)

    # process them
    cat_features = ColumnSelector(["userId", "movieId", "genres"]) >> nvt.ops.Categorify()
    ratings = ColumnSelector(["rating"]) >> (lambda col: (col > 3).astype("int8"))
    output = cat_features + ratings

    proc = nvt.Workflow(output)
    train_iter = nvt.Dataset(df_files, part_size="10MB")
    proc.fit(train_iter)

    target_path_train = os.path.join(tmpdir, "train/")
    os.mkdir(target_path_train)

    proc.transform(train_iter).to_parquet(output_path=target_path_train, out_files_per_proc=5)

    # add new location
    target_path = os.path.join(tmpdir, "workflow/")
    os.mkdir(target_path)
    proc.save(target_path)

    curr_path = os.path.abspath(__file__)
    repo_root = os.path.relpath(os.path.normpath(os.path.join(curr_path, "../../../..")))
    hvd_example_path = os.path.join(repo_root, "examples/multi-gpu-movielens/torch_trainer.py")

    with subprocess.Popen(
        [
            "horovodrun",
            "-np",
            "2",
            "-H",
            "localhost:2",
            "python",
            hvd_example_path,
            "--dir_in",
            f"{tmpdir}",
            "--batch_size",
            "1024",
        ],
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE,
    ) as process:
        process.wait()
        stdout, stderr = process.communicate()
        print(str(stdout))
        print(str(stderr))


      assert "Training complete" in str(stdout)


E           assert 'Training complete' in "b''"

E            +  where "b''" = str(b'')
tests/unit/loader/test_torch_dataloader.py:663: AssertionError

----------------------------- Captured stdout call -----------------------------

b''

b'[1,1]:Traceback (most recent call last):\n[1,1]:  File "./examples/multi-gpu-movielens/torch_trainer.py", line 47, in \n[1,1]:    EMBEDDING_TABLE_SHAPES, MH_EMBEDDING_TABLE_SHAPES = nvt.ops.get_embedding_sizes(proc)\n[1,1]:ValueError: too many values to unpack (expected 2)\n[1,0]:Traceback (most recent call last):\n[1,0]:  File "./examples/multi-gpu-movielens/torch_trainer.py", line 47, in \n[1,0]:    EMBEDDING_TABLE_SHAPES, MH_EMBEDDING_TABLE_SHAPES = nvt.ops.get_embedding_sizes(proc)\n[1,0]:ValueError: too many values to unpack (expected 2)\n--------------------------------------------------------------------------\nPrimary job  terminated normally, but 1 process returned\na non-zero exit code. Per user-direction, the job has been aborted.\n--------------------------------------------------------------------------\n--------------------------------------------------------------------------\nmpirun detected that one or more processes exited with non-zero status, thus causing\nthe job to be terminated. The first process to do so was:\n\n  Process name: [[35548,1],0]\n  Exit code:    1\n--------------------------------------------------------------------------\n'

______________ test_schema_out[selection0-op8-tags0-properties0] _______________
tags = [], properties = {}, selection = ['1']

op = <nvtabular.ops.hash_bucket.HashBucket object at 0x7fa08c903610>
@pytest.mark.parametrize("properties", [{}, {"p1": "1"}])
@pytest.mark.parametrize("tags", [[], ["TAG1", "TAG2"]])
@pytest.mark.parametrize(
    "op",
    [
        ops.Bucketize([1]),
        ops.Rename(postfix="_trim"),
        ops.Categorify(),
        ops.Categorify(encode_type="combo"),
        ops.Clip(0),
        ops.DifferenceLag("1"),
        ops.FillMissing(),
        ops.Groupby(["1"]),
        ops.HashBucket(1),
        ops.HashedCross(1),
        ops.JoinGroupby(["1"]),
        ops.ListSlice(0),
        ops.LogOp(),
        ops.Normalize(),
        ops.TargetEncoding(["1"]),
        ops.AddMetadata(tags=["excellent"], properties={"domain": {"min": 0, "max": 20}}),
    ],
)
@pytest.mark.parametrize("selection", [["1"], ["2", "3"], ["1", "2", "3", "4"]])
def test_schema_out(tags, properties, selection, op):
    # Create columnSchemas
    column_schemas = []
    all_cols = []
    for x in range(5):
        all_cols.append(str(x))
        column_schemas.append(ColumnSchema(str(x), tags=tags, properties=properties))

    # Turn to Schema
    schema = Schema(column_schemas)

    # run schema through op
    selector = ColumnSelector(selection)
    new_schema = op.compute_output_schema(schema, selector)

    # should have dtype float
    for col_name in selector.names:
        names_group = [name for name in new_schema.column_schemas if col_name in name]
        if names_group:
            for name in names_group:
                schema1 = new_schema.column_schemas[name]

                # should not be exactly the same name, having gone through operator
                assert schema1.dtype == op.output_dtype()
                if name in selector.names:


                  assert schema1.properties == properties


E                       AssertionError: assert {'domain': {'...mension': 16}} == {}

E                         Left contains 2 more items:

E                         {'domain': {'max': 1, 'min': 0},

E                          'embedding_sizes': {'cardinality': 1, 'dimension': 16}}

E                         Use -v to get the full diff
tests/unit/ops/test_ops_schema.py:57: AssertionError

______________ test_schema_out[selection0-op8-tags0-properties1] _______________
tags = [], properties = {'p1': '1'}, selection = ['1']

op = <nvtabular.ops.hash_bucket.HashBucket object at 0x7fa08c903610>
@pytest.mark.parametrize("properties", [{}, {"p1": "1"}])
@pytest.mark.parametrize("tags", [[], ["TAG1", "TAG2"]])
@pytest.mark.parametrize(
    "op",
    [
        ops.Bucketize([1]),
        ops.Rename(postfix="_trim"),
        ops.Categorify(),
        ops.Categorify(encode_type="combo"),
        ops.Clip(0),
        ops.DifferenceLag("1"),
        ops.FillMissing(),
        ops.Groupby(["1"]),
        ops.HashBucket(1),
        ops.HashedCross(1),
        ops.JoinGroupby(["1"]),
        ops.ListSlice(0),
        ops.LogOp(),
        ops.Normalize(),
        ops.TargetEncoding(["1"]),
        ops.AddMetadata(tags=["excellent"], properties={"domain": {"min": 0, "max": 20}}),
    ],
)
@pytest.mark.parametrize("selection", [["1"], ["2", "3"], ["1", "2", "3", "4"]])
def test_schema_out(tags, properties, selection, op):
    # Create columnSchemas
    column_schemas = []
    all_cols = []
    for x in range(5):
        all_cols.append(str(x))
        column_schemas.append(ColumnSchema(str(x), tags=tags, properties=properties))

    # Turn to Schema
    schema = Schema(column_schemas)

    # run schema through op
    selector = ColumnSelector(selection)
    new_schema = op.compute_output_schema(schema, selector)

    # should have dtype float
    for col_name in selector.names:
        names_group = [name for name in new_schema.column_schemas if col_name in name]
        if names_group:
            for name in names_group:
                schema1 = new_schema.column_schemas[name]

                # should not be exactly the same name, having gone through operator
                assert schema1.dtype == op.output_dtype()
                if name in selector.names:


                  assert schema1.properties == properties


E                       AssertionError: assert {'domain': {'...6}, 'p1': '1'} == {'p1': '1'}

E                         Omitting 1 identical items, use -vv to show

E                         Left contains 2 more items:

E                         {'domain': {'max': 1, 'min': 0},

E                          'embedding_sizes': {'cardinality': 1, 'dimension': 16}}

E                         Use -v to get the full diff
tests/unit/ops/test_ops_schema.py:57: AssertionError

______________ test_schema_out[selection0-op8-tags1-properties0] _______________
tags = ['TAG1', 'TAG2'], properties = {}, selection = ['1']

op = <nvtabular.ops.hash_bucket.HashBucket object at 0x7fa08c903610>
@pytest.mark.parametrize("properties", [{}, {"p1": "1"}])
@pytest.mark.parametrize("tags", [[], ["TAG1", "TAG2"]])
@pytest.mark.parametrize(
    "op",
    [
        ops.Bucketize([1]),
        ops.Rename(postfix="_trim"),
        ops.Categorify(),
        ops.Categorify(encode_type="combo"),
        ops.Clip(0),
        ops.DifferenceLag("1"),
        ops.FillMissing(),
        ops.Groupby(["1"]),
        ops.HashBucket(1),
        ops.HashedCross(1),
        ops.JoinGroupby(["1"]),
        ops.ListSlice(0),
        ops.LogOp(),
        ops.Normalize(),
        ops.TargetEncoding(["1"]),
        ops.AddMetadata(tags=["excellent"], properties={"domain": {"min": 0, "max": 20}}),
    ],
)
@pytest.mark.parametrize("selection", [["1"], ["2", "3"], ["1", "2", "3", "4"]])
def test_schema_out(tags, properties, selection, op):
    # Create columnSchemas
    column_schemas = []
    all_cols = []
    for x in range(5):
        all_cols.append(str(x))
        column_schemas.append(ColumnSchema(str(x), tags=tags, properties=properties))

    # Turn to Schema
    schema = Schema(column_schemas)

    # run schema through op
    selector = ColumnSelector(selection)
    new_schema = op.compute_output_schema(schema, selector)

    # should have dtype float
    for col_name in selector.names:
        names_group = [name for name in new_schema.column_schemas if col_name in name]
        if names_group:
            for name in names_group:
                schema1 = new_schema.column_schemas[name]

                # should not be exactly the same name, having gone through operator
                assert schema1.dtype == op.output_dtype()
                if name in selector.names:


                  assert schema1.properties == properties


E                       AssertionError: assert {'domain': {'...mension': 16}} == {}

E                         Left contains 2 more items:

E                         {'domain': {'max': 1, 'min': 0},

E                          'embedding_sizes': {'cardinality': 1, 'dimension': 16}}

E                         Use -v to get the full diff
tests/unit/ops/test_ops_schema.py:57: AssertionError

______________ test_schema_out[selection0-op8-tags1-properties1] _______________
tags = ['TAG1', 'TAG2'], properties = {'p1': '1'}, selection = ['1']

op = <nvtabular.ops.hash_bucket.HashBucket object at 0x7fa08c903610>
@pytest.mark.parametrize("properties", [{}, {"p1": "1"}])
@pytest.mark.parametrize("tags", [[], ["TAG1", "TAG2"]])
@pytest.mark.parametrize(
    "op",
    [
        ops.Bucketize([1]),
        ops.Rename(postfix="_trim"),
        ops.Categorify(),
        ops.Categorify(encode_type="combo"),
        ops.Clip(0),
        ops.DifferenceLag("1"),
        ops.FillMissing(),
        ops.Groupby(["1"]),
        ops.HashBucket(1),
        ops.HashedCross(1),
        ops.JoinGroupby(["1"]),
        ops.ListSlice(0),
        ops.LogOp(),
        ops.Normalize(),
        ops.TargetEncoding(["1"]),
        ops.AddMetadata(tags=["excellent"], properties={"domain": {"min": 0, "max": 20}}),
    ],
)
@pytest.mark.parametrize("selection", [["1"], ["2", "3"], ["1", "2", "3", "4"]])
def test_schema_out(tags, properties, selection, op):
    # Create columnSchemas
    column_schemas = []
    all_cols = []
    for x in range(5):
        all_cols.append(str(x))
        column_schemas.append(ColumnSchema(str(x), tags=tags, properties=properties))

    # Turn to Schema
    schema = Schema(column_schemas)

    # run schema through op
    selector = ColumnSelector(selection)
    new_schema = op.compute_output_schema(schema, selector)

    # should have dtype float
    for col_name in selector.names:
        names_group = [name for name in new_schema.column_schemas if col_name in name]
        if names_group:
            for name in names_group:
                schema1 = new_schema.column_schemas[name]

                # should not be exactly the same name, having gone through operator
                assert schema1.dtype == op.output_dtype()
                if name in selector.names:


                  assert schema1.properties == properties


E                       AssertionError: assert {'domain': {'...6}, 'p1': '1'} == {'p1': '1'}

E                         Omitting 1 identical items, use -vv to show

E                         Left contains 2 more items:

E                         {'domain': {'max': 1, 'min': 0},

E                          'embedding_sizes': {'cardinality': 1, 'dimension': 16}}

E                         Use -v to get the full diff
tests/unit/ops/test_ops_schema.py:57: AssertionError

______________ test_schema_out[selection1-op8-tags0-properties0] _______________
tags = [], properties = {}, selection = ['2', '3']

op = <nvtabular.ops.hash_bucket.HashBucket object at 0x7fa08c903610>
@pytest.mark.parametrize("properties", [{}, {"p1": "1"}])
@pytest.mark.parametrize("tags", [[], ["TAG1", "TAG2"]])
@pytest.mark.parametrize(
    "op",
    [
        ops.Bucketize([1]),
        ops.Rename(postfix="_trim"),
        ops.Categorify(),
        ops.Categorify(encode_type="combo"),
        ops.Clip(0),
        ops.DifferenceLag("1"),
        ops.FillMissing(),
        ops.Groupby(["1"]),
        ops.HashBucket(1),
        ops.HashedCross(1),
        ops.JoinGroupby(["1"]),
        ops.ListSlice(0),
        ops.LogOp(),
        ops.Normalize(),
        ops.TargetEncoding(["1"]),
        ops.AddMetadata(tags=["excellent"], properties={"domain": {"min": 0, "max": 20}}),
    ],
)
@pytest.mark.parametrize("selection", [["1"], ["2", "3"], ["1", "2", "3", "4"]])
def test_schema_out(tags, properties, selection, op):
    # Create columnSchemas
    column_schemas = []
    all_cols = []
    for x in range(5):
        all_cols.append(str(x))
        column_schemas.append(ColumnSchema(str(x), tags=tags, properties=properties))

    # Turn to Schema
    schema = Schema(column_schemas)

    # run schema through op
    selector = ColumnSelector(selection)
    new_schema = op.compute_output_schema(schema, selector)

    # should have dtype float
    for col_name in selector.names:
        names_group = [name for name in new_schema.column_schemas if col_name in name]
        if names_group:
            for name in names_group:
                schema1 = new_schema.column_schemas[name]

                # should not be exactly the same name, having gone through operator
                assert schema1.dtype == op.output_dtype()
                if name in selector.names:


                  assert schema1.properties == properties


E                       AssertionError: assert {'domain': {'...mension': 16}} == {}

E                         Left contains 2 more items:

E                         {'domain': {'max': 1, 'min': 0},

E                          'embedding_sizes': {'cardinality': 1, 'dimension': 16}}

E                         Use -v to get the full diff
tests/unit/ops/test_ops_schema.py:57: AssertionError

______________ test_schema_out[selection1-op8-tags0-properties1] _______________
tags = [], properties = {'p1': '1'}, selection = ['2', '3']

op = <nvtabular.ops.hash_bucket.HashBucket object at 0x7fa08c903610>
@pytest.mark.parametrize("properties", [{}, {"p1": "1"}])
@pytest.mark.parametrize("tags", [[], ["TAG1", "TAG2"]])
@pytest.mark.parametrize(
    "op",
    [
        ops.Bucketize([1]),
        ops.Rename(postfix="_trim"),
        ops.Categorify(),
        ops.Categorify(encode_type="combo"),
        ops.Clip(0),
        ops.DifferenceLag("1"),
        ops.FillMissing(),
        ops.Groupby(["1"]),
        ops.HashBucket(1),
        ops.HashedCross(1),
        ops.JoinGroupby(["1"]),
        ops.ListSlice(0),
        ops.LogOp(),
        ops.Normalize(),
        ops.TargetEncoding(["1"]),
        ops.AddMetadata(tags=["excellent"], properties={"domain": {"min": 0, "max": 20}}),
    ],
)
@pytest.mark.parametrize("selection", [["1"], ["2", "3"], ["1", "2", "3", "4"]])
def test_schema_out(tags, properties, selection, op):
    # Create columnSchemas
    column_schemas = []
    all_cols = []
    for x in range(5):
        all_cols.append(str(x))
        column_schemas.append(ColumnSchema(str(x), tags=tags, properties=properties))

    # Turn to Schema
    schema = Schema(column_schemas)

    # run schema through op
    selector = ColumnSelector(selection)
    new_schema = op.compute_output_schema(schema, selector)

    # should have dtype float
    for col_name in selector.names:
        names_group = [name for name in new_schema.column_schemas if col_name in name]
        if names_group:
            for name in names_group:
                schema1 = new_schema.column_schemas[name]

                # should not be exactly the same name, having gone through operator
                assert schema1.dtype == op.output_dtype()
                if name in selector.names:


                  assert schema1.properties == properties


E                       AssertionError: assert {'domain': {'...6}, 'p1': '1'} == {'p1': '1'}

E                         Omitting 1 identical items, use -vv to show

E                         Left contains 2 more items:

E                         {'domain': {'max': 1, 'min': 0},

E                          'embedding_sizes': {'cardinality': 1, 'dimension': 16}}

E                         Use -v to get the full diff
tests/unit/ops/test_ops_schema.py:57: AssertionError

______________ test_schema_out[selection1-op8-tags1-properties0] _______________
tags = ['TAG1', 'TAG2'], properties = {}, selection = ['2', '3']

op = <nvtabular.ops.hash_bucket.HashBucket object at 0x7fa08c903610>
@pytest.mark.parametrize("properties", [{}, {"p1": "1"}])
@pytest.mark.parametrize("tags", [[], ["TAG1", "TAG2"]])
@pytest.mark.parametrize(
    "op",
    [
        ops.Bucketize([1]),
        ops.Rename(postfix="_trim"),
        ops.Categorify(),
        ops.Categorify(encode_type="combo"),
        ops.Clip(0),
        ops.DifferenceLag("1"),
        ops.FillMissing(),
        ops.Groupby(["1"]),
        ops.HashBucket(1),
        ops.HashedCross(1),
        ops.JoinGroupby(["1"]),
        ops.ListSlice(0),
        ops.LogOp(),
        ops.Normalize(),
        ops.TargetEncoding(["1"]),
        ops.AddMetadata(tags=["excellent"], properties={"domain": {"min": 0, "max": 20}}),
    ],
)
@pytest.mark.parametrize("selection", [["1"], ["2", "3"], ["1", "2", "3", "4"]])
def test_schema_out(tags, properties, selection, op):
    # Create columnSchemas
    column_schemas = []
    all_cols = []
    for x in range(5):
        all_cols.append(str(x))
        column_schemas.append(ColumnSchema(str(x), tags=tags, properties=properties))

    # Turn to Schema
    schema = Schema(column_schemas)

    # run schema through op
    selector = ColumnSelector(selection)
    new_schema = op.compute_output_schema(schema, selector)

    # should have dtype float
    for col_name in selector.names:
        names_group = [name for name in new_schema.column_schemas if col_name in name]
        if names_group:
            for name in names_group:
                schema1 = new_schema.column_schemas[name]

                # should not be exactly the same name, having gone through operator
                assert schema1.dtype == op.output_dtype()
                if name in selector.names:


                  assert schema1.properties == properties


E                       AssertionError: assert {'domain': {'...mension': 16}} == {}

E                         Left contains 2 more items:

E                         {'domain': {'max': 1, 'min': 0},

E                          'embedding_sizes': {'cardinality': 1, 'dimension': 16}}

E                         Use -v to get the full diff
tests/unit/ops/test_ops_schema.py:57: AssertionError

______________ test_schema_out[selection1-op8-tags1-properties1] _______________
tags = ['TAG1', 'TAG2'], properties = {'p1': '1'}, selection = ['2', '3']

op = <nvtabular.ops.hash_bucket.HashBucket object at 0x7fa08c903610>
@pytest.mark.parametrize("properties", [{}, {"p1": "1"}])
@pytest.mark.parametrize("tags", [[], ["TAG1", "TAG2"]])
@pytest.mark.parametrize(
    "op",
    [
        ops.Bucketize([1]),
        ops.Rename(postfix="_trim"),
        ops.Categorify(),
        ops.Categorify(encode_type="combo"),
        ops.Clip(0),
        ops.DifferenceLag("1"),
        ops.FillMissing(),
        ops.Groupby(["1"]),
        ops.HashBucket(1),
        ops.HashedCross(1),
        ops.JoinGroupby(["1"]),
        ops.ListSlice(0),
        ops.LogOp(),
        ops.Normalize(),
        ops.TargetEncoding(["1"]),
        ops.AddMetadata(tags=["excellent"], properties={"domain": {"min": 0, "max": 20}}),
    ],
)
@pytest.mark.parametrize("selection", [["1"], ["2", "3"], ["1", "2", "3", "4"]])
def test_schema_out(tags, properties, selection, op):
    # Create columnSchemas
    column_schemas = []
    all_cols = []
    for x in range(5):
        all_cols.append(str(x))
        column_schemas.append(ColumnSchema(str(x), tags=tags, properties=properties))

    # Turn to Schema
    schema = Schema(column_schemas)

    # run schema through op
    selector = ColumnSelector(selection)
    new_schema = op.compute_output_schema(schema, selector)

    # should have dtype float
    for col_name in selector.names:
        names_group = [name for name in new_schema.column_schemas if col_name in name]
        if names_group:
            for name in names_group:
                schema1 = new_schema.column_schemas[name]

                # should not be exactly the same name, having gone through operator
                assert schema1.dtype == op.output_dtype()
                if name in selector.names:


                  assert schema1.properties == properties


E                       AssertionError: assert {'domain': {'...6}, 'p1': '1'} == {'p1': '1'}

E                         Omitting 1 identical items, use -vv to show

E                         Left contains 2 more items:

E                         {'domain': {'max': 1, 'min': 0},

E                          'embedding_sizes': {'cardinality': 1, 'dimension': 16}}

E                         Use -v to get the full diff
tests/unit/ops/test_ops_schema.py:57: AssertionError

______________ test_schema_out[selection2-op8-tags0-properties0] _______________
tags = [], properties = {}, selection = ['1', '2', '3', '4']

op = <nvtabular.ops.hash_bucket.HashBucket object at 0x7fa08c903610>
@pytest.mark.parametrize("properties", [{}, {"p1": "1"}])
@pytest.mark.parametrize("tags", [[], ["TAG1", "TAG2"]])
@pytest.mark.parametrize(
    "op",
    [
        ops.Bucketize([1]),
        ops.Rename(postfix="_trim"),
        ops.Categorify(),
        ops.Categorify(encode_type="combo"),
        ops.Clip(0),
        ops.DifferenceLag("1"),
        ops.FillMissing(),
        ops.Groupby(["1"]),
        ops.HashBucket(1),
        ops.HashedCross(1),
        ops.JoinGroupby(["1"]),
        ops.ListSlice(0),
        ops.LogOp(),
        ops.Normalize(),
        ops.TargetEncoding(["1"]),
        ops.AddMetadata(tags=["excellent"], properties={"domain": {"min": 0, "max": 20}}),
    ],
)
@pytest.mark.parametrize("selection", [["1"], ["2", "3"], ["1", "2", "3", "4"]])
def test_schema_out(tags, properties, selection, op):
    # Create columnSchemas
    column_schemas = []
    all_cols = []
    for x in range(5):
        all_cols.append(str(x))
        column_schemas.append(ColumnSchema(str(x), tags=tags, properties=properties))

    # Turn to Schema
    schema = Schema(column_schemas)

    # run schema through op
    selector = ColumnSelector(selection)
    new_schema = op.compute_output_schema(schema, selector)

    # should have dtype float
    for col_name in selector.names:
        names_group = [name for name in new_schema.column_schemas if col_name in name]
        if names_group:
            for name in names_group:
                schema1 = new_schema.column_schemas[name]

                # should not be exactly the same name, having gone through operator
                assert schema1.dtype == op.output_dtype()
                if name in selector.names:


                  assert schema1.properties == properties


E                       AssertionError: assert {'domain': {'...mension': 16}} == {}

E                         Left contains 2 more items:

E                         {'domain': {'max': 1, 'min': 0},

E                          'embedding_sizes': {'cardinality': 1, 'dimension': 16}}

E                         Use -v to get the full diff
tests/unit/ops/test_ops_schema.py:57: AssertionError

______________ test_schema_out[selection2-op8-tags0-properties1] _______________
tags = [], properties = {'p1': '1'}, selection = ['1', '2', '3', '4']

op = <nvtabular.ops.hash_bucket.HashBucket object at 0x7fa08c903610>
@pytest.mark.parametrize("properties", [{}, {"p1": "1"}])
@pytest.mark.parametrize("tags", [[], ["TAG1", "TAG2"]])
@pytest.mark.parametrize(
    "op",
    [
        ops.Bucketize([1]),
        ops.Rename(postfix="_trim"),
        ops.Categorify(),
        ops.Categorify(encode_type="combo"),
        ops.Clip(0),
        ops.DifferenceLag("1"),
        ops.FillMissing(),
        ops.Groupby(["1"]),
        ops.HashBucket(1),
        ops.HashedCross(1),
        ops.JoinGroupby(["1"]),
        ops.ListSlice(0),
        ops.LogOp(),
        ops.Normalize(),
        ops.TargetEncoding(["1"]),
        ops.AddMetadata(tags=["excellent"], properties={"domain": {"min": 0, "max": 20}}),
    ],
)
@pytest.mark.parametrize("selection", [["1"], ["2", "3"], ["1", "2", "3", "4"]])
def test_schema_out(tags, properties, selection, op):
    # Create columnSchemas
    column_schemas = []
    all_cols = []
    for x in range(5):
        all_cols.append(str(x))
        column_schemas.append(ColumnSchema(str(x), tags=tags, properties=properties))

    # Turn to Schema
    schema = Schema(column_schemas)

    # run schema through op
    selector = ColumnSelector(selection)
    new_schema = op.compute_output_schema(schema, selector)

    # should have dtype float
    for col_name in selector.names:
        names_group = [name for name in new_schema.column_schemas if col_name in name]
        if names_group:
            for name in names_group:
                schema1 = new_schema.column_schemas[name]

                # should not be exactly the same name, having gone through operator
                assert schema1.dtype == op.output_dtype()
                if name in selector.names:


                  assert schema1.properties == properties


E                       AssertionError: assert {'domain': {'...6}, 'p1': '1'} == {'p1': '1'}

E                         Omitting 1 identical items, use -vv to show

E                         Left contains 2 more items:

E                         {'domain': {'max': 1, 'min': 0},

E                          'embedding_sizes': {'cardinality': 1, 'dimension': 16}}

E                         Use -v to get the full diff
tests/unit/ops/test_ops_schema.py:57: AssertionError

______________ test_schema_out[selection2-op8-tags1-properties0] _______________
tags = ['TAG1', 'TAG2'], properties = {}, selection = ['1', '2', '3', '4']

op = <nvtabular.ops.hash_bucket.HashBucket object at 0x7fa08c903610>
@pytest.mark.parametrize("properties", [{}, {"p1": "1"}])
@pytest.mark.parametrize("tags", [[], ["TAG1", "TAG2"]])
@pytest.mark.parametrize(
    "op",
    [
        ops.Bucketize([1]),
        ops.Rename(postfix="_trim"),
        ops.Categorify(),
        ops.Categorify(encode_type="combo"),
        ops.Clip(0),
        ops.DifferenceLag("1"),
        ops.FillMissing(),
        ops.Groupby(["1"]),
        ops.HashBucket(1),
        ops.HashedCross(1),
        ops.JoinGroupby(["1"]),
        ops.ListSlice(0),
        ops.LogOp(),
        ops.Normalize(),
        ops.TargetEncoding(["1"]),
        ops.AddMetadata(tags=["excellent"], properties={"domain": {"min": 0, "max": 20}}),
    ],
)
@pytest.mark.parametrize("selection", [["1"], ["2", "3"], ["1", "2", "3", "4"]])
def test_schema_out(tags, properties, selection, op):
    # Create columnSchemas
    column_schemas = []
    all_cols = []
    for x in range(5):
        all_cols.append(str(x))
        column_schemas.append(ColumnSchema(str(x), tags=tags, properties=properties))

    # Turn to Schema
    schema = Schema(column_schemas)

    # run schema through op
    selector = ColumnSelector(selection)
    new_schema = op.compute_output_schema(schema, selector)

    # should have dtype float
    for col_name in selector.names:
        names_group = [name for name in new_schema.column_schemas if col_name in name]
        if names_group:
            for name in names_group:
                schema1 = new_schema.column_schemas[name]

                # should not be exactly the same name, having gone through operator
                assert schema1.dtype == op.output_dtype()
                if name in selector.names:


                  assert schema1.properties == properties


E                       AssertionError: assert {'domain': {'...mension': 16}} == {}

E                         Left contains 2 more items:

E                         {'domain': {'max': 1, 'min': 0},

E                          'embedding_sizes': {'cardinality': 1, 'dimension': 16}}

E                         Use -v to get the full diff
tests/unit/ops/test_ops_schema.py:57: AssertionError

______________ test_schema_out[selection2-op8-tags1-properties1] _______________
tags = ['TAG1', 'TAG2'], properties = {'p1': '1'}

selection = ['1', '2', '3', '4']

op = <nvtabular.ops.hash_bucket.HashBucket object at 0x7fa08c903610>
@pytest.mark.parametrize("properties", [{}, {"p1": "1"}])
@pytest.mark.parametrize("tags", [[], ["TAG1", "TAG2"]])
@pytest.mark.parametrize(
    "op",
    [
        ops.Bucketize([1]),
        ops.Rename(postfix="_trim"),
        ops.Categorify(),
        ops.Categorify(encode_type="combo"),
        ops.Clip(0),
        ops.DifferenceLag("1"),
        ops.FillMissing(),
        ops.Groupby(["1"]),
        ops.HashBucket(1),
        ops.HashedCross(1),
        ops.JoinGroupby(["1"]),
        ops.ListSlice(0),
        ops.LogOp(),
        ops.Normalize(),
        ops.TargetEncoding(["1"]),
        ops.AddMetadata(tags=["excellent"], properties={"domain": {"min": 0, "max": 20}}),
    ],
)
@pytest.mark.parametrize("selection", [["1"], ["2", "3"], ["1", "2", "3", "4"]])
def test_schema_out(tags, properties, selection, op):
    # Create columnSchemas
    column_schemas = []
    all_cols = []
    for x in range(5):
        all_cols.append(str(x))
        column_schemas.append(ColumnSchema(str(x), tags=tags, properties=properties))

    # Turn to Schema
    schema = Schema(column_schemas)

    # run schema through op
    selector = ColumnSelector(selection)
    new_schema = op.compute_output_schema(schema, selector)

    # should have dtype float
    for col_name in selector.names:
        names_group = [name for name in new_schema.column_schemas if col_name in name]
        if names_group:
            for name in names_group:
                schema1 = new_schema.column_schemas[name]

                # should not be exactly the same name, having gone through operator
                assert schema1.dtype == op.output_dtype()
                if name in selector.names:


                  assert schema1.properties == properties


E                       AssertionError: assert {'domain': {'...6}, 'p1': '1'} == {'p1': '1'}

E                         Omitting 1 identical items, use -vv to show

E                         Left contains 2 more items:

E                         {'domain': {'max': 1, 'min': 0},

E                          'embedding_sizes': {'cardinality': 1, 'dimension': 16}}

E                         Use -v to get the full diff
tests/unit/ops/test_ops_schema.py:57: AssertionError

=============================== warnings summary ===============================

tests/unit/test_dask_nvt.py: 3 warnings

tests/unit/test_io.py: 24 warnings

tests/unit/test_tf4rec.py: 2 warnings

tests/unit/test_tools.py: 2 warnings

tests/unit/test_triton_inference.py: 5 warnings

tests/unit/loader/test_tf_dataloader.py: 50 warnings

tests/unit/loader/test_torch_dataloader.py: 16 warnings

tests/unit/ops/test_column_similarity.py: 7 warnings

tests/unit/ops/test_ops.py: 74 warnings

tests/unit/workflow/test_workflow.py: 31 warnings

tests/unit/workflow/test_workflow_node.py: 1 warning

tests/unit/workflow/test_workflow_schemas.py: 1 warning

/var/jenkins_home/.local/lib/python3.8/site-packages/numba-0.54.0-py3.8-linux-x86_64.egg/numba/cuda/compiler.py:865: NumbaPerformanceWarning: �[1mGrid size (1) < 2 * SM count (112) will likely result in GPU under utilization due to low occupancy.�[0m

warn(NumbaPerformanceWarning(msg))
tests/unit/test_io.py::test_validate_dataset_bad_schema

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:1105: UserWarning: Unable to sample column dtypes to infer nvt.Dataset schema, schema is empty.

warnings.warn(
tests/unit/test_io.py: 96 warnings

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/init.py:38: DeprecationWarning: ColumnGroup is deprecated, use ColumnSelector instead

warnings.warn("ColumnGroup is deprecated, use ColumnSelector instead", DeprecationWarning)
tests/unit/test_io.py: 24 warnings

tests/unit/loader/test_torch_dataloader.py: 54 warnings

tests/unit/workflow/test_workflow_node.py: 1 warning

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow/node.py:47: FutureWarning: The ["a", "b", "c"] >> ops.Operator syntax for creating a ColumnGroup has been deprecated in NVTabular 21.09 and will be removed in a future version.

warnings.warn(
tests/unit/test_io.py: 36 warnings

tests/unit/workflow/test_workflow.py: 44 warnings

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow/workflow.py:89: UserWarning: A global dask.distributed client has been detected, but the single-threaded scheduler will be used for execution. Please use the client argument to initialize a Workflow object with distributed-execution enabled.

warnings.warn(
tests/unit/test_io.py: 52 warnings

tests/unit/workflow/test_workflow.py: 35 warnings

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dask.py:372: UserWarning: A global dask.distributed client has been detected, but the single-threaded scheduler will be used for this write operation. Please use the client argument to initialize a Dataset and/or Workflow object with distributed-execution enabled.

warnings.warn(
tests/unit/test_io.py: 36 warnings

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:511: UserWarning: A global dask.distributed client has been detected, but the single-threaded scheduler is being used for this shuffle operation. Please use the client argument to initialize a Dataset and/or Workflow object with distributed-execution enabled.

warnings.warn(
tests/unit/ops/test_ops.py::test_fill_median[True-True-op_columns1-parquet-0.01]

tests/unit/ops/test_ops.py::test_fill_median[True-True-op_columns1-parquet-0.1]

tests/unit/ops/test_ops.py::test_fill_median[True-True-op_columns1-csv-0.01]

tests/unit/ops/test_ops.py::test_fill_median[True-True-op_columns1-csv-0.1]

tests/unit/ops/test_ops.py::test_fill_median[True-True-op_columns1-csv-no-header-0.01]

tests/unit/ops/test_ops.py::test_fill_median[True-True-op_columns1-csv-no-header-0.1]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/fill.py:125: SettingWithCopyWarning:

A value is trying to be set on a copy of a slice from a DataFrame.

Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

df[f"{col}_filled"] = df[col].isna()
tests/unit/ops/test_ops.py::test_fill_median[True-True-op_columns1-parquet-0.01]

tests/unit/ops/test_ops.py::test_fill_median[True-True-op_columns1-parquet-0.1]

tests/unit/ops/test_ops.py::test_fill_median[True-True-op_columns1-csv-0.01]

tests/unit/ops/test_ops.py::test_fill_median[True-True-op_columns1-csv-0.1]

tests/unit/ops/test_ops.py::test_fill_median[True-True-op_columns1-csv-no-header-0.01]

tests/unit/ops/test_ops.py::test_fill_median[True-True-op_columns1-csv-no-header-0.1]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/fill.py:126: SettingWithCopyWarning:

A value is trying to be set on a copy of a slice from a DataFrame.

Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

df[col] = df[col].fillna(self.medians[col])
tests/unit/ops/test_ops.py::test_fill_missing[True-True-parquet]

tests/unit/ops/test_ops.py::test_fill_missing[True-False-parquet]

tests/unit/ops/test_ops.py::test_filter[parquet-0.1-True]

/var/jenkins_home/.local/lib/python3.8/site-packages/pandas-1.2.5-py3.8-linux-x86_64.egg/pandas/core/indexing.py:1637: SettingWithCopyWarning:

A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

self._setitem_single_block(indexer, value, name)
tests/unit/ops/test_ops.py::test_fill_missing[True-True-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/fill.py:54: SettingWithCopyWarning:

A value is trying to be set on a copy of a slice from a DataFrame.

Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

df[f"{col}_filled"] = df[col].isna()
tests/unit/ops/test_ops.py::test_fill_missing[True-True-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/fill.py:55: SettingWithCopyWarning:

A value is trying to be set on a copy of a slice from a DataFrame.

Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

df[col] = df[col].fillna(self.fill_val)
tests/unit/ops/test_ops.py: 96 warnings

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/join_external.py:190: SettingWithCopyWarning:

A value is trying to be set on a copy of a slice from a DataFrame.

Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

df[tmp] = _arange(len(df), like_df=df, dtype="int32")
tests/unit/ops/test_ops.py::test_join_external[True-True-left-host-pandas-parquet]

tests/unit/ops/test_ops.py::test_join_external[True-True-left-device-pandas-parquet]

tests/unit/ops/test_ops.py::test_join_external[True-True-inner-host-pandas-parquet]

tests/unit/ops/test_ops.py::test_join_external[True-True-inner-device-pandas-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/join_external.py:171: SettingWithCopyWarning:

A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

_ext.drop_duplicates(ignore_index=True, inplace=True)
tests/unit/ops/test_ops.py::test_filter[parquet-0.1-True]

tests/unit/ops/test_ops.py::test_filter[parquet-0.1-False]

tests/unit/ops/test_ops.py::test_groupby_op[id-True]

tests/unit/ops/test_ops.py::test_groupby_op[id-False]

/var/jenkins_home/.local/lib/python3.8/site-packages/dask-2021.4.1-py3.8.egg/dask/dataframe/core.py:6610: UserWarning: Insufficient elements for head. 1 elements requested, only 0 elements available. Try passing larger npartitions to head.

warnings.warn(msg.format(n, len(r)))
tests/unit/workflow/test_cpu_workflow.py: 78 warnings

/var/jenkins_home/.local/lib/python3.8/site-packages/pandas-1.2.5-py3.8-linux-x86_64.egg/pandas/core/frame.py:3191: SettingWithCopyWarning:

A value is trying to be set on a copy of a slice from a DataFrame.

Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

self[k1] = value[k2]
-- Docs: https://docs.pytest.org/en/stable/warnings.html
---------- coverage: platform linux, python 3.8.10-final-0 -----------

Name                                                           Stmts   Miss Branch BrPart  Cover   Missing
examples/multi-gpu-movielens/torch_trainer.py                     65     33      6      1    46%   32->36, 48-145

nvtabular/init.py                                             18      0      0      0   100%

nvtabular/columns/init.py                                      2      0      0      0   100%

nvtabular/columns/schema.py                                      209     17    103     20    88%   46->62, 49, 51, 53-56, 58, 98->109, 104, 147, 174, 260->267, 262, 263->265, 275, 292->297, 295->297, 308, 332, 339, 348, 351, 356->355

nvtabular/columns/selector.py                                     74      1     34      0    99%   121

nvtabular/dispatch.py                                            273     55    132     22    78%   36-40, 45-47, 53-63, 70-71, 99-101, 106-109, 113-118, 125, 144, 155, 161, 166->168, 179, 202-205, 244, 247, 253, 269, 276, 307->312, 310, 313, 316->320, 353, 364-367, 394-397, 427, 431, 472, 496, 498, 505

nvtabular/framework_utils/init.py                              0      0      0      0   100%

nvtabular/framework_utils/tensorflow/init.py                   1      0      0      0   100%

nvtabular/framework_utils/tensorflow/feature_column_utils.py     134     78     90     15    39%   30, 99, 103, 114-130, 140, 143-158, 162, 166-167, 173-198, 207-217, 220-227, 229->233, 234, 239-279, 282

nvtabular/framework_utils/tensorflow/layers/init.py            4      0      0      0   100%

nvtabular/framework_utils/tensorflow/layers/embedding.py         153     12     85      6    91%   60, 68->49, 122, 179, 231-239, 335->343, 357->360, 363-364, 367

nvtabular/framework_utils/tensorflow/layers/interaction.py        47     25     20      1    43%   49, 74-103, 106-110, 113

nvtabular/framework_utils/tensorflow/layers/outer_product.py      30     24     10      0    15%   37-38, 41-60, 71-84, 87

nvtabular/framework_utils/tensorflow/tfrecords_to_parquet.py      58     58     30      0     0%   16-111

nvtabular/framework_utils/torch/init.py                        0      0      0      0   100%

nvtabular/framework_utils/torch/layers/init.py                 2      0      0      0   100%

nvtabular/framework_utils/torch/layers/embeddings.py              32     15     14      1    52%   50, 74-82, 85-95

nvtabular/framework_utils/torch/models.py                         45      6     28     10    75%   56, 57->61, 62, 67, 87->89, 90-91, 93->96, 96->100, 103, 107->109

nvtabular/framework_utils/torch/utils.py                          75     10     30      9    82%   51->53, 53->55, 64, 70, 71->76, 75, 109, 118-120, 129-131

nvtabular/inference/init.py                                    0      0      0      0   100%

nvtabular/inference/triton/init.py                           385    215    180     11    44%   82-86, 141-174, 195-218, 263-307, 338, 364-372, 380-387, 406, 428-444, 485-489, 527-537, 583-623, 629-645, 649-716, 723->726, 726->722, 762-772, 781, 791, 812, 818-844, 850-876, 883, 888-894

nvtabular/inference/triton/benchmarking_tools.py                  52     52     10      0     0%   2-103

nvtabular/inference/triton/data_conversions.py                    87      3     58      4    95%   32-33, 84

nvtabular/inference/triton/model.py                              176    176     98      0     0%   27-332

nvtabular/inference/triton/model_config_pb2.py                   299      0      2      0   100%

nvtabular/inference/triton/model_pt.py                           101    101     40      0     0%   27-220

nvtabular/io/init.py                                           4      0      0      0   100%

nvtabular/io/avro.py                                              88     88     30      0     0%   16-189

nvtabular/io/csv.py                                               57      6     20      5    86%   22-23, 99, 103->107, 108, 110, 124

nvtabular/io/dask.py                                             183     18     72     11    87%   111, 114, 150, 235-246, 398, 408, 425->428, 436, 440->442, 442->438, 447, 449

nvtabular/io/dataframe_engine.py                                  61      5     28      6    88%   19-20, 50, 69, 88->92, 92->97, 94->97, 97->116, 125

nvtabular/io/dataset.py                                          353     76    166     28    75%   46-47, 257, 259, 272, 281, 299-313, 436->510, 441-444, 450-457, 462-506, 510->519, 570-571, 572->576, 619, 741, 743, 745, 751, 755-757, 759, 819-820, 847, 854-855, 861, 867, 963-964, 1081-1086, 1092, 1171, 1180

nvtabular/io/dataset_engine.py                                    24      1      0      0    96%   48

nvtabular/io/hugectr.py                                           45      2     24      2    91%   34, 74->97, 101

nvtabular/io/parquet.py                                          551     45    180     26    89%   34-35, 57, 76, 80->92, 89, 112, 122->127, 140, 142, 166->170, 173-179, 225-233, 248, 254, 272->274, 287, 306-316, 457-462, 500-505, 621->628, 689->694, 695-696, 816, 820, 824, 830, 862, 879, 883, 890->892, 1000->exit, 1010->1015, 1020->1030, 1035, 1057, 1080-1081

nvtabular/io/shuffle.py                                           31      6     16      5    77%   42, 44-45, 49, 59, 63

nvtabular/io/writer.py                                           175     13     68      5    92%   24-25, 51, 79, 125, 128, 212, 221, 224, 267, 288-290

nvtabular/io/writer_factory.py                                    18      2      8      2    85%   35, 60

nvtabular/loader/init.py                                       0      0      0      0   100%

nvtabular/loader/backend.py                                      330     15    140     13    94%   111, 128, 143-144, 242->244, 254-258, 304-305, 344->348, 345->344, 419, 423-424, 454, 534, 559, 567

nvtabular/loader/tensorflow.py                                   163     22     52      7    86%   58, 66-69, 84, 98, 308, 344, 359-361, 390-392, 402-410, 413-416

nvtabular/loader/tf_utils.py                                      55     10     20      5    80%   29->32, 32->34, 39->41, 43, 50-51, 58-60, 66-70

nvtabular/loader/torch.py                                         81     13     16      2    78%   25-27, 30-36, 111, 149-150

nvtabular/ops/init.py                                         22      0      0      0   100%

nvtabular/ops/add_metadata.py                                      9      0      0      0   100%

nvtabular/ops/bucketize.py                                        37     10     18      3    69%   53-55, 59->exit, 62-65, 84-87, 94

nvtabular/ops/categorify.py                                      624     74    332     48    85%   245, 247, 264, 268, 276, 284, 286, 313, 332-333, 357, 366, 377->381, 385-392, 474-475, 499-504, 591, 603-605, 622, 715, 733, 769, 847-848, 863-867, 868->832, 886, 894, 901->exit, 925, 928->931, 983, 988, 1004->1008, 1015-1018, 1029, 1033, 1035, 1042, 1047-1050, 1128, 1130, 1200->1223, 1206->1223, 1224-1229, 1266, 1285->1290, 1289, 1299->1296, 1304->1296, 1311, 1314, 1322-1332

nvtabular/ops/clip.py                                             18      2      6      3    79%   44, 52->54, 55

nvtabular/ops/column_similarity.py                               118     25     38      5    74%   19-20, 78->exit, 108, 134, 198-199, 208-210, 218-234, 251->254, 255, 265

nvtabular/ops/data_stats.py                                       56      2     22      3    94%   91->93, 95, 97->87, 102

nvtabular/ops/difference_lag.py                                   31      1      8      1    95%   69->71, 94

nvtabular/ops/dropna.py                                            8      0      0      0   100%

nvtabular/ops/fill.py                                             91     12     36      3    82%   63-67, 93, 121, 147, 151, 162-165

nvtabular/ops/filter.py                                           20      1      6      1    92%   49

nvtabular/ops/groupby.py                                         119      3     70      4    96%   73, 84, 94->96, 106->111, 141

nvtabular/ops/hash_bucket.py                                      41      2     20      2    93%   72, 106->112, 118

nvtabular/ops/hashed_cross.py                                     36      4     15      3    86%   53, 66, 81, 91

nvtabular/ops/internal/init.py                                 3      0      0      0   100%

nvtabular/ops/internal/concat_columns.py                          11      0      0      0   100%

nvtabular/ops/internal/identity.py                                 6      1      0      0    83%   42

nvtabular/ops/internal/subset_columns.py                          13      1      0      0    92%   53

nvtabular/ops/join_external.py                                    89      7     36      6    90%   20-21, 113, 115, 117, 159, 176->178, 215

nvtabular/ops/join_groupby.py                                    101      7     36      4    92%   108, 115, 124, 131->130, 215-216, 219-220

nvtabular/ops/lambdaop.py                                         39      6     18      6    79%   59, 63, 77, 89, 94, 103

nvtabular/ops/list_slice.py                                       66     24     26      1    58%   21-22, 53-54, 104-118, 126-137

nvtabular/ops/logop.py                                            13      0      0      0   100%

nvtabular/ops/moments.py                                          65      0     20      0   100%

nvtabular/ops/normalize.py                                        81     10     14      1    86%   70, 78-79, 85, 118-119, 141-142, 146, 157

nvtabular/ops/operator.py                                         66      3     14      1    95%   111, 189, 196

nvtabular/ops/rename.py                                           41      3     22      3    90%   47, 88-90

nvtabular/ops/stat_operator.py                                     8      0      0      0   100%

nvtabular/ops/target_encoding.py                                 153     11     66      4    91%   167->171, 175->184, 232-233, 236-237, 249-255, 346->349, 362

nvtabular/tags.py                                                 16      0      0      0   100%

nvtabular/tools/init.py                                        0      0      0      0   100%

nvtabular/tools/data_gen.py                                      236      1     62      1    99%   321

nvtabular/tools/dataset_inspector.py                              50      7     18      1    79%   32-39

nvtabular/tools/inspector_script.py                               46     46      0      0     0%   17-168

nvtabular/utils.py                                               102     43     46      8    52%   31-32, 36-37, 50, 61-62, 64-66, 69, 72, 78, 84, 90-126, 145, 149->153

nvtabular/worker.py                                               82      5     38      7    90%   24-25, 82->99, 91, 92->99, 99->102, 108, 110, 111->113

nvtabular/workflow/init.py                                     2      0      0      0   100%

nvtabular/workflow/node.py                                       240     18    116     10    89%   55, 93->98, 146, 248->252, 288, 302, 311, 329-334, 339, 388-389, 400->395, 453-458

nvtabular/workflow/workflow.py                                   221     15    112      7    93%   28-29, 47, 139, 195, 222-224, 332, 347-348, 366-367, 502, 514
TOTAL                                                           7521   1547   3025    353    77%

Coverage XML written to file coverage.xml
Required test coverage of 70% reached. Total coverage: 76.78%

=========================== short test summary info ============================

SKIPPED [1] ../../../../../usr/local/lib/python3.8/dist-packages/dask_cudf/io/tests/test_s3.py:16: could not import 's3fs': No module named 's3fs'

SKIPPED [8] tests/unit/test_io.py:555: could not import 'uavro': No module named 'uavro'

SKIPPED [2] tests/unit/test_io.py:914: Dask>=2021.07.1 required for file aggregation

SKIPPED [1] tests/unit/loader/test_tf_dataloader.py:521: not working correctly in ci environment

==== 15 failed, 1500 passed, 12 skipped, 794 warnings in 2067.71s (0:34:27) ====

Build step 'Execute shell' marked build as failure

Performing Post build task...

Match found for : : True

Logical operation result is TRUE

Running script  : #!/bin/bash

cd /var/jenkins_home/

CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"

[nvtabular_tests] $ /bin/bash /tmp/jenkins3787678591802848262.sh

…9/NVTabular into get-embedding-sizes-fix

nvidia-merlin-bot · 2021-09-19T20:32:37Z

Click to view CI Results

GitHub pull request #1127 of commit cc1fe6709c19c9ccca1df772fba08dde2972ab63, no merge conflicts. Running as SYSTEM Setting status of cc1fe6709c19c9ccca1df772fba08dde2972ab63 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/3503/ and message: 'Pending' Using context: Jenkins Unit Test Run Building in workspace /var/jenkins_home/workspace/nvtabular_tests using credential nvidia-merlin-bot Cloning the remote Git repository Cloning repository https://github.com/NVIDIA/NVTabular.git > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10 Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git > git --version # timeout=10 using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10 Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/1127/*:refs/remotes/origin/pr/1127/* # timeout=10 > git rev-parse cc1fe6709c19c9ccca1df772fba08dde2972ab63^{commit} # timeout=10 Checking out Revision cc1fe6709c19c9ccca1df772fba08dde2972ab63 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f cc1fe6709c19c9ccca1df772fba08dde2972ab63 # timeout=10 Commit message: "Merge branch 'get-embedding-sizes-fix' of https://github.com/jperez999/NVTabular into get-embedding-sizes-fix" > git rev-list --no-walk d1dd81f4e577dede3376d36c3bcea9de2919a943 # timeout=10 First time build. Skipping changelog. [nvtabular_tests] $ /bin/bash /tmp/jenkins7330693510522698979.sh Installing NVTabular Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com Requirement already satisfied: pip in /var/jenkins_home/.local/lib/python3.8/site-packages (21.2.4) Requirement already satisfied: setuptools in /var/jenkins_home/.local/lib/python3.8/site-packages (58.0.4) Requirement already satisfied: wheel in /var/jenkins_home/.local/lib/python3.8/site-packages (0.37.0) Requirement already satisfied: pybind11 in /var/jenkins_home/.local/lib/python3.8/site-packages (2.7.1) running develop running egg_info creating nvtabular.egg-info writing nvtabular.egg-info/PKG-INFO writing dependency_links to nvtabular.egg-info/dependency_links.txt writing requirements to nvtabular.egg-info/requires.txt writing top-level names to nvtabular.egg-info/top_level.txt writing manifest file 'nvtabular.egg-info/SOURCES.txt' reading manifest template 'MANIFEST.in' warning: no files found matching '*.h' under directory 'cpp' warning: no files found matching '*.cu' under directory 'cpp' warning: no files found matching '*.cuh' under directory 'cpp' adding license file 'LICENSE' writing manifest file 'nvtabular.egg-info/SOURCES.txt' running build_ext x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include/python3.8 -c flagcheck.cpp -o flagcheck.o -std=c++17 building 'nvtabular_cpp' extension creating build creating build/temp.linux-x86_64-3.8 creating build/temp.linux-x86_64-3.8/cpp creating build/temp.linux-x86_64-3.8/cpp/nvtabular creating build/temp.linux-x86_64-3.8/cpp/nvtabular/inference x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.6.0+75.gcc1fe67 -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/__init__.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/__init__.o -std=c++17 -fvisibility=hidden -g0 x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.6.0+75.gcc1fe67 -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/inference/__init__.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/__init__.o -std=c++17 -fvisibility=hidden -g0 x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.6.0+75.gcc1fe67 -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/inference/categorify.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/categorify.o -std=c++17 -fvisibility=hidden -g0 x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.6.0+75.gcc1fe67 -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/inference/fill.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/fill.o -std=c++17 -fvisibility=hidden -g0 creating build/lib.linux-x86_64-3.8 x86_64-linux-gnu-g++ -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fwrapv -O2 -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-x86_64-3.8/cpp/nvtabular/__init__.o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/__init__.o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/categorify.o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/fill.o -o build/lib.linux-x86_64-3.8/nvtabular_cpp.cpython-38-x86_64-linux-gnu.so copying build/lib.linux-x86_64-3.8/nvtabular_cpp.cpython-38-x86_64-linux-gnu.so -> Generating nvtabular/inference/triton/model_config_pb2.py from nvtabular/inference/triton/model_config.proto Creating /var/jenkins_home/.local/lib/python3.8/site-packages/nvtabular.egg-link (link to .) nvtabular 0.6.0+75.gcc1fe67 is already the active version in easy-install.pth

Installed /var/jenkins_home/workspace/nvtabular_tests/nvtabular
Processing dependencies for nvtabular==0.6.0+75.gcc1fe67
Searching for protobuf==3.17.3
Best match: protobuf 3.17.3
Adding protobuf 3.17.3 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for tensorflow-metadata==1.2.0
Best match: tensorflow-metadata 1.2.0
Processing tensorflow_metadata-1.2.0-py3.8.egg
tensorflow-metadata 1.2.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/tensorflow_metadata-1.2.0-py3.8.egg
Searching for pyarrow==4.0.1
Best match: pyarrow 4.0.1
Adding pyarrow 4.0.1 to easy-install.pth file
Installing plasma_store script to /var/jenkins_home/.local/bin

Using /usr/local/lib/python3.8/dist-packages
Searching for tqdm==4.61.2
Best match: tqdm 4.61.2
Processing tqdm-4.61.2-py3.8.egg
tqdm 4.61.2 is already the active version in easy-install.pth
Installing tqdm script to /var/jenkins_home/.local/bin

Using /var/jenkins_home/.local/lib/python3.8/site-packages/tqdm-4.61.2-py3.8.egg
Searching for numba==0.54.0
Best match: numba 0.54.0
Processing numba-0.54.0-py3.8-linux-x86_64.egg
numba 0.54.0 is already the active version in easy-install.pth
Installing pycc script to /var/jenkins_home/.local/bin
Installing numba script to /var/jenkins_home/.local/bin

Using /var/jenkins_home/.local/lib/python3.8/site-packages/numba-0.54.0-py3.8-linux-x86_64.egg
Searching for pandas==1.2.5
Best match: pandas 1.2.5
Processing pandas-1.2.5-py3.8-linux-x86_64.egg
pandas 1.2.5 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/pandas-1.2.5-py3.8-linux-x86_64.egg
Searching for distributed==2021.4.1
Best match: distributed 2021.4.1
Processing distributed-2021.4.1-py3.8.egg
distributed 2021.4.1 is already the active version in easy-install.pth
Installing dask-ssh script to /var/jenkins_home/.local/bin
Installing dask-scheduler script to /var/jenkins_home/.local/bin
Installing dask-worker script to /var/jenkins_home/.local/bin

Using /var/jenkins_home/.local/lib/python3.8/site-packages/distributed-2021.4.1-py3.8.egg
Searching for dask==2021.4.1
Best match: dask 2021.4.1
Processing dask-2021.4.1-py3.8.egg
dask 2021.4.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/dask-2021.4.1-py3.8.egg
Searching for PyYAML==5.4.1
Best match: PyYAML 5.4.1
Processing PyYAML-5.4.1-py3.8-linux-x86_64.egg
PyYAML 5.4.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/PyYAML-5.4.1-py3.8-linux-x86_64.egg
Searching for six==1.15.0
Best match: six 1.15.0
Adding six 1.15.0 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for googleapis-common-protos==1.53.0
Best match: googleapis-common-protos 1.53.0
Processing googleapis_common_protos-1.53.0-py3.8.egg
googleapis-common-protos 1.53.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/googleapis_common_protos-1.53.0-py3.8.egg
Searching for absl-py==0.12.0
Best match: absl-py 0.12.0
Processing absl_py-0.12.0-py3.8.egg
absl-py 0.12.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/absl_py-0.12.0-py3.8.egg
Searching for numpy==1.20.2
Best match: numpy 1.20.2
Adding numpy 1.20.2 to easy-install.pth file
Installing f2py script to /var/jenkins_home/.local/bin
Installing f2py3 script to /var/jenkins_home/.local/bin
Installing f2py3.8 script to /var/jenkins_home/.local/bin

Using /usr/local/lib/python3.8/dist-packages
Searching for setuptools==58.0.4
Best match: setuptools 58.0.4
Adding setuptools 58.0.4 to easy-install.pth file

Using /var/jenkins_home/.local/lib/python3.8/site-packages
Searching for llvmlite==0.37.0
Best match: llvmlite 0.37.0
Processing llvmlite-0.37.0-py3.8-linux-x86_64.egg
llvmlite 0.37.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/llvmlite-0.37.0-py3.8-linux-x86_64.egg
Searching for pytz==2021.1
Best match: pytz 2021.1
Adding pytz 2021.1 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for python-dateutil==2.8.2
Best match: python-dateutil 2.8.2
Adding python-dateutil 2.8.2 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages
Searching for zict==2.0.0
Best match: zict 2.0.0
Processing zict-2.0.0-py3.8.egg
zict 2.0.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/zict-2.0.0-py3.8.egg
Searching for tornado==6.1
Best match: tornado 6.1
Processing tornado-6.1-py3.8-linux-x86_64.egg
tornado 6.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/tornado-6.1-py3.8-linux-x86_64.egg
Searching for toolz==0.11.1
Best match: toolz 0.11.1
Processing toolz-0.11.1-py3.8.egg
toolz 0.11.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/toolz-0.11.1-py3.8.egg
Searching for tblib==1.7.0
Best match: tblib 1.7.0
Processing tblib-1.7.0-py3.8.egg
tblib 1.7.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/tblib-1.7.0-py3.8.egg
Searching for sortedcontainers==2.4.0
Best match: sortedcontainers 2.4.0
Processing sortedcontainers-2.4.0-py3.8.egg
sortedcontainers 2.4.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/sortedcontainers-2.4.0-py3.8.egg
Searching for psutil==5.8.0
Best match: psutil 5.8.0
Processing psutil-5.8.0-py3.8-linux-x86_64.egg
psutil 5.8.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/psutil-5.8.0-py3.8-linux-x86_64.egg
Searching for msgpack==1.0.2
Best match: msgpack 1.0.2
Processing msgpack-1.0.2-py3.8-linux-x86_64.egg
msgpack 1.0.2 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/msgpack-1.0.2-py3.8-linux-x86_64.egg
Searching for cloudpickle==1.6.0
Best match: cloudpickle 1.6.0
Processing cloudpickle-1.6.0-py3.8.egg
cloudpickle 1.6.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/cloudpickle-1.6.0-py3.8.egg
Searching for click==8.0.1
Best match: click 8.0.1
Processing click-8.0.1-py3.8.egg
click 8.0.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/click-8.0.1-py3.8.egg
Searching for partd==1.2.0
Best match: partd 1.2.0
Processing partd-1.2.0-py3.8.egg
partd 1.2.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/partd-1.2.0-py3.8.egg
Searching for fsspec==2021.8.1
Best match: fsspec 2021.8.1
Processing fsspec-2021.8.1-py3.8.egg
fsspec 2021.8.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/fsspec-2021.8.1-py3.8.egg
Searching for HeapDict==1.0.1
Best match: HeapDict 1.0.1
Processing HeapDict-1.0.1-py3.8.egg
HeapDict 1.0.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/HeapDict-1.0.1-py3.8.egg
Searching for locket==0.2.1
Best match: locket 0.2.1
Processing locket-0.2.1-py3.8.egg
locket 0.2.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/locket-0.2.1-py3.8.egg
Finished processing dependencies for nvtabular==0.6.0+75.gcc1fe67
Running black --check
All done! ✨ 🍰 ✨
128 files would be left unchanged.
Running flake8
Running isort
Skipped 2 files
Running bandit
Running pylint
************* Module nvtabular.ops.categorify
nvtabular/ops/categorify.py:504:15: I1101: Module 'nvtabular_cpp' has no 'inference' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)
************* Module nvtabular.ops.fill
nvtabular/ops/fill.py:67:15: I1101: Module 'nvtabular_cpp' has no 'inference' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)

Your code has been rated at 10.00/10 (previous run: 10.00/10, +0.00)

Running flake8-nb
Building docs
make: Entering directory '/var/jenkins_home/workspace/nvtabular_tests/nvtabular/docs'
/usr/lib/python3/dist-packages/requests/init.py:89: RequestsDependencyWarning: urllib3 (1.26.6) or chardet (3.0.4) doesn't match a supported version!
warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
/usr/local/lib/python3.8/dist-packages/recommonmark/parser.py:75: UserWarning: Container node skipped: type=document
warn("Container node skipped: type={0}".format(mdnode.t))
/usr/local/lib/python3.8/dist-packages/recommonmark/parser.py:75: UserWarning: Container node skipped: type=document
warn("Container node skipped: type={0}".format(mdnode.t))
/usr/local/lib/python3.8/dist-packages/recommonmark/parser.py:75: UserWarning: Container node skipped: type=document
warn("Container node skipped: type={0}".format(mdnode.t))
make: Leaving directory '/var/jenkins_home/workspace/nvtabular_tests/nvtabular/docs'
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-6.2.5, py-1.10.0, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: pyproject.toml
plugins: cov-2.12.1, forked-1.3.0, xdist-2.3.0
collected 1526 items / 1 skipped / 1525 selected

tests/unit/test_dask_nvt.py ............................................ [ 2%]
..................................................................... [ 7%]
tests/unit/test_io.py .................................................. [ 10%]
........................................................................ [ 15%]
..........ssssssss.....................................................s [ 20%]
s [ 20%]
tests/unit/test_notebooks.py ...F.. [ 20%]
tests/unit/test_tf4rec.py . [ 20%]
tests/unit/test_tools.py ...................... [ 22%]
tests/unit/test_triton_inference.py .............................. [ 24%]
tests/unit/columns/test_column_schemas.py .............................. [ 26%]
................................................... [ 29%]
tests/unit/columns/test_column_selector.py .................... [ 30%]
tests/unit/framework_utils/test_tf_feature_columns.py . [ 30%]
tests/unit/framework_utils/test_tf_layers.py ........................... [ 32%]
................................................... [ 35%]
tests/unit/framework_utils/test_torch_layers.py . [ 35%]
tests/unit/loader/test_dataloader_backend.py .. [ 36%]
tests/unit/loader/test_tf_dataloader.py ................................ [ 38%]
........................................s.. [ 40%]
tests/unit/loader/test_torch_dataloader.py ............................. [ 42%]
...................................................FF.. [ 46%]
tests/unit/ops/test_column_similarity.py ........................ [ 48%]
tests/unit/ops/test_ops.py ............................................. [ 50%]
........................................................................ [ 55%]
........................................................................ [ 60%]
........................................................................ [ 65%]
........................................................................ [ 69%]
........................................................................ [ 74%]
............................................. [ 77%]
tests/unit/ops/test_ops_schema.py ...................................... [ 80%]
........................................................................ [ 84%]
........................................................................ [ 89%]
.......................... [ 91%]
tests/unit/workflow/test_cpu_workflow.py ...... [ 91%]
tests/unit/workflow/test_workflow.py ................................... [ 93%]
.......................................................... [ 97%]
tests/unit/workflow/test_workflow_node.py ........... [ 98%]
tests/unit/workflow/test_workflow_ops.py .. [ 98%]
tests/unit/workflow/test_workflow_schemas.py ....................... [100%]

=================================== FAILURES ===================================
____________________________ test_movielens_example ____________________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-30/test_movielens_example0')

def test_movielens_example(tmpdir):
    _get_random_movielens_data(tmpdir, 10000, dataset="movie")
    _get_random_movielens_data(tmpdir, 10000, dataset="ratings")
    _get_random_movielens_data(tmpdir, 5000, dataset="ratings", valid=True)

    triton_model_path = os.path.join(tmpdir, "models")
    os.environ["INPUT_DATA_DIR"] = str(tmpdir)
    os.environ["MODEL_PATH"] = triton_model_path

    notebook_path = os.path.join(
        dirname(TEST_PATH),
        "examples/getting-started-movielens/",
        "02-ETL-with-NVTabular.ipynb",
    )
    _run_notebook(tmpdir, notebook_path)

    def _modify_tf_nb(line):
        return line.replace(
            # don't require graphviz/pydot
            "tf.keras.utils.plot_model(model)",
            "# tf.keras.utils.plot_model(model)",
        )

    def _modify_tf_triton(line):
        # models are already preloaded
        line = line.replace("triton_client.load_model", "# triton_client.load_model")
        line = line.replace("triton_client.unload_model", "# triton_client.unload_model")
        return line

    notebooks = []
    try:
        import torch  # noqa

        notebooks.append("03-Training-with-PyTorch.ipynb")
    except Exception:
        pass
    try:
        import nvtabular.inference.triton  # noqa
        import nvtabular.loader.tensorflow  # noqa

        notebooks.append("03-Training-with-TF.ipynb")
        has_tf = True

    except Exception:
        has_tf = False

    for notebook in notebooks:
        notebook_path = os.path.join(
            dirname(TEST_PATH),
            "examples/getting-started-movielens/",
            notebook,
        )
        if notebook == "03-Training-with-TF.ipynb":
            _run_notebook(tmpdir, notebook_path, transform=_modify_tf_nb)
        else:

          _run_notebook(tmpdir, notebook_path)

tests/unit/test_notebooks.py:211:

tests/unit/test_notebooks.py:305: in _run_notebook
subprocess.check_output([sys.executable, script_path])
/usr/lib/python3.8/subprocess.py:415: in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,

input = None, capture_output = False, timeout = None, check = True
popenargs = (['/usr/bin/python', '/tmp/pytest-of-jenkins/pytest-30/test_movielens_example0/notebook.py'],)
kwargs = {'stdout': -1}, process = <subprocess.Popen object at 0x7f3346fa2d90>
stdout = b'', stderr = None, retcode = 1

def run(*popenargs,
        input=None, capture_output=False, timeout=None, check=False, **kwargs):
    """Run command with arguments and return a CompletedProcess instance.

    The returned instance will have attributes args, returncode, stdout and
    stderr. By default, stdout and stderr are not captured, and those attributes
    will be None. Pass stdout=PIPE and/or stderr=PIPE in order to capture them.

    If check is True and the exit code was non-zero, it raises a
    CalledProcessError. The CalledProcessError object will have the return code
    in the returncode attribute, and output & stderr attributes if those streams
    were captured.

    If timeout is given, and the process takes too long, a TimeoutExpired
    exception will be raised.

    There is an optional argument "input", allowing you to
    pass bytes or a string to the subprocess's stdin.  If you use this argument
    you may not also use the Popen constructor's "stdin" argument, as
    it will be used internally.

    By default, all communication is in bytes, and therefore any "input" should
    be bytes, and the stdout and stderr will be bytes. If in text mode, any
    "input" should be a string, and stdout and stderr will be strings decoded
    according to locale encoding, or by "encoding" if set. Text mode is
    triggered by setting any of text, encoding, errors or universal_newlines.

    The other arguments are the same as for the Popen constructor.
    """
    if input is not None:
        if kwargs.get('stdin') is not None:
            raise ValueError('stdin and input arguments may not both be used.')
        kwargs['stdin'] = PIPE

    if capture_output:
        if kwargs.get('stdout') is not None or kwargs.get('stderr') is not None:
            raise ValueError('stdout and stderr arguments may not be used '
                             'with capture_output.')
        kwargs['stdout'] = PIPE
        kwargs['stderr'] = PIPE

    with Popen(*popenargs, **kwargs) as process:
        try:
            stdout, stderr = process.communicate(input, timeout=timeout)
        except TimeoutExpired as exc:
            process.kill()
            if _mswindows:
                # Windows accumulates the output in a single blocking
                # read() call run on child threads, with the timeout
                # being done in a join() on those threads.  communicate()
                # _after_ kill() is required to collect that and add it
                # to the exception.
                exc.stdout, exc.stderr = process.communicate()
            else:
                # POSIX _communicate already populated the output so
                # far into the TimeoutExpired exception.
                process.wait()
            raise
        except:  # Including KeyboardInterrupt, communicate handled that.
            process.kill()
            # We don't call process.wait() as .__exit__ does that for us.
            raise
        retcode = process.poll()
        if check and retcode:

          raise CalledProcessError(retcode, process.args,

                                     output=stdout, stderr=stderr)

E subprocess.CalledProcessError: Command '['/usr/bin/python', '/tmp/pytest-of-jenkins/pytest-30/test_movielens_example0/notebook.py']' returned non-zero exit status 1.

/usr/lib/python3.8/subprocess.py:516: CalledProcessError
----------------------------- Captured stderr call -----------------------------
Traceback (most recent call last):
File "/tmp/pytest-of-jenkins/pytest-30/test_movielens_example0/notebook.py", line 60, in
EMBEDDING_TABLE_SHAPES, MH_EMBEDDING_TABLE_SHAPES = nvt.ops.get_embedding_sizes(proc)
ValueError: too many values to unpack (expected 2)
____________________________ test_mh_model_support _____________________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-30/test_mh_model_support0')

def test_mh_model_support(tmpdir):
    df = cudf.DataFrame(
        {
            "Authors": [["User_A"], ["User_A", "User_E"], ["User_B", "User_C"], ["User_C"]],
            "Reviewers": [["User_A"], ["User_A", "User_E"], ["User_B", "User_C"], ["User_C"]],
            "Engaging User": ["User_B", "User_B", "User_A", "User_D"],
            "Null_User": ["User_B", "User_B", "User_A", "User_D"],
            "Post": [1, 2, 3, 4],
            "Cont1": [0.3, 0.4, 0.5, 0.6],
            "Cont2": [0.3, 0.4, 0.5, 0.6],
            "Cat1": ["A", "B", "A", "C"],
        }
    )
    cat_names = ["Cat1", "Null_User", "Authors", "Reviewers"]  # , "Engaging User"]
    cont_names = ["Cont1", "Cont2"]
    label_name = ["Post"]
    out_path = os.path.join(tmpdir, "train/")
    os.mkdir(out_path)

    cats = cat_names >> ops.Categorify()
    conts = cont_names >> ops.Normalize()

    processor = nvt.Workflow(cats + conts + label_name)
    df_out = processor.fit_transform(nvt.Dataset(df)).to_ddf().compute()
    data_itr = torch_dataloader.TorchAsyncItr(
        nvt.Dataset(df_out),
        cats=cat_names,
        conts=cont_names,
        labels=label_name,
        batch_size=2,
    )
    emb_sizes = nvt.ops.get_embedding_sizes(processor)
    # check  for correct  embedding representation

  assert len(emb_sizes[1].keys()) == 2  # Authors, Reviewers

E KeyError: 1

tests/unit/loader/test_torch_dataloader.py:547: KeyError
____________________________ test_horovod_multigpu _____________________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-30/test_horovod_multigpu0')

@pytest.mark.skipif(importlib.util.find_spec("horovod") is None, reason="needs horovod")
@pytest.mark.skipif(
    cupy.cuda.runtime.getDeviceCount() <= 1, reason="This unittest requires multiple gpu's to run"
)
def test_horovod_multigpu(tmpdir):

    json_sample = {
        "conts": {},
        "cats": {
            "genres": {
                "dtype": None,
                "cardinality": 50,
                "min_entry_size": 1,
                "max_entry_size": 5,
                "multi_min": 2,
                "multi_max": 4,
                "multi_avg": 3,
            },
            "movieId": {
                "dtype": None,
                "cardinality": 500,
                "min_entry_size": 1,
                "max_entry_size": 5,
            },
            "userId": {"dtype": None, "cardinality": 500, "min_entry_size": 1, "max_entry_size": 5},
        },
        "labels": {"rating": {"dtype": None, "cardinality": 2}},
    }
    cols = datagen._get_cols_from_schema(json_sample)
    df_gen = datagen.DatasetGen(datagen.UniformDistro(), gpu_frac=0.0001)

    target_path = os.path.join(tmpdir, "input/")
    os.mkdir(target_path)
    df_files = df_gen.full_df_create(10000, cols, output=target_path)

    # process them
    cat_features = ColumnSelector(["userId", "movieId", "genres"]) >> nvt.ops.Categorify()
    ratings = ColumnSelector(["rating"]) >> (lambda col: (col > 3).astype("int8"))
    output = cat_features + ratings

    proc = nvt.Workflow(output)
    train_iter = nvt.Dataset(df_files, part_size="10MB")
    proc.fit(train_iter)

    target_path_train = os.path.join(tmpdir, "train/")
    os.mkdir(target_path_train)

    proc.transform(train_iter).to_parquet(output_path=target_path_train, out_files_per_proc=5)

    # add new location
    target_path = os.path.join(tmpdir, "workflow/")
    os.mkdir(target_path)
    proc.save(target_path)

    curr_path = os.path.abspath(__file__)
    repo_root = os.path.relpath(os.path.normpath(os.path.join(curr_path, "../../../..")))
    hvd_example_path = os.path.join(repo_root, "examples/multi-gpu-movielens/torch_trainer.py")

    with subprocess.Popen(
        [
            "horovodrun",
            "-np",
            "2",
            "-H",
            "localhost:2",
            "python",
            hvd_example_path,
            "--dir_in",
            f"{tmpdir}",
            "--batch_size",
            "1024",
        ],
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE,
    ) as process:
        process.wait()
        stdout, stderr = process.communicate()
        print(str(stdout))
        print(str(stderr))

      assert "Training complete" in str(stdout)

E assert 'Training complete' in "b''"
E + where "b''" = str(b'')

tests/unit/loader/test_torch_dataloader.py:663: AssertionError
----------------------------- Captured stdout call -----------------------------
b''
b'[1,0]:Traceback (most recent call last):\n[1,0]: File "./examples/multi-gpu-movielens/torch_trainer.py", line 47, in \n[1,0]: EMBEDDING_TABLE_SHAPES, MH_EMBEDDING_TABLE_SHAPES = nvt.ops.get_embedding_sizes(proc)\n[1,0]:ValueError: too many values to unpack (expected 2)\n[1,1]:Traceback (most recent call last):\n[1,1]: File "./examples/multi-gpu-movielens/torch_trainer.py", line 47, in \n[1,1]: EMBEDDING_TABLE_SHAPES, MH_EMBEDDING_TABLE_SHAPES = nvt.ops.get_embedding_sizes(proc)\n[1,1]:ValueError: too many values to unpack (expected 2)\n--------------------------------------------------------------------------\nPrimary job terminated normally, but 1 process returned\na non-zero exit code. Per user-direction, the job has been aborted.\n--------------------------------------------------------------------------\n--------------------------------------------------------------------------\nmpirun detected that one or more processes exited with non-zero status, thus causing\nthe job to be terminated. The first process to do so was:\n\n Process name: [[59441,1],1]\n Exit code: 1\n--------------------------------------------------------------------------\n'
=============================== warnings summary ===============================
tests/unit/test_dask_nvt.py: 3 warnings
tests/unit/test_io.py: 24 warnings
tests/unit/test_tf4rec.py: 2 warnings
tests/unit/test_tools.py: 2 warnings
tests/unit/test_triton_inference.py: 5 warnings
tests/unit/loader/test_tf_dataloader.py: 50 warnings
tests/unit/loader/test_torch_dataloader.py: 16 warnings
tests/unit/ops/test_column_similarity.py: 7 warnings
tests/unit/ops/test_ops.py: 74 warnings
tests/unit/workflow/test_workflow.py: 31 warnings
tests/unit/workflow/test_workflow_node.py: 1 warning
tests/unit/workflow/test_workflow_schemas.py: 1 warning
/var/jenkins_home/.local/lib/python3.8/site-packages/numba-0.54.0-py3.8-linux-x86_64.egg/numba/cuda/compiler.py:865: NumbaPerformanceWarning: �[1mGrid size (1) < 2 * SM count (112) will likely result in GPU under utilization due to low occupancy.�[0m
warn(NumbaPerformanceWarning(msg))

tests/unit/test_io.py::test_validate_dataset_bad_schema
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:1105: UserWarning: Unable to sample column dtypes to infer nvt.Dataset schema, schema is empty.
warnings.warn(

tests/unit/test_io.py: 96 warnings
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/init.py:38: DeprecationWarning: ColumnGroup is deprecated, use ColumnSelector instead
warnings.warn("ColumnGroup is deprecated, use ColumnSelector instead", DeprecationWarning)

tests/unit/test_io.py: 24 warnings
tests/unit/loader/test_torch_dataloader.py: 54 warnings
tests/unit/workflow/test_workflow_node.py: 1 warning
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow/node.py:47: FutureWarning: The ["a", "b", "c"] >> ops.Operator syntax for creating a ColumnGroup has been deprecated in NVTabular 21.09 and will be removed in a future version.
warnings.warn(

tests/unit/test_io.py: 36 warnings
tests/unit/workflow/test_workflow.py: 44 warnings
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow/workflow.py:89: UserWarning: A global dask.distributed client has been detected, but the single-threaded scheduler will be used for execution. Please use the client argument to initialize a Workflow object with distributed-execution enabled.
warnings.warn(

tests/unit/test_io.py: 52 warnings
tests/unit/workflow/test_workflow.py: 35 warnings
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dask.py:372: UserWarning: A global dask.distributed client has been detected, but the single-threaded scheduler will be used for this write operation. Please use the client argument to initialize a Dataset and/or Workflow object with distributed-execution enabled.
warnings.warn(

tests/unit/test_io.py: 36 warnings
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:511: UserWarning: A global dask.distributed client has been detected, but the single-threaded scheduler is being used for this shuffle operation. Please use the client argument to initialize a Dataset and/or Workflow object with distributed-execution enabled.
warnings.warn(

tests/unit/ops/test_ops.py::test_fill_median[True-True-op_columns1-parquet-0.01]
tests/unit/ops/test_ops.py::test_fill_median[True-True-op_columns1-parquet-0.1]
tests/unit/ops/test_ops.py::test_fill_median[True-True-op_columns1-csv-0.01]
tests/unit/ops/test_ops.py::test_fill_median[True-True-op_columns1-csv-0.1]
tests/unit/ops/test_ops.py::test_fill_median[True-True-op_columns1-csv-no-header-0.01]
tests/unit/ops/test_ops.py::test_fill_median[True-True-op_columns1-csv-no-header-0.1]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/fill.py:125: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df[f"{col}_filled"] = df[col].isna()

tests/unit/ops/test_ops.py::test_fill_median[True-True-op_columns1-parquet-0.01]
tests/unit/ops/test_ops.py::test_fill_median[True-True-op_columns1-parquet-0.1]
tests/unit/ops/test_ops.py::test_fill_median[True-True-op_columns1-csv-0.01]
tests/unit/ops/test_ops.py::test_fill_median[True-True-op_columns1-csv-0.1]
tests/unit/ops/test_ops.py::test_fill_median[True-True-op_columns1-csv-no-header-0.01]
tests/unit/ops/test_ops.py::test_fill_median[True-True-op_columns1-csv-no-header-0.1]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/fill.py:126: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df[col] = df[col].fillna(self.medians[col])

tests/unit/ops/test_ops.py::test_fill_missing[True-True-parquet]
tests/unit/ops/test_ops.py::test_fill_missing[True-False-parquet]
tests/unit/ops/test_ops.py::test_filter[parquet-0.1-True]
/var/jenkins_home/.local/lib/python3.8/site-packages/pandas-1.2.5-py3.8-linux-x86_64.egg/pandas/core/indexing.py:1637: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
self._setitem_single_block(indexer, value, name)

tests/unit/ops/test_ops.py::test_fill_missing[True-True-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/fill.py:54: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df[f"{col}_filled"] = df[col].isna()

tests/unit/ops/test_ops.py::test_fill_missing[True-True-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/fill.py:55: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df[col] = df[col].fillna(self.fill_val)

tests/unit/ops/test_ops.py: 96 warnings
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/join_external.py:190: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df[tmp] = _arange(len(df), like_df=df, dtype="int32")

tests/unit/ops/test_ops.py::test_join_external[True-True-left-host-pandas-parquet]
tests/unit/ops/test_ops.py::test_join_external[True-True-left-device-pandas-parquet]
tests/unit/ops/test_ops.py::test_join_external[True-True-inner-host-pandas-parquet]
tests/unit/ops/test_ops.py::test_join_external[True-True-inner-device-pandas-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/join_external.py:171: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
_ext.drop_duplicates(ignore_index=True, inplace=True)

tests/unit/ops/test_ops.py::test_filter[parquet-0.1-True]
tests/unit/ops/test_ops.py::test_filter[parquet-0.1-False]
tests/unit/ops/test_ops.py::test_groupby_op[id-True]
tests/unit/ops/test_ops.py::test_groupby_op[id-False]
/var/jenkins_home/.local/lib/python3.8/site-packages/dask-2021.4.1-py3.8.egg/dask/dataframe/core.py:6610: UserWarning: Insufficient elements for head. 1 elements requested, only 0 elements available. Try passing larger npartitions to head.
warnings.warn(msg.format(n, len(r)))

tests/unit/workflow/test_cpu_workflow.py: 78 warnings
/var/jenkins_home/.local/lib/python3.8/site-packages/pandas-1.2.5-py3.8-linux-x86_64.egg/pandas/core/frame.py:3191: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
self[k1] = value[k2]

-- Docs: https://docs.pytest.org/en/stable/warnings.html

---------- coverage: platform linux, python 3.8.10-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

examples/multi-gpu-movielens/torch_trainer.py 65 33 6 1 46% 32->36, 48-145
nvtabular/init.py 18 0 0 0 100%
nvtabular/columns/init.py 2 0 0 0 100%
nvtabular/columns/schema.py 209 17 103 20 88% 46->62, 49, 51, 53-56, 58, 98->109, 104, 147, 174, 260->267, 262, 263->265, 275, 292->297, 295->297, 308, 332, 339, 348, 351, 356->355
nvtabular/columns/selector.py 74 1 34 0 99% 121
nvtabular/dispatch.py 273 55 132 22 78% 36-40, 45-47, 53-63, 70-71, 99-101, 106-109, 113-118, 125, 144, 155, 161, 166->168, 179, 202-205, 244, 247, 253, 269, 276, 307->312, 310, 313, 316->320, 353, 364-367, 394-397, 427, 431, 472, 496, 498, 505
nvtabular/framework_utils/init.py 0 0 0 0 100%
nvtabular/framework_utils/tensorflow/init.py 1 0 0 0 100%
nvtabular/framework_utils/tensorflow/feature_column_utils.py 134 78 90 15 39% 30, 99, 103, 114-130, 140, 143-158, 162, 166-167, 173-198, 207-217, 220-227, 229->233, 234, 239-279, 282
nvtabular/framework_utils/tensorflow/layers/init.py 4 0 0 0 100%
nvtabular/framework_utils/tensorflow/layers/embedding.py 153 12 85 6 91% 60, 68->49, 122, 179, 231-239, 335->343, 357->360, 363-364, 367
nvtabular/framework_utils/tensorflow/layers/interaction.py 47 25 20 1 43% 49, 74-103, 106-110, 113
nvtabular/framework_utils/tensorflow/layers/outer_product.py 30 24 10 0 15% 37-38, 41-60, 71-84, 87
nvtabular/framework_utils/tensorflow/tfrecords_to_parquet.py 58 58 30 0 0% 16-111
nvtabular/framework_utils/torch/init.py 0 0 0 0 100%
nvtabular/framework_utils/torch/layers/init.py 2 0 0 0 100%
nvtabular/framework_utils/torch/layers/embeddings.py 32 15 14 1 52% 50, 74-82, 85-95
nvtabular/framework_utils/torch/models.py 45 6 28 10 75% 56, 57->61, 62, 67, 87->89, 90-91, 93->96, 96->100, 103, 107->109
nvtabular/framework_utils/torch/utils.py 75 10 30 9 82% 51->53, 53->55, 64, 70, 71->76, 75, 109, 118-120, 129-131
nvtabular/inference/init.py 0 0 0 0 100%
nvtabular/inference/triton/init.py 385 215 180 11 44% 82-86, 141-174, 195-218, 263-307, 338, 364-372, 380-387, 406, 428-444, 485-489, 527-537, 583-623, 629-645, 649-716, 723->726, 726->722, 762-772, 781, 791, 812, 818-844, 850-876, 883, 888-894
nvtabular/inference/triton/benchmarking_tools.py 52 52 10 0 0% 2-103
nvtabular/inference/triton/data_conversions.py 87 3 58 4 95% 32-33, 84
nvtabular/inference/triton/model.py 176 176 98 0 0% 27-332
nvtabular/inference/triton/model_config_pb2.py 299 0 2 0 100%
nvtabular/inference/triton/model_pt.py 101 101 40 0 0% 27-220
nvtabular/io/init.py 4 0 0 0 100%
nvtabular/io/avro.py 88 88 30 0 0% 16-189
nvtabular/io/csv.py 57 6 20 5 86% 22-23, 99, 103->107, 108, 110, 124
nvtabular/io/dask.py 183 18 72 11 87% 111, 114, 150, 235-246, 398, 408, 425->428, 436, 440->442, 442->438, 447, 449
nvtabular/io/dataframe_engine.py 61 5 28 6 88% 19-20, 50, 69, 88->92, 92->97, 94->97, 97->116, 125
nvtabular/io/dataset.py 353 76 166 28 75% 46-47, 257, 259, 272, 281, 299-313, 436->510, 441-444, 450-457, 462-506, 510->519, 570-571, 572->576, 619, 741, 743, 745, 751, 755-757, 759, 819-820, 847, 854-855, 861, 867, 963-964, 1081-1086, 1092, 1171, 1180
nvtabular/io/dataset_engine.py 24 1 0 0 96% 48
nvtabular/io/hugectr.py 45 2 24 2 91% 34, 74->97, 101
nvtabular/io/parquet.py 551 45 180 26 89% 34-35, 57, 76, 80->92, 89, 112, 122->127, 140, 142, 166->170, 173-179, 225-233, 248, 254, 272->274, 287, 306-316, 457-462, 500-505, 621->628, 689->694, 695-696, 816, 820, 824, 830, 862, 879, 883, 890->892, 1000->exit, 1010->1015, 1020->1030, 1035, 1057, 1080-1081
nvtabular/io/shuffle.py 31 6 16 5 77% 42, 44-45, 49, 59, 63
nvtabular/io/writer.py 175 13 68 5 92% 24-25, 51, 79, 125, 128, 212, 221, 224, 267, 288-290
nvtabular/io/writer_factory.py 18 2 8 2 85% 35, 60
nvtabular/loader/init.py 0 0 0 0 100%
nvtabular/loader/backend.py 330 15 140 13 94% 111, 128, 143-144, 242->244, 254-258, 304-305, 344->348, 345->344, 419, 423-424, 454, 534, 559, 567
nvtabular/loader/tensorflow.py 163 22 52 7 86% 58, 66-69, 84, 98, 308, 344, 359-361, 390-392, 402-410, 413-416
nvtabular/loader/tf_utils.py 55 10 20 5 80% 29->32, 32->34, 39->41, 43, 50-51, 58-60, 66-70
nvtabular/loader/torch.py 81 13 16 2 78% 25-27, 30-36, 111, 149-150
nvtabular/ops/init.py 22 0 0 0 100%
nvtabular/ops/add_metadata.py 9 0 0 0 100%
nvtabular/ops/bucketize.py 37 10 18 3 69% 53-55, 59->exit, 62-65, 84-87, 94
nvtabular/ops/categorify.py 624 74 332 48 85% 245, 247, 264, 268, 276, 284, 286, 313, 332-333, 357, 366, 377->381, 385-392, 474-475, 499-504, 591, 603-605, 622, 715, 733, 769, 847-848, 863-867, 868->832, 886, 894, 901->exit, 925, 928->931, 983, 988, 1004->1008, 1015-1018, 1029, 1033, 1035, 1042, 1047-1050, 1128, 1130, 1200->1223, 1206->1223, 1224-1229, 1266, 1285->1290, 1289, 1299->1296, 1304->1296, 1311, 1314, 1322-1332
nvtabular/ops/clip.py 18 2 6 3 79% 44, 52->54, 55
nvtabular/ops/column_similarity.py 118 25 38 5 74% 19-20, 78->exit, 108, 134, 198-199, 208-210, 218-234, 251->254, 255, 265
nvtabular/ops/data_stats.py 56 2 22 3 94% 91->93, 95, 97->87, 102
nvtabular/ops/difference_lag.py 31 1 8 1 95% 69->71, 94
nvtabular/ops/dropna.py 8 0 0 0 100%
nvtabular/ops/fill.py 91 12 36 3 82% 63-67, 93, 121, 147, 151, 162-165
nvtabular/ops/filter.py 20 1 6 1 92% 49
nvtabular/ops/groupby.py 119 3 70 4 96% 73, 84, 94->96, 106->111, 141
nvtabular/ops/hash_bucket.py 41 2 20 2 93% 72, 106->112, 118
nvtabular/ops/hashed_cross.py 36 4 15 3 86% 53, 66, 81, 91
nvtabular/ops/internal/init.py 3 0 0 0 100%
nvtabular/ops/internal/concat_columns.py 11 0 0 0 100%
nvtabular/ops/internal/identity.py 6 1 0 0 83% 42
nvtabular/ops/internal/subset_columns.py 13 1 0 0 92% 53
nvtabular/ops/join_external.py 89 7 36 6 90% 20-21, 113, 115, 117, 159, 176->178, 215
nvtabular/ops/join_groupby.py 101 7 36 4 92% 108, 115, 124, 131->130, 215-216, 219-220
nvtabular/ops/lambdaop.py 39 6 18 6 79% 59, 63, 77, 89, 94, 103
nvtabular/ops/list_slice.py 66 24 26 1 58% 21-22, 53-54, 104-118, 126-137
nvtabular/ops/logop.py 13 0 0 0 100%
nvtabular/ops/moments.py 65 0 20 0 100%
nvtabular/ops/normalize.py 81 10 14 1 86% 70, 78-79, 85, 118-119, 141-142, 146, 157
nvtabular/ops/operator.py 66 3 14 1 95% 111, 189, 196
nvtabular/ops/rename.py 41 3 22 3 90% 47, 88-90
nvtabular/ops/stat_operator.py 8 0 0 0 100%
nvtabular/ops/target_encoding.py 153 11 66 4 91% 167->171, 175->184, 232-233, 236-237, 249-255, 346->349, 362
nvtabular/tags.py 16 0 0 0 100%
nvtabular/tools/init.py 0 0 0 0 100%
nvtabular/tools/data_gen.py 236 1 62 1 99% 321
nvtabular/tools/dataset_inspector.py 50 7 18 1 79% 32-39
nvtabular/tools/inspector_script.py 46 46 0 0 0% 17-168
nvtabular/utils.py 102 43 46 8 52% 31-32, 36-37, 50, 61-62, 64-66, 69, 72, 78, 84, 90-126, 145, 149->153
nvtabular/worker.py 82 5 38 7 90% 24-25, 82->99, 91, 92->99, 99->102, 108, 110, 111->113
nvtabular/workflow/init.py 2 0 0 0 100%
nvtabular/workflow/node.py 240 18 116 10 89% 55, 93->98, 146, 248->252, 288, 302, 311, 329-334, 339, 388-389, 400->395, 453-458
nvtabular/workflow/workflow.py 221 15 112 7 93% 28-29, 47, 139, 195, 222-224, 332, 347-348, 366-367, 502, 514

TOTAL 7521 1547 3025 353 77%
Coverage XML written to file coverage.xml

Required test coverage of 70% reached. Total coverage: 76.78%
=========================== short test summary info ============================
SKIPPED [1] ../../../../../usr/local/lib/python3.8/dist-packages/dask_cudf/io/tests/test_s3.py:16: could not import 's3fs': No module named 's3fs'
SKIPPED [8] tests/unit/test_io.py:555: could not import 'uavro': No module named 'uavro'
SKIPPED [2] tests/unit/test_io.py:914: Dask>=2021.07.1 required for file aggregation
SKIPPED [1] tests/unit/loader/test_tf_dataloader.py:521: not working correctly in ci environment
==== 3 failed, 1512 passed, 12 skipped, 794 warnings in 2119.99s (0:35:19) =====
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://api.GitHub.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins9150767269801821051.sh

nvidia-merlin-bot · 2021-09-20T18:13:33Z

Click to view CI Results

GitHub pull request #1127 of commit c76f67b8049d053658ab327c8969199735341105, no merge conflicts. Running as SYSTEM Setting status of c76f67b8049d053658ab327c8969199735341105 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/3509/ and message: 'Pending' Using context: Jenkins Unit Test Run Building in workspace /var/jenkins_home/workspace/nvtabular_tests using credential nvidia-merlin-bot Cloning the remote Git repository Cloning repository https://github.com/NVIDIA/NVTabular.git > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10 Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git > git --version # timeout=10 using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10 Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/1127/*:refs/remotes/origin/pr/1127/* # timeout=10 > git rev-parse c76f67b8049d053658ab327c8969199735341105^{commit} # timeout=10 Checking out Revision c76f67b8049d053658ab327c8969199735341105 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f c76f67b8049d053658ab327c8969199735341105 # timeout=10 Commit message: "Merge branch 'main' into get-embedding-sizes-fix" > git rev-list --no-walk 015c9d1b59ba1d6ff668b3d2161937ccfd960f77 # timeout=10 First time build. Skipping changelog. [nvtabular_tests] $ /bin/bash /tmp/jenkins1510096760308657384.sh Installing NVTabular Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com Requirement already satisfied: pip in /var/jenkins_home/.local/lib/python3.8/site-packages (21.2.4) Requirement already satisfied: setuptools in /var/jenkins_home/.local/lib/python3.8/site-packages (58.0.4) Requirement already satisfied: wheel in /var/jenkins_home/.local/lib/python3.8/site-packages (0.37.0) Requirement already satisfied: pybind11 in /var/jenkins_home/.local/lib/python3.8/site-packages (2.7.1) running develop running egg_info creating nvtabular.egg-info writing nvtabular.egg-info/PKG-INFO writing dependency_links to nvtabular.egg-info/dependency_links.txt writing requirements to nvtabular.egg-info/requires.txt writing top-level names to nvtabular.egg-info/top_level.txt writing manifest file 'nvtabular.egg-info/SOURCES.txt' reading manifest template 'MANIFEST.in' warning: no files found matching '*.h' under directory 'cpp' warning: no files found matching '*.cu' under directory 'cpp' warning: no files found matching '*.cuh' under directory 'cpp' adding license file 'LICENSE' writing manifest file 'nvtabular.egg-info/SOURCES.txt' running build_ext x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include/python3.8 -c flagcheck.cpp -o flagcheck.o -std=c++17 building 'nvtabular_cpp' extension creating build creating build/temp.linux-x86_64-3.8 creating build/temp.linux-x86_64-3.8/cpp creating build/temp.linux-x86_64-3.8/cpp/nvtabular creating build/temp.linux-x86_64-3.8/cpp/nvtabular/inference x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.6.0+78.gc76f67b -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/__init__.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/__init__.o -std=c++17 -fvisibility=hidden -g0 x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.6.0+78.gc76f67b -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/inference/__init__.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/__init__.o -std=c++17 -fvisibility=hidden -g0 x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.6.0+78.gc76f67b -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/inference/categorify.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/categorify.o -std=c++17 -fvisibility=hidden -g0 x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.6.0+78.gc76f67b -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/inference/fill.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/fill.o -std=c++17 -fvisibility=hidden -g0 creating build/lib.linux-x86_64-3.8 x86_64-linux-gnu-g++ -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fwrapv -O2 -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-x86_64-3.8/cpp/nvtabular/__init__.o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/__init__.o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/categorify.o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/fill.o -o build/lib.linux-x86_64-3.8/nvtabular_cpp.cpython-38-x86_64-linux-gnu.so copying build/lib.linux-x86_64-3.8/nvtabular_cpp.cpython-38-x86_64-linux-gnu.so -> Generating nvtabular/inference/triton/model_config_pb2.py from nvtabular/inference/triton/model_config.proto Creating /var/jenkins_home/.local/lib/python3.8/site-packages/nvtabular.egg-link (link to .) nvtabular 0.6.0+78.gc76f67b is already the active version in easy-install.pth

Installed /var/jenkins_home/workspace/nvtabular_tests/nvtabular
Processing dependencies for nvtabular==0.6.0+78.gc76f67b
Searching for protobuf==3.17.3
Best match: protobuf 3.17.3
Adding protobuf 3.17.3 to easy-install.pth file