-
Notifications
You must be signed in to change notification settings - Fork 504
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix tensorflow pip install (doesn't work on any JP/L4T Versions) #760
base: dev
Are you sure you want to change the base?
Conversation
Avoid cuda link if tensorflow is >=2.18 and jax>= 0.4.34. Now tensorflow and jax use hermetic cuda, that support natively jetson |
What is the proper way to build tensorflow2 using jetson-containers?
…On Fri, Dec 27, 2024 at 1:43 AM Johnny ***@***.***> wrote:
Build command:
***@***.***:~/jetson-containers/logs/20241226_221017/build$ cat ros_humble-ros-base-l4t-r36.3.0-tensorflow2.sh
#!/usr/bin/env bash
DOCKER_BUILDKIT=0 docker build --network=host --tag ros:humble-ros-base-l4t-r36.3.0-tensorflow2 \
--file /home/aero/jetson-containers/packages/ml/tensorflow/Dockerfile \
--build-arg BASE_IMAGE=ros:humble-ros-base-l4t-r36.3.0-protobuf_cpp \
--build-arg TENSORFLOW_VERSION="2.16.1" \
--build-arg TENSORFLOW_URL="https://developer.download.nvidia.com/compute/redist/jp/v60/tensorflow/tensorflow-2.16.1+nv24.06-cp310-cp310-linux_aarch64.whl" \
--build-arg TENSORFLOW_WHL="tensorflow-2.16.1+nv24.06-cp310-cp310-linux_aarch64.whl" \
--build-arg PYTHON_VERSION_MAJOR="3" \
--build-arg PYTHON_VERSION_MINOR="10" \
--build-arg FORCE_BUILD="off" \
/home/aero/jetson-containers/packages/ml/tensorflow \
2>&1 | tee /home/aero/jetson-containers/logs/20241226_221017/build/ros_humble-ros-base-l4t-r36.3.0-tensorflow2.txt; exit ${PIPESTATUS[0]}
Result:
cat ros_humble-ros-base-l4t-r36.3.0-tensorflow2.txt
DEPRECATED: The legacy builder is deprecated and will be removed in a future release.
BuildKit is currently disabled; enable it by removing the DOCKER_BUILDKIT=0
environment-variable.
Sending build context to Docker daemon 61.95kB
Step 1/5 : ARG BASE_IMAGE
Step 2/5 : FROM ${BASE_IMAGE}
---> 7d00558e15a1
Step 3/5 : ARG TENSORFLOW_URL TENSORFLOW_WHL HDF5_DIR="/usr/lib/aarch64-linux-gnu/hdf5/serial/" MAKEFLAGS=-j$(nproc) FORCE_BUILD
---> Running in 7997ed85083c
---> Removed intermediate container 7997ed85083c
---> 401a01062b21
Step 4/5 : COPY install.sh /tmp/tensorflow/
---> 5eb0c716e22e
Step 5/5 : RUN /tmp/tensorflow/install.sh
---> Running in 3fab2073b17b
+ bash /tmp/TENSORFLOW/link_cuda.sh
bash: /tmp/TENSORFLOW/link_cuda.sh: No such file or directory
The command '/bin/sh -c /tmp/tensorflow/install.sh' returned a non-zero code: 127
Which then revealed the following error in the version comparison (after
resolving the previous error by copying link_cuda.sh to the Docker build)
+ apt-get clean
+ '[' off == on ']'
++ echo ' <= 2.16.1'
++ bc
(standard_in) 1: syntax error
+ '[' -eq 1 ']'
/tmp/tensorflow/install.sh: line 47: [: -eq: unary operator expected
+ pip3 install --no-cache-dir --verbose
Which then lead to a successful install/test on JP6.1 (CUDA 12.6)
root cause was that we were trying to do a version comparison using bc,
which only supports decimals move to using dpkg --compare-versions
Unknown if link_cuda.sh is needed for the install, but fixed the copy and
it resulted in a successful build. link_cuda.sh needs modifications to
support different cuda versions, leaving unchanged as intended direction is
not known
Avoid cuda link. Now tensorflow and jax use hermetic cuda, that support
natively jetson
—
Reply to this email directly, view it on GitHub
<#760 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AN7VQMEFKIWDUETEQW5SPJT2HUAJZAVCNFSM6AAAAABUH2PAVKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNRTGQZDANZWGI>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Now, it is only supported for jetpack >=6 |
@johnnynunez |
it's updated with jetpack 6.1 |
sorry, I meant jetson-containers as a whole |
Sorry, yes will get the cached tf2 path working again - those dockerfiles/configs changed a lot recently to include the ability to build tf2 + jax wheels from source (fortunately Johnny did this as now official support is up to us), and I also migrated our pip server to https.
…________________________________
From: Johnny ***@***.***>
Sent: Friday, December 27, 2024 1:08:24 PM
To: dusty-nv/jetson-containers ***@***.***>
Cc: Subscribed ***@***.***>
Subject: Re: [dusty-nv/jetson-containers] fix tensorflow pip install (doesn't work on any JP/L4T Versions) (PR #760)
@johnnynunez<https://github.com/johnnynunez> I understand, this project is a huge undertaking. Are there plans to get it up to date w/ jetpack 6.1? It seems like there needs to be some optimizations around using prebuilt wheels instead of building from source for all jetpack versions
it's updated with jetpack 6.1
https://pypi.jetson-ai-lab.dev/jp6/cu126
—
Reply to this email directly, view it on GitHub<#760 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ADVEGK5BCSZ2ONBGHWQCHAL2HWJRRAVCNFSM6AAAAABUH2PAVKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNRTHEZDGMRTG4>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
Build command:
Result:
Which then revealed the following error in the version comparison (after resolving the previous error by copying link_cuda.sh to the Docker build)
Which then lead to a successful install/test on JP6.1 (CUDA 12.6)
root cause was that we were trying to do a version comparison using bc, which only supports decimals
move to using dpkg --compare-versions
Unknown if link_cuda.sh is needed for the install, but fixed the copy and it resulted in a successful build.
link_cuda.sh needs modifications to support different cuda versions, leaving unchanged as intended direction is not known