Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ported Marvell armhf build on amd64 host for debian buster to use cross-comp… #8035

Merged
merged 37 commits into from
Jul 21, 2022

Conversation

gregshpit
Copy link
Contributor

@gregshpit gregshpit commented Jul 1, 2021

…ilation instead of qemu emulation

The following Sonic modules PRs are part of this PR:
sonic-net/sonic-sairedis#852
opencomputeproject/SAI#1267
sonic-net/sonic-mgmt-framework#88
sonic-net/sonic-swss-common#501
sonic-net/sonic-telemetry#80

New PRs as part of bullseye port:
sonic-net/sonic-utilities#2233
sonic-net/sonic-linkmgrd#91

Why I did it

Current armhf Sonic build on amd64 host uses qemu emulation. Due to the nature of the emulation it takes a very long time, about 22-24 hours to complete the build. The change I did to reduce the building time by porting Sonic armhf build on amd64 host for Marvell platform for debian buster to use cross-compilation on arm64 host for armhf target. The overall Sonic armhf building time using cross-compilation reduced to about 6 hours.

How I did it

The cross-compilation environment is installed inside slave docker. All debian packages are cross-compiled using arm-linux-gnueabihf-gcc/g++ cross-compilers. Also the most build time consuming python dependencies packages are prebuilt for armhf target once using cross-compilation during slave docker build and then used for python wheels tests and inside dockers components.
The environment variable CROSS_BUILD_ENVIRON is set to 'y' for cross-compilation build and is checked to determine the build environment.

How to verify it

The Sonic configure and build for the armhf cross-compilation is as following:
NOJESSIE=1 NOSTRETCH=1 CROSS_BLDENV=1 make configure PLATFORM=marvell-armhf PLATFORM_ARCH=armhf
NOJESSIE=1 NOSTRETCH=1 CROSS_BLDENV=1 make target/sonic-marvell-armhf.bin

Description for the changelog

Ported Marvell armhf build on amd64 host for debian buster to use cross-compilation instead of qemu emulation. This reduced building time from 22-24 hours to 6 hours.

A picture of a cute animal (not mandatory but encouraged)

RUN cd /python_virtualenv && python3 -m virtualenv -p /usr/bin/python env2
RUN cd /python_virtualenv && python3 -m virtualenv --copies -p /usr/bin/python3 env3

RUN PATH=/python_virtualenv/env2/bin/:$PATH pip2 install setuptools==40.8.0 wheel==0.35.1 fastentrypoints pytest pytest-cov pytest-runner==4.4 nose==1.3.7 mockredispy==2.9.3 mock==3.0.5 j2cli==0.3.10 PyYAML==5.4.1 pexpect==4.6.0 Pympler==0.8 ctypesgen==1.0.2 natsort redis
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when the pip package versions obsolete ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when the pip package versions obsolete ?

These are python dependencies specific versions for building Sonic packages. They did exist in the Dockerfile before my change. I guess that they have to be updated if one of the Sonic packages requires it.

@gregshpit
Copy link
Contributor Author

Dear reviewers,

Long time has passed since this PR was raised. Please make progress on the review.

Thanks,

Gregory

slave.mk Outdated
CROSS_HOST_TYPE = arm-linux-gnueabihf
GOARCH=arm
else ifeq ($(CONFIGURED_ARCH),arm64)
CROSS_HOST_TYPE = aarch64-linux-gnueabi
Copy link
Contributor

@vivekrnv vivekrnv Aug 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @gregshpit, Thanks for your contribution. It seems that you've also added support to cross compile non-platform specific packages to arm64 as well. Am i correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to write generic changes to support arm64 as well. However I did not test it as opposed to armhf. So there may be some issue with arm64.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, makes sense

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, shouldn't this be aarch64-linux-gnu ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Done.

@dgsudharsan
Copy link
Collaborator

@gregshpit Can you please resolve the conflicts?
@lguohan @qiluo-msft @xumia Can we have this reviewed?

FROM debian:buster
COPY --from=qemu /usr/bin/qemu-arm-static /usr/bin
{%- elif CONFIGURED_ARCH == "arm64" and CROSS_BUILD_ENVIRON == "y" %}
FROM multiarch/qemu-user-static:x86_64-arm-5.0.0-2 as qemu
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be FROM multiarch/qemu-user-static:x86_64-aarch64-5.0.0-2 as qemu

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Done.

@gregshpit
Copy link
Contributor Author

@gregshpit Can you please resolve the conflicts?
@lguohan @qiluo-msft @xumia Can we have this reviewed?

Done.

@gregshpit
Copy link
Contributor Author

@gregshpit Can you please resolve the conflicts?
@lguohan @qiluo-msft @xumia Can we have this reviewed?

Any update on the review ?

@@ -25,7 +25,11 @@ $(addprefix $(DEST)/, $(MAIN_TARGET)): $(DEST)/% :
stg import -s ../patch/series

# Build source and Debian packages
ifeq ($(CROSS_BUILD_ENVIRON), y)
Copy link
Contributor

@vivekrnv vivekrnv Sep 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To cross-compile kdump tools from source, a flag TARGET has to be passed to the makefile. Check README here: https://github.com/makedumpfile/makedumpfile/tree/0820a55bf9a0d1f6769398b686a328e5979542b5

But that isn't being set current, and i'm seeing compilation failures while trying for arm64. Has this been observed while testing for armhf?

to fix this, i had to edit the override_dh_auto_build inside the src/kdump-tools/makedumpfile-1.6.1/debian/rules to include TARGET=<CROSS_TARGET> flag and the compilation went through.

Can you check and add this flag?

#Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, in armhf kdump tools builds fine without errors, just double checked it now. Trying to figure out what went wrong in arm64.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I added TARGET flag and it fixed arm64 build. Thanks for the help.

@@ -38,8 +38,20 @@ $(addprefix $(DEST)/, $(MAIN_TARGET)): $(DEST)/% :
# and go into learning mode.
sed -i 's/\/usr\/sbin\/ntpd {/\/usr\/sbin\/ntpd flags=(attach_disconnected complain) {/' debian/apparmor-profile

ifeq ($(CROSS_BUILD_ENVIRON), y)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, Did you encounter this error while compiling for armhf?

make[6]: Entering directory '/sonic/src/ntp/ntp-4.2.8p12/include/isc'
make[6]: Nothing to be done for 'all'.
make[6]: Leaving directory '/sonic/src/ntp/ntp-4.2.8p12/include/isc'
make[6]: Entering directory '/sonic/src/ntp/ntp-4.2.8p12/include'
make[6]: *** No rule to make target 'adjtime.h', needed by 'all-am'.  Stop.
make[6]: Leaving directory '/sonic/src/ntp/ntp-4.2.8p12/include'
make[5]: *** [Makefile:616: all-recursive] Error 1
make[5]: Leaving directory '/sonic/src/ntp/ntp-4.2.8p12/include'
make[4]: *** [Makefile:667: all-recursive] Error 1
make[4]: Leaving directory '/sonic/src/ntp/ntp-4.2.8p12'
make[3]: *** [Makefile:599: all] Error 2
make[3]: Leaving directory '/sonic/src/ntp/ntp-4.2.8p12'
dh_auto_build: make -j64 returned exit code 2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Does it happen in arm64 build ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll check it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just built successfully ntp debian package for arm64.
I noticed that the created Makefile.in as well as Makefile in src/ntp/ntp-4.2.8p12/include directory does not contain adjtime.h in the noinst_HEADERS variable however Makefile.am does contain it in the noinst_HEADERS variable. So it looks like automake removed adjtime.h file from the noinst_HEADERS variable when building Makefile.in .
Hope this helps.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've dug through a bit and then found a patch which removed adjtime.h line from the Makefile.am.

Anyway, i've ran a make clean followed by this build and it worked :) Thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great ! Is there anything else that prevents this PR merging ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've only tested the changes you've made for arm64 (not armhf). For now, i could successfully build the buster slave-docker and most of the non-platform specific debs. I've gave comments on the debs which are failing. I've not yet tested for the files/ & whl/ and docker packages yet.

But as for the overall scope of this PR, i'm not the official reviewer and couldn't sign-off

As a general advice, If the CI for this PR could include a build for marvell-armhf with cross compilation flags enabled, the reviewers will have confidence in merging the PR. @lguohan, is it possible?

slave.mk Outdated
{ sudo DEBIAN_FRONTEND=noninteractive dpkg -i $(DEBS_PATH)/$* $(LOG) && rm -d $(DEBS_PATH)/dpkg_lock && break; } || { rm -d $(DEBS_PATH)/dpkg_lock && sudo lsof /var/lib/dpkg/lock-frontend && ps aux && exit 1 ; }
else
# Relocate debian packages python libraries to the cross python virtual env location
{ sudo DEBIAN_FRONTEND=noninteractive dpkg -i $(if $(findstring $(LINUX_HEADERS),$*),--force-depends) $(DEBS_PATH)/$* $(LOG) && rm -rf tmp && mkdir tmp && dpkg -x $(DEBS_PATH)/$* tmp && (sudo cp -rf tmp/usr/lib/python2.7/dist-packages/* /python_virtualenv/env2/lib/python2.7/site-packages/ 2>/dev/null || true) && (sudo cp -rf tmp/usr/lib/python3/dist-packages/* /python_virtualenv/env3/lib/python3.7/site-packages/ 2>/dev/null || true) && rm -d $(DEBS_PATH)/dpkg_lock && break; } || { rm -d $(DEBS_PATH)/dpkg_lock && exit 1 ; }
Copy link
Contributor

@vivekrnv vivekrnv Sep 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might also need, python 2.6
target/debs/buster/python-ptf_0.9-1_all.deb is failing at my end

[ REASON ] :      target/debs/buster/python-ptf_0.9-1_all.deb does not exist
[ FLAGS  FILE    ] : []
[ FLAGS  DEPENDS ] : []
[ FLAGS  DIFF    ] : []
/sonic/src/ptf /sonic
dpkg-buildpackage: info: source package ptf
dpkg-buildpackage: info: source version 0.9-1
dpkg-buildpackage: info: source distribution unstable
dpkg-buildpackage: info: source changed by Guohan Lu <[email protected]>
dpkg-architecture: warning: specified GNU system type aarch64-linux-gnu does not match CC system type x86_64-linux-gnu, try setting a correct CC environment variable
 dpkg-source --before-build .
dpkg-buildpackage: info: host architecture arm64
dpkg-checkbuilddeps: error: Unmet build dependencies: python-all (>= 2.6.6-3)
dpkg-buildpackage: warning: build dependencies/conflicts unsatisfied; aborting
dpkg-buildpackage: warning: (Use -d flag to override.)

@@ -27,7 +27,11 @@ $(addprefix $(DEST)/, $(MAIN_TARGET)): $(DEST)/% :
# (https://jira.apache.org/jira/browse/THRIFT-3650)
patch -p1 < ../patch/0001-Revert-THRIFT-3650-incorrect-union-serialization.patch
patch -p1 < ../patch/0002-cve-2017-1000487.patch
ifeq ($(CROSS_BUILD_ENVIRON), y)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

libthrift deb in itself is building properly, but the *-install is failing. seems there is some dependency missing.

[ FAIL LOG START ] [ target/debs/buster/libthrift-0.11.0_0.11.0-4_arm64.deb-install ]
Selecting previously unselected package libthrift-0.11.0:arm64.
(Reading database ... 183765 files and directories currently installed.)
Preparing to unpack .../libthrift-0.11.0_0.11.0-4_arm64.deb ...
Unpacking libthrift-0.11.0:arm64 (0.11.0-4) ...
dpkg: dependency problems prevent configuration of libthrift-0.11.0:arm64:
 libthrift-0.11.0:arm64 depends on libgcc1-arm64-cross (>= 1:3.0).

dpkg: error processing package libthrift-0.11.0:arm64 (--install):
 dependency problems - leaving unconfigured
Processing triggers for libc-bin (2.28-10) ...
Errors were encountered while processing:
 libthrift-0.11.0:arm64
[  FAIL LOG END  ] [ target/debs/buster/libthrift-0.11.0_0.11.0-4_arm64.deb-install ] 

@@ -18,7 +18,11 @@ $(addprefix $(DEST)/, $(MAIN_TARGET)): $(DEST)/% :
patch -p1 < ../patches/freeradius_libeap_deprecated_openssl_1_0.patch
cp ../patches/config.sub .
cp ../patches/config.guess .
ifeq ($(CROSS_BUILD_ENVIRON), y)
./configure --disable-static --enable-libtool-lock --libdir=$(CROSS_PKGS_LIB_PATH) --libexecdir=$(CROSS_PKGS_LIB_PATH) --host=$(CROSS_HOST_TYPE)
Copy link
Contributor

@vivekrnv vivekrnv Sep 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This deb is failing for arm64, (target/debs/buster/libpam-radius-auth_1.4.1-1_arm64.deb)

=== configuring in libltdl (/sonic/src/radius/pam/freeradius/freeradius-server/libltdl)
configure: running /bin/bash ./configure '--prefix=/usr/local'  '--disable-static' '--enable-libtool-lock' '--libdir=/usr/lib/aarch64-linux-gnu' '--libexecdir=/usr/lib/aarch64-linux-gnu' '--host=aarch64-linux-gnu' 'host_alias=aarch64-linux-gnu' '--enable-ltdl-install' --cache-file=/dev/null --srcdir=.
configure: WARNING: If you wanted to set the --build type, don't use --host.
    If a cross compiler is detected then cross compile mode will be used.
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
checking for aarch64-linux-gnu-strip... aarch64-linux-gnu-strip
checking for aarch64-linux-gnu-gcc... aarch64-linux-gnu-gcc
checking for C compiler default output file name... a.out
checking whether the C compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables... 
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether aarch64-linux-gnu-gcc accepts -g... yes
checking for aarch64-linux-gnu-gcc option to accept ISO C89... none needed
checking for style of include used by make... GNU
checking dependency style of aarch64-linux-gnu-gcc... none
checking for an ANSI C-conforming const... yes
checking for inline... inline
checking build system type... x86_64-unknown-linux-gnu
checking host system type... Invalid configuration `aarch64-linux-gnu': machine `aarch64' not recognized
configure: error: /bin/bash ./config.sub aarch64-linux-gnu failed
configure: error: ./configure failed for libltdl

#Closed

@@ -40,11 +48,88 @@ RUN echo "deb [arch=arm64] http://deb.debian.org/debian buster main contrib non-
echo 'deb [arch=arm64] http://ftp.debian.org/debian buster-backports main' >> /etc/apt/sources.list && \
echo "deb [arch=arm64] http://packages.trafficmanager.net/debian/debian buster main contrib non-free" >> /etc/apt/sources.list && \
echo "deb [arch=arm64] http://packages.trafficmanager.net/debian/debian buster-updates main contrib non-free" >> /etc/apt/sources.list
{%- elif CONFIGURED_ARCH == "armhf" and CROSS_BUILD_ENVIRON == "y" %}
Copy link
Contributor

@vivekrnv vivekrnv Sep 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If i have to install a package lets say (libpopt-dev) which has both the amd64 & arm64 variants available.

If i install it using "apt-get install libpopt-dev -y", which arch will be fetched from? I'm asking this because from what i understand the the "/etc/apt/sources.list" file is not modified if CROSS_BUILD_ENVIRON == y.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you install "apt-get install libpopt-dev -y" then amd64 will be installed (as the sonic-slave image architecture which is amd64). If you install "apt-get install libpopt-dev:arm64 -y" then arm64 package will be installed. The same logic for the ':armhf' package suffix. You can see it in the sonic-slave-buster/Dockerfile.j2 file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a sample from the Dockerfile.j2:

RUN apt-get update && apt-get install -y libelf-dev:$arch libdw-dev:$arch libbz2-dev:$arch liblzo2-dev:$arch libedit-dev:$arch libevent-dev:$arch libopts25-dev:$arch ...

where $arch is either arm64 or armhf according to the target architecture.

@gregshpit
Copy link
Contributor Author

Hi @lguohan @qiluo-msft @xumia ,

Is there anything that prevents merging of this PR ? What can I do to facilitate the merging ?

RUN PATH=/python_virtualenv/env3/bin/:$PATH python3 -m pip uninstall -y enum34
RUN PATH=/python_virtualenv/env3/bin/:$PATH pip3 install --force-reinstall --no-cache-dir coverage
{%- endif %}

RUN apt-get update && apt-get install -y \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In reference to comment given above:

Here is a sample from the Dockerfile.j2:
RUN apt-get update && apt-get install -y libelf-dev:$arch libdw-dev:$arch libbz2-dev:$arch liblzo2-dev:$arch libedit-dev:$arch libevent-dev:$arch libopts25-dev:$arch ...
where $arch is either arm64 or armhf according to the target architecture.

In that case, shoudn't we be adding :arch extensions to those packages which are target specific.

Because otherwise, we might face these problems in the future.
For Eg: I am working to compile a package which depend on "libusb-1.0-0-dev:arm64" and we already install this package here

# For BFN sdk build
        libusb-1.0-0-dev \ (host arch package will be fetched here and thus it won't work for cross compilation scenario )

But also, packages like these need not be target specific, they have to be based on host arch

# For build image
        cpio \
        squashfs-tools \
        zip \

Any idea/plans on solving this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually there is no problem in installing the same package for both host and other architectures. They are installed in different directories. You can see in Dockerfile.j2 that some packages are already installed for both host and armhf/arm64 architectures.

build_debian.sh Outdated
@@ -113,6 +113,10 @@ sudo LANG=C chroot $FILESYSTEM_ROOT apt-get -y upgrade
echo '[INFO] Install packages for building image'
sudo LANG=C chroot $FILESYSTEM_ROOT apt-get -y install makedev psmisc

if [[ $CROSS_BUILD_ENVIRON == y ]]; then
sudo LANG=C chroot $FILESYSTEM_ROOT dpkg --add-architecture $CONFIGURED_ARCH
Copy link
Collaborator

@qiluo-msft qiluo-msft Sep 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sudo

sudo LANG=C chroot $FILESYSTEM_ROOT dpkg --add-architecture $CONFIGURED_ARCH


Could you indent inner code block? #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still applicable. Minor issue.

ifneq ($(CROSS_BLDENV),)
SLAVE_BASE_IMAGE = $(SLAVE_DIR)-march-$(CONFIGURED_ARCH)
MULTIARCH_QEMU_ENVIRON = n
CROSS_BUILD_ENVIRON = y
Copy link
Collaborator

@qiluo-msft qiluo-msft Sep 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CROSS_BUILD_ENVIRON

The 2 variable names are confusing: CROSS_BUILD_ENVIRON vs CROSS_BLDENV. What is the difference? Can we use one? #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CROSS_BLDENV is used only on the building/configuring command line to trigger cross-compilation build instead of the default qemu build. Like this:
NOJESSIE=1 NOSTRETCH=1 BLDENV=buster CROSS_BLDENV=1 make target/sonic-marvell-armhf.bin

CROSS_BUILD_ENVIRON is internal variabled used everywhere in the Makefiles.

Please see the PR description above for more info.

slave.mk Outdated
CROSS_BIN_PATH = /usr/$(CROSS_HOST_TYPE)/bin/
CROSS_PKGS_LIB_PATH = /usr/lib/$(CROSS_HOST_TYPE)

CROSS_PERL_CORE_PATH = $(CROSS_PKGS_LIB_PATH)/perl/5.28.1/CORE/
Copy link
Collaborator

@qiluo-msft qiluo-msft Sep 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5.28.1

CROSS_PERL_CORE_PATH = $(CROSS_PKGS_LIB_PATH)/perl/5.28.1/CORE/


Why you need to use fixed version? #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix.

slave.mk Outdated
{ sudo -E pip$($*_PYTHON_VERSION) install $(PYTHON_WHEELS_PATH)/$* $(LOG) && rm -d $(PYTHON_WHEELS_PATH)/pip_lock && break; } || { rm -d $(PYTHON_WHEELS_PATH)/pip_lock && exit 1 ; }
else
# Link python script and data expected location to the cross python virtual env istallation locations
{ PATH=$(VIRTENV_BIN_CROSS_PYTHON$($*_PYTHON_VERSION)):${PATH} sudo -E $(VIRTENV_BIN_CROSS_PYTHON$($*_PYTHON_VERSION))/pip$($*_PYTHON_VERSION) install --no-deps $(PYTHON_WHEELS_PATH)/$* $(LOG) && $(if $(findstring $(SONIC_CONFIG_ENGINE_PY3),$*),(sudo ln -s $(VIRTENV_BIN_CROSS_PYTHON$($*_PYTHON_VERSION))/sonic-cfggen /usr/local/bin/sonic-cfggen 2>/dev/null || true), true ) && $(if $(findstring $(SONIC_YANG_MODELS_PY3),$*),(sudo ln -s $(VIRTENV_BASE_CROSS_PYTHON3)/yang-models /usr/local/yang-models 2>/dev/null || true), true ) && rm -d $(PYTHON_WHEELS_PATH)/pip_lock && break; } || { rm -d $(PYTHON_WHEELS_PATH)/pip_lock && exit 1 ; }
Copy link
Collaborator

@qiluo-msft qiluo-msft Sep 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SONIC_CONFIG_ENGINE_PY3

Why treat package differently? slave just provides a common environment to build any packages in general. #Closed

Copy link
Contributor Author

@gregshpit gregshpit Sep 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is wrong here that python3 version appears explicitly or that I do some specific handling for SONIC_CONFIG_ENGINE_PY3 in slave.mk ?
SONIC_CONFIG_ENGINE_PY3 and also SONIC_YANG_MODELS_PY3 are special because components using them expect them to be installed in /usr/local/bin/sonic-cfggen and /usr/local/yang-models respectively. However when using cross-compilation these python scripts are installed in some other python virtual environment location. In order to keep them accessible through expected /usr/local/... location I linked the built scripts to these locations.
This should be done every time SONIC_CONFIG_ENGINE_PY3 or SONIC_YANG_MODELS_PY3 wheels are installed.
I could have done it inside these wheels building environment but I did not find any hook for "wheel installation" there except for that in slave.mk .

Please advise how to improve this implementation.
Thanks.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that you only need virtual env in build, not runtime. How about buiding the python wheel with virtual env, but let slave container install the wheel outside virtual env.

Not an expert on the solution. The concern is that this is not general and not scalable.


PRE_BUILT_TARGET_DIR=$1

declare -a pkgs=("grpcio==1.38.0" "grpcio-tools==1.20.0" "m2crypto==0.36.0" "bitarray==1.5.3" "lxml==4.6.3")
Copy link
Collaborator

@qiluo-msft qiluo-msft Sep 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pkgs

What is the motivation of this script? It use completely new way to handle dependencies and versions, and it will challenge the design sonic-net/SONiC#684 (@xumia to check).

If the purpose is to cross compile some pypi packages with c++ code, will this answer help https://stackoverflow.com/a/61019582/2514803?

We prefer the ways to manage dependencies and versions, such as

  1. apt-get install
  2. pip install
  3. wget/curl
  4. git clone

#Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script cross-compiles dockers' python packages heaviest (by building time) dependencies.
Dockers install a number of python packages which in turn install their dependencies. Installing the dependencies takes a big amount of time using qemu emulation. And also all dockers that use these heavy dependencies build them from scratch again and again and waste the building time.
I wanted to prebuild these dependencies using cross-compiling - much faster than using qemu emulation. And copy them to the docker image (through Dockerfile.j2) to be used (without building them) by the python packages installed in the docker image. This approach saves a lot of building time.
So what is wrong that I use specific dependencies version ?
But dockers' Dockerfile.j2 also contains specific python packages versions, like this:
RUN pip3 install thrift==0.13.0
So these are specific python dependencies versions of the specific python packages versions.
Please instruct me how can I improve this approach.
Thanks.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xumia The concern is how reproducible build support pip3 download? Could you check?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only support pip3 install now, but the constraint option (-c <constraints.txt>) is also supported by pip3 download.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xumia has some idea how to implement reproducible build for pip download. I can close this comment.

DOCKER_HOST="unix:///dockerfs/var/run/docker.sock"
SONIC_NATIVE_DOCKERD_FOR_DOCKERFS=" -H unix:///dockerfs/var/run/docker.sock "
Copy link
Collaborator

@qiluo-msft qiluo-msft Sep 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SONIC_NATIVE_DOCKERD_FOR_DOCKERFS

This is defined at Makefile.work L214, could you reuse? #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand in Makefile.work this variable has different meaning. It is a list of commands to be executed. In sonic_debian_extension.j2 file it is docker daemon socket to connect to.

@@ -3,7 +3,11 @@
include /usr/share/dpkg/pkg-info.mk

PACKAGE_PRE_NAME := mrvlprestera
ifeq ($(CROSS_BUILD_ENVIRON), y)
KVERSION ?= $(KVERSION)
Copy link
Collaborator

@qiluo-msft qiluo-msft Sep 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KVERSION ?= $(KVERSION)


Is this line needed? Seems a noop. #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix.

@@ -11,10 +11,20 @@ DISTRO=$5
DOCKERFILE_PATH=$(dirname "$DOCKERFILE_TARGE")
BUILDINFO_PATH="${DOCKERFILE_PATH}/buildinfo"
BUILDINFO_VERSION_PATH="${BUILDINFO_PATH}/versions"
if [[ $CROSS_BUILD_ENVIRON == y ]]; then
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$CROSS_BUILD_ENVIRON

This file is to implement reproducible feature sonic-net/SONiC#684. Why you add here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to copy prebuilt (cross-compiled) python wheels to the dockers directory in order to be installed in the dockers Docker.j2 file. See for example dockers/docker-config-engine-buster/Dockerfile.j2.
Please suggest what and where to do it if this file is inappropriate.
Thanks.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you treat them as normal Makefile target and normal dependencies of others?

@gregshpit
Copy link
Contributor Author

Hi @saiarcot895,

While building target/sonic-marvell-armhf.bin image I received the following error:

+ sudo dpkg --root=./fsroot-marvell-armhf -i target/debs/bullseye/openssh-server_8.4p1-5_armhf.deb
dpkg: warning: downgrading openssh-server:armhf from 1:8.4p1-5+deb11u1 to 1:8.4p1-5
(Reading database ... 42580 files and directories currently installed.)
Preparing to unpack .../openssh-server_8.4p1-5_armhf.deb ...
Unpacking openssh-server:armhf (1:8.4p1-5) over (1:8.4p1-5+deb11u1) ...
dpkg: dependency problems prevent configuration of openssh-server:armhf:
 openssh-server:armhf depends on openssh-client (= 1:8.4p1-5); however:
  Version of openssh-client:armhf on system is 1:8.4p1-5+deb11u1.

dpkg: error processing package openssh-server:armhf (--install):
 dependency problems - leaving unconfigured
Errors were encountered while processing:
 openssh-server:armhf
+ clean_sys

The same build a week ago installed correct openssh-client (= 1:8.4p1-5) package.
As far as I can see there was a new bullseye release a few days ago which could break the build.
Can you confirm it ?

Besides this error and target/python-wheels/bullseye/sonic_ycabled-1.0-py3-none-any.whl known issue the rest of armhf Sonic was successfully cross-built with your changes which I incorporated yesterday. These are good new.

Thanks,

Gregory

@gregshpit
Copy link
Contributor Author

Hi @saiarcot895,

Every single day something is changing in the upstream and it causes and will cause me an extra merging and fixing work. Can we make an effort and close all issues and finaĺly merge into the upstream next week ?

Thanks,

Gregory

@saiarcot895
Copy link
Contributor

It looks like there is no prebuilt manylinux armhf grpcio-tools package (only arm64 package is available for Python 3.9), see here https://pypi.org/project/grpcio-tools/#files .

Yeah, for grpcio-tools at least, there's no manylinux armhf package. In this case, grpcio-tools would be locally compiled.

After applying your changes target/python-wheels/bullseye/sonic_ycabled-1.0-py3-none-any.whl testing still fails with the same old error: ``ImportError: /usr/lib/arm-linux-gnueabihf/libstdc++.so.6: version GLIBCXX_3.4.29' not found (required by /sonic/src/sonic-platform-daemons/sonic-ycabled/.eggs/grpcio_tools-1.47.0-py3.9-linux-armv7l.egg/grpc_tools/_protoc_compiler.cpython-39-arm-linux-gnueabihf.so) `

I had the fix for this in a separate commit, saiarcot895/sonic-buildimage@aba74f2. Bringing that commit in should get rid of that error.

I've also encountered this issue.

That is why I replaced this line which caused problems: pip(__PYTHON_VERSION) install . && pip(__PYTHON_VERSION) uninstall --yes python$($*_PYTHON_VERSION) setup.py --name with this one which does almost the same as far as I understand and runs fine: python(*_PYTHON_VERSION) setup.py build

Ah, I see. I think we had that syntax before, but changed it because there was a prerelease version of some package getting pulled in which caused tests to fail (I don't remember the exact details). If this syntax works, then we can keep it for the crosscompile build at least.

The same build a week ago installed correct openssh-client (= 1:8.4p1-5) package. As far as I can see there was a new bullseye release a few days ago which could break the build. Can you confirm it ?

Yeah there was a change this past weekend for the openssh-client package, and has since been fixed in the master branch of sonic-buildimage.

@gregshpit
Copy link
Contributor Author

Thanks. I'll complete all the fixes on Sunday according to your suggestion and then we'll be hopefully ready for the merge. Let me know if there is anything else to fix.

Gregory

@gregshpit
Copy link
Contributor Author

It looks like there is no prebuilt manylinux armhf grpcio-tools package (only arm64 package is available for Python 3.9), see here https://pypi.org/project/grpcio-tools/#files .

Yeah, for grpcio-tools at least, there's no manylinux armhf package. In this case, grpcio-tools would be locally compiled.

After applying your changes target/python-wheels/bullseye/sonic_ycabled-1.0-py3-none-any.whl testing still fails with the same old error: ``ImportError: /usr/lib/arm-linux-gnueabihf/libstdc++.so.6: version GLIBCXX_3.4.29' not found (required by /sonic/src/sonic-platform-daemons/sonic-ycabled/.eggs/grpcio_tools-1.47.0-py3.9-linux-armv7l.egg/grpc_tools/_protoc_compiler.cpython-39-arm-linux-gnueabihf.so) `

I had the fix for this in a separate commit, saiarcot895/sonic-buildimage@aba74f2. Bringing that commit in should get rid of that error.

I don't understand. Your fix in this commit saiarcot895/sonic-buildimage@aba74f2 deals with manylinux wheels only. But as I discovered problematic packages grpcio-tools and grpcio do not have manylinux version. So your fix seems to be of no help. Do I still need to incorporate it ?
We also agreed that we then need to locally compile the both packages. How do I do it and where this compilation is placed ? In the sonic-slave-bullseye/Dockerfile.j2 file or in src/sonic-platform-daemons/sonic-ycabled ? Can you be more specific on how it is recommended to do it ? The both packages use version 1.47.0. Should I download sources for this fixed version for the both packages ?

Thanks,

Gregory

@saiarcot895
Copy link
Contributor

So the code that that patch is changing is related to what types/versions of prebuilt wheels can be used. For example, each version of grpcio-tools has many types of prebuilt wheels available on pypi (cp310-cp310-win_amd64, cp310-cp310-win32, cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64, cp310-cp310-manylinux_2_17_i686.manylinux2014_i686, cp310-cp310-manylinux_2_17_aarch64, cp310-cp310-macosx_10_10_x86_64, cp310-cp310-linux_armv7l, and so on). Obviously, not all of those can be used on the current platform. So this function returns the list of types that can be used on the current platform. The yield from _manylinux.platform_tags(linux, arch) line returns all of the manylinux-based types that can be used. The yield from _musllinux.platform_tags(arch) line, in our case, returns nothing, since we're using glibc, not musl, as our libc library. The yield linux line return just the current system type (linux_armv7l, as Python sees it).

The following code shows a partial list of the wheel types that this function returns (the full list is very long)

>>> from pip._internal.models.target_python import TargetPython
>>> target_python = TargetPython()
>>> target_python.get_tags()
[<cp39-cp39-manylinux_2_31_armv7l @ 1056301832>, <cp39-cp39-manylinux_2_30_armv7l @ 1056302568>, <cp39-cp39-manylinux_2_29_armv7l @ 1056302696>, <cp39-cp39-manylinux_2_28_armv7l @ 1056302792>, <cp39-cp39-manylinux_2_27_armv7l @ 1056302888>, ..., <cp39-cp39-linux_armv7l @ 1056304008>, <cp39-abi3-manylinux_2_31_armv7l @ 1056304104>, <cp39-abi3-manylinux_2_30_armv7l @ 1056304200>, <cp39-abi3-manylinux_2_29_armv7l @ 1056304296>, <cp39-abi3-manylinux_2_28_armv7l @ 1056304392>, <cp39-abi3-manylinux_2_27_armv7l @ 1056304488>, ..., <cp39-abi3-linux_armv7l @ 1056326152>, <cp39-none-manylinux_2_31_armv7l @ 1056326248>, <cp39-none-manylinux_2_30_armv7l @ 1056326344>, <cp39-none-manylinux_2_29_armv7l @ 1056326440>, <cp39-none-manylinux_2_28_armv7l @ 1056326536>, <cp39-none-manylinux_2_27_armv7l @ 1056326632>, ..., <cp39-none-linux_armv7l @ 1056327784>, ...

Here, the <cp39-cp39-linux_armv7l @ 1056304008> is the problematic entry for this. This tells pip that it can use the grpcio_tools-1.47.0-cp39-cp39-linux_armv7l.whl prebuilt wheel, except that isn't a manylinux wheel, and because of that, it doesn't work in the Debian Bullseye environment. Therefore, we need to tell pip not to use that type of wheel. Besides having a specific exclusion for just this package, a more general way to do this is to tell pip not to use anything besides manylinux wheels (in other words, not to use anything that isn't guaranteed to work on our system).

With the patch applied, the non-manylinux entries disappear from that list. When that happens, pip will see that none of the wheel types in our list match the wheel types that are available from pypi for grpcio-tools (and possibly other packages as well). That means pip is then forced to locally build the package in this environment, since there's no prebuilt wheel for it, and so pip will download the sources for grpcio-tools and build it.

@gregshpit
Copy link
Contributor Author

Oh great !!! Thanks for your prompt answer and the detailed explanation . I'll try it today.
Have a nice weekend !

Gregory

@gregshpit
Copy link
Contributor Author

Hi @saiarcot895,

I made all fixes we had discussed lately and also merged with the upstream. Don't have anything else to add.
I'm testing Sonic armhf cross-build run now.

You are welcome to try to cross-build it yourself and if no issues found then please merge it into the upstream.

Don't forget to take these two new PRs (see this PR description above):
sonic-net/sonic-utilities#2233
sonic-net/sonic-linkmgrd#91

and also old sonic-net/sonic-mgmt-framework#88 PR which has not been merged into the upstream yet.

Thanks,

Gregory

@gregshpit
Copy link
Contributor Author

Hi @saiarcot895,

2 CI tests failed after my last commit. Could you please take a look whether this is related to my changes or not. There is some failure in sonic-frr build as far as I can see.

Traceback (most recent call last):
  File "/sonic/src/sonic-frr/frr/build/../yang/embedmodel.py", line 16, in <module>
    os.makedirs(outdir)
  File "/usr/lib/python3.9/os.py", line 225, in makedirs
    mkdir(name, mode)
FileExistsError: [Errno 17] File exists: '/sonic/src/sonic-frr/frr/build/yang/ietf'

Thanks,

Gregory

@saiarcot895
Copy link
Contributor

I restarted the pipeline, I'm guessing it's some transient failure.

For the submodules, the standard practice is to include the submodule updates in this PR (since they're required here), if they haven't been already updated in sonic-buildimage. The submodules for the three PRs mentioned have already been updated in sonic-buildimage with your changes, so nothing needs to be done there.

One change I had to do for docker-dhcp-relay and docker-macsec to build was to remove --no-deps from the python wheel installation command. This is so that the CLI tests could be run. Otherwise, I get errors about lazy_object_proxy missing (snippet of error message):

__________________ ERROR collecting test_config_dhcp_relay.py __________________
ImportError while importing test module '/sonic/dockers/docker-dhcp-relay/cli-plugin-tests/test_config_dhcp_relay.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/python_virtualenv/env3/lib/python3.9/site-packages/_pytest/python.py:608: in _importtestmodule
    mod = import_path(self.path, mode=importmode, root=self.config.rootpath)
/python_virtualenv/env3/lib/python3.9/site-packages/_pytest/pathlib.py:533: in import_path
    importlib.import_module(module_name)
/usr/lib/python3.9/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1030: in _gcd_import
    ???
<frozen importlib._bootstrap>:1007: in _find_and_load
    ???
<frozen importlib._bootstrap>:986: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:680: in _load_unlocked
    ???
/python_virtualenv/env3/lib/python3.9/site-packages/_pytest/assertion/rewrite.py:168: in exec_module
    exec(co, module.__dict__)
test_config_dhcp_relay.py:13: in <module>
    import dhcp_relay
../cli/config/plugins/dhcp_relay.py:2: in <module>
    import utilities_common.cli as clicommon
/python_virtualenv/env3/lib/python3.9/site-packages/utilities_common/cli.py:10: in <module>
    import lazy_object_proxy
E   ModuleNotFoundError: No module named 'lazy_object_proxy'
diff --git a/slave.mk b/slave.mk
index 57b16c02e..9691271e3 100644
--- a/slave.mk
+++ b/slave.mk
@@ -857,7 +857,7 @@ ifneq ($(CROSS_BUILD_ENVIRON),y)
        { sudo -E pip$($*_PYTHON_VERSION) install $(PYTHON_WHEELS_PATH)/$* $(LOG) && rm -d $(PYTHON_WHEELS_PATH)/pip_lock && break; } || { rm -d $(PYTHON_WHEELS_PATH)/pip_lock && exit 1 ; }
 else
        # Link python script and data expected location to the cross python virtual env istallation locations
-       { PATH=$(VIRTENV_BIN_CROSS_PYTHON$($*_PYTHON_VERSION)):${PATH} sudo -E $(VIRTENV_BIN_CROSS_PYTHON$($*_PYTHON_VERSION))/pip$($*_PYTHON_VERSION) install --no-deps $(PYTHON_WHEELS_PATH)/$* $(LOG) && $(if $(findstring $(SONIC_CONFIG_ENGINE_PY3),$*),(sudo ln -s $(VIRTENV_BIN_CROSS_PYTHON$($*_PYTHON_VERSION))/sonic-cfggen /usr/local/bin/sonic-cfggen 2>/dev/null || true), true ) && $(if $(findstring $(SONIC_YANG_MODELS_PY3),$*),(sudo ln -s $(VIRTENV_BASE_CROSS_PYTHON3)/yang-models /usr/local/yang-models 2>/dev/null || true), true ) && rm -d $(PYTHON_WHEELS_PATH)/pip_lock && break; } || { rm -d $(PYTHON_WHEELS_PATH)/pip_lock && exit 1 ; }
+       { PATH=$(VIRTENV_BIN_CROSS_PYTHON$($*_PYTHON_VERSION)):${PATH} sudo -E $(VIRTENV_BIN_CROSS_PYTHON$($*_PYTHON_VERSION))/pip$($*_PYTHON_VERSION) install $(PYTHON_WHEELS_PATH)/$* $(LOG) && $(if $(findstring $(SONIC_CONFIG_ENGINE_PY3),$*),(sudo ln -s $(VIRTENV_BIN_CROSS_PYTHON$($*_PYTHON_VERSION))/sonic-cfggen /usr/local/bin/sonic-cfggen 2>/dev/null || true), true ) && $(if $(findstring $(SONIC_YANG_MODELS_PY3),$*),(sudo ln -s $(VIRTENV_BASE_CROSS_PYTHON3)/yang-models /usr/local/yang-models 2>/dev/null || true), true ) && rm -d $(PYTHON_WHEELS_PATH)/pip_lock && break; } || { rm -d $(PYTHON_WHEELS_PATH)/pip_lock && exit 1 ; }
 endif
        fi
        done

Was there a reason --no-deps was added?

@gregshpit
Copy link
Contributor Author

gregshpit commented Jul 19, 2022

Don't remember why --no-deps was added. I've also encountered missing lazy_object_proxy module and added its installation in the bullseye Dockerfile.j2 in one of the last commits. So my today's cross-build from scratch after all changes applied completed successfully without errors.

@saiarcot895
Copy link
Contributor

I think unless the non-crosscompile slave container is installing a python package, the crosscompile slave container shouldn't be installing it.

@gregshpit
Copy link
Contributor Author

No problem. If removal of "--no-deps" does not bring up other issues then fine. Will you check it ?

@gregshpit
Copy link
Contributor Author

I'm rebuilding without "--no-deps" and no issues so far. Do you want me to update the PR accordingly removing --no-depsfrom slave.mk and also lazy_object_proxy from the Dockefile.j2 ?

@gregshpit
Copy link
Contributor Author

BTW the 2 tests still failed.

@saiarcot895
Copy link
Contributor

Yes please, remove the --no-deps and remove the installation of lazy_object_proxy.

The failure in the build this time appears to be a network issue. I won't restart it since there'll be a new commit, and that'll anyways run all of the pipelines.

@gregshpit
Copy link
Contributor Author

Thanks. I'll remove soon and commit. Is there anything else that prevents merging with the upstream ?

…lazy_object_proxy arm python3 package instalation
@gregshpit
Copy link
Contributor Author

Just made a new commit with the 2 removals.

Gregory

@gregshpit
Copy link
Contributor Author

All tests completed successfully. What's next ?

@saiarcot895 saiarcot895 merged commit 5df0949 into sonic-net:master Jul 21, 2022
@gregshpit
Copy link
Contributor Author

Hi @saiarcot895,

Thanks a lot for merging my PR and for all the extensive assistance you provided.
BTW did you add a new CI test for cross-building armhf Sonic to prevent further changes that can break this build ?

Thanks,

Gregory

@radha-danda
Copy link

@saiarcot895, @yxieca, PR has to be backported to 202012 branch and 202205 branch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.