Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rebuild hatchling-1.18.0-GCCcore-12.3.0 #669

Merged
merged 1 commit into from
Aug 22, 2024

Conversation

Neves-P
Copy link
Member

@Neves-P Neves-P commented Aug 14, 2024

Copy link

eessi-bot bot commented Aug 14, 2024

Instance eessi-bot-mc-aws is configured to build for:

  • architectures: x86_64/generic, x86_64/intel/haswell, x86_64/intel/skylake_avx512, x86_64/amd/zen2, x86_64/amd/zen3, aarch64/generic, aarch64/neoverse_n1, aarch64/neoverse_v1
  • repositories: eessi.io-2023.06-compat, eessi-hpc.org-2023.06-software, eessi-hpc.org-2023.06-compat, eessi.io-2023.06-software

Copy link

eessi-bot bot commented Aug 14, 2024

Instance eessi-bot-mc-azure is configured to build for:

  • architectures: x86_64/amd/zen4
  • repositories: eessi.io-2023.06-compat, eessi-hpc.org-2023.06-compat, eessi-hpc.org-2023.06-software, eessi.io-2023.06-software

Instance boegel-bot-deucalion is configured to build for:

  • architectures: aarch64/a64fx
  • repositories: eessi.io-2023.06-software

@Neves-P
Copy link
Member Author

Neves-P commented Aug 14, 2024

bot: build repo:eessi.io-2023.06-software arch:aarch64/neoverse_n1

Copy link

eessi-bot bot commented Aug 14, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:aarch64/neoverse_n1 from Neves-P

    • expanded format: build repository:eessi.io-2023.06-software architecture:aarch64/neoverse_n1
  • handling command build repository:eessi.io-2023.06-software architecture:aarch64/neoverse_n1 resulted in:

Copy link

eessi-bot bot commented Aug 14, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • account Neves-P has NO permission to send commands to the bot

Copy link

eessi-bot bot commented Aug 14, 2024

New job on instance eessi-bot-mc-aws for architecture aarch64-neoverse_n1 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.08/pr_669/16216

date job status comment
Aug 14 15:17:45 UTC 2024 submitted job id 16216 awaits release by job manager
Aug 14 15:18:14 UTC 2024 released job awaits launch by Slurm scheduler
Aug 14 15:23:17 UTC 2024 running job 16216 is running
Aug 14 15:59:53 UTC 2024 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-16216.out
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
No artefacts were created or found.
Aug 14 15:59:53 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 17/17 test case(s) from 17 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-16216.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

Updates by the bot instance boegel-bot-deucalion (click for details)
  • account Neves-P has NO permission to send commands to the bot

@Neves-P
Copy link
Member Author

Neves-P commented Aug 15, 2024

New job on instance eessi-bot-mc-aws for architecture aarch64-neoverse_n1 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.08/pr_669/16216
date job status comment
Aug 14 15:17:45 UTC 2024 submitted job id 16216 awaits release by job manager
Aug 14 15:18:14 UTC 2024 released job awaits launch by Slurm scheduler
Aug 14 15:23:17 UTC 2024 running job 16216 is running
Aug 14 15:59:53 UTC 2024 finished
😢 FAILURE (click triangle for details)

Details
✅ job output file slurm-16216.out
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts

Aug 14 15:59:53 UTC 2024 test result
😁 SUCCESS (click triangle for details)

Looking at the build log we see:

ERROR: Could not install packages due to an OSError.
Consider using the `--user` option or check the permissions.
Traceback (most recent call last):
  ...
  File "/cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/neoverse_n1/software/Python/3.11.3-GCCcore-12.3.0/lib/python3.11/site-packages/pip/_vendor/distlib/scripts.py", line 293, in _write_script
    self._fileop.write_binary_file(outname, script_bytes)
  File "/cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/neoverse_n1/software/Python/3.11.3-GCCcore-12.3.0/lib/python3.11/site-packages/pip/_vendor/distlib/util.py", line 555, in write_binary_file
    os.remove(path)
PermissionError: [Errno 13] Permission denied: '/cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/neoverse_n1/software/hatchling/1.18.0-GCCcore-12.3.0/bin/hatchling'
 (at easybuild/tools/run.py:682 in parse_cmd_output)

This is almost certainly the same @bedroge reports in #556 and worked around here: #555 (comment)

@Neves-P
Copy link
Member Author

Neves-P commented Aug 16, 2024

bot: build repo:eessi.io-2023.06-software arch:aarch64/neoverse_n1

Copy link

eessi-bot bot commented Aug 16, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:aarch64/neoverse_n1 from Neves-P

    • expanded format: build repository:eessi.io-2023.06-software architecture:aarch64/neoverse_n1
  • handling command build repository:eessi.io-2023.06-software architecture:aarch64/neoverse_n1 resulted in:

Updates by the bot instance boegel-bot-deucalion (click for details)
  • account Neves-P has NO permission to send commands to the bot

Copy link

eessi-bot bot commented Aug 16, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • account Neves-P has NO permission to send commands to the bot

Copy link

eessi-bot bot commented Aug 16, 2024

New job on instance eessi-bot-mc-aws for architecture aarch64-neoverse_n1 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.08/pr_669/16313

date job status comment
Aug 16 09:17:01 UTC 2024 submitted job id 16313 awaits release by job manager
Aug 16 09:17:12 UTC 2024 released job awaits launch by Slurm scheduler
Aug 16 09:23:14 UTC 2024 running job 16313 is running
Aug 16 10:00:35 UTC 2024 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-16313.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-aarch64-neoverse_n1-1723800239.tar.gzsize: 0 MiB (531065 bytes)
entries: 367
modules under 2023.06/software/linux/aarch64/neoverse_n1/modules/all
hatchling/1.18.0-GCCcore-12.3.0.lua
software under 2023.06/software/linux/aarch64/neoverse_n1/software
hatchling/1.18.0-GCCcore-12.3.0
other under 2023.06/software/linux/aarch64/neoverse_n1
no other files in tarball
Aug 16 10:00:35 UTC 2024 test result
😢 FAILURE (click triangle for details)
Reason
EESSI test suite produced failures.
ReFrame Summary
[ FAILED ] Ran 18/18 test case(s) from 18 check(s) (1 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-16313.out
❌ found message matching ERROR:
❌ found message matching [\s*FAILED\s*].*Ran .* test case

@Neves-P
Copy link
Member Author

Neves-P commented Aug 16, 2024

Failed test:

================================================================================
SUMMARY OF FAILURES
--------------------------------------------------------------------------------
FAILURE INFO for EESSI_LAMMPS_lj %scale=1_node %device_type=cpu %module_name=LAMMPS/2Aug2023_update2-foss-2023a-kokkos (run: 1/1)
  * Description:
  * System partition: BotBuildTests:default
  * Environment: default
  * Stage directory: /project/60006/SHARED/jobs/2024.08/pr_669/event_47bfab50-5bb0-11ef-89f6-9f388b4fa992/run_000/linux_aarch64_neoverse_n1/eessi.io-2023.06-software/reframe_runs/stage/BotBuildTests/default/default/EESSI_LAMMPS_lj_04ff9ece
  * Node list: aarch64-neoverse-n1-node1.int.aws-rocky88-202310
  * Job type: local (id=27215)
  * Dependencies (conceptual): []
  * Dependencies (actual): []
  * Maintainers: []
  * Failing phase: sanity
  * Rerun with '-n /04ff9ece -p default --system BotBuildTests:default -r'
  * Reason: sanity error: 4.340000000002675e-05 >= 1e-06
--- rfm_job.out (first 10 lines) ---
LAMMPS (2 Aug 2023 - Update 2)
  using 1 OpenMP thread(s) per MPI task
Lattice spacing in x,y,z = 1.6795962 1.6795962 1.6795962
Created orthogonal box = (0 0 0) to (33.591924 33.591924 33.591924)
  2 by 2 by 4 MPI processor grid
Created 32000 atoms
  using lattice units in orthogonal box = (0 0 0) to (33.591924 33.591924 33.591924)
  create_atoms CPU = 0.001 seconds
Generated 0 of 0 mixed pair_coeff terms from geometric mixing rule
Neighbor list info ...
--- rfm_job.out ---
--- rfm_job.err (first 10 lines) ---
--- rfm_job.err ---
--------------------------------------------------------------------------------

@Neves-P
Copy link
Member Author

Neves-P commented Aug 16, 2024

bot: build repo:eessi.io-2023.06-software arch:aarch64/neoverse_n1

Copy link

eessi-bot bot commented Aug 16, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:aarch64/neoverse_n1 from Neves-P

    • expanded format: build repository:eessi.io-2023.06-software architecture:aarch64/neoverse_n1
  • handling command build repository:eessi.io-2023.06-software architecture:aarch64/neoverse_n1 resulted in:

Updates by the bot instance boegel-bot-deucalion (click for details)
  • account Neves-P has NO permission to send commands to the bot

Copy link

eessi-bot bot commented Aug 16, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • account Neves-P has NO permission to send commands to the bot

Copy link

eessi-bot bot commented Aug 16, 2024

New job on instance eessi-bot-mc-aws for architecture aarch64-neoverse_n1 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.08/pr_669/16316

date job status comment
Aug 16 10:54:17 UTC 2024 submitted job id 16316 awaits release by job manager
Aug 16 10:54:55 UTC 2024 released job awaits launch by Slurm scheduler
Aug 16 10:55:57 UTC 2024 running job 16316 is running
Aug 16 11:33:34 UTC 2024 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-16316.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-aarch64-neoverse_n1-1723805793.tar.gzsize: 0 MiB (529696 bytes)
entries: 367
modules under 2023.06/software/linux/aarch64/neoverse_n1/modules/all
hatchling/1.18.0-GCCcore-12.3.0.lua
software under 2023.06/software/linux/aarch64/neoverse_n1/software
hatchling/1.18.0-GCCcore-12.3.0
other under 2023.06/software/linux/aarch64/neoverse_n1
no other files in tarball
Aug 16 11:33:34 UTC 2024 test result
😢 FAILURE (click triangle for details)
Reason
EESSI test suite produced failures.
ReFrame Summary
[ FAILED ] Ran 18/18 test case(s) from 18 check(s) (1 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-16316.out
❌ found message matching ERROR:
❌ found message matching [\s*FAILED\s*].*Ran .* test case
Aug 21 17:11:19 UTC 2024 uploaded transfer of eessi-2023.06-software-linux-aarch64-neoverse_n1-1723805793.tar.gz to S3 bucket succeeded

@Neves-P
Copy link
Member Author

Neves-P commented Aug 16, 2024

bot: build repo:eessi.io-2023.06-software arch:x86_64/intel/skylake_avx512

Copy link

eessi-bot bot commented Aug 16, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/intel/skylake_avx512 from Neves-P

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/intel/skylake_avx512
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/intel/skylake_avx512 resulted in:

Updates by the bot instance boegel-bot-deucalion (click for details)
  • account Neves-P has NO permission to send commands to the bot

Copy link

eessi-bot bot commented Aug 16, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • account Neves-P has NO permission to send commands to the bot

Copy link

eessi-bot bot commented Aug 16, 2024

New job on instance eessi-bot-mc-aws for architecture x86_64-intel-skylake_avx512 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.08/pr_669/16320

date job status comment
Aug 16 12:46:34 UTC 2024 submitted job id 16320 awaits release by job manager
Aug 16 12:46:42 UTC 2024 released job awaits launch by Slurm scheduler
Aug 16 12:52:45 UTC 2024 running job 16320 is running
Aug 16 13:09:02 UTC 2024 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-16320.out
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
No artefacts were created or found.
Aug 16 13:09:02 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 18/18 test case(s) from 18 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-16320.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@Neves-P
Copy link
Member Author

Neves-P commented Aug 16, 2024

My mistake, did not add write permissions recursively.

bot: build repo:eessi.io-2023.06-software arch:x86_64/intel/skylake_avx512

Copy link

eessi-bot bot commented Aug 16, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/intel/skylake_avx512 from Neves-P

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/intel/skylake_avx512
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/intel/skylake_avx512 resulted in:

Updates by the bot instance boegel-bot-deucalion (click for details)
  • account Neves-P has NO permission to send commands to the bot

Copy link

eessi-bot bot commented Aug 16, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • account Neves-P has NO permission to send commands to the bot

Copy link

eessi-bot bot commented Aug 16, 2024

New job on instance eessi-bot-mc-azure for architecture x86_64-amd-zen4 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.08/pr_669/213

date job status comment
Aug 16 21:22:45 UTC 2024 submitted job id 213 awaits release by job manager
Aug 16 21:23:33 UTC 2024 released job awaits launch by Slurm scheduler
Aug 16 21:27:36 UTC 2024 running job 213 is running
Aug 16 21:28:37 UTC 2024 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-213.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
No artefacts were created or found.
Aug 16 21:28:37 UTC 2024 test result
😢 FAILURE (click triangle for details)
Reason
EESSI test suite was not run, test step itself failed to execute.
Details
✅ job output file slurm-213.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@boegel boegel added the 2023.06-software.eessi.io 2023.06 version of software.eessi.io label Aug 17, 2024
@boegel
Copy link
Contributor

boegel commented Aug 17, 2024

Hmm, I only see this in build log:

== hatchling/1.18.0-GCCcore-12.3.0 is already installed (module found), skipping

Aren't we missing a --rebuild option here?

@Neves-P
Copy link
Member Author

Neves-P commented Aug 17, 2024

Aren't we missing a --rebuild option here?

Interesting, I figured just adding the easyconfig to the rebuild easystack would trigger it as a rebuild in zen4 as well 🤔 Maybe I'm missing a step...

@casparvl
Copy link
Collaborator

I dont think it should be different for zen4. Dont have time to check logs now, but we should check if it goes into the else statement correctly here:

if [[ -z "${changed_easystacks_rebuilds}" ]]; then

This should remove the installation in the overlay. Since the same overlay is then used at Build time (or should be, at least), we don't actually need --rebuild: the previous installation is gone when easybuild is invoked on the easy stack file.

@casparvl
Copy link
Collaborator

Checked the logs, the removal step fails, since some fakeroot-related dir doesn't exist:

Executing command to remove software:
./eessi_container.sh --verbose --access rw --mode run --container docker://ghcr.io/eessi/build-node:debian11 --repository eessi.io-2023.06-software --save /project/60006/SHARED/jobs/2024.08/pr_669/event_aa65ed90-5c15-11ef-9f64-c9653d7e0953/run_000/linux_x86_64_amd_zen4/eessi.io-2023.06
-software/previous_tmp/removal_step --storage /tmp/bot/EESSI --fakeroot
                     -- ./EESSI-remove-software.sh "" "" 2>&1 | tee -a remove.outerr.yEdX
checking for duplicates: 'eessi.io-2023.06-software' and 'eessi.io-2023.06-software'
repo 'eessi.io-2023.06-software' is not an EESSI CVMFS repository...
Using /tmp/bot/EESSI/eessi.vxf0XHi8WL as tmp directory (to resume session add '--resume /tmp/bot/EESSI/eessi.vxf0XHi8WL').
EESSI_TMPDIR=/tmp/bot/EESSI/eessi.vxf0XHi8WL
HOST_INJECTIONS=/tmp/bot/EESSI/eessi.vxf0XHi8WL/opt-eessi
SINGULARITY_CACHEDIR=/project/def-users/bot/.cache/containers/x86_64
Pulling container image from docker://ghcr.io/eessi/build-node:debian11 to /tmp/bot/EESSI/eessi.vxf0XHi8WL/ghcr.io_eessi_build_node_debian11.sif
INFO:    Environment variable SINGULARITY_TMPDIR is set, but APPTAINER_TMPDIR is preferred
INFO:    Environment variable SINGULARITY_CACHEDIR is set, but APPTAINER_CACHEDIR is preferred
INFO:    Using cached SIF image
CONTAINER=/tmp/bot/EESSI/eessi.vxf0XHi8WL/ghcr.io_eessi_build_node_debian11.sif
EESSI_CVMFS_VAR_LIB=/tmp/bot/EESSI/eessi.vxf0XHi8WL/var-lib-cvmfs
EESSI_CVMFS_VAR_RUN=/tmp/bot/EESSI/eessi.vxf0XHi8WL/var-run-cvmfs
SINGULARITY_HOME=/project/60006/SHARED/jobs/2024.08/pr_669/event_aa65ed90-5c15-11ef-9f64-c9653d7e0953/run_000/linux_x86_64_amd_zen4/eessi.io-2023.06-software:/eessi_bot_job
BIND_PATHS=/tmp/bot/EESSI/eessi.vxf0XHi8WL/var-lib-cvmfs:/var/lib/cvmfs,/tmp/bot/EESSI/eessi.vxf0XHi8WL/var-run-cvmfs:/var/run/cvmfs,/tmp/bot/EESSI/eessi.vxf0XHi8WL/opt-eessi:/opt/eessi,/tmp/bot/EESSI/eessi.vxf0XHi8WL:/tmp

BIND_PATHS before processing REPOSITORIES
  BIND_PATHS=/tmp/bot/EESSI/eessi.vxf0XHi8WL/var-lib-cvmfs:/var/lib/cvmfs,/tmp/bot/EESSI/eessi.vxf0XHi8WL/var-run-cvmfs:/var/run/cvmfs,/tmp/bot/EESSI/eessi.vxf0XHi8WL/opt-eessi:/opt/eessi,/tmp/bot/EESSI/eessi.vxf0XHi8WL:/tmp

process CVMFS repo spec 'eessi.io-2023.06-software'
default.local --> /etc/cvmfs/default.local
eessi.io/eessi.io.pub --> /etc/cvmfs/keys/eessi.io/eessi.io.pub
eessi.io.conf --> /etc/cvmfs/domain.d/eessi.io.conf
BIND_PATHS after processing 'eessi.io-2023.06-software'
  BIND_PATHS=/tmp/bot/EESSI/eessi.vxf0XHi8WL/var-lib-cvmfs:/var/lib/cvmfs,/tmp/bot/EESSI/eessi.vxf0XHi8WL/var-run-cvmfs:/var/run/cvmfs,/tmp/bot/EESSI/eessi.vxf0XHi8WL/opt-eessi:/opt/eessi,/tmp/bot/EESSI/eessi.vxf0XHi8WL:/tmp,/tmp/bot/EESSI/eessi.vxf0XHi8WL/repos_cfg/default.local:/etc/cvmfs/default.local,/tmp/bot/EESSI/eessi.vxf0XHi8WL/repos_cfg/eessi.io/eessi.io.pub:/etc/cvmfs/keys/eessi.io/eessi.io.pub,/tmp/bot/EESSI/eessi.vxf0XHi8WL/repos_cfg/eessi.io.conf:/etc/cvmfs/domain.d/eessi.io.conf

add fusemount options for CVMFS repo 'eessi.io-2023.06-software'
repo 'eessi.io-2023.06-software' is not an EESSI CVMFS repository...
TMP directory contents:
total 369488
-rwxr-xr-x. 1 bot bot 378355712 Aug 16 21:27 ghcr.io_eessi_build_node_debian11.sif
drwxr-xr-x. 2 bot bot         6 Aug 16 21:27 opt-eessi
drwxr-xr-x. 3 bot bot        81 Aug 16 21:27 repos_cfg
drwxr-xr-x. 4 bot bot        47 Aug 16 21:27 software.eessi.io
drwxr-xr-x. 2 bot bot         6 Aug 16 21:27 var-lib-cvmfs
drwxr-xr-x. 2 bot bot         6 Aug 16 21:27 var-run-cvmfs
SINGULARITY_BIND=/project/def-users/SHARED/build-logs,/project/def-users/bot/shared,/tmp/bot/EESSI/eessi.vxf0XHi8WL/var-lib-cvmfs:/var/lib/cvmfs,/tmp/bot/EESSI/eessi.vxf0XHi8WL/var-run-cvmfs:/var/run/cvmfs,/tmp/bot/EESSI/eessi.vxf0XHi8WL/opt-eessi:/opt/eessi,/tmp/bot/EESSI/eessi.vxf0XHi8WL:/tmp,/tmp/bot/EESSI/eessi.vxf0XHi8WL/repos_cfg/default.local:/etc/cvmfs/default.local,/tmp/bot/EESSI/eessi.vxf0XHi8WL/repos_cfg/eessi.io/eessi.io.pub:/etc/cvmfs/keys/eessi.io/eessi.io.pub,/tmp/bot/EESSI/eessi.vxf0XHi8WL/repos_cfg/eessi.io.conf:/etc/cvmfs/domain.d/eessi.io.conf
Launching container with command (next line):
singularity  run --fakeroot --fusemount container:cvmfs2 cvmfs-config.cern.ch /cvmfs/cvmfs-config.cern.ch --fusemount container:cvmfs2 software.eessi.io /cvmfs_ro/software.eessi.io --fusemount container:fuse-overlayfs -o lowerdir=/cvmfs_ro/software.eessi.io -o upperdir=/tmp/software.eessi.io/overlay-upper -o workdir=/tmp/software.eessi.io/overlay-work /cvmfs/software.eessi.io /tmp/bot/EESSI/eessi.vxf0XHi8WL/ghcr.io_eessi_build_node_debian11.sif ./EESSI-remove-software.sh
INFO:    Environment variable SINGULARITY_BIND is set, but APPTAINER_BIND is preferred
INFO:    Environment variable SINGULARITY_HOME is set, but APPTAINER_HOME is preferred
INFO:    Environment variable SINGULARITY_TMPDIR is set, but APPTAINER_TMPDIR is preferred
INFO:    User not listed in /etc/subuid, trying root-mapped namespace
INFO:    Environment variable SINGULARITY_BIND is set, but APPTAINER_BIND is preferred
INFO:    Environment variable SINGULARITY_HOME is set, but APPTAINER_HOME is preferred
INFO:    Environment variable SINGULARITY_TMPDIR is set, but APPTAINER_TMPDIR is preferred
INFO:    Using fakeroot command combined with root-mapped namespace
unknown argument ignored: lazytime
CernVM-FS: pre-mounted on file descriptor 3
CernVM-FS: pre-mounted on file descriptor 3
FATAL:   exec /.singularity.d/libs/fakeroot failed: fork/exec /.singularity.d/libs/fakeroot: no such file or directory
Saved contents of tmp directory '/tmp/bot/EESSI/eessi.vxf0XHi8WL' to tarball '/project/60006/SHARED/jobs/2024.08/pr_669/event_aa65ed90-5c15-11ef-9f64-c9653d7e0953/run_000/linux_x86_64_amd_zen4/eessi.io-2023.06-software/previous_tmp/removal_step/-1723843637.tgz' (to resume session add '--resume /project/60006/SHARED/jobs/2024.08/pr_669/event_aa65ed90-5c15-11ef-9f64-c9653d7e0953/run_000/linux_x86_64_amd_zen4/eessi.io-2023.06-software/previous_tmp/removal_step/-1723843637.tgz')

@boegel
Copy link
Contributor

boegel commented Aug 20, 2024

Workaround has been put in place to fix fakeroot problem (see apptainer/apptainer#2189) for now

@boegel
Copy link
Contributor

boegel commented Aug 20, 2024

bot: build repo:eessi.io-2023.06-software arch:x86_64/amd/zen4

Copy link

eessi-bot bot commented Aug 20, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/amd/zen4 from boegel

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4 resulted in:

    • no jobs were submitted

Copy link

eessi-bot bot commented Aug 20, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/amd/zen4 from boegel

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4 resulted in:

Copy link

eessi-bot bot commented Aug 20, 2024

New job on instance eessi-bot-mc-azure for architecture x86_64-amd-zen4 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.08/pr_669/224

date job status comment
Aug 20 08:08:37 UTC 2024 submitted job id 224 awaits release by job manager
Aug 20 08:08:56 UTC 2024 released job awaits launch by Slurm scheduler
Aug 20 11:22:20 UTC 2024 running job 224 is running
Aug 20 11:54:20 UTC 2024 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-224.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-amd-zen4-1724152985.tar.gzsize: 0 MiB (518951 bytes)
entries: 366
modules under 2023.06/software/linux/x86_64/amd/zen4/modules/all
hatchling/1.18.0-GCCcore-12.3.0.lua
software under 2023.06/software/linux/x86_64/amd/zen4/software
hatchling/1.18.0-GCCcore-12.3.0
other under 2023.06/software/linux/x86_64/amd/zen4
no other files in tarball
Aug 20 11:54:20 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 13/13 test case(s) from 13 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-224.out
✅ no message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case
Aug 21 17:11:20 UTC 2024 uploaded transfer of eessi-2023.06-software-linux-x86_64-amd-zen4-1724152985.tar.gz to S3 bucket succeeded

@boegel
Copy link
Contributor

boegel commented Aug 20, 2024

bot: build repo:eessi.io-2023.06-software arch:aarch64/a64fx

Copy link

eessi-bot bot commented Aug 20, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:aarch64/a64fx from boegel

    • expanded format: build repository:eessi.io-2023.06-software architecture:aarch64/a64fx
  • handling command build repository:eessi.io-2023.06-software architecture:aarch64/a64fx resulted in:

    • no jobs were submitted

Copy link

eessi-build-deploy-bot-deucalion bot commented Aug 20, 2024

Updates by the bot instance boegel-bot-deucalion (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:aarch64/a64fx from boegel

    • expanded format: build repository:eessi.io-2023.06-software architecture:aarch64/a64fx
  • handling command build repository:eessi.io-2023.06-software architecture:aarch64/a64fx resulted in:

Copy link

eessi-bot bot commented Aug 20, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:aarch64/a64fx from boegel

    • expanded format: build repository:eessi.io-2023.06-software architecture:aarch64/a64fx
  • handling command build repository:eessi.io-2023.06-software architecture:aarch64/a64fx resulted in:

    • no jobs were submitted

Copy link

eessi-build-deploy-bot-deucalion bot commented Aug 20, 2024

New job on instance boegel-bot-deucalion for architecture aarch64-a64fx for repository eessi.io-2023.06-software in job dir /home/kehoste/project_dir/bot/jobs/2024.08/pr_669/72265

date job status comment
Aug 20 19:09:33 UTC 2024 submitted job id 72265 awaits release by job manager
Aug 20 19:10:14 UTC 2024 released job awaits launch by Slurm scheduler
Aug 20 19:11:19 UTC 2024 running job 72265 is running
Aug 20 19:19:45 UTC 2024 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-72265.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
No artefacts were created or found.
Aug 20 19:19:45 UTC 2024 test result
😢 FAILURE (click triangle for details)
Reason
EESSI test suite was not run, test step itself failed to execute.
Details
✅ job output file slurm-72265.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

Copy link
Contributor

@boegel boegel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@boegel
Copy link
Contributor

boegel commented Aug 20, 2024

Hmm, also seeing this when rebuilding hatchling for a64fx:

== hatchling/1.18.0-GCCcore-12.3.0 is already installed (module found), skipping

Same problem as we saw with zen4:

FATAL:   could not use fakeroot: no mapping entry found in /etc/subuid for kehoste

Problem is this time there's no easy fix, since we don't have control over the configuration of the workernode on which we're running...

So it seems like we need to re-visit the use of --fakeroot, since we can't/shouldn't assume it works everywhere?

Thoughts here @casparvl?

@boegel
Copy link
Contributor

boegel commented Aug 21, 2024

@Neves-P has manually removed hatchling/1.18.0-GCCcore-12.3.0 for A64FX, so rebuilding should work now, so we can wrap this up (and follow up on the --fakeroot issues later)

bot: build repo:eessi.io-2023.06-software arch:aarch64/a64fx

Copy link

eessi-bot bot commented Aug 21, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:aarch64/a64fx from boegel

    • expanded format: build repository:eessi.io-2023.06-software architecture:aarch64/a64fx
  • handling command build repository:eessi.io-2023.06-software architecture:aarch64/a64fx resulted in:

    • no jobs were submitted

Copy link

eessi-build-deploy-bot-deucalion bot commented Aug 21, 2024

Updates by the bot instance boegel-bot-deucalion (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:aarch64/a64fx from boegel

    • expanded format: build repository:eessi.io-2023.06-software architecture:aarch64/a64fx
  • handling command build repository:eessi.io-2023.06-software architecture:aarch64/a64fx resulted in:

Copy link

eessi-bot bot commented Aug 21, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:aarch64/a64fx from boegel

    • expanded format: build repository:eessi.io-2023.06-software architecture:aarch64/a64fx
  • handling command build repository:eessi.io-2023.06-software architecture:aarch64/a64fx resulted in:

    • no jobs were submitted

Copy link

eessi-build-deploy-bot-deucalion bot commented Aug 21, 2024

New job on instance boegel-bot-deucalion for architecture aarch64-a64fx for repository eessi.io-2023.06-software in job dir /home/kehoste/project_dir/bot/jobs/2024.08/pr_669/72499

date job status comment
Aug 21 16:46:13 UTC 2024 submitted job id 72499 awaits release by job manager
Aug 21 16:47:16 UTC 2024 released job awaits launch by Slurm scheduler
Aug 21 16:48:19 UTC 2024 running job 72499 is running
Aug 21 17:00:57 UTC 2024 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-72499.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-aarch64-a64fx-1724259587.tar.gzsize: 0 MiB (513152 bytes)
entries: 367
modules under 2023.06/software/linux/aarch64/a64fx/modules/all
hatchling/1.18.0-GCCcore-12.3.0.lua
software under 2023.06/software/linux/aarch64/a64fx/software
hatchling/1.18.0-GCCcore-12.3.0
other under 2023.06/software/linux/aarch64/a64fx
no other files in tarball
Aug 21 17:00:57 UTC 2024 test result
😢 FAILURE (click triangle for details)
Reason
EESSI test suite was not run, test step itself failed to execute.
Details
✅ job output file slurm-72499.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case
Aug 21 17:12:07 UTC 2024 uploaded transfer of eessi-2023.06-software-linux-aarch64-a64fx-1724259587.tar.gz to S3 bucket succeeded

Copy link
Contributor

@boegel boegel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@boegel boegel added the bot:deploy Ask bot to deploy missing software installations to EESSI label Aug 21, 2024
@Neves-P
Copy link
Member Author

Neves-P commented Aug 22, 2024

Manually removed old tarballs and added new ones. Procedure:

#!/bin/bash

# Remove and replace hatchling 1.18.0 GCCcore-12.3.0 PR 669 - Pedro Santos Neves

# Remove existing installs (A64FX was removed previously in order to build)
rm -rf /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen2/software/hatchling/1.18.0-GCCcore-12.3.0
rm -rf /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen3/software/hatchling/1.18.0-GCCcore-12.3.0
rm -rf /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen4/software/hatchling/1.18.0-GCCcore-12.3.0
rm -rf /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/intel/haswell/software/hatchling/1.18.0-GCCcore-12.3.0
rm -rf /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/intel/skylake_avx512/software/hatchling/1.18.0-GCCcore-12.3.0
rm -rf /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/generic/software/hatchling/1.18.0-GCCcore-12.3.0
rm -rf /cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/generic/software/hatchling/1.18.0-GCCcore-12.3.0
rm -rf /cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/neoverse_n1/software/hatchling/1.18.0-GCCcore-12.3.0
rm -rf /cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/neoverse_v1/software/hatchling/1.18.0-GCCcore-12.3.0

# Move to software dir
cd /cvmfs/software.eessi.io/versions

# Unpack tarballs
tar xvzf /srv/tmp/tarballs/eessi-2023.06-software-linux-aarch64-a64fx-1724259587.tar.gz
tar xvzf /srv/tmp/tarballs/eessi-2023.06-software-linux-x86_64-amd-zen4-1724152985.tar.gz
tar xvzf /srv/tmp/tarballs/eessi-2023.06-software-linux-x86_64-intel-haswell-1723819673.tar.gz
tar xvzf /srv/tmp/tarballs/eessi-2023.06-software-linux-x86_64-amd-zen3-1723819613.tar.gz
tar xvzf /srv/tmp/tarballs/eessi-2023.06-software-linux-x86_64-amd-zen2-1723819605.tar.gz
tar xvzf /srv/tmp/tarballs/eessi-2023.06-software-linux-x86_64-generic-1723819622.tar.gz
tar xvzf /srv/tmp/tarballs/eessi-2023.06-software-linux-aarch64-neoverse_v1-1723819484.tar.gz
tar xvzf /srv/tmp/tarballs/eessi-2023.06-software-linux-aarch64-generic-1723819504.tar.gz
tar xvzf /srv/tmp/tarballs/eessi-2023.06-software-linux-x86_64-intel-skylake_avx512-1723817335.tar.gz
tar xvzf /srv/tmp/tarballs/eessi-2023.06-software-linux-aarch64-neoverse_n1-1723805793.tar.gz

Looks as expected AFAICS.

If everything looks good, PR can be closed (or merged?)

@boegel boegel merged commit 97505aa into EESSI:2023.06-software.eessi.io Aug 22, 2024
33 checks passed
@boegel
Copy link
Contributor

boegel commented Aug 22, 2024

Verified, looks good:

$ ls -ld /cvmfs/software.eessi.io/versions/2023.06/software/linux/*/*/*/software/hatchling/*/lib/python3.11/site-packages/* | grep requirements_txt$
dr-xr-xr-x  3 cvmfs cvmfs   60 Aug 16 16:46 /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen2/software/hatchling/1.18.0-GCCcore-12.3.0/lib/python3.11/site-packages/hatch_requirements_txt
dr-xr-xr-x  3 cvmfs cvmfs   60 Aug 16 16:46 /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen3/software/hatchling/1.18.0-GCCcore-12.3.0/lib/python3.11/site-packages/hatch_requirements_txt
dr-xr-xr-x  3 cvmfs cvmfs   60 Aug 20 13:22 /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen4/software/hatchling/1.18.0-GCCcore-12.3.0/lib/python3.11/site-packages/hatch_requirements_txt
dr-xr-xr-x  3 cvmfs cvmfs   60 Aug 16 16:47 /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/intel/haswell/software/hatchling/1.18.0-GCCcore-12.3.0/lib/python3.11/site-packages/hatch_requirements_txt
dr-xr-xr-x  3 cvmfs cvmfs   60 Aug 16 16:08 /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/intel/skylake_avx512/software/hatchling/1.18.0-GCCcore-12.3.0/lib/python3.11/site-packages/hatch_requirements_txt
$ ls -ld /cvmfs/software.eessi.io/versions/2023.06/software/linux/*/*/software/hatchling/*/lib/python3.11/site-packages/* | grep requirements_txt$
dr-xr-xr-x  3 cvmfs cvmfs   60 Aug 21 18:57 /cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/a64fx/software/hatchling/1.18.0-GCCcore-12.3.0/lib/python3.11/site-packages/hatch_requirements_txt
dr-xr-xr-x  3 cvmfs cvmfs   60 Aug 16 16:44 /cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/generic/software/hatchling/1.18.0-GCCcore-12.3.0/lib/python3.11/site-packages/hatch_requirements_txt
dr-xr-xr-x  3 cvmfs cvmfs   60 Aug 16 12:55 /cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/neoverse_n1/software/hatchling/1.18.0-GCCcore-12.3.0/lib/python3.11/site-packages/hatch_requirements_txt
dr-xr-xr-x  3 cvmfs cvmfs   60 Aug 16 16:44 /cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/neoverse_v1/software/hatchling/1.18.0-GCCcore-12.3.0/lib/python3.11/site-packages/hatch_requirements_txt
dr-xr-xr-x  3 cvmfs cvmfs   60 Aug 16 16:46 /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/generic/software/hatchling/1.18.0-GCCcore-12.3.0/lib/python3.11/site-packages/hatch_requirements_txt

@Neves-P Neves-P deleted the rebuild/hatchling branch August 23, 2024 07:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2023.06-software.eessi.io 2023.06 version of software.eessi.io bot:deploy Ask bot to deploy missing software installations to EESSI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants