Remote-apis testing: pants is failing since 2.16 #20731

jjardon · 2024-03-29T13:14:33Z

Describe the bug
Since 2.16 (commit 27fc9ee7761e61f3c5c9b502d612df5f1f13e29b in https://github.com/pantsbuild/example-python), pants appears to be behave incorrectly when used for remote execution

See https://gitlab.com/remote-apis-testing/remote-apis-testing/-/merge_requests/362

Pants version
Issue is present from 2.16 to current one, 2.19 (commit f37c500e4f4e0c67e29aa9434b1b414f333bdd79 in https://github.com/pantsbuild/example-python)

OS
Linux

Additional info
We have upgraded to latest (2.19) in https://remote-apis-testing.gitlab.io/remote-apis-testing/ to show current status for now, please let us know if this is a known issue and we will upgrade as soon as a new tag is available

huonw · 2024-04-16T20:06:06Z

Thanks for flagging this.

Is there a summary of how we can run the remote-apis-testing tests locally, to reproduce this? I see that https://gitlab.com/remote-apis-testing/remote-apis-testing/-/blob/master/CONTRIBUTING.md seems to be focused on augmenting the CI pipeline, although peeking at that pipeline does suggest that maybe docker-compose/run.sh is what we should be starting with?

sdclarke · 2024-04-18T12:39:32Z

Is there a summary of how we can run the remote-apis-testing tests locally, to reproduce this? I see that https://gitlab.com/remote-apis-testing/remote-apis-testing/-/blob/master/CONTRIBUTING.md seems to be focused on augmenting the CI pipeline, although peeking at that pipeline does suggest that maybe docker-compose/run.sh is what we should be starting with?

In order to run a test with pants locally you can cd docker-compose and run ./run.sh -g -c pants.yml -s <server>.yml (the server options being buildbarn, buildfarm and buildgrid). After the first run it is no longer necessary to specify the -g option, as this is used to generate the docker compose yaml files. You can make changes to the generated files if necessary and then re-use the same command above without that flag to test the changes.

I hope this helps!

seifertm · 2024-05-03T05:06:10Z

The logs of one of the test runs report that the RE server fails to find the Python3.9 executable in /root/.cache/nce/60b51…8fda/….

I see a similar error when testing locally with Nativelink as the underlying REAPI server. It looks like Pants submits an absolute path to the client's home directory to the remote execution server. The server is unable to find that path locally, because it runs on a different machine. The log below shows server output in my test. Pants is run as user michael, but there is no user michael in the executor container.

nativelink_executor-1   |   2024-05-03T04:41:00.194254Z ERROR nativelink_worker::local_worker: Error executing action, err: Error { code: NotFound, messages: ["No such file or directory (os error 2)", "Could not execute command [\"/home/michael/.cache/nce/fa6ec1ff473e58cf7dff9577ae94c2bde6bf1c7a837c75b928b414c0195eb80e/bindings/venvs/2.20.0/bin/python3.9\", \"./pex\", \"--tmpdir\", \".tmp\", \"--no-emit-warnings\", \"--pip-version\", \"23.1.2\", \"--python-path\", \"\", \"--output-file\", \"local_dists.pex\", \"--intransitive\", \"--interpreter-constraint\", \"CPython==3.11.*\", \"--sources-directory=source_files\", \"--no-pypi\", \"--index=https://pypi.org/simple/\", \"--manylinux\", \"manylinux2014\", \"--resolver-version\", \"pip-2020-resolver\", \"--layout\", \"zipapp\"]"] }

I assume something similar is happening in the remote-apis-testing repository, except that the mismatching contents of /root/.cache isn't apparent, because both the local and remote environments are supposedly run as root.

huonw · 2024-05-05T23:47:56Z

Ah okay, that's a handy smoking gun. Thank you!

It looks like the process invocation is referencing the absolute path to the ~/.cache/nce/.../python3.9 binary that scie-pants provides (the "bootstrap python"). I thus have a suspicion that this might've been caused by #18433 (cherry-picked back to 2.16 in #18495) which switched us to running PEX with that Python interpreter, rather than one more "normally" managed.

@thejcannon (as author) @stuhood (as reviewer): do you have any insight into how we might fix this remote execution issue?

thejcannon · 2024-05-08T12:23:36Z

The get_python_for_scripts should be computing the path of the unpacked digest in the remote environment (and not the local path). Either that's wrong or remote execution is incorrectly having their code return the local path.

seifertm · 2024-05-09T06:00:16Z

Thanks for the pointer! The rule you mentioned clearly distinguishes between remote and local environments:

pants/src/python/pants/core/util_rules/adhoc_binaries.py

Lines 50 to 57 in a40f2fd

    
           @rule 
        
           async def get_python_for_scripts(env_tgt: EnvironmentTarget) -> PythonBuildStandaloneBinary: 
        
               if env_tgt.val is None or isinstance(env_tgt.val, LocalEnvironmentTarget): 
        
                   return PythonBuildStandaloneBinary(sys.executable) 
        
               result = await Get(_PythonBuildStandaloneBinary, _DownloadPythonBuildStandaloneBinaryRequest()) 
        
               return PythonBuildStandaloneBinary(result.path)

When get_python_for_scripts requests _PythonBuildStandaloneBinary in line 55, I expect the engine to run the download_python_binary rule:

pants/src/python/pants/core/util_rules/adhoc_binaries.py

Lines 60 to 68 in a40f2fd

    
           @rule(desc="Downloading Python for scripts", level=LogLevel.TRACE) 
        
           async def download_python_binary( 
        
               _: _DownloadPythonBuildStandaloneBinaryRequest, 
        
               platform: Platform, 
        
               tar_binary: TarBinary, 
        
               bash_binary: BashBinary, 
        
               python_bootstrap: PythonBootstrapSubsystem, 
        
               system_binaries_environment: SystemBinariesSubsystem.EnvironmentAware, 
        
           ) -> _PythonBuildStandaloneBinary:

The Python binary download should emit a log message, but running pants with -ltrace doesn't seem to emit that log message. Consequently, it looks that the EnvironmentTarget passed to get_python_for_scripts or the corresponding if clause has an issue.

Here's a reproducer. Start an REAPI server in one terminal:

git clone https://github.com/TraceMachina/nativelink.git
cd nativelink/deployment-examples/docker-compose
docker compose up -d --build
docker compose logs -f

Start Pants in another terminal:

cat << EOF >pants.toml
[GLOBAL]
pants_version = "2.20.0"
backend_packages = [
    "pants.backend.python",
]
remote_execution = true
remote_store_address = "grpc://127.0.0.1:50051"
remote_execution_address = "grpc://127.0.0.1:50052"
remote_instance_name = "main"
process_execution_remote_parallelism = 1

[python]
interpreter_constraints = ["==3.11.*"]
EOF

cat << EOF >app.py
print("hello")
EOF

cat << EOF >BUILD
python_sources()
EOF

pants -ltrace --no-pantsd --no-local-cache run app.py

jjardon added the bug label Mar 29, 2024

huonw added the remote label Apr 16, 2024

huonw added the backend: Python Python backend-related issues label May 5, 2024

huonw mentioned this issue Oct 28, 2024

Remote buildfarm build fails: Cannot run program "/bin/python3" #21578

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remote-apis testing: pants is failing since 2.16 #20731

Remote-apis testing: pants is failing since 2.16 #20731

jjardon commented Mar 29, 2024

huonw commented Apr 16, 2024 •

edited

Loading

sdclarke commented Apr 18, 2024

seifertm commented May 3, 2024 •

edited

Loading

huonw commented May 5, 2024 •

edited by kaos

Loading

thejcannon commented May 8, 2024

seifertm commented May 9, 2024

Remote-apis testing: pants is failing since 2.16 #20731

Remote-apis testing: pants is failing since 2.16 #20731

Comments

jjardon commented Mar 29, 2024

huonw commented Apr 16, 2024 • edited Loading

sdclarke commented Apr 18, 2024

seifertm commented May 3, 2024 • edited Loading

huonw commented May 5, 2024 • edited by kaos Loading

thejcannon commented May 8, 2024

seifertm commented May 9, 2024

huonw commented Apr 16, 2024 •

edited

Loading

seifertm commented May 3, 2024 •

edited

Loading

huonw commented May 5, 2024 •

edited by kaos

Loading