-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: "OSError: [Errno 7] Argument list too long" #45
Comments
Thoughts:
Looking into it further, the timeraw tests that use a "small" example (11kB) DO actually run! This thread claims that it's a system-level issue, where the command line input is larger than what the system supports. So how to work around this?
I don't know anything about these kind of things, but my intuition would be that saving and reading a file would be slower than passing the info through STDIN. So my approach would be to pass the code to execute like this: res = subprocess.check_output([sys.executable, "-c", code], input=stmt.encode("utf-8")) We would also have to update the
Ah... There's a problem... We cannot pass only the main test code ( So what we might want to do instead is to pass the whole And then, the code that would be called would just read STDIN and pass it to exec(): code = self.subprocess_tmpl.format(stmt=stmt, setup=setup, number=number)
evaler = """
import sys
code = sys.stdin.read()
eval(code)
"""
res = subprocess.check_output([sys.executable, "-c", evaler], input=code.encode("utf-8"))
return float(res.strip()) |
I tested the snippet below locally, and it works. I yet have to test if this works in the CI too. code = self.subprocess_tmpl.format(stmt=stmt, setup=setup, number=number)
evaler = textwrap.dedent(
"""
import sys
code = sys.stdin.read()
exec(code)
"""
)
res = subprocess.check_output([sys.executable, "-c", evaler], input=code.encode("utf-8"), stderr=subprocess.STDOUT)
return float(res.strip()) One thing though, chatbot assistant suggested using proc = subprocess.Popen([sys.executable, "-c", evaler],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
stdout, stderr = proc.communicate(input=code.encode("utf-8"))
if proc.returncode != 0:
raise RuntimeError(f"Subprocess failed: {stderr.decode()}")
return float(stdout.decode("utf-8").strip()) The upside of using
@HaoZeke What do you think of this fix? And where is |
Looks like I managed to fix my CI. For anyone with the same issue as me, here's the fix you can use in the meantime until the fix gets released in
# TODO: REMOVE ONCE FIXED UPSTREAM
# Fix for https://github.com/airspeed-velocity/asv_runner/issues/45
# Prepare virtual environment
# Currently, we have to monkeypatch the `timeit` function in the `timeraw` benchmark.
# The problem is that `asv` passes the code to execute via command line, and when the
# code is too big, it fails with `OSError: [Errno 7] Argument list too long`.
# So we have to tweak it to pass the code via STDIN, which doesn't have this limitation.
#
# 1. First create the virtual environment, so that asv generates the directories where
# the monkeypatch can be applied.
echo "Creating virtual environment..."
asv setup -v || true
echo "Virtual environment created."
# 2. Now let's apply the monkeypatch by appending it to the `timeraw.py` files.
# First find all `timeraw.py` files
echo "Applying monkeypatch..."
find .asv/env -type f -path "*/site-packages/asv_runner/benchmarks/timeraw.py" | while read -r file; do
# Add a newline and then append the monkeypatch contents
echo "" >> "$file"
cat "benchmarks/monkeypatch_asv_ci.txt" >> "$file"
done
echo "Monkeypatch applied."
# END OF MONKEYPATCH The fix that is applied is appended to the original content of # ------------ FIX FOR #45 ------------
# See https://github.com/airspeed-velocity/asv_runner/issues/45
# This fix is applied in CI in the `benchmark.yml` file.
# This file is intentionally named `monkeypatch_asv_ci.txt` to avoid being
# loaded as a python file by `asv`.
# -------------------------------------
def timeit(self, number):
"""
Run the function's code `number` times in a separate Python process, and
return the execution time.
#### Parameters
**number** (`int`)
: The number of times to execute the function's code.
#### Returns
**time** (`float`)
: The time it took to execute the function's code `number` times.
#### Notes
The function's code is executed in a separate Python process to avoid
interference from the parent process. The function can return either a
single string of code to be executed, or a tuple of two strings: the
code to be executed and the setup code to be run before timing.
"""
stmt = self.func()
if isinstance(stmt, tuple):
stmt, setup = stmt
else:
setup = ""
stmt = textwrap.dedent(stmt)
setup = textwrap.dedent(setup)
stmt = stmt.replace(r'"""', r"\"\"\"")
setup = setup.replace(r'"""', r"\"\"\"")
# TODO
# -----------ORIGINAL CODE-----------
# code = self.subprocess_tmpl.format(stmt=stmt, setup=setup, number=number)
# res = subprocess.check_output([sys.executable, "-c", code])
# return float(res.strip())
# -----------NEW CODE-----------
code = self.subprocess_tmpl.format(stmt=stmt, setup=setup, number=number)
evaler = textwrap.dedent(
"""
import sys
code = sys.stdin.read()
exec(code)
"""
)
proc = subprocess.Popen([sys.executable, "-c", evaler],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
stdout, stderr = proc.communicate(input=code.encode("utf-8"))
if proc.returncode != 0:
raise RuntimeError(f"Subprocess failed: {stderr.decode()}")
return float(stdout.decode("utf-8").strip())
_SeparateProcessTimer.timeit = timeit
# ------------ END FIX #45 ------------ And this is the final Github action workflow for running benchmarks on pull request: # Run benchmark report on pull requests to master.
# The report is added to the PR as a comment.
name: Benchmarks
on:
pull_request:
branches: [ master ]
jobs:
benchmark:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # Need full history for ASV
- name: Fetch base branch
run: |
git remote add upstream https://github.com/${{ github.repository }}.git
git fetch upstream master
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
cache: 'pip'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install asv
- name: Run benchmarks
run: |
# TODO: REMOVE ONCE FIXED UPSTREAM
# Fix for https://github.com/airspeed-velocity/asv_runner/issues/45
# Prepare virtual environment
# Currently, we have to monkeypatch the `timeit` function in the `timeraw` benchmark.
# The problem is that `asv` passes the code to execute via command line, and when the
# code is too big, it fails with `OSError: [Errno 7] Argument list too long`.
# So we have to tweak it to pass the code via STDIN, which doesn't have this limitation.
#
# 1. First create the virtual environment, so that asv generates the directories where
# the monkeypatch can be applied.
echo "Creating virtual environment..."
asv setup -v || true
echo "Virtual environment created."
# 2. Now let's apply the monkeypatch by appending it to the `timeraw.py` files.
# First find all `timeraw.py` files
echo "Applying monkeypatch..."
find .asv/env -type f -path "*/site-packages/asv_runner/benchmarks/timeraw.py" | while read -r file; do
# Add a newline and then append the monkeypatch contents
echo "" >> "$file"
cat "benchmarks/monkeypatch_asv_ci.txt" >> "$file"
done
echo "Monkeypatch applied."
# END OF MONKEYPATCH
# Prepare the profile under which the benchmarks will be saved.
# We assume that the CI machine has a name that is unique and stable.
# See https://github.com/airspeed-velocity/asv/issues/796#issuecomment-1188431794
asv machine --yes
# Generate benchmark data
# - `^` means that we mean the COMMIT of the branch, not the BRANCH itself.
# Without it, we would run benchmarks for the whole branch history.
# With it, we run benchmarks FROM the latest commit (incl) TO ...
# - `!` means that we want to select range spanning a single commit.
# Without it, we would run benchmarks for all commits FROM the latest commit
# TO the start of the branch history.
# With it, we run benchmarks ONLY FOR the latest commit.
asv run upstream/master^! -v
asv run HEAD^! -v
# Compare against master
asv compare upstream/master HEAD --factor 1.1 --split > benchmark_results.md
- name: Comment on PR
uses: actions/github-script@v7
with:
github-token: ${{secrets.GITHUB_TOKEN}}
script: |
const fs = require('fs');
const results = fs.readFileSync('benchmark_results.md', 'utf8');
const body = `## Performance Benchmark Results\n\nComparing PR changes against master branch:\n\n${results}`;
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.name,
body: body
});
|
UPDATE: The error is not what I thought it was at first glance, so I updated the name.
I have set up a CI workflow that, on pull request:
asv continuous
to compare the two.The timeraw tests are failing in the CI (Github Action with linux worker).
The workflow looks like this:
The error occurs when I get to running the benchmarks in
asv run upstream/master^! -v
I am also running peakmem benchmarks as part of the same suite, and these run successfully. So I know that the overall setup works, and that the issue is only with the timeraw tests. But the timeraw tests work for me locally.
The actual error is:
Which is caused from within
subprocess.check_output
asv_runner/asv_runner/benchmarks/timeraw.py
Line 78 in 1c88c49
Traceback:
The text was updated successfully, but these errors were encountered: