Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Invalid use of subproceess.Popen API with wrapped file objects #3101

Merged
merged 7 commits into from
Nov 15, 2024

Conversation

kyujin-cho
Copy link
Member

@kyujin-cho kyujin-cho commented Nov 15, 2024

Fix LZMAFile and GZipFile instances incorrectly fed to asyncio.create_subprocess_exec()'s stdin parameter.

The following code looks legitimately working and it actually executes:

with lzma.open(...) as f:
    await asyncio.create_subprocess_exec(..., stdin=f)

Under the hood, the subprocess.Popen constructor calls f.fileno() method that returns the file descriptor representing the raw file stream before decompression.

These confusing behavior made create_subprocess_exec() not to hand out the extracted tarball to the docker load command, but rather passing original uncompressed .xz (or .gz) file, as Popen tries to locate make use of the file descriptor (which returned by fileno() method) rather than using IOBase's high-level read() API where LZMAFile and GZipFile performs decompression.

We should have done either:

with lzma.open(...) as f:
    proc = await asyncio.create_subprocess_exec(..., stdin=asyncio.subprocess.PIPE)
    await proc.communicate(f.read())

(for files whose sizes are controlled by us), or:

with lzma.open(...) as f:
    proc = await asyncio.create_subprocess_exec(..., stdin=asyncio.subprocess.PIPE)
    while True:
        chunk = f.read(CHUNK_SIZE)
        if not chunk:
            break
        proc.stdin.write(chunk)
        await proc.stdin.drain()
    await proc.communicate()

(for files that may have arbitrarily large sizes).

Even though it is definitely not an intended situation, as docker load also accepts tarballs compressed with either XZ or GZip algorithm (ref) - which has effectively hidden this long-standing bug, at least for more than 5 years...

Checklist: (if applicable)

  • Milestone metadata specifying the target backport version

@kyujin-cho kyujin-cho added the type:bug Reports about that are not working label Nov 15, 2024
@kyujin-cho kyujin-cho added this to the 23.09 milestone Nov 15, 2024
@kyujin-cho kyujin-cho requested a review from achimnol November 15, 2024 07:12
@kyujin-cho kyujin-cho self-assigned this Nov 15, 2024
@github-actions github-actions bot added comp:agent Related to Agent component size:S 10~30 LoC labels Nov 15, 2024
@achimnol achimnol changed the title fix: invalid usage of LZMAFile object fix: Invalid use of subproceess.Popen API with wrapped file objects Nov 15, 2024
@kyujin-cho kyujin-cho requested a review from achimnol November 15, 2024 07:27
@kyujin-cho kyujin-cho enabled auto-merge November 15, 2024 07:31
@kyujin-cho kyujin-cho added this pull request to the merge queue Nov 15, 2024
Merged via the queue into main with commit 5c97728 Nov 15, 2024
20 checks passed
@kyujin-cho kyujin-cho deleted the fix/lzma-decompression-bug branch November 15, 2024 18:54
lablup-octodog pushed a commit that referenced this pull request Nov 15, 2024
…#3101)

Co-authored-by: Joongi Kim <[email protected]>
Backported-from: main (24.12)
Backported-to: 24.09
Backport-of: 3101
lablup-octodog pushed a commit that referenced this pull request Nov 15, 2024
…#3101)

Co-authored-by: Joongi Kim <[email protected]>
Backported-from: main (24.12)
Backported-to: 23.09
Backport-of: 3101
lablup-octodog pushed a commit that referenced this pull request Nov 15, 2024
…#3101)

Co-authored-by: Joongi Kim <[email protected]>
Backported-from: main (24.12)
Backported-to: 24.03
Backport-of: 3101
github-merge-queue bot pushed a commit that referenced this pull request Nov 15, 2024
achimnol added a commit that referenced this pull request Nov 15, 2024
…#3101)

Co-authored-by: Joongi Kim <[email protected]>
Backported-from: main (24.12)
Backported-to: 24.09
Backport-of: 3101
achimnol added a commit that referenced this pull request Nov 15, 2024
…#3101)

Co-authored-by: Joongi Kim <[email protected]>
Backported-from: main (24.12)
Backported-to: 23.09
Backport-of: 3101
github-merge-queue bot pushed a commit that referenced this pull request Nov 15, 2024
github-merge-queue bot pushed a commit that referenced this pull request Nov 16, 2024
github-merge-queue bot pushed a commit that referenced this pull request Nov 16, 2024
lablup-octodog pushed a commit that referenced this pull request Nov 16, 2024
Backported-from: main (24.12)
Backported-to: 24.09
Backport-of: 3106
lablup-octodog pushed a commit that referenced this pull request Nov 16, 2024
Backported-from: main (24.12)
Backported-to: 23.09
Backport-of: 3106
lablup-octodog pushed a commit that referenced this pull request Nov 16, 2024
Backported-from: main (24.12)
Backported-to: 24.03
Backport-of: 3106
github-merge-queue bot pushed a commit that referenced this pull request Nov 16, 2024
github-merge-queue bot pushed a commit that referenced this pull request Nov 16, 2024
achimnol added a commit that referenced this pull request Nov 16, 2024
Yaminyam pushed a commit that referenced this pull request Feb 12, 2025
Yaminyam pushed a commit that referenced this pull request Feb 12, 2025
Yaminyam pushed a commit that referenced this pull request Feb 12, 2025
Yaminyam pushed a commit that referenced this pull request Feb 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:agent Related to Agent component size:S 10~30 LoC type:bug Reports about that are not working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants