Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent "snappy: corrupt input" with plain CSVs #2157

Open
tmtmtmtm opened this issue Sep 21, 2024 · 15 comments
Open

Intermittent "snappy: corrupt input" with plain CSVs #2157

tmtmtmtm opened this issue Sep 21, 2024 · 15 comments

Comments

@tmtmtmtm
Copy link
Contributor

From time to time (somewhere on the order of 1% of runs), with no pattern I can discern, I get a io error: snappy: corrupt input (expected stream header but got unexpected chunk type byte 112) error when reading plain CSV files. I can never replicate this: on a subsequent run of the same command everything works fine. It usually happens somewhere in the middle of a chain of piped commands, so I also can't tell which of the commands is blowing up, or if there's any pattern to it.

I have been holding off on reporting in the hope that I could pin it down at least a little more, but I've been unable to do so. I have no .sz files anywhere, so I'm assuming that either qsv is running a check for snappy-ness somewhere (in which case perhaps there's a way I could explicitly turn that off?), or one of the sub-commands is producing a temporary snappy file. But I don't really understand what's going on well enough so hopefully there's enough info here for someone else to pick up a useful clue.

(I'm currently on qsv 0.134.0-mimalloc-apply;fetch;foreach;geocode;Luau 0.640;to;polars-0.42.0-fe04390;self_update-8-8;12.80 GiB-677.75 MiB-0 B-16.00 GiB (aarch64-apple-darwin compiled with Rust 1.81) prebuilt but it's been happening with other recent versions too)

@jqnatividad
Copy link
Collaborator

That's interesting @tmtmtmtm , can you also run qsv --envlist to see if there are any applicable environment variables?

@tmtmtmtm
Copy link
Contributor Author

> qsv --envlist
No qsv-relevant environment variables set.

@jqnatividad
Copy link
Collaborator

Can you also set QSV_LOG_LEVEL=debug?

export QSV_LOG_LEVEL=debug

and check if it logs any snappy encoding/decoding ops in the qsv_rCURRENT.log file?

@tmtmtmtm
Copy link
Contributor Author

no mention of snappy encoding or decoding anywhere in the log, even on a run that fails. Only reference was the io error: snappy: corrupt input (expected stream header but got unexpected chunk type byte 105) to normal output.

@jqnatividad
Copy link
Collaborator

Hi @tmtmtmtm , are you still getting the intermittent snappy error with the latest release?

@tmtmtmtm
Copy link
Contributor Author

Hi @jqnatividad, I haven't had a chance to upgrade to the latest version yet (I've been burned before by updates renaming commands without leaving the old version in place for a deprecation cycle, or by subtle behaviour changes, so I need to set dedicated time aside for carefully making sure a new version is safe before doing a full install), but I'll let you know once I've been able to switch to it.

@tmtmtmtm
Copy link
Contributor Author

Hi @jqnatividad I'm still having this issue with qsv 0.136.0-mimalloc-apply;fetch;foreach;geocode;Luau 0.640;to;polars-0.43.1-ee9bafb;self_update-8-8;12.80 GiB-450.50 MiB-0 B-16.00 GiB (aarch64-apple-darwin compiled with Rust 1.81) prebuilt

@jqnatividad
Copy link
Collaborator

Can you confirm if this still happens with 0.137.0?

Also, can you share some HW/OS info by running this command on terminal.

system_profiler SPSoftwareDataType SPHardwareDataType

@tmtmtmtm
Copy link
Contributor Author

Still getting it with 0.137 :(

Software:

    System Software Overview:

      System Version: macOS 15.0 (24A335)
      Kernel Version: Darwin 24.0.0
      Boot Volume: Macintosh HD
      Boot Mode: Normal
      Secure Virtual Memory: Enabled
      System Integrity Protection: Enabled
      Time since boot: 8 days, 7 hours, 3 minutes

Hardware:

    Hardware Overview:

      Model Name: MacBook Air
      Model Identifier: MacBookAir10,1
      Chip: Apple M1
      Total Number of Cores: 8 (4 performance and 4 efficiency)
      Memory: 16 GB
      System Firmware Version: 11881.1.1
      OS Loader Version: 11881.1.1
`

@jqnatividad
Copy link
Collaborator

Have you tried compiling from source and using the locally compiled binary in your pipeline?

@tmtmtmtm
Copy link
Contributor Author

@jqnatividad i finally managed to be able to compile from source, and the same error happens with that too.

@jqnatividad
Copy link
Collaborator

I found an error in the get_delim_by_extension helper and have fixed it.

Can you give it a try, compiling from source?

@jqnatividad jqnatividad reopened this Oct 31, 2024
@tmtmtmtm
Copy link
Contributor Author

tmtmtmtm commented Nov 1, 2024

@jqnatividad unfortunately that hasn't fixed it :(

io error: snappy: corrupt input (expected stream header but got unexpected chunk type byte 105)
> qsv --version
qsv 0.137.0-mimalloc-apply;fetch;foreach;geocode;Luau 0.640;prompt;python-3.13.0 (main, Oct  7 2024, 05:02:14) [Clang 16.0.0 (clang-1600.0.26.3)];to;polars-0.43.1-;self_update-8-8;12.80 GiB-1.10 GiB-111.47 MiB-16.00 GiB (Unknown_target compiled with Rust 1.82) installed

@jqnatividad
Copy link
Collaborator

@tmtmtmtm what about qsv 0.138.0?

@tmtmtmtm
Copy link
Contributor Author

tmtmtmtm commented Nov 8, 2024

@jqnatividad 'fraid not :(

io error: snappy: corrupt input (expected stream header but got unexpected chunk type byte 105)
> qsv --version
qsv 0.138.0-mimalloc-apply;fetch;foreach;geocode;Luau 0.650;prompt;python-3.13.0 (main, Oct  7 2024, 05:02:14) [Clang 16.0.0 (clang-1600.0.26.3)];to;polars-0.44.2-9c08893;self_update-8-8;12.80 GiB-566.69 MiB-0 B-16.00 GiB (aarch64-apple-darwin compiled with Rust 1.82) compiled

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants