Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up Parquet utf8 validation #6667

Closed
Dandandan opened this issue Oct 31, 2024 · 1 comment · Fixed by #6668
Closed

Speed up Parquet utf8 validation #6667

Dandandan opened this issue Oct 31, 2024 · 1 comment · Fixed by #6668
Labels
enhancement Any new improvement worthy of a entry in the changelog parquet Changes to the parquet crate performance

Comments

@Dandandan
Copy link
Contributor

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Utf8 validation comes up in profiles when reading Parquet.

Describe the solution you'd like

We could use https://docs.rs/simdutf8/latest/simdutf8/ to speed up validation of utf8.

Describe alternatives you've considered

Additional context

@alamb
Copy link
Contributor

alamb commented Nov 7, 2024

Here is another idea: #6701

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Any new improvement worthy of a entry in the changelog parquet Changes to the parquet crate performance
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants