Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Compression via Nvcomp #23

Closed
thomcom opened this issue Mar 9, 2022 · 3 comments
Closed

[Feature] Compression via Nvcomp #23

thomcom opened this issue Mar 9, 2022 · 3 comments

Comments

@thomcom
Copy link
Contributor

thomcom commented Mar 9, 2022

Hi! We've been talking about adding nvcomp to kvikio. I'm looking to add python bindings for Snappy, Cascaded, and Lz4 algorithms from nvcomp. In order to do so, we'll need to add the python bindings nvcomp.pyx and nvcomp.pxd to kvikio/python/kvikio/_lib and a wrapper for them, nvcomp.py. Once this is done I'll write tests.

A CMakeFlag -DUSE_NVCOMP=True will be added, disabled by default.

We're planning on using the nvcomp headers that are installed by cudf, which can be installed via conda, right?

I'm also looking into adding kvikio as another library option for https://github.com/trxcllnt/rapids-compose, which will make maintenance and development quite easy.

@thomcom
Copy link
Contributor Author

thomcom commented Mar 9, 2022

Looking at cuDFs means for installing nvcomp, it installs it locally only as a cudf build dependency. I'm not sure how to proceed, either kvikio needs to know about cuDF's internal workings or it'll need more explicit dependencies, I think. I'm checking if the conda install of cudf has nvcomp.h etc, I doubt it.

@thomcom
Copy link
Contributor Author

thomcom commented Mar 10, 2022

@madsbk I did find the nvcomp.h headers in the cudf conda build after burning most of the day trying to run side-by-side rapids-compose builds, one for kvikio development, but I had to abandon that route with little fruit.

@madsbk
Copy link
Member

madsbk commented Mar 29, 2022

Implemented in #24

@madsbk madsbk closed this as completed Mar 29, 2022
vuule pushed a commit to vuule/kvikio that referenced this issue Nov 8, 2023
…i#23)

Fixes: rapidsai/cudf#14218

This PR disallows assigning NaT to non-datetime/timedelta columns. Pandas allows this by changing the column to object dtype, which we cannot support.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants