Query optimization #333

tomwhite · 2023-12-12T16:08:40Z

There are a couple of possibilities for doing query optimization that have come up recently.

Dask-expr will support arrays soon (dask/dask-expr#446). It would be interesting to see if the expression system can be used in Cubed, and if there are any changes we'd need to contribute back.

egglog "is a Python package that provides bindings to the Rust library egglog, allowing you to use e-graphs in Python for optimization". Interestingly, it has a prototype of the Array API, which might make it a good candidate for providing query optimization for Cubed. This tutorial has an example of using the Array API implementation to optimize a scikit-learn function. (@saulshanabrook told us about egglog at yesterday's Pangeo Distributed Computing Working Group.)

The text was updated successfully, but these errors were encountered:

dcherian · 2023-12-17T02:01:55Z

Ah wish I could've made the discussion. egglog looks cool!

saulshanabrook · 2023-12-17T19:27:43Z

I would be happy to do another call to get into more specifics of trying to implement the types of rewrites cubed needs with egglog. I did something similar a few weeks ago around the PyTensor project, using a concrete example to drive some exploration of how it could be implemented: https://egglog-python.readthedocs.io/latest/explanation/2023_11_17_pytensor.html

EDIT: Both the python bindings and the upstream rust library are in active development, so it's definitely useful to see if anything can be improved to support this kind of use case.

tomwhite · 2023-12-19T09:54:09Z

@saulshanabrook that would be awesome - thanks for the offer!

tomwhite added help wanted Extra attention is needed array api optimization labels Dec 12, 2023

tomwhite mentioned this issue Dec 19, 2023

Optimization tracking issue #339

Open

20 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Query optimization #333

Query optimization #333

tomwhite commented Dec 12, 2023

dcherian commented Dec 17, 2023

saulshanabrook commented Dec 17, 2023 •

edited

Loading

tomwhite commented Dec 19, 2023

Query optimization #333

Query optimization #333

Comments

tomwhite commented Dec 12, 2023

dcherian commented Dec 17, 2023

saulshanabrook commented Dec 17, 2023 • edited Loading

tomwhite commented Dec 19, 2023

saulshanabrook commented Dec 17, 2023 •

edited

Loading