Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python API #395

Open
domoritz opened this issue May 23, 2024 · 11 comments
Open

Python API #395

domoritz opened this issue May 23, 2024 · 11 comments
Labels
projects Project ideas for Mosaic

Comments

@domoritz
Copy link
Member

domoritz commented May 23, 2024

Design and develop a Python API for generating Mosaic specifications. Similar in spirit to Vega-Altair, the Python API should enable programmatic construction of a Mosaic JSON specification and enable easy operation with data from Pandas, Polars, DuckDB, etc.

@domoritz domoritz added the projects Project ideas for Mosaic label May 23, 2024
@domoritz
Copy link
Member Author

domoritz commented May 23, 2024

Since #358, Mosaic has a JSON schema (e.g. https://github.com/uwdata/mosaic/blob/main/docs/public/schema/v0.10.0.json) for the spec format so we should be able to use https://github.com/vega/altair/tree/main/tools/schemapi to generate the Python API. The idea is to build something similar to https://github.com/vega/altair/blob/main/tools/generate_schema_wrapper.py but Mosaic specific. The mosaic schema is pretty large so I expect there to be some things we need to change in schemapi but what exactly remains to be seen.

@aeroaks
Copy link

aeroaks commented Aug 26, 2024

Hi,
Really amazed by the performance of Mosaic. I can't wait to use it in my daily work to visualise those heavy datasets.

I looked in the documentation and github and it looks like that currently the best way to use Mosaic from Python is the Jupyter widget path. Is that correct?

@domoritz
Copy link
Member Author

Yes. I'm planning to work on a Python API for vgplot this semester.

@aeroaks
Copy link

aeroaks commented Aug 26, 2024

Great, Count me in if you need testing support. :)

@domoritz
Copy link
Member Author

domoritz commented Oct 2, 2024

https://pypi.org/project/gosling/ by @manzt looks neat as well and can be an Inspiration. I like how he used Altair as a submodule to get schemapi (which probably should have another release). https://github.com/gosling-lang/gos/tree/main/tools

@manzt
Copy link
Collaborator

manzt commented Oct 2, 2024

Thanks for the ping.

If I were starting again, I'd seriously consider using msgspec from @jcrist to build the API layer upon. The separation of encoding/decoding from the base classes is really desirable and makes it so the API layer can be cleaner IMO. Both pydantic and schemapi do not separate the two.

However, in this case I assume you'd like to generate the Python API from some JSON schema. msgspec has the opposite (https://jcristharif.com/msgspec/jsonschema.html), but I wonder with the relatively simpler mosaic JSON schema it would be easier to generate Python API from the JSON schema.

msgpec also supports other encoders, so you could reuse the base classes to generate yaml as well.

@manzt
Copy link
Collaborator

manzt commented Oct 2, 2024

Full disclosure, I've been wanting to move toward msgspec for having widget developers define widgets as well. (Will be some time from now to really experiment with those changes as I've been busy wrapping up the PhD).

@domoritz
Copy link
Member Author

domoritz commented Oct 2, 2024

Thanks for the pointer! We should explore that.

but I wonder with the relatively simpler mosaic JSON schema it would be easier to generate Python API from the JSON schema

What makes mosaic schema simpler? I would like to avoid having to maintain a Python and a TypeScript version of the spec schema. If we can have it in one place and convert to either, that would be best.

@manzt
Copy link
Collaborator

manzt commented Oct 2, 2024

What makes mosaic schema simpler?

Sorry, I haven't taken a close look at the spec package... but the thing that makes spec generation "hard" IMO in Altair/Gosling is that TS source makes use of sophisticated TypeScript features like generics and extending base types.

Having a high-level look at mosiac the types seem to have less inheritance and don't use these features. I believe that would be easier to map to JSON (and simple language bindings). Could be wrong!

That said, since the TS types aren't really used in mosaic core code base (and it appears mostly for generating the JSON spec), maybe it would make sense to author the types in Python for end users on the Python side, and then generate the JSON specs that way.

Totally agree that maintaining both Python and TypeScript types is a huge pain.

@domoritz
Copy link
Member Author

domoritz commented Oct 2, 2024

I don't care too much what the source of truth is but I'd like ts and python types. The ts types are being used.

@manzt
Copy link
Collaborator

manzt commented Oct 2, 2024

The ts types are being used.

Ah sorry, I wasn't aware. I'd need to explore more to find whats out there. Ideally you could generate typed Python dataclass-like things (e.g., msgspec.Struct) from something like TS or the JSON schema.

Then the API could be a set of functions that operate on those Python types.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
projects Project ideas for Mosaic
Projects
None yet
Development

No branches or pull requests

3 participants