-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add placeholders to define Tables interface #11
Comments
That would indeed be useful for JuliaStats/StatsBase.jl#527. |
Hmmm.........since JuliaData/Tables.jl#82 was closed, Tables.jl doesn't itself use Requires.jl anymore, which has drastically improved Tables.jl load times (around 0.06s on my laptop consistently, with ~0.01s coming from dependency loading, and the rest ~0.05s being Tables.jl definitions itself). Is that really too heavy? I can certainly understand the "separation of concerns" argument, wherein I believe the right answer is waiting on JuliaLang/Pkg.jl#1285, which would allow the proper separation of "glue" code into a separate glue module. I do agree with the sink concern: namely that separating the API makes it easier for packages to be sources, but not sinks (since the We also already have the property that an object can be a "table" without explicitly depending on Tables.jl, via iterating property-accessible objects; but this isn't supported for columns. All in all, I think packages should just decide whether they want to take the Tables.jl dependency or not, or wait for proper glue package support. |
Could query operators be defined here as well and Tables just implement their efficient methods while having a fallback? For example, |
The problem is that packages don't really agree on the API of most of these functions. Though it would probably be useful to file an issue for each to discuss that. |
Each package would be free to define their API, but they could also implement a common one. I think it should cover those in the Query.jl style (Julia / LINQ style, Tidyverse, maditr) like... Should we first define which operations we want to support and then the API design? |
At least we would need packages to agree on a common API to ensure no ambiguities or incompatibilities happen. |
We would need to agree on at least one positional argument at specified position to have a type restriction that is defined in a package. |
Why would that be the case?
DataFrames would then add,
|
you should define bare The problem with your definition is that some packages might accept something else than In one package:
In the other package:
and you are toasted. That is what, if I understand this correctly, @nalimilan meant by common API. A minimal requirement is to be specific on a single positional argument with a fixed position. The problem in your case is that using |
Aye. The API design would have to be drafter similar to for example, Abstraction for Statistical Models in StatsBase.jl.
The benefit of defining the API rather than throwing namespaces is that the users get an universal way to query tables front-end while using the efficient struct specific internals for each tabular representation. Packages can provide their flavor as well to interact with their structs regardless. |
Yes that's not type piracy as long as packages only add methods with the first argument being of a type they own. But for that function to be generically usable, we should define at least some common signatures that are expected to work (e.g. symbol varargs). |
This is exactly what I have postulated. |
Some potential features,
|
Just as a comment from DataFrames.jl:
This will be handled as a part of |
Aye. SQL-ish and parsimonious. |
I was wondering whether we could consider adding the minimal placeholders to implement the Tables interface here. In particular:
The idea is that this way it is possible to be a Table "source" without depending on Tables. For example, I would like StructArrays to keep being a Tables source but there were concerns both on having it depend on Tables and on using Requires, which is the current solution.
This still makes it harder to be a Table sink without depending on Tables but that would be weird: one should not have a default fallback that depends on a package that maybe was not loaded.
The text was updated successfully, but these errors were encountered: