-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document table function support #12338
Conversation
public Optional<TableFunctionApplicationResult<ConnectorTableHandle>> applyTableFunction(ConnectorSession session, ConnectorTableFunctionHandle handle) | ||
{ | ||
// get the ConnectorTableHandle and a list of ColumnHandles representing the table function result | ||
|
||
return Optional.of(new TableFunctionApplicationResult<>(tableHandle, columnHandles)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not copy code into docs to avoid going out of sync. Pointer to a method (interface method name to implement) should suffice and then that method javadoc should describe the technical details
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, I can indeed remove the pasted code, as it doesn't add value to the docs.
However, I'd like to leave the other code samples (the constructor and analyze()), even at the risk of going out of sync, because they provide a convenient way of explaining things. Anyway, I hope I won't miss the docs update if the SPI changes in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass..
@mosabua thanks for your feedback. I split the document in two, one from the user's perspective, and one from the developer's perspective.
The changes are in a separate commit. |
=============== | ||
Table functions | ||
=============== | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need an intro sentence about table functions .. what is it? And why would I use it..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could just link to the user docs actually
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a bunch of clarifications to allow users to actually use this
=============== | ||
Table functions | ||
=============== | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could just link to the user docs actually
Table function declaration | ||
-------------------------- | ||
|
||
Trino supports adding custom table functions. They are declared by connectors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So ... thats kinda weird .. so users have to maintain a fork of the connector? It cant be a separate plugin that is somehow associated with a specific connector?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it doesn't require a fork or a separate plugin. Any existing connector can expose table functions. Soon we'll have the query pass-through feature, which is a good examples of (existing) connectors registering polymorphic table functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, I believe that we will be adding a lot of custom table functions to Trino. Of course, we will have to judge if the function is generally useful. If someone need a function that we find "controversial", they might have to fork.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay ... maybe we should say something like.. want another table function added to trino .. file a ticket or PR and contact us ... although thats kinda odd for the docs.. not sure at this stage how we should proceed .. if there are not currently usable table functions maybe we should merge..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we don't write things like "want another connector...file a ticket..." in the docs.
At the moment, we have a bunch of table functions in review. They should be in soon. When we merge them, we'll document them, and they will serve as a good example of what people have to do if they want a new table function in Trino.
Also, we will likely have some publications around table functions.
Trino supports adding custom table functions. They are declared by connectors | ||
through implementing dedicated interfaces. | ||
|
||
To declare a table function, you need to subclass ``ConnectorTableFunction``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we probably need to add a whole bunch more info about the mechanics .. like separate module or fork of specific connector, pom.xml, dependencies?, what about package structure and so on..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added another sentence about passing the custom implementation through a Provider. It is all that the implementor must do. Also, this is a pretty common pattern of exposing things from connectors in Trino.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the user has to do the development .. and that is not the case it seems.. so it just goes into existing connectors .. we should mention that somehow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When thinking of it, I can identify at least four categories of table functions, and they will be added in different ways:
- functions that we will add to the existing connectors. The query pass-through capability is a good example, because it is specific to the connector: you pass different syntax to Elasticsearch than to Teradata.
- functions that users add to their own connectors (forked). Generally, these are functions too controversial or too specific to be included in Trino
- "global" functions, which will be built-in in Trino, but they don't belong to any specific connector. Example: the pivot capability. We will support them only when we have the dedicated Operator
- I think that having table function support might result in creating a whole new kind of connectors. Currently, we have connectors over different data sources. Now that we support adding table functions, it makes sense to connect to an outer system for its processing capabilities and not necessarily for the data. Such connector could expose table functions that give access to the outer system's abilities or resources, so that they can be used for processing data in Trino.
@mosabua could you help me with the wording? The problem is, the table functions can go both to the existing and new connectors. I can't predict what kind of use case will be dominant.
- returned row type | ||
|
||
In the example, the returned row type is ``GENERIC_TABLE``, which means that | ||
the row type is not known statically, and it will be determined dynamically |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and it is determined
|
||
In order to provide all the necessary information to the Trino engine, the | ||
class must implement the ``analyze()`` method. This method is called by the | ||
engine during the Analysis phase of query processing. The ``analyze()`` method |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the analysis phase
|
||
The access control for table functions can be provided both on system and | ||
connector level. It is based on the fully qualified table function name, | ||
which consists of the catalog name, the schema name, and the function name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the syntax of ?
catalog.schema.function
I assume.. but spell that out..
the catalog. You can qualify the function name with a schema name, or with | ||
catalog and schema names:: | ||
|
||
SELECT * FROM schema_name.my_function(1, 100) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
really.. I assume that only works if you specified the current catalog with USE
does that also work for catalog.schema ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we now support path resolution (#11476).
|
||
Otherwise, the standard Trino name resolution is applied. The connection | ||
between the function and the catalog must be identified, because the function | ||
is executed by the corresponding connector. If the function is not registered |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is not registered ... that begs the question .. is there anything I have to do as a user?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm... that probably means the user misspelled something or they don't have the right catalog on the path. How could I word it?
within the SQL query. They can be used for working with external systems as | ||
well as for enhancing Trino with capabilities going beyond the SQL standard. | ||
|
||
Trino supports adding custom table functions. They are declared by connectors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to add info about what table functions are there for each connector .. even if that is none
and then be explicit about linking to the dev guide to figure out how to add any
|
||
There are two conventions of passing arguments to a table function: | ||
|
||
- arguments passed by name:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should start with uppercase .. and maybe bold it .. whatever you think looks good in the output
@mosabua I applied your comments as another fixup commit. Please take a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm
This is documentation for Table Functions #11336.
The feature is already in.