-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Guidelines to handle schemes and views #24
Comments
I think everything can be tested in a generic fashion in the DBItest package |
Need schema parameter in |
dbBuildIdentifier() or dbQuoteIdentifier()? Double-check if anyone uses it, probably not (at least not dplyr) --> then it's dbQuoteIdentifier() with character vector |
On one hand I fully support your request but on the other this will be backward-incompatible change in the API. It is against CRAN guidelines and as DBI is quite widely used it is a really serious thing. Although I can imagine solutions which will not break backward compatibility (e.g. passing schemas vector as an attribute of the returned vector) it will be rather inconvenient for the user. Taking this into account maybe it would be better to introduce a new function and mark dbListTables() as deprecated... |
@zozlak how will it be a backward incompatible change? (And backward incompatible changes aren't against CRAN policy as long as you give everyone adequate notice about the change) |
From the policies:
|
If I want to keep current return type (character vector, one element per table), then I should either join schema name and table name (probably properly escaped) either do some strange things (like assign schema names to attribute).
And anyway if we agree that other functions (dbReadTable(), etc.) should take schema and table name separated from each other (e.g. as a two-element character vector), to be able to escape them properly, then I would expect dbListTables() to return schema and table name separately. Now if I want to return schema name separated from table name and do it in a natural way it would mean for me to return them as a character matrix (or data frame) with schema and table name in separate columns but this would change return type of dbListTables() (and so will be backward incompatible). Or am I missing something? |
I think dbListTables() should keep returning a plain character vector with unquoted names. The schema should be specified as input parameter. |
@kiril I like your idea. It will make change totally transparent for those who don't care about schemes and at the same time allow to use schemes for those who need them. Then we would also need dbListSchemes() |
|
What should |
Currently, the design for schema support looks like this:
|
Another option: |
Next iteration of the design:
Example: dbQuoteIdentifier(con, "name", namespace = c("database", "schema"))
## <SQL> "database"."schema"."name"
dbReadTable(con, name, namespace = c("database", "schema"))
## ...
dbListNamespaces(con)$namespaces
[[1]]
list()
[[2]]
list(namespace = c("database", "schema"))
...
dbListTables(con, name, namespace = c("database", "schema")) |
OK. These would all work for my wish list. tbl(pg, dbId("some_schema", "some_table"))
dbListTables(dbId("some_schema"))
compute(dbId("some_schema", "some_table")) (Actually, |
Thanks for your input. I'm still confident that we can make do without a The approach I'm anticipating is described in #24 (comment), I'm repeating it below (slightly modified). We will not be able to decompose fully qualified table names into their components (cluster, database, schema, ...), but I'd argue we don't need to. I think we need two components:
For the first, we already have the For the second, I'd like to define a generic |
Schema support is vital for working with big/enterprise databases. In terms of what to expose as metadata - it'd be nifty if you could support the range of information_schema exposed concepts. information_schema is an ansi compliant metadata schema that can be leveraged for getting consistent results across databases.
|
example provided by @stephlocke makes me thinking about the approach proposed by @krlmlr . By the way while reading discussion about the
While the first approach provides users with short and intuitive syntax only the latter one will work with the |
I just implemented a simple version of this for odbc, https://github.com/rstats-db/odbc/compare/SQLTable?expand=1#diff-ada2ba640dbb79b8d3a115369830f718. It would be nice to settle on something in DBI as well, as this generic should really be defined there. I think reversing the order of the arguments like I did is pretty important; i.e. the highest level argument should the furthest right. This approach would also allow the generic to skip names for Also I think you need to pass the connection object to |
- Schema support: Export `Id()`, new generics `dbListObjects()` and `dbUnquoteIdentifier()`, methods for `Id` that call `dbQuoteIdentifier()` and then forward (#220).
Hello, thank you for the recent addition. Support for schema is great. Using DBI 0.7.15 from github and trying to send to dbo.mytable using:
Is this the correct usage of thank you |
That looks like the right usage to me. I'm also having this same problem with
returns
|
Commenting on a closed issue is not a good way to get help. You probably need to file an issue with whatever package you are using to connect to SQL server. |
Thanks. |
Oh, I'm confused, it's |
Thanks, I'll move this to |
I rise my issue here, because I think it would be valuable to decide such things on the "top level", so implementations of the DBI API can be compiliant one with the other and provide users with a really unified API.
Wide variety of modern database systems supports:
but there are no official guidelines how to handle that in packages implementing the DBI API.
There is a nottice in dbGetTables() description that "This should, where possible, include temporary tables.". Is it possible to add a similar notice regarding views? And maybe also in dbReadTable() description?
I am asking about it because views play a vital role for both:
and ommiting them in DBI API implementations dramatically reduce their functionality.
This is even more painful as dplyr uses DBI database connectivity backends.
The schemes issue is more complicated. DBI wanted a table to be identified by simple a name. For database systems which support multiple schemes per database a clarification is needed:
I checked RPostgreSQL, RMySQL and ROracle and they deal with scheme in different way:
And checking onother DBI API implementations would probably make this list longer.
This inconsistency limits the usage of the DBI API when it cames to handle schemes.
The text was updated successfully, but these errors were encountered: