Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement easy way to see remote data type for each tibble field #2510

Closed
richierocks opened this issue Mar 8, 2017 · 2 comments
Closed

Comments

@richierocks
Copy link

Inspired by this issue, sparklyr/sparklyr#537, it would be useful to have a dplyr function to retrieve the type of each column.

For local tibbles, this is not so important, since there are already easy ways of doing that. However, for remote data sources, it becomes tricky. For example, with a tibble whose data is in a database table, you need to write something like this:

first_row <- a_db_tibble %>% head(1) %>% collect()
db_data_type(dbconn, first_row)

It would be useful to have a function (tbl_schema?) that returns the remote database/Spark data type of each field. To make it easy to work with programmatically, the return value should be a tibble with two columns: name and type.

@hadley
Copy link
Member

hadley commented Mar 22, 2017

Since all remote backends are DBI based, this is basically a duplicate of r-dbi/DBI#75

@hadley hadley closed this as completed Mar 22, 2017
@krlmlr
Copy link
Member

krlmlr commented Mar 22, 2017

Perhaps dplyr should just call dbColumnInfo() during collect(), and make this information available in an attribute?

@lock lock bot locked as resolved and limited conversation to collaborators Jun 8, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants