Implement easy way to see remote data type for each tibble field #2510

richierocks · 2017-03-08T18:52:35Z

Inspired by this issue, sparklyr/sparklyr#537, it would be useful to have a dplyr function to retrieve the type of each column.

For local tibbles, this is not so important, since there are already easy ways of doing that. However, for remote data sources, it becomes tricky. For example, with a tibble whose data is in a database table, you need to write something like this:

first_row <- a_db_tibble %>% head(1) %>% collect()
db_data_type(dbconn, first_row)

It would be useful to have a function (tbl_schema?) that returns the remote database/Spark data type of each field. To make it easy to work with programmatically, the return value should be a tibble with two columns: name and type.

The text was updated successfully, but these errors were encountered:

hadley · 2017-03-22T13:06:02Z

Since all remote backends are DBI based, this is basically a duplicate of r-dbi/DBI#75

krlmlr · 2017-03-22T13:30:24Z

Perhaps dplyr should just call dbColumnInfo() during collect(), and make this information available in an attribute?

hadley closed this as completed Mar 22, 2017

lock bot locked as resolved and limited conversation to collaborators Jun 8, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement easy way to see remote data type for each tibble field #2510

Implement easy way to see remote data type for each tibble field #2510

richierocks commented Mar 8, 2017

hadley commented Mar 22, 2017

krlmlr commented Mar 22, 2017

Implement easy way to see remote data type for each tibble field #2510

Implement easy way to see remote data type for each tibble field #2510

Comments

richierocks commented Mar 8, 2017

hadley commented Mar 22, 2017

krlmlr commented Mar 22, 2017