Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow customisation of field types before sending to datapusher #175

Open
rossjones opened this issue Apr 22, 2016 · 2 comments
Open

Allow customisation of field types before sending to datapusher #175

rossjones opened this issue Apr 22, 2016 · 2 comments

Comments

@rossjones
Copy link
Contributor

Using something like https://github.com/timwis/csv-schema which is an in-browser JS only tool for analysing and guessing types (of CSV columns), might help datapusher import data into datastore.

@pwalsh
Copy link
Member

pwalsh commented Apr 22, 2016

At Open Knowledge, we are working on lots of tooling around Data Package and JSON Table Schema as part of the Frictionless Data project.

We have libraries that do schema inference in Python and Javascript along with a bunch of other things related to schemas. We already have a few apps live that do analysing and guessing of types in both Python and Javascript, including DataPackagist and the new OpenSpending Packager.

We've started integrating this ecosystem of tools into CKAN with https://github.com/ckan/ckanext-datapackager as a first step.

We've also written some nice packages to leverage JSON Table Schema and make import and export flows with data storage backends and tabular data formats (CSV, Excel, JSON) seamless.

Interfaces for SQL and BigQuery are done, and more are planned (Mongo, etc.).

The next logical step in terms of CKAN integration, from my perspective, is to use all of the above to greatly improve both the import/validation pipelines of data into CKAN, and, crucially, to radically improve the datastore.

CKAN integration is definitely part of the Frictionless Data roadmap, and it would be great to work with the wider community on CKAN/Frictionless Data integration, to solve issues like this in datapusher/datastore in a robust way.

@rufuspollock
Copy link
Member

+1 on @pwalsh and also flag my longish comments in this earlier issue about data pusher where I suggested "Connect / Reuse Frictionless Data and Data Package" #150 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants