-
Notifications
You must be signed in to change notification settings - Fork 8
Import .CSV files with custom cycler definition metadata file #97
Comments
If needed, I have Arbin .CSV files that can be used for testing. |
On the parser side, this would involve implementing a new parser that uses the JSON file to map columns in the CSV to our standard columns. The (perhaps bigger) piece of work would be to allow users to supply this JSON file. Perhaps this would be a field in the harvester @mjaquiery? We'd also need to determin the format of this JSON, but seems like it would just be a dictionary that maps column names to our standard column names. In this case, we'd have to tell the user that they need to provide a csv with the 1st row being header names |
I'd suggest we have two header rows, one with column names and one with data type. |
Hmm, can we infer the data-type from the data itself? We could then compare the data-type to an approved list or single value and throw an error if mismatch. I think it's probable that asking an average user to define data-type is too much. I think it's easiest for end users to provide CSV files with a corresponding cycler definition structure. This could either be in JSON format (we could provide some example for users to copy). |
Perhaps a better alternative is to maintain internal structures ourselves, and let users upload .csv and select where it comes from from a list? We can allow advanced users to create new mappings (in JSON or something) where they provide an example .csv file and add metadata for the columns. My concern is that data types aren't always simply parsable from data: datetime strings are difficult to automatically recognise, for example, and sometimes datasets use string values (e.g. "NA") to represent missing numerical data. Maybe I'm not quite seeing where these data are coming from and what they look like in their original form. We might benefit from a real-time discussion about this. |
Is your feature request related to a problem? Please describe.
Currently, importing .CSV files from non-supported cyclers isn't supported.
Describe the solution you'd like
A method to import .CSV files with a corresponding JSON file that defines a custom cycler data standard. For example, a user with an Arbin exported .CSV file could provide a JSON file that defines the header names for import into Galv. This could also be used to import virtual data gained from predictive models, as long as the metadata file had the corresponding information.
Additional context
To integrate with #95, if the JSON exported (as per #95) contained the required information to reimport into Galv that would enable users to share between Galv instances. Perhaps, there might be a better method for this though.
The text was updated successfully, but these errors were encountered: