Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve import/export #110

Closed
aiida-bot opened this issue Oct 23, 2014 · 2 comments
Closed

Improve import/export #110

aiida-bot opened this issue Oct 23, 2014 · 2 comments

Comments

@aiida-bot
Copy link

Originally reported by: Nicolas Mounet (Bitbucket: mounet, GitHub: nmounet)


  • Make import and export more reliable.

Issues:

  • If UPF are imported with different uuid but same MD5, there will be some problem (the new node will probably not be created, but then this poses problems for provenance)

@aiida-bot
Copy link
Author

Original comment by Tiziano Müller (Bitbucket: dev-zero, GitHub: dev-zero):


Besides, I think that the current import/export functions/interfaces are not really flexible enough.
The following things are currently not implementable as far as I can see:

  • Certain parsers used when importing data (for example pure regex-based parsers) could work on mmaped files instead of strings which would reduce the memory requirement when importing from large files (like trajectories).
  • Some times you would like to do a transformation before exporting or importing data. This could be done by allowing a transformation function (or object) be passed to the export/import* functions
  • Some export/import functions will be code-dependant, having them mixed in the same objects (StructureData, TrajectoryData) may clutter things up
  • Some importers (and possibly exporters) may need or take additional arguments and/or multiple files to be able to generate consistent objects

A complete example interface supporting the requirements above could look like the following (a simple brain-dump, no Proof of Concept written yet):

#!python
s = StructureData.load(positions='mypos.xyz')
t = TrajectoryData.load(
  positions='mypos.xyz',
  velocities=(velocities_txt, format='xyz/string'),
  times=('my.ener', format='tab', transformation=lambda x: float(x[1])/10.),
  structure=(array([site.kind for site in s.sites], raw_data=True))

@szoupanos
Copy link
Contributor

Part of these comments will be considered when addressing issue #999

@sphuber sphuber modified the milestones: 1.0 release, v1.0.0 May 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants