-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make export data more robust, i.e maybe allow for dirty exports, and or create helpers for cleaning data #4631
Comments
Add on, I have also unsealed data nodes. So these are not processes only |
If I understand correctly, you have two use cases for exporting
Regarding backing up: I don't think we should ever allow unsealed nodes to be backed up, because as you say, what do you do on import? The logic to adjudicate between "clashes" will be intractable. But most importantly, backing up should not be done through export archives. There is so much overhead that it is extremely inefficient. The real way to backup is by dumping database contents and copying the repository folder. The only reason we haven't made a Then, the use case to share the (intermediate) data with a colleague. We are faced with the same problem. I would be very hesitant to allow unsealed nodes to be exported, because what do you do when these are imported? What if you export the same node later after it has been sealed. If you import it now there is two nodes with the same UUID with very different properties. You might say just take the one that is sealed as the correct one, but I am not sure if it is easy to oversee all the consequences. The point where I agree is that we should limit the problems when exporting due to bugs in AiiDA itself. So I would be tempted to go in the direction of adding helpers as you propose to clean faulty nodes. This is easy in some cases, but not all.
So ultimately, I am not really sure what to do here other than:
Allowing unsealed nodes to be exported (even with a force flag) will have huge (unforeseen) consequences.
Data nodes can never be sealed. The sealing concept only applies to process nodes. |
Is your feature request related to a problem? Please describe
Exporting not sealed nodes is not allowed. (I understand why that is because they should not run in an imported database, and also exporting a process node without outputs is not good.)
Also per default a missing repository folder for a node will also end in a critical failure of an export.
While this is all well to keep exports clean, it is very annoying, since the engine currently "looses" some processes very easily which never get sealed (in every of my databases I have some of these, sometimes process kills are incomplete, etc), and you have to seal them per hand or explicitly exclude them and all their provenance which would pick them up for the export. Also if you seal them by hand, (sometimes this even does not work, no idea why) you might end up with the error that for some of them there is no repo folder and you have to create dummy folders these.
While for a publication it is clear, that I want to go through all this and clean the whole graph and publish clear data. (So I export only all the "good" and "relevant" stuff.
There are situations where one does not care, i.e If I just want to do a quick export for a backup, or give may database to a colleague.
Also consider the case where you want to create a backup, but there are still not finished processes, but the daemon is not running.
Describe the solution you'd like
Of course the ideal solution would be if the engine never looses any processes and fails to seal nodes on the way, but since I am not sure that this can be fixed, it might make more sense to work with it.
And as a user I do not want to write hundreds of lines of code and try several times to export just that aiida allows me to export some data.
(Also non export user might not be able to do this at all).
(This might be overall problematic and not wanted, so this should be carefully discussed)
In all the last 3 cases I still want to be warned instead of the critical error message.
What do you think? Maybe the aiida-fleur plugin generates these 'zombie' processes more easily and the general 'user experience' is a different one.
The text was updated successfully, but these errors were encountered: