-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hierarchical data structure in .eln or not #98
Comments
@NicolasCARPi @nicobrandt @FlorianRhiem Any opinion / preference? |
So far, SampleDB doesn't follow either alternative, though it aligns roughly with A. The structure for a SampleDB export is roughly like this:
The key distinction to a deeply nested approach is that only the Dataset nodes which are parts of I think it's fine to have nodes which represent directories and aren't really importable objects/Datasets, and to have nodes that are deeply nested but are really importable objects/Datasets, as long as there's a well-defined way to find all the Dataset nodes that should be considered as "importable". Sure, they may use One way would be to add a custom attribute, or to create a @salexan2001 Are you aware of how this is handled by RO-Crates more generally? |
The exported data structure in Kadi4Mat currently looks like the following:
Whether there are one or multiple "folders" depends on whether a single record (the basic data/metadata containers in Kadi4Mat) or a collection of multiple records is exported. This is also why we have two examples in this repo, one for each resource type. The general structure is the same though. That being said, we also support collection hierarchies in Kadi4Mat. However, the export currently only goes one level deep. We haven't decided yet on how to deal with this in the future. Basically, we could either keep the hierarchy flat (maybe with some additional metadata if someone really wants to recreate the hierarchy), or actually make use of "sub-folders". In the latter case, intermediate folders (collections) would not contain any files though. Regarding the import, we currently focused on the flat structure shown above. Importing nested structures would in principle be possible for us, but probably not without limitations/some information loss, independent of whether we flatten everything. All in all, I don't have strong opinions about this, as long as we can agree on something. In general, I suggest keeping our spec a bit more strict than the RO-Crate spec though to make our life a bit easier. I also suggest discussing this in the next meeting, rather than only a couple of people deciding right now :P |
I don't think we have to / should make a decision here. |
I agree that we might want to discuss this during a meeting, along with #69. |
For me that's the important bit: things that must be imported must be mentionned in The Ro-crate spec allows for nested Datasets, but as @nicobrandt said, having stricter rules will make our life easier. And having all the importable bits in the |
So, to summarize, what are the issues that need to be discussed now (for me preferably in one of the next meetings)?
In my opinion an important goal of the ELN file format should be to maximize compatibility/interoperability between the different ELNs, so I think having a strict and simple specification (while maintaining full compatibility to ROCrate) is desirable. Btw.: Are there already efforts to create something like standardized ELN file format libraries for different languages used by the ELNs? Or is there basically a new import/export implementation in each ELN? |
(based on discussion around PR #95)
Should the .eln file allow for a hierarchical structure of folders and files or keep it flat? It seems most ELNs run a rather flat structure that does not allow arbitrarily nested structures which can have directories or files on the top.
Hence there seems to be two alternatives:
A) the .eln has a rather flat structure of two levels (./ and the items); Those ELNs, that have a deep hierarchy, flatten on export to .eln and deepen/raising on import from .eln (possible algorithm for deepening below).
B) the .eln has a arbitrary deep hierarchy. During import the ELNs, that do not like deep hierarchies, flatten the information of the .eln.
Deepening algorithm can be based on boolean operations. A=set of ids at './'. B=set of all ids of all hasParts. At top level are those items: A - B. From there use the given hasPart information to construct the individual trees. If items of B are not in A, they are supplementary.
The text was updated successfully, but these errors were encountered: