You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I feel the documentation is lacking a top-level "primer" of what aiida-core is/does (plus having this terminology permeate in the API code).
Something along the lines of:
AiiDA is a workflow engine framework, which provides five core capabilities:
Storage: AiiDA automates the storage of generated calculation/workflow inputs, outputs, and the provenance between them.
AiiDA also provides functionality for introspecting (querying) this data.
Communication: AiiDA provides built-in functionality to communicate with external compute services (such as HPCs and cloud); automating data transfer and job scheduling.
Processing: AiiDA provides an API for building and running complex workflows, composed of one or more computations, that can be run locally or externally.
AiiDA also provides process persistance (check-pointing), meaning that running workflows persist in the event of lost connections or system reboots.
Developer interface: AiiDA provides a plugin system, for developers to extend aiida-core with their own workflows, data types, HPC interfaces, etc.
User interface: AiiDA provides both command-line and web-based APIs for starting, monitoring and introspecting workflows.
Storage
I have recently been streamlining the storage API (e.g. #5154, #5172, #5320, #5145, #5228).
It is of note that, although we currently split storage into the PostgreSQL database and the disk-objectstore repository, this should not be a primary concern for "standard" users.
I would explain the storage as something like the following:
A Profile is intended for a single project, configured for the processing and storage of a single provenance graph.
Entities are subsections of a profile's storage: user, authinfo, computer, group, node, log, comment
Fields are (string) keys on an entity that point towards a (JSONable) value (such as its unique identifier)
By default, fields are deemed immutable once they are stored
The node entity is the primary component of a provenance graph (and the edges between them)
A node has a "special" field, attributes, which allows it to be extended to different data types
The attributes value is a dictionary that can contain keys specific to that data type
Process node attributes contain some special keys, which are deemed mutable until the process has completed (and is sealed). This includes the checkpoint key, which stores the process checkpoint in YAML serialized format.
A node also has a special field, extras, which allows users to store mutable data on any data type.
node entities can also store objects, which are (string) POSIX paths that point towards bytes data.
It is the data types responsibility to interpret object's bytes encoding format
Processing
The AiiDA daemon and RabbitMQ broker, both fall under this processing umbrella.
I feel the terminology around this (daemons, workers, etc) is probably a bit confusing to "non-technical" users, and could be improved. Also, as per aiidateam/AEP#30, it is quite possible that we will replace RabbitMQ in the future, so there should not be any terminology specific to that (e.g. broker)
But more to come on this later...
The text was updated successfully, but these errors were encountered:
Hi @soma2000-lang, sorry for the late reply, what did you have in mind?
I primarily wrote this down, whilst doing #5330, to collate some of my thoughts and circle back around to. But happy to collaborate
I feel the documentation is lacking a top-level "primer" of what aiida-core is/does (plus having this terminology permeate in the API code).
Something along the lines of:
Storage
I have recently been streamlining the storage API (e.g. #5154, #5172, #5320, #5145, #5228).
It is of note that, although we currently split storage into the PostgreSQL database and the disk-objectstore repository, this should not be a primary concern for "standard" users.
I would explain the storage as something like the following:
Profile
is intended for a single project, configured for the processing and storage of a single provenance graph.user
,authinfo
,computer
,group
,node
,log
,comment
node
entity is the primary component of a provenance graph (and the edges between them)node
has a "special" field,attributes
, which allows it to be extended to different data typesattributes
value is a dictionary that can contain keys specific to that data typeattributes
contain some special keys, which are deemed mutable until the process has completed (and is sealed). This includes thecheckpoint
key, which stores the process checkpoint in YAML serialized format.node
also has a special field,extras
, which allows users to store mutable data on any data type.node
entities can also storeobjects
, which are (string) POSIX paths that point towards bytes data.Processing
The AiiDA daemon and RabbitMQ broker, both fall under this processing umbrella.
I feel the terminology around this (daemons, workers, etc) is probably a bit confusing to "non-technical" users, and could be improved. Also, as per aiidateam/AEP#30, it is quite possible that we will replace RabbitMQ in the future, so there should not be any terminology specific to that (e.g. broker)
But more to come on this later...
The text was updated successfully, but these errors were encountered: