Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving AiiDA terminology #5324

Open
chrisjsewell opened this issue Jan 24, 2022 · 2 comments
Open

Improving AiiDA terminology #5324

chrisjsewell opened this issue Jan 24, 2022 · 2 comments

Comments

@chrisjsewell
Copy link
Member

I feel the documentation is lacking a top-level "primer" of what aiida-core is/does (plus having this terminology permeate in the API code).

Something along the lines of:

AiiDA is a workflow engine framework, which provides five core capabilities:

  1. Storage: AiiDA automates the storage of generated calculation/workflow inputs, outputs, and the provenance between them.
    AiiDA also provides functionality for introspecting (querying) this data.
  2. Communication: AiiDA provides built-in functionality to communicate with external compute services (such as HPCs and cloud); automating data transfer and job scheduling.
  3. Processing: AiiDA provides an API for building and running complex workflows, composed of one or more computations, that can be run locally or externally.
    AiiDA also provides process persistance (check-pointing), meaning that running workflows persist in the event of lost connections or system reboots.
  4. Developer interface: AiiDA provides a plugin system, for developers to extend aiida-core with their own workflows, data types, HPC interfaces, etc.
  5. User interface: AiiDA provides both command-line and web-based APIs for starting, monitoring and introspecting workflows.

Storage

I have recently been streamlining the storage API (e.g. #5154, #5172, #5320, #5145, #5228).
It is of note that, although we currently split storage into the PostgreSQL database and the disk-objectstore repository, this should not be a primary concern for "standard" users.
I would explain the storage as something like the following:

  • A Profile is intended for a single project, configured for the processing and storage of a single provenance graph.
  • Entities are subsections of a profile's storage: user, authinfo, computer, group, node, log, comment
  • Fields are (string) keys on an entity that point towards a (JSONable) value (such as its unique identifier)
    • By default, fields are deemed immutable once they are stored
  • The node entity is the primary component of a provenance graph (and the edges between them)
    • A node has a "special" field, attributes, which allows it to be extended to different data types
      • The attributes value is a dictionary that can contain keys specific to that data type
      • Process node attributes contain some special keys, which are deemed mutable until the process has completed (and is sealed). This includes the checkpoint key, which stores the process checkpoint in YAML serialized format.
    • A node also has a special field, extras, which allows users to store mutable data on any data type.
    • node entities can also store objects, which are (string) POSIX paths that point towards bytes data.
      • It is the data types responsibility to interpret object's bytes encoding format

Processing

The AiiDA daemon and RabbitMQ broker, both fall under this processing umbrella.
I feel the terminology around this (daemons, workers, etc) is probably a bit confusing to "non-technical" users, and could be improved. Also, as per aiidateam/AEP#30, it is quite possible that we will replace RabbitMQ in the future, so there should not be any terminology specific to that (e.g. broker)
But more to come on this later...

@soma2000-lang
Copy link

@chrisjsewell Please assignme this.I would like to work on this

@chrisjsewell
Copy link
Member Author

Hi @soma2000-lang, sorry for the late reply, what did you have in mind?
I primarily wrote this down, whilst doing #5330, to collate some of my thoughts and circle back around to. But happy to collaborate

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants