Skip to content
This repository has been archived by the owner on Oct 15, 2024. It is now read-only.

decisions: new categories #4201

Merged
merged 34 commits into from
Jan 28, 2022
Merged

decisions: new categories #4201

merged 34 commits into from
Jan 28, 2022

Conversation

markus2330
Copy link
Contributor

Basics

These points need to be fulfilled for every PR:

  • Short descriptions of your changes are in the release notes
    (added as entry in doc/news/_preparation_next_release.md which
    contains _(my name)_)
    Please always add something to the release notes.
  • Details of what you changed are in commit messages
    (first line should have module: short statement syntax)
  • References to issues, e.g. close #X, are in the commit messages.
  • The buildservers are happy. If not, fix in this order:
    • add a line in doc/news/_preparation_next_release.md
    • reformat the code with scripts/dev/reformat-all
    • make all unit tests pass
    • fix all memleaks
  • The PR is rebased with current master.

If you have any troubles fulfilling these criteria, please write
about the trouble as comment in the PR. We will help you.
But we cannot accept PRs that do not fulfill the basics.

Checklist

Check relevant points but please do not remove entries.
For docu fixes, spell checking, and similar none of these points below
need to be checked.

  • I added unit tests for my code
  • I fully described what my PR does in the documentation
    (not in the PR description)
  • I fixed all affected documentation
  • I added code comments, logging, and assertions as appropriate (see Coding Guidelines)
  • I updated all meta data (e.g. README.md of plugins and METADATA.ini)
  • I mentioned every code not directly written by me in THIRD-PARTY-LICENSES

Review

Reviewers will usually check the following:

Labels

If you are already Elektra developer:

  • Add the "work in progress" label if you do not want the PR to be reviewed yet.
  • Add the "ready to merge" label if the basics are fulfilled and you also
    say that everything is ready to be merged.

@markus2330 markus2330 force-pushed the decisions branch 2 times, most recently from 16fe034 to 6522f71 Compare December 17, 2021 20:35
@markus2330 markus2330 force-pushed the decisions branch 2 times, most recently from 066dd85 to 6284a35 Compare December 23, 2021 09:32
Copy link
Member

@kodebach kodebach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked through these decisions and had a few thoughts. The big one is the escaped key names. I don't really like the "lazy value" solution, but I don't really have an alternative either.

Additionally, I have some concerns that didn't fit anywhere:

  1. Memory layout: In particular because of mmapstorage, but possibly elsewhere too, we basically require that implemenations use a certain memory layout. This limits alternative implementations (e.g. I think in Rust it would be much easier than in C to use a separate struct without a meta-keyset for meta-keys). It is also a bit weird that because of mmapstorage the memory layout is basically part of the API, and yet we use an opaque struct (I know it has other reasons like safe locking, but it is still weird).
  2. Key Names: I think it might actually make sense to create a separate struct KeyName. I still need to think about how this would work exactly, but there are a few reasons why I think this could be good:
    • Often a Key * argument is used when you just need a key name. This is because with a Key * we know the name is valid and we get an unescaped name. Using a Key * here makes the API a bit confusing.
    • There could be a richer API for manipulating key names without relying on the escaped name (e.g. concatenating two full key names). With the current situation, all these functions would need to be part of the API for a key. Adding such functions to the key API is certainly not minimal.
    • If the struct is not opaque, stack instances can be created. This means we can get rid of ksLookupByName. However, then we get problems with locking key names so maybe this part is not a good idea.
    • This might also make it possible to have a keyNew without using an escaped name and without using lots of macro tricks and array arguments to make the API somewhat usable.
    • It also shows (within the API) that a key name has some internal structure and is not just a plain string.

doc/DESIGN.md Outdated Show resolved Hide resolved
doc/DESIGN.md Outdated Show resolved Hide resolved
doc/METADATA.ini Outdated Show resolved Hide resolved
doc/decisions/binary.md Show resolved Hide resolved
doc/decisions/binary.md Show resolved Hide resolved
doc/decisions/simplify_api.md Outdated Show resolved Hide resolved
doc/decisions/simplify_api.md Outdated Show resolved Hide resolved
doc/decisions/simplify_api.md Outdated Show resolved Hide resolved
@@ -29,7 +29,8 @@ Only store one key name, suitable for comparing/searching/iterating over name.
## Implications

- rename `keyUnescapedName` to `keyName`
- To get the keyName you then would need something like `elektraEncodeKeyName(buffer, keyName(...
- rename `keyName` to `keyEscapedName`, which is lazy (to be done later)
- users should use compare functions, .. instead of `keyName`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using keyName (i.e. the old keyUnescapedName) is fine too. Actual compare functions are of course better, but the unescaped name is simple enough to handle manually if needed.

@@ -29,7 +29,8 @@ Only store one key name, suitable for comparing/searching/iterating over name.
## Implications

- rename `keyUnescapedName` to `keyName`
- To get the keyName you then would need something like `elektraEncodeKeyName(buffer, keyName(...
- rename `keyName` to `keyEscapedName`, which is lazy (to be done later)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this is not a good solution for the public API. I actually think a function copying into a provided buffer, is a better solution for the public API.

/**
 * Calculate the escaped name of key and try to copy it into buffer.
 * Doesn't copy anything if the escaped name will be longer than the provided bufferSize or if buffer == NULL.
 * returns the size of the full escaped name of key
 */
size_t keyNameEscape (const Key * key, char * buffer, size_t bufferSize);

I don't think we should make the escaped name too easy to access publicly. This would just encourage improper and unnecessary use of the escaped name. However, I have not found a satisfying to the problem with the error macros yet. Unless we have a function that returns the escaped name as a const char * that needs no cleanup, using ELEKTRA_SET_*_ERRORF is very annoying.

@markus2330
Copy link
Contributor Author

@kodebach thank you very much for the review! I updated the decisions.

  1. Memory layout

I added memory_layout.md

  1. Key Names

I added key_name.md

Copy link
Member

@kodebach kodebach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The files for memory layout and key names both kinda miss the point, IMO. The suggestion for mmapstorage simply doesn't work and for key names you just defined that the problem is not a problem...

doc/decisions/key_name.md Outdated Show resolved Hide resolved
doc/decisions/key_name.md Outdated Show resolved Hide resolved
doc/decisions/key_name.md Outdated Show resolved Hide resolved
doc/decisions/memory_layout.md Outdated Show resolved Hide resolved
@markus2330
Copy link
Contributor Author

If you give mmapstorage a KeySet * that uses a different memory layout, it will happily try and interpret that as a struct _KeySet * using the C memory layout. With a lot of luck it might even succeed in storing something into the cache.

afaik this is not true, I hope I clarified it in 8d46196

If your concern is if mmapstorage will work with Rust: good question, let us talk about this today

@kodebach
Copy link
Member

kodebach commented Jan 5, 2022

afaik this is not true, I hope I clarified it in 8d46196

I know about the magic data stuff. However, this only guards against byte-order changes. It cannot detect general memory layout changes.

Let's suppose we have libelektra-core-A and libelektra-core-B, both implement the core API but they use different versions of struct _Key:

// libelektra-core-A
struct _Key
{
  void * data;
  KeySet * meta;
// [...]
};

// libelektra-core-B
struct _Key
{
  KeySet * meta;
  void * data;
// [...]
};

mmapstorage uses this struct definition directly. In our example, we'll assume mmapstorage uses the first definition, because it was built for libelektra-core-A. However, our application uses and links against libelektra-core-B. For simplicity, we also assume that the cache already exists on disk (e.g. from a kdb get) and it was created entirely using libelektra-core-A.

Now we run our application. mmapstorage uses the first struct. Checks the magic data, which works fine, because nothing has changed. So it goes on and takes the mmap data and creates a KeySet and all the Keys with this data. This also works fine. The problem, however, is that mmapstorage of course only knows about the first struct definition. Therefore, it creates the KeySet/Key according to this definition.

The problem happens, when mmapstorage returns and our application accesses a Key using libelektra-core-B. Because libelektra-core-B assumes a different struct definition, we suddenly try to interpret the key value as its metadata KeySet and all other kinds of weirdness.

In fact to understand this problem, you don't even need to think about mmap (mmap just makes it much harder to avoid the problem). The same kind of issue would arise, if struct _Key was not opaque and we would change its fields. Suddenly an application built for an old version would read the struct completely wrong. This is the reason why struct _Key is opaque in the first place.

The core issue here is, that mmapstorage does not use functions like keySetName or keySetValue. Instead it uses the struct definition directly, and therefore directly writes to a certain memory offset.

If your concern is if mmapstorage will work with Rust: good question, let us talk about this today

Rust doesn't really come into play here. You can have the same issue with two separate C implementations. AFAIK it should be possible to recreate any memory layout in Rust.

@markus2330 markus2330 mentioned this pull request Jan 6, 2022
doc/decisions/elektra_prefix.md Show resolved Hide resolved
doc/decisions/header_file_structure.md Show resolved Hide resolved
doc/decisions/header_file_structure.md Show resolved Hide resolved
doc/decisions/header_file_structure.md Show resolved Hide resolved
## Problem

Currently src/libs/elektra contains the source of several libraries. Each of the libraries should have its own folder.
Furthermore, we need to allow rust source to exist in parallel.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest this structure for the repo (formatted as YAML for highlighting):

doc:
examples:
scripts:
src:
 c:
  include: = /src/include (only headers that simply combine other headers)
  libs: = /src/libs (one folder per lib, contains headers and source, only libs written in C)
  plugins: = /src/plugins (one folder per plugin, only plugins written in C)
  tools: = /src/tools (one folder per tool, only tools written in C)
  tests: = /tests (only tests written in C)
 rust:
  binding: = /src/bindings/rust
  libs:
   core: = new Rust alternative of libelektra-core
  _: {other folders depend on Rust's needs, AFAIK tests are normally in the same file as source}
 cpp:
  binding: = /src/bindings/cpp (without the tests)
  libs: (one folder per lib, contains headers and source, only libs written in C++)
  plugins: (one folder per plugin, only plugins written in C++)
  tools: (one folder per tool, only tools written in C++)
  tests: (only tests written in C++)
 _: {folders for other languages, "binding" is always the folder for the binding to the C API, rest depends on the language's needs}

That way we can also confine the CMake checks for every language (compiler installed, etc.) to src/<lang>/CMakeLists.txt folder.

PS. IMO benchmarks should be moved into a separate repo. They are not really relevant to users of Elektra and there is no real technical reason to keep them in this repo. In fact, since benchmark results will only be relevant for a specific version of Elektra, we could use one branch per benchmark and use git submodule pointing to the relevant commit of libelektra.

doc/decisions/library_split.md Show resolved Hide resolved
doc/decisions/library_split.md Show resolved Hide resolved
@kodebach
Copy link
Member

kodebach commented Jan 6, 2022

I know there are already a lot of decisions here, but I though of one more thing...

It's basically #3598. The cascading type and the typedefs are a nice-to-have, but without the meta type we get a few problems. In fact Key also needs to behave differently for meta:/ names. Most obviously, adding metakeys to meta:/ keys shouldn't be not allowed. But there is also #3610.

The good news is we don't actually need a new flag anywhere. In Key we must sometimes check the current namespace and in KeySet we simply check the namespace of the first key. That way any Key can be inserted into an empty KeySet, but when inserting into non-empty KeySets the new Key must be compatible with the existing one(s).

However, the bad news is with the removal of keyGetMeta, you cannot check for the existence of metakeys without creating the metadata KeySet. One solution is the addition of a keyHasMeta function to the new libelektra-operations library. The function will access key->meta directly and therefore avoid the creation of the KeySet.

The problem is that you always need to call keyHasMeta first, or accept that the KeySet might be created. Another solution would of course be:

// can return NULL
KeySet * keyMeta (const Key * key) {
  return key->meta;
}

// consumes `meta`, don't call `ksDel (meta)` afterwards
// `keyMeta (key) == meta` after calling `keySetMeta (key, meta)`
void keySetMeta (Key * key, KeySet * meta) {
  if (key->meta != NULL) {
    ksDel (key->meta);
  }
  key->meta = meta;
}

// ALTERNATIVELY
// you must call `ksDel (meta)` later
// `keyMeta (key) != meta` after calling `keySetMeta (key, meta)`
void keySetMeta (Key * key, const KeySet * meta) {
  if (key->meta == NULL) {
    key->meta = ksDup (meta);
  } else {
    ksClear (key->meta);
    ksAppend (key->meta, meta);
  }
}

This wasn't really an option before, because we kept the old keySetMeta. This API would also match keyName/keySetName and keyValue/keySetValue.

@kodebach kodebach mentioned this pull request Jan 6, 2022
@markus2330
Copy link
Contributor Author

I know there are already a lot of decisions here, but I though of one more thing...

It is a very good thing to think about everything now, so that we have a complete set of decisions. We can anyway decide to postpone or to reject because of time constraints but it makes sense to document everything what we have thought about and discussed.

It would, however, help a lot if you can directly either push to this branch, create suggestions in this PR or also create a new PR. Copying/summarizing&reformatting comments&issues is also quite some work.

(didn't read the other text yet, I hope you bring it to the decision format)

@kodebach
Copy link
Member

I create #4223 with my suggestions.


The decision which transport mechanism, e.g. dbus or zeromq, must be
chosen per computer node or container. Within one Elektra setup,
they cannot be intermixed (unlike config file formats, where many
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Within one Elektra setup, they cannot be intermixed

Is there a technical reason, or would sending notifications via multiple transports at the same time (e.g. both dbus and zeromq) work, i.e. all messages duplicated to all transports?

If multiple transports is technically possible, we shouldn't use a symlink and instead add config keys in system:/elektra.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It simply follows the simple rule to keep everything simple as long as nobody needs something more complicated. If someone really wants to use several transports on one system, the person can also use some hub that reads from one transport and relays to another. So we do not break any use case by reducing our notification system to one transport.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree the symlink makes sense to indicate the default (like with resolver and storage). But the decision (at least to me) sounds like we are saying we will never support multiple transports. I still don't see a technical reason why we should restrict ourselves to a single notification transport forever.

I would suggest, we decide that

  • the symlink indicates the default notification transport
  • we currently only support one transport at a time, so the default is always used
  • in future, if the need arises, there may be additional options (e.g. config keys) to enable secondary transports

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, of course! The decisions always only describe what we have now or want to implement for 1.0. Decisions can change, especially if assumptions are wrong. That is why it is important to write down assumptions.

So I think this decision describes what we both want.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The decisions always only describe what we have now or want to implement for 1.0. Decisions can change, especially if assumptions are wrong.

But we have to be careful, how we write things. Future developers (maybe we ourselves), might just read the decision and use it as the basis for their own decision. "must be chosen per computer node or container" sounds very much final and could result in APIs or implementation details (in the notification stuff, or elsewhere) that may be very hard to change without breaking things.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, of course assumptions need to be chosen carefully. They might lead to unsuitability of Elektra in some areas. But as you often stated: we also cannot please everybody, so some assumptions are needed to make Elektra usable.

@markus2330 markus2330 merged commit d8b56fb into master Jan 28, 2022
@markus2330 markus2330 deleted the decisions branch January 28, 2022 09:54
@mpranj mpranj added this to the 0.9.9 milestone Mar 8, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants