[meta issue] hxlm #11

fititnt · 2021-02-28T06:11:38Z

This issue will be used to reference commits from this repository and others.

TODO: add more context.

Update 1 (2021-03-01):

Ok. I liked the idea of YAML-like projects!!! But may be easier to do the full thing than explain upfront. (I'm obviously biased because of Ansible, but ok; anyway I know is possible to even implement testinfra; but would be easier to create an "Ansible for datasets+ (automated) compliance" than reuse Ansible)

Also YAML, different from JSON, is much more Human friendly (for example: it allows comments!) so this can somewhat help.

Being practical, at this moment I think mostly will be wrapper to libraries and APIs already existing (aka syntetic sugar, not really new features). But as soon as the building blocks are ready, the YAML projects themselves become powerful!

…used by HConteiner (temporary name)

…ink allow even localized also on key) may be harder to do quick tests since my prefered code editor have this microsoft/vscode#11770

fititnt · 2021-02-28T12:34:45Z

Ok. I think I will give up the idea of trying to make the code generate an Schema of what is on disk

but do the opposite: let YAML describle what is on disk (or what should be the final state of what would be on disk).

Turns out that this remember a lot about Ansible playbooks! but instead of entire group of servers, is an group of datasets on local disk! But even if on next days each of these points like hdatasets, hfiles, etc on the YAML inventory are already mapped to action classes, there would still be missing the equivalent of "ad-hoc" ansible tasks. The Ansible ad-hoc tasks would be what on HXL the name of these are recipes:

https://github.com/HXLStandard/hxl-proxy/wiki/JSON-recipes (exact equivalent)
https://github.com/OCHA-DAP/hxl-recipes (not exact, but still recipe)

Why YAML over JSON

At this moment, I think that the main difference here over just use ad-hoc HXL-proxy recipes are the fact to start to have an inventory. Ansible separes what is inventory and what is task (so this can be used for several projects that are somewhat similar).

But the main idea that started to look at YAML was not even this about Ansible, but I remembered that YAML is easier to deal with comments, while still powerful to process via tools.

Special attention to the concept of compliance (this is likely to take months)

Do exist some building blocks to abstract, but one special attention to in the end come with an descriptive language easier to abstract is the concept of compliance rules. So we're not only talking here about have one common way to express concepts, but needs to be on local language and needs to support spaces, accents, etc, also on the key terms (not only on values).

compliance rules is somewhat the idea of compute an result of if something is authorized or not and what to do if is not authorized. Other thing could be compliance rules apply filters or at least require that the human ask permission of someone why some specific filter does not apply to some case. But in an scenario where people trust more an computer than each other (or actually trust each other, but need some explanation to not break laws that would need weeks or months to get clearance), if who approves feel safe about what is on an YAML compliance file and do exist people outside the organization that grant that it at least reduce human error and eventually allow faster data exchange for more sensitive contents.

…ble playbooks, but allowing several hmetas on same file

…ype created

…ction at Model level to load the YAML

… to parse raw schemas

…ect; still need some extra abstraction

…t there to make work with URL

…fs already on core); HFile now starts to check is_available_locally

…ouces()

fititnt · 2021-03-01T12:59:04Z

HFile already is able to reload files from remote sources (the first one that works will be downloaded, if already does not have copy on disk).

HRecipe already have some draft using the HXL-proxy recipes, but if we manage to make it work also with libhxl-python, the hmeta.yml project file can be used to play around with multiple JSON recipes.

Before going to compliance rules, I think we need abstract the JSON recipes. I'm not fully sure if some features of HXL-proxy are only from HXL-proxy and not libhxl. But one thing that really necessary for compliance is some quick way to discover the headings of each file, since some compliance may need to allow/block (or at least require human review to force on the hmeta.yml that's ok) based on the typical headings.

fititnt · 2021-03-02T06:29:25Z

I just realized that hxlm/data/baseline/hmeta.yml & hxlm/data/baseline/hcompliance.hmeta.yml in practice allow a sort of Declarative programming (https://en.wikipedia.org/wiki/Declarative_programming). [1]

Considering that the thing that really would be complicated would be the compliance rules (and compliance rules ideally should by design be strictly translatable even between languages, since some country/territory would take too much time to translate from local language to a common language, and this would be unacceptable) I think that the parts that matter should already be enforced. And go all the way Declarative programming makes easier for humans who approve what is right and what is wrong, while offloading complexity to who actually implements the software.

I will make some tests.

1:

The (not implemented) htasks would break this, but maybe worth to simply not implement it (or allow, but via command line operations).

…missing spaces betwen words

…source Identifier'/'Uniform_Resource_Name')

…)' actually seems the best term in English; maybe worthe change internal names

…oning of vocab.yml

…so vocab.yml change from CRLF to LF

…orHSchema

…eudotest()

…would not have hardware smartcards

…t which crypto high level library to use (and also about care about developers usability

…se encrypted URNs (urnresolver #13) may require some way generic adapter

…P class

…_init, HDP._safer_zone_hosts, HDP._safer_zone_list, HDP.export_schema_json(), HDP.export_yml()

…& HDP.HDP_YML_EXTENSIONS

… now works

fititnt · 2021-03-13T09:26:48Z

The new hdpcli (the class HDP) first to be implemented because of the conversion from YAML to JSON processing specs (#14) actually will allow fetch instructions from remote hosts, like test files from GitHub.

hdpcli as "offline-first" usage

At this point there is no problem with this, but I think that by design may be better to start "in offline mode" and either require extra command from user or interactively manually ask if the user allow connect to the host (at least if we detect that is an interactive session, not running as part of some scripts).

Even with acceptable sandboxing-by-default (and eventual way to grant that public shared HDP/URN files would be signed by authenticity), there still the privacy point

By offline mode for example, the urnresolver #13 already has a draft of this (but the user needs to have files in some place of own computer (we don't have yet some directory structure to discover the files). But with the hdpcli as soon as is implemented allow load files from remote sources, even if the YAML instead of plain shell or python scripts are already an form of sandboxes, and even if hypothetically we de facto go as deep as implement signed files (e.g. think as far as we have every public URN list or HDP sort of file signed) we would still with... the privacy.

As for who would read this later, I'm not talking about privacy of eventual humans that are referenced on data managed by these tools, but the humans that would use the command line tools to automate tasks. Like would happens with anything that access internet, if hdpcli/urnresolver would be allowed to fetch data from internet, whatever is the host, they can know the requester IP address.

allow offline (structures cache) also as an way to mitigate overload remote servers requests

In particular if some the urnesolver becomes ok to use as standalone CLI os as part of other libraries (and not just as internal tools here), depending on how cli tools deal with caching, misbehaving tools could make a lot of requests to know available URNs. (This is also why the URN index files are likely to allow simple text files, even if this means allow users to encrypt just specific content and don't care if the file may be public access. This approach is somewhat to mitigate server loads while still keeping some way to find content).

Things are even worse if, since YAML would be easier to use even on a local machine which is often done with HXL-Proxy, the person often works with large files and don't dowload locally first. While these files may not be requested as often, they can be much larger than what the HXL-Proxy by default allows to use (that is a lot! It can easily pass 500.000 rows of data).

So do exist cases where a mix of allowing online and offline (or organized local cache) is actually useful beyond the privacy part. In fact, this is the biggest reason

`hdpcli` may have conventions that the data provider may allow enforce disable offline/local cache for certain contents

Even if very likely whoever has access to data allowed by other organizations already explicitly passed some trust check, if we automate ways to make local cache, I'm 100% sure this will conflict with specific usages. What I meant is that the ideal circumstance is at least the way URNs are documented could give a hint if the provider doesn't care to be overloaded with requests because is preferable this than the user having a local copy of the index files. (Note that users have access to URN that list available resources does not mean that the human will download them, just that the person knows that they exist and may already have a link to download then).

The problem if some URN indexes request this feature (and this provider is very important or is not untrusted by default) is... this could conflict with the privacy of the human. Maybe should be an way to (even if the user in fact based on boat already trusted level have access to the resources without human intervention) have some metadata that the URN provider allows users to keep local cache (like at least the urn keys, even if no metadata at all). This is something that still need some planning, because if at least would exist some way to know the URN keys, at least the user could "fetch" the URN index file (that, again, do not means that the person will try download the resource) if the URN file actually would at least have that resource. Also, for performance reasons, features like command line autocomplete would need this.

And if the dataset was already downloaded (in special if was not an internal cache of the tool that is cleaned implicitly)?

If would be a win-win situation (both to keep flow of data sharing while solving average issues like someone just releasing something that should not) may actually make sense allow something that can invalidate (aka kindly ask delete) an dataset that was downloaded by another peer. An perfect example is the twitter API (or at least the stream API that I tested years ago) that both users can have access to tweet data of others users, but the API can also send requests with specific IDs that when implementers see that, are expected to immediately delete the respective tweet.

While I'm not sure (and this actually would need a lot of thought on all bad things that could happens, like if implemented by default one organization could could actually ask to invalidate/delete massive amount of files from an different organization) I think that at least if is for individual, explicitly named datasets that very recently were shared, even if we're not marked as sensitive content initially (or need of some extra authorization) if in an new check for updates and info that such dataset was release by mistake, may be reasonable to delete the local dataset.

How to mark a dataset as "mistakenly" released may not be intuitive. For example: if one URN index file is both accessed by trusted personal and by a broader public, a generic request could delete even from peers that actually should have access to that files (or maybe even already have something with the same URN). So at least at this moment, I think that one way to, at least for like datasets announced on URN index files, if the local client hdpcli assumes that the user made an local copy (or have as local cache) when the new dataset was with direct access to one URN, and on next refresh (maybe like each 15 minutes?) the URL become encrypted (maybe some additional hint that could proposely be vague or redundant, but sufficient to the hdpcli knows that if the current user can't decrypt the URL, means that someone made a mistake in last hour) we could force delete local cache of local user.

potential conventions on how to deal with URN leak of data

Like I said, I'm not fully sure at this moment on how to deal with something like someone shared wrong and resource, but this is likely to happen in special when needed to do fast paced data sharing, so makes sense not be complicated to implement on client side. Maybe even draft how to resolve urn:data (#13) and document the URN index files could already, if not as an users agreement like a company like Twitter, as an moral code of conduct.

But just to say upfront: in general optimize sharing of sensitive data is by nature complex. Very likely people who would have access to this already have some level of trust, and while adhere to best data sharing standards is desirable, do exist contexts were this is, not just morally but by laws, secondary to other urgent needs.

… works

…update()

fititnt · 2021-03-15T03:18:10Z

Just added the functionality of load files by expected suffix by directory and the HXL data processing specs (an array) already (as soon or later I would expect) doesn't grant exact order as when it is a single file.

I think that on average user usage (beyond single file) with my experience with Ansible (medium to big projects, including creating Ansible Roles) I will try to optimize for medium to large projects. I do not have experience from the very early days of Ansible 1.0, so most of the things to run partial playbooks very likely were ready, but I think that definitely the idea of getting a recipe by array index is so prone to go wrong that should not be even on documentation.

Analogies to Ansible

In some aspects the HDataset (and implicity an potential result of each recipe) is similar to Ansible inventory (The playbooks / tasks I think should be as abstracted as possible to mitigate user being decepcioned with order of execution; or at least we try to delay time need for an user have no option but know about order of execution? The ideal would be an user who is just consuming already working project be able to reproduce and things still working even if the user start to merge more and more files

Tags to allow control selection (include/exclude by tag)?

Ansible allows users to use tags a lot to both select what they want or, by tag, what they do not want (in fact personally I overuse that's beyond average users on Ansible, but ok). Here may be the case so we could at bare minimum already have such tags (but I already know that this is not sufficient; in special if is to reuse projects from others; and then the likelihood or try ask people to keep same tagging conventions; this alone is not good).

URNs, if used, would allow exact selection. But this may reduce reusability for inner parts and also could force users to make decisions too soon. But anyway, it could be a good idea to, if the user does not explicitly create exact URNs, we implicitly create based on context. Maybe one implicitly 2 letter iso country for "localhost" if the user did not select or receive project files already well organized?

One thing that Ansible has for hosts is both 'localhost', 'all' (that include all hosts, except localhost, because this could break things like install/remove things from their own computer!) and 'ungrouped. Maybe we could create another pseudo concept, that would be almost like tags, but instead of allow apply like even for subitens, require be more "top level" like hsilo?

TODO: maybe (if implemented) in addition to something like 'localhost', 'all' and 'ungrouped should have something that tells explicit rules that are loaded from outside localhost? Like if the user explicitly allowed load rules from outside the default safer network? Also, different from Ansible, actually most of the things would be localhost, so actually the interest is to protect (or deal differently) what is not from local.

Another thing I'm considering doing is, if an user adds an tag directly on scope of hsilo, the tag applies also to anything that was on that silo (including hrecipes, hfiles, HDataset, maybe latter even included_files).

About hsilos and how to "select by then" (Ansible uses concept of host groups)

I'm thinking about actually not force that a single hsilo could have an unique name (like an unique ID), but tolerate (maybe strongly recommend; or at least make easier to users would tend to this) that by labeling an silo on an group, that group would make every hsilo with that thing actually "part of the same silo".

This somewhat would work like tags, but groups (using this approach) would only apply to top level. Anyway, if the user really wanted an exact id for a file, it could simply create a very unique group name (or force an URN, that in this case could act like an prefix... But if we document use an unique URN as base, then this would break the concept of hsilos like a single silo but different files, hummm...).

End comments

The JSON schema (the file used to help with validating YAML files while using applications like VSCode) can actually help to enforce what could or not be in each file. So anyway whatever become the implementations, by using the helper, the user could have feedback without waiting to run hdpcli and receive errors.

…ady was ugly, it would be much worse with anything beyond ASCII; so lets force it by default for everyone

fititnt · 2021-03-15T19:52:25Z

before

Now

… explicitly use; something like 'urn:oo:hsilo:domain_base:container_base-container_item_index' were container_base ofte would be 'local' (localhost) and container_base often the filename itself

… --non-nomen, --non-tag, --non-urn, --verum-adm0, --verum-grupum, --verum-tag, --verum-urn

…ttp://example.org/path/my.hdp.yml

…f the YAML files, will mention the '[meta] HDP Declarative Programming (working draft) #16'

fititnt · 2021-03-17T04:48:20Z

Just create the issue [meta] HDP Declarative Programming (working draft) #16.

Considering what could be on short to medium term production ready, even if abstraction with only YAML with this Domain Specific Language HDP would not be as powerful as plugins directly to Python code itself, this may be more realistic without require people not only start to use HXL on these contexts, but allocate individuals that would be able make it using Python and the person not be scared that data themselves are very sensitive.

1. Part of the functionality of auditing could be moved to filters instead of require custom python code (needs testing)

Do exist some extra points (they still relevant with some YAML files that someone is overriding a default behavior) but that would be even more essential with plugins in plain Python: the code would have to me even more strictly audited than if at least most common features could be already possible using and DSL like language.

If we manage to draft reusable HXL data processing specs on YAML that could be challenged with testing data (even if such tests would not need to become public) this could help to spot common errors. These "errors" like a human being aware that a customized rule allow pass private data could be ignored if the human is able to check that the authorization allows that.

Note that this type of test is not applicable on all types of data sharing. But in cases we're do exist more explicit restrictions, it could be used.

2. Considering the idea of files that mention datasets by default don't require they on the same folder

Weeks ago one screenshot had a way to express datasets inside the current folder. If the urnresolver`: Uniform Resource Names - URN Resolver #13 plus conventions on how to represent an URN on some base folder on local disk become viable, the use of HDP-like instructions would never store the data themselves where the HDP files are.

This approach could both solve the problem of storing files on separate disk partitions (or maybe an S3-like storage) while the metadata files could be handled differently.

3. Avoiding defining new keywords to define terms by... Simple have translations to every language people care to use (or allow someone trusted to give a file that add missing terms)

One very hard decision I discovered when planning for example the best hashtags to use when sharing datasets with @HXL-CPLP is that this decision is sometimes hard if we try to find something that is more universal.

With the HDP, the drafted idea is in addition to the internal terms (that if using Latin script is... Latin) has both one canonical term to translate if is to a known language and if converting from such language, can understand some extra aliases. To simplify a lot of translations, most HDP keywords are single somewhat primitive words. In special cases for macro languages (Both Arabic and Chinese) this means that I already know that some terms are impossible and eventually will need to implement the variants. But at least we're already have from start that allow localization!

As complicated as it may sound to tolerate such a level of localization, considering time needs "to fix", this seems more easy to fix permanently than alternatives. Also, the fact that the terms can be in the people native language simplifies a lot of documentation.

End comments

The three points above may give an idea of why the ideal of full "Declarative Programming" (here as abstraction on how things are really done) can be actually harder to implement, but may be less harder than alternatives.

At bare minimum, it provides some level of sandboxing compared to allow full Python. Also the extra requirements/restrictions may help to implement early what is viable to be used in production.

And, under context of HXL, the idea of for example allow localized terms to express commands is actually feasible for an programming interface (like the Excel formulas, as Example, often are on the person native language) but is not if deciding good reusable hashtags for datasets. I mean: the number of possibilities of keywords of a programming interface is controlled.

That's it!

fititnt added a commit that referenced this issue Feb 28, 2021

hxlm (#11): v0.7.2 started; let's try YAML instead of JSON for files …

ac74732

…used by HConteiner (temporary name)

fititnt added a commit that referenced this issue Feb 28, 2021

hxlm (#11): testing YAML instead of JSON; RTLs like Arabic/Hebrew (th…

7cf25f8

…ink allow even localized also on key) may be harder to do quick tests since my prefered code editor have this microsoft/vscode#11770

fititnt added a commit that referenced this issue Feb 28, 2021

hxlm (#11): started hxlm.core.schema

afc3bbb

fititnt added a commit that referenced this issue Feb 28, 2021

hxlm (#11): improved organization of hxlm.core.schema

270daa4

fititnt added a commit that referenced this issue Feb 28, 2021

hxlm (#11): hxlm.core.schema.baseline.yml created

352289d

fititnt added a commit that referenced this issue Feb 28, 2021

hxlm (#11): hxlm.core.schema.baseline.yml added HTYPE_PRIMITIVES

ef006ab

fititnt added a commit that referenced this issue Mar 1, 2021

hxlm (#11): hxml.core.data.baseline.hmeta.yml now as array (like Ansi…

287c996

…ble playbooks, but allowing several hmetas on same file

fititnt added a commit that referenced this issue Mar 1, 2021

hxlm (#11): RecipeHtype, HXLProxyRecipeHtype and ItemHXLProxyRecipeHt…

8730fa7

…ype created

fititnt added a commit that referenced this issue Mar 1, 2021

hxlm (#11): draft of hrecipes/hrecipe on YAML file; needs more abstra…

27092a9

…ction at Model level to load the YAML

fititnt added a commit that referenced this issue Mar 1, 2021

hxlm (#11): started HMeta class, hxlm.core.model.meta

cfe0824

fititnt added a commit that referenced this issue Mar 1, 2021

hxlm (#11): improved way HMeta load hdatasets

4118708

fititnt added a commit that referenced this issue Mar 1, 2021

hxlm (#11): moved RecipeHType from HType to Model; improved HMeta way…

51f0c64

… to parse raw schemas

fititnt added a commit that referenced this issue Mar 1, 2021

hxlm (#11): HMeta export all subschemas (but still using YAML raw obj…

554b37f

…ect; still need some extra abstraction

fititnt added a commit that referenced this issue Mar 1, 2021

hxlm (#11): HRecipe (HXL-proxy JSON recipes) is getting better; almos…

818620e

…t there to make work with URL

fititnt added a commit that referenced this issue Mar 1, 2021

hxlm (#11): Draft of SStorage (not sure if abstract ramfs or let temp…

808e34f

…fs already on core); HFile now starts to check is_available_locally

fititnt added a commit that referenced this issue Mar 1, 2021

hxlm (#11): HFile implements is_available_sources() and reload_from_s…

8595c68

…ouces()

fititnt added a commit that referenced this issue Mar 2, 2021

hxlm (#11): testing Declarative Programming more strict approach

f34cf38

fititnt added a commit that referenced this issue Mar 2, 2021

hxlm (#11): v0.7.3 started; created draft of vocab.yml

c5f55ce

fititnt added a commit that referenced this issue Mar 2, 2021

hxlm (#11): vocab.yml, added hfile; hdataset now uses '-' to replace …

e1283cd

…missing spaces betwen words

fititnt added a commit that referenced this issue Mar 2, 2021

hxlm (#11): vocab.yml, added attr.iri/attr.urn ('Internationalized Re…

1051db7

…source Identifier'/'Uniform_Resource_Name')

fititnt added a commit that referenced this issue Mar 2, 2021

hxlm (#11): vocab.yml, added hcompliance; 'acceptable use policy (AUP…

bf97ac2

…)' actually seems the best term in English; maybe worthe change internal names

fititnt added a commit that referenced this issue Mar 2, 2021

hxlm (#11): vocab.yml, added attr.tag & attr.source

9d9a486

fititnt added a commit that referenced this issue Mar 2, 2021

hxlm (#11): vocab.yml, added attr.adm0, attr.lang, attr.meta

f293164

fititnt added a commit that referenced this issue Mar 2, 2021

hxlm (#11): vocab.yml, added attr.desc and quick overview of the reas…

d96483b

…oning of vocab.yml

fititnt added a commit that referenced this issue Mar 2, 2021

hxlm (#11): Right-to-left (RTL) scripts will need extra attention; al…

fbadf87

…so vocab.yml change from CRLF to LF

fititnt added a commit that referenced this issue Mar 3, 2021

hxlm (#11): ConversorHSchema class started

4255f48

fititnt added a commit that referenced this issue Mar 3, 2021

hxlm (#11): previous fixes to avoid circular imports with new Convers…

31b5044

…orHSchema

fititnt added a commit that referenced this issue Mar 9, 2021

hdpcli (#11): drafted _get_salt() & _get_fernet()

1840c6e

fititnt added a commit that referenced this issue Mar 9, 2021

hdpcli (#11): MVP of _get_salt() & _get_fernet(); created _entropy_ps…

d8992f6

…eudotest()

fititnt added a commit that referenced this issue Mar 10, 2021

hdpcli (#11): we will definely will need keyring, in special for who …

b74dc68

…would not have hardware smartcards

fititnt added a commit that referenced this issue Mar 10, 2021

hxlm, hdpcli (#11): Started draft of HWorkspace; lots of reading abou…

da44daf

…t which crypto high level library to use (and also about care about developers usability

fititnt added a commit that referenced this issue Mar 10, 2021

hxlm (#11): HDataDispatch; flight dispatcher, but for data

fe6cb62

fititnt added a commit that referenced this issue Mar 10, 2021

hxlm (#11): _get_keypar() started

6317513

fititnt added a commit that referenced this issue Mar 10, 2021

hdpcli, hxlm (#11): HKeystore created; proof of concept of how to par…

a57f77d

…se encrypted URNs (urnresolver #13) may require some way generic adapter

fititnt added a commit that referenced this issue Mar 13, 2021

hxlm(#11), hxl-processing-specs (#14): created hxlm.core.model.hdp HD…

e516c56

…P class

fititnt added a commit that referenced this issue Mar 13, 2021

hxlm(#11), hxl-processing-specs (#14): Added HDP._online_unrestricted…

10c7df8

…_init, HDP._safer_zone_hosts, HDP._safer_zone_list, HDP.export_schema_json(), HDP.export_yml()

fititnt added a commit that referenced this issue Mar 13, 2021

hxlm(#11), hxl-processing-specs (#14): Added HDP.HDP_JSON_EXTENSIONS …

cc5462f

…& HDP.HDP_YML_EXTENSIONS

fititnt added a commit that referenced this issue Mar 13, 2021

hxlm(#11), hxl-processing-specs (#14): HDP._prepare_from_local_file()…

0dc88ee

… now works

fititnt added a commit that referenced this issue Mar 13, 2021

hxlm(#11), hxl-processing-specs (#14): HDP._prepare_from_remote_iri()…

57de6da

… works

fititnt mentioned this issue Mar 14, 2021

[meta] Internationalization and localization (i18n, l10n) and internal working vocabulary #15

Open

fititnt added a commit that referenced this issue Mar 15, 2021

hxlm (#11): added HDP._prepare_from_local_directory(); draft of HDP._…

27893a0

…update()

fititnt added a commit that referenced this issue Mar 15, 2021

hxlm (#11): force output Unicode YAML; if a single 'Olá Mundo\!' alre…

e228e24

…ady was ugly, it would be much worse with anything beyond ASCII; so lets force it by default for everyone

fititnt added a commit that referenced this issue Mar 16, 2021

hxlm (#11): drafted cli commands to hdpcli; --non-adm0, --non-grupum,…

abe2fd2

… --non-nomen, --non-tag, --non-urn, --verum-adm0, --verum-grupum, --verum-tag, --verum-urn

fititnt added a commit that referenced this issue Mar 16, 2021

hxlm (#11): re-enabled hdpcli to use remote files as HDP, e.g. hdpcli h…

7fa8f27

…ttp://example.org/path/my.hdp.yml

fititnt added a commit that referenced this issue Mar 16, 2021

hxlm (#11): re-enabled hdpcli --non-grupum & hdpcli --verum-grupum

6859bd8

fititnt added a commit that referenced this issue Mar 16, 2021

hxlm (#11): re-enabled hdpcli --non-urn & hdpcli --verum-urn

40f9d79

fititnt added a commit that referenced this issue Mar 16, 2021

hxlm (#11): initial draft of hdpcli --objectivum-linguam NNN

430ffcb

fititnt added a commit that referenced this issue Mar 17, 2021

v0.8.0 started; issues strictly related to hxlm #11, but for syntax o…

b37218b

…f the YAML files, will mention the '[meta] HDP Declarative Programming (working draft) #16'

fititnt closed this as completed Mar 17, 2021

fititnt added a commit that referenced this issue Mar 22, 2021

hdp-spec (#16), hxlm (#11): hxlm.core.internal.signature started

dce6fab

fititnt mentioned this issue Mar 28, 2021

hxlquickimport #6

Closed

fititnt added the epic label Mar 28, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[meta issue] hxlm #11

[meta issue] hxlm #11

fititnt commented Feb 28, 2021 •

edited

Loading

fititnt commented Feb 28, 2021

fititnt commented Mar 1, 2021

fititnt commented Mar 2, 2021

fititnt commented Mar 13, 2021

fititnt commented Mar 15, 2021

fititnt commented Mar 15, 2021

fititnt commented Mar 17, 2021

[meta issue] hxlm #11

[meta issue] hxlm #11

Comments

fititnt commented Feb 28, 2021 • edited Loading

fititnt commented Feb 28, 2021

Why YAML over JSON

Special attention to the concept of compliance (this is likely to take months)

fititnt commented Mar 1, 2021

fititnt commented Mar 2, 2021

fititnt commented Mar 13, 2021

hdpcli as "offline-first" usage

Even with acceptable sandboxing-by-default (and eventual way to grant that public shared HDP/URN files would be signed by authenticity), there still the privacy point

allow offline (structures cache) also as an way to mitigate overload remote servers requests

hdpcli may have conventions that the data provider may allow enforce disable offline/local cache for certain contents

And if the dataset was already downloaded (in special if was not an internal cache of the tool that is cleaned implicitly)?

potential conventions on how to deal with URN leak of data

fititnt commented Mar 15, 2021

Analogies to Ansible

Tags to allow control selection (include/exclude by tag)?

About hsilos and how to "select by then" (Ansible uses concept of host groups)

End comments

fititnt commented Mar 15, 2021

fititnt commented Mar 17, 2021

1. Part of the functionality of auditing could be moved to filters instead of require custom python code (needs testing)

2. Considering the idea of files that mention datasets by default don't require they on the same folder

3. Avoiding defining new keywords to define terms by... Simple have translations to every language people care to use (or allow someone trusted to give a file that add missing terms)

End comments

fititnt commented Feb 28, 2021 •

edited

Loading

`hdpcli` may have conventions that the data provider may allow enforce disable offline/local cache for certain contents