Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow "[]" to return to global scope (for adding sibling array after a table) #456

Closed
TedDriggs opened this issue Mar 22, 2017 · 52 comments
Closed

Comments

@TedDriggs
Copy link

I'm trying to model this Rust struct in TOML:

#[derive(Debug, Clone, PartialEq, Deserialize)]
pub struct WhitelistConfig<T: DeviceIdentifier> {
    appliance: Appliance,
    rules: Vec<Rule>,
    
    #[serde(default)]
    ignore: PhantomData<T>
}

When writing it out in TOML, I believe the natural way to read/write it would be:

  1. Provide information about the appliance this config file connects to
  2. Provide the list of rules (which could be long)

That would look something like...

[appliance]
host = "teddriggs-dev"
api_key = "MY-API-KEY"

# this isn't valid right now; it will consider the field to be part of `appliance`
rules = [
    { tag = "whitelist" },
    { group_id = 1 }
]

One approach I thought of that could address this would be allowing [] to appear once to return to global scope after entering a table. It would be invalid to use [] if the document started with key-value pairs, avoiding the splitting of a table definition.

This might be too narrow a case, but I think the pattern of 'metadata + a list' seems likely to occur somewhat frequently. This may also dovetail with #309; shorter syntax for arrays-of-tables would make me less concerned about teaching users the [[rules]] construct.

@lmna
Copy link

lmna commented Mar 23, 2017

  1. Empty brackets naturally correspond to an empty key. Special treatment of empty brackets increases overall language complexity, a tiny step from being obvious towards being cryptic.
  2. In general, random switching between tables and/or global scope makes configuration file less friendly to a human reader.
  3. This looks much less json-ish and much more ini-sh:
    [appliance]
    host = "teddriggs-dev"
    api_key = "MY-API-KEY"
    
    [[rules]]
    tag = "whitelist"
    
    [[rules]]
    group_id = 1

@guai
Copy link

guai commented Mar 30, 2017

Isn't empty quotes correspond to an empty key, and empty brackets disallowed?

@lmna
Copy link

lmna commented Mar 30, 2017

@guai, empty quotes correspond to an empty quoted key. Quoted things are allowed inside brackets. So i believe that this is valid today:

[""]
foo = "bar"

@guai
Copy link

guai commented Mar 30, 2017

I see

[]     # INVALID

in readme. So empty brackets already a special case, but it just disallowed and have no special meaning. I think @TedDriggs right, it is a good candidate for marking return to toplevel
We can define child map before parent unless parent already filled with values and kinda 'closed'. This allowed:

[a.b]
c = 1
[a]
d = 2

If we widen this rule on toplevel we'll get something like this:

[b]
c = 1
[]
d = 2

@guai
Copy link

guai commented Mar 30, 2017

I think this is quite intuitive and will allow to remove this special case handling in implementation code. Which is always a good thing

@lmna
Copy link

lmna commented Mar 30, 2017

I see

[]     # INVALID

in readme. So empty brackets already a special case

Not special, this case is exactly the same:

= "no key name"  # INVALID

@lmna
Copy link

lmna commented Mar 30, 2017

If you allow

[]

than what is your intuition for

[a..b]

and for

[a.]

?

@guai
Copy link

guai commented Mar 30, 2017

Empty strings maybe? I don't insist this to be allowed cause it much less usefull.
I'm writing my first toml doc and I already hit this strangeness that I can write child tables before parent's key-value pairs but not on top level. Why?

@lmna
Copy link

lmna commented Mar 30, 2017

Consider Tom`s comment in #419:

You can use a dot-separated key-path to unambiguously denote a value:

[a.b.c]
key = "val"
a.b.c.key #=> "val"

In general:

path = table-name + "." + key-name

In particular, for empty brackets:

[]
key = "val"
.key #=> "val"

To get "key" instead of ".key", you have to introduce a special logic.

I can write child tables before parent's key-value pairs but not on top level. Why?

Because there is no natural way to return to a global scope.

@lmna
Copy link

lmna commented Mar 30, 2017

Top level in toml is special in many ways. For instance, top level itself is doomed to be a table, while any value below is allowed to be an array, a table, or an immediate string/number/etc. In contrast, top-level "JSON value can be an object, array, number, string, true, false, or null".

So we have a complexity trade-off here. Introduce special treatment of empty brackets for sake of making top-level less special in one particular aspect: "I can write child tables before parent's key-value pairs but not on top level".

@guai
Copy link

guai commented Mar 30, 2017

.key for an immediate child of top level map is wrong
then for the first example we would have .a.b.c.key
there already is this logic somewhere
btw why should I even care about implementation? I concerned more about how easy it would it be to use toml

@guai
Copy link

guai commented Mar 31, 2017

And its not path = table-name + "." + key-name
its path = (parent.pathList + [key]).joinWithDot()
where for empty parent.pathList we'll get just [key]

@TheElectronWill
Copy link
Contributor

TheElectronWill commented Mar 31, 2017

@guai You should care about the parser's implementation because, from the spec: "TOML should be easy to parse into data structures in a wide variety of languages"

But in that case, making [] return to global scope should not be too difficult. I don't know how every Toml parser is made, but mine already treats [] as a very special case and throws an error, so it could instead treat it as something valid and merge the table's content under it with the root table 👍

I'm not for changing the meaning of things like [a..b] or [.] or .key : keep them invalid.

@guai
Copy link

guai commented Mar 31, 2017

@TheElectronWill, it also states, that it is obvious. For me, its obvious, that applying the same rule everywhere is simpler and better.
The rule is: brackets matched - change current path, then key-value pairs matched - add them at current path. Unless current map already closed, which happens when we switch from the map with at least one primitive inside to another path. Empty path means root
I'm not the fan of these guys [a..b] or [.] or .key too.

@TedDriggs
Copy link
Author

I'm not for changing the meaning of things like [a..b] or [.] or .key : keep them invalid.

100% agree; the original proposal deliberately was limited to the "return-to-global" behavior for that reason.

@fspm
Copy link

fspm commented Apr 7, 2017

I'm all for improving the representation for top-level arrays... the repeating [[arrayname]] syntax is definitely less ergonomic than most other elements of the spec. That being said, I don't think that this proposition is the way to go. I prefer toml files specifically because the language is very intuitive. I can throw a toml file at a newbie (or non) programmer, and they can immediately comprehend it.

I don't think that [ ] is expressive or intuitive enough given that human readability is the primary selling point of the toml spec (at least for me). Declarations like [tablename] and [table.sub-table] are immediately obvious because they simultaneously reflect structure and meaning, even to the uninitiated. I use toml files extensively, and I would never guess that [ ] meant "return to the top level namespace and treat the next key as an array instead".

@guai
Copy link

guai commented Apr 7, 2017

@fspm, the proposition is not "return to the top level namespace and treat the next key as an array instead" but "return to toplevel and treat the next key-value as immediate primitive in toplevel map". We can do it for maps at levels > 0. For levels > 0 we can first describe child maps/arrays and then immediate primitives, but not on top level, where we must declare primitives first right now.
Your comment lead me to the thought that we could use [[]] to describe toplevel array. Which makes toml more close to json/yaml

@BurntSushi
Copy link
Member

I'm not a fan of the "return to global" idea at all. A nice property of TOML is that it's very obvious which table a key belongs to based on where it's defined. If you want global keys, then you know to always look at the top of the file.

Things like [] or [[]] are completely opaque.

@guai
Copy link

guai commented Apr 7, 2017

you know to always look at the top of the file

@BurntSushi, but that prevent expressing some domain-specific ordering. Like important things on top, less important - lower. And most important things not required to be toplevel

@BurntSushi
Copy link
Member

BurntSushi commented Apr 7, 2017

@guai Then don't use the global namespace for non-table keys.

@guai
Copy link

guai commented Apr 7, 2017

@BurntSushi, its acceptable when I design the data structure from scratch, but not when I need to express somethins existing in toml

@BurntSushi
Copy link
Member

Yes, there are the trade offs. The downside of not being able to arbitrarily inject things into the global scope is that you can't order the non-table keys in the global scope in an arbitrary order. I think we can live with that downside. I think the cost of not being able to specify an arbitrary order is quite a bit smaller than the cost of making removing a very simple property of TOML: it's always obvious where a key is defined.

@guai
Copy link

guai commented Apr 7, 2017

@BurntSushi, it makes toml less usefull for me. I'm trying to migrate xml configuration files of an existing
large project to toml.
I do not want to change pathes all over the code and I do not want any reordering, cause people got used to current order. This order have some meaning itself.
Wasn't this supposed to be the main reason, why toml even exist - to have simple and nonambiguous format for configs?
The possibility of drop-in replacement would make it much easier to adapt

@guai
Copy link

guai commented Apr 10, 2017

@BurntSushi, what trade-off are you talking about? I see no gains here, only losses:
less symmetry, more exceptions, more complicated rules, less expressiveness.
The only kinda gain is "we should look both before and after child sections everywhere but toplevel, on toplevel we only should look on top". Not much of a gain imho

@BurntSushi
Copy link
Member

I don't understand. Please use examples.

@TheElectronWill
Copy link
Contributor

As [] isn't obvious, I think that if you're in a situation where you need to do this, you could use a table named "global" (or whatever makes sense), and put its content where you need it programmatically.

In the example given by @TedDriggs, a "parameters" table makes sense for me:

[appliance]
# ...

[parameters]
rules = ...

Even if it's less straightforward to map this to the data structure you mentioned, it shouldn't be impossible with a good library.

@guai
Copy link

guai commented Apr 11, 2017

@TheElectronWill but in that case we'll have to prefix every path all over the toml document

@guai
Copy link

guai commented Apr 11, 2017

BTW, why "[] isn't obvious"?
What another meaning it can have in this context? I see none. Don't treat users as complete idiots.

@guai
Copy link

guai commented Apr 11, 2017

@BurntSushi, when I was talking about less symmetry and more exceptions I meant this examples

[a.b]
c = 1
[a]
d = 2
[b]
c = 1
[]
d = 2

I can add prefix to nest some section into another one, and I can remove this prefix. Except when its on top level. In this case I need to reorder/reorganize/refactor, change code etc.
One can first define children and then primitives except toplevel.
We can have arrays except toplevel one.
And so on.
And to mention expressiveness: it turns out I cannot express any data stricture just as I want. I have no toplevel array. I can't use the order I want.
I need to take this all into account.
Whereas in json or yaml I can. Toml could be a better and more readable drop-in replacement for this heavy(yaml) and less readable(json) formats in case if we give the green light to this issue.
And then one can decide, if he need this feature in his own document

@guai
Copy link

guai commented Apr 11, 2017

And if you guys afraid that users won't get it, then we can use comments

[a.b]
c=1
[] # toplevel primitive start here
foo = "bar"

I think in most cases users don't write toml, they change existing one or copy-paste it and then just change values they know the meaning of

@lmna
Copy link

lmna commented Apr 11, 2017

And to mention expressiveness: it turns out I cannot express any data stricture just as I want. I have no toplevel array. I can't use the order I want.
Toml could be a better and more readable drop-in replacement for .... json ... in case if we give the green light to this issue.

Toml is less expressive than json in many more ways. Toml imposes restrictions on type of items in array. Toml has no null. Things like nested arrays of tables are hard to read.

@BurntSushi
Copy link
Member

Toml could be a better and more readable drop-in replacement for this heavy(yaml) and less readable(json) formats in case if we give the green light to this issue.

TOML isn't a replacement for JSON or YAML. It's a configuration file format.

@guai
Copy link

guai commented Apr 11, 2017

It's a configuration file format.

And I could not just convert my existing xml configuration files into toml without extra efford
Not a good property for a format, made espesially for this purpose.

@mojombo
Copy link
Member

mojombo commented Apr 11, 2017

I think the biggest problem here is that this proposal would violate the "write once" aspect of tables. The root level is a table. It can be written to at the beginning of each TOML document. If we also allow writing to it via [] then we are allowing the re-opening of an existing table. Unless we only allow writing to [] if there are no global entries at the top of the file, which seems fragile and complex.

@guai
Copy link

guai commented Apr 11, 2017

@mojombo, I propose to use the same rules on any level.
Child map can be introduced if map at current path have no primitives yet
We still must group primitives together, but the restriction that primitives on toplevel must appear before any child goes away.

@ChristianSi
Copy link
Contributor

ChristianSi commented Apr 12, 2017

@guai:

We still must group primitives together, but the restriction that primitives on toplevel must appear before any child goes away.

So, if a [] follows anywhere in a long file, I'm NOT allowed to write global keys at the top of the file (they must go below the []), but if it doesn't, I am? This is somewhat logical, but it would also be utterly confusing and counter-intuitive. People would get surprising parse errors ("Global table re-opened") that appear much farther down below in the file than the place they edited (the top) and would accordingly be hard to understand and resolve. No, that's not gonna fly.

It also contradicts the original proposal:

One approach I thought of that could address this would be allowing [] to appear once to return to global scope after entering a table. [emphasis added]

But the original proposal won't fly either, for the reason stated by @mojombo: It would make the top-level the only one whose entries could be spread over two places instead of being all grouped together.

This issue has been discussed long enough and it's clearly going nowhere, I think. Time to close it?

@guai
Copy link

guai commented Apr 12, 2017

@ChristianSi, then why is it ok to get "table re-opened" error on levels > 0?

@TheElectronWill
Copy link
Contributor

TheElectronWill commented Apr 12, 2017

As @guai pointed out, you already get errors when you reopen a table, which could be defined far away from the user edit.

The meaning of [] isn't obvious, but I realized that the syntax of the array of tables... isn't more obvious. At least we know that [] probably declares a table.

Allowing [] to open - but not to reopen - the global scope makes much more sense than the original proposal, and is quite simple for the parser. It allows flexibility, at the cost of introducing a new syntax (actually it's more a special case of the table syntax). Therefore I'm not against it.

@ChristianSi
Copy link
Contributor

@guai:

@ChristianSi, then why is it ok to get "table re-opened" error on levels > 0?

Well, if you add [tablename], disregarding the fact that [tablename] was already defined elsewhere, you shouldn't be totally surprised if you get an error. But if you just add a key-value pair (at the top), either not knowing that there is [] elsewhere or not knowing what it means – that's a quite different matter and shouldn't cause an error.

So defining a [] syntax with such a "spooky action-at-a-distance" effect (forbidding key-value pairs before the first explicitly named table) violates the principle of least surprise and hence shouldn't make it into the spec.

@guai
Copy link

guai commented Apr 13, 2017

@ChristianSi, this have nothing in common with principle of least surprise. Principle of least surprise means that if user can be already familiar with something with this syntax, we should not change its meaning.
What experience would be reused here?
None
Its impossible to avoid any learning for everyone.
Why even one assume, that he can create new pairs at the top if he know nothing about toml?
If user does not know what [] mean he will learn it from the first error message stating that key-value pairs should be grouped together
If someone does not know toml well then I think he will just edit existing examples and nothing more
And I think user will place new pair along with other pairs he knows the meaning of anyway

@sgarciac
Copy link
Contributor

sgarciac commented May 11, 2017

What experience would be reused here?
None

The experience of reading most documents ever written: you don't get to "re-open" a section after a new section has started. No one expects to read something like this:


An article about configuration files

Introduction

blah blah blah...

Methods and Data

blah blah blah ..

Introduction

eh?
....


@guai
Copy link

guai commented May 11, 2017

@sgarciac, I never suggested to make sections reopenable, I suggested to make toplevel movable.
Where did the assumption that toplevel is always "the most useful information" came from? In my case toplevel is uncategorized crap which I want to be at the bottom.
And since I want to adapt toml into existing large project, I do not want to add prefix to pathes all over the code. But I do want toml to be an easy drop-in replacement for any configuration library.

@ChristianSi
Copy link
Contributor

ChristianSi commented May 12, 2017

@guai:

In my case toplevel is uncategorized crap which I want to be at the bottom.
And since I want to adapt toml into existing large project, I do not want to add prefix to pathes all over the code.

You don't have to add a prefix to every path. Just put your "uncategorized crap" into a table called "General", "Unsorted", "Crap", or whatever, and leave everything else unchanged. You don't even have to change any of your data structures to do so, it's just a tiny change in the way they are mapped to and from TOML.

And should you answer "I cannot do that", I will reply with: "Of course you can!"

@guai
Copy link

guai commented May 12, 2017

@ChristianSi, good idea, but it is still a workaround
You must first find out somewhere what substitution path was hardcoded as a replacement for toplevel for that exact file.
Where will you find it out?
People say that [] is not obvious, but at least it can easily be taught from the manual or from error message.
Not quite obvious feature vs complete lack of information.

@a-teammate
Copy link

To summarize I would say the different standpoints are that:

  • Developers using the format should have it easy to do that
    vs
  • Users using the format should have it easy to do that

I understand that people would like to do as little changes as possible to their codebase to support sth.
This particular change unfortunately seems to have no compromise to not strictly violate the user-orientated nature of TOML.

So why is this still open?

@a-teammate a-teammate mentioned this issue Oct 14, 2017
26 tasks
@pradyunsg
Copy link
Member

Indeed. This can be closed.

@guai
Copy link

guai commented Nov 14, 2017

Indeed. This can be closed.

Just another task about this to appear. This is fourth one already, I believe.

@pradyunsg
Copy link
Member

@guai Sorry but I can't understand what you have typed.

@guai
Copy link

guai commented Nov 14, 2017

@pradyunsg, I mean once this task will be closed another one will apper about that same thing. Because that is what some people (me included) see as most obvious.

@a-teammate
Copy link

a-teammate commented Nov 17, 2017

If we had a wiki here, we could add a FAQ section.

But actually that is a general problem with suggestions, they tend to be redundant.
Leaving it open to make it more obvious to people is not the answer.
(A real fix should probably done on GitHub side)

@pradyunsg
Copy link
Member

@guai Oh, okay. That makes sense.

I guess it would be a good idea to link to the existing ones from here?

@mojombo
Copy link
Member

mojombo commented Nov 23, 2017

@TedDriggs @guai Thanks for your impassioned thoughts on this matter, these discussions are always helpful in thinking through these edge cases. For closure on this, I think the proposal is too non-obvious to be included. In TOML, root keys come first, and thus carry some significance because of that. In addition, TOML need not be a perfect replacement for existing config formats. I know it's inconvenient to have to rework some code to incorporate a switch to TOML, but maybe it could be a good opportunity to rethink the config layout. Is it the best idea to have a bunch of "crap" settings uncategorized? Could they fit elsewhere? At the very least, you can communicate their non-importance by bundling them into a [misc] table.

Regarding questions that keep coming up, I'm 100% ok with that. If people constantly bring things up, it means there is an opportunity to clarify the spec or reconsider past decisions based on new evidence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests