Clarify well-structured control flow #475

kripken · 2015-11-23T04:05:22Z

I think it's helpful to add some clarification to what we mean by "well-structured", since that term - and similarly "irreducible control flow" and so forth - are not universally familiar, and actually have some different definitions (e.g. they can be relative to specific constructs).

In addition, this added text provides an intuition to help understand our control flow for people more familiar with high-level languages. With this addition, I hope the text will be more accessible to a wider range of compiler hackers. In particular, it could help the large community of hackers currently compiling down to JavaScript, that we would love to get compiling to WebAssembly eventually.

I believe this would also address most of my concerns on the break/branch topic (#445).

titzer · 2015-11-23T09:42:57Z

AstSemantics.md

+from outside it. This restriction ensures all control flow graphs are well-structured
+in the exact sense as in high-level languages like Java and JavaScript. To
+further see the parallel, note that a `br` to a `block`'s label is functionally
+equivalent to a labeled `break` in high-level languages, that is, a `br` on a


"in high-level languages, thus a br simply breaks out of a block"

wdyt?

I'm not quite sure which part of the sentence you mean to edit to that, I can see more than one way to do it. I pushed an update which rewords part of it (which I agree makes it shorter and clearer). How is it now?

titzer · 2015-11-23T09:43:11Z

lgtm other than minor wordsmithing on the last sentence.

titzer · 2015-11-23T20:07:27Z

AstSemantics.md

+from outside it. This restriction ensures all control flow graphs are well-structured
+in the exact sense as in high-level languages like Java and JavaScript. To
+further see the parallel, note that a `br` to a `block`'s label is functionally
+equivalent to a labeled `break` in high-level languages, that is, a `br` simply


This might read a little smoother if we change ", that is, " to "in that". That avoids the comma pauses. Otherwise lgtm.

Cool, done.

jfbastien · 2015-11-23T22:46:43Z

lgtm

While we're clarifying, would you want to add examples of what not possible? Or do you think that's too much? Maybe such deep-dive should be in Rationale.md?

kripken · 2015-11-23T23:05:15Z

Rationale might be a good place to go into more detail, yeah.

But it's actually not easy! :) Since what's not possible is jumping into a loop, but really, even that is possible if you use a control flow threading variable, so what we really mean is "jump into a loop without additional overhead" - but even that isn't strictly true, since the threading variable would likely be eliminated out, as was mentioned back when we discussed this in more detail. So we kind of mean "not possible to jump into a loop using just br, loop and block", which is almost true, because it is actually possible to jump into the middle of the loop using those constructs, only you would be jumping over code at the beginning of the loop that is never accessible anyhow (unconditional branch at the top into the middle). Which again brings us back to "without additional overhead", but really, the entire argument here is that we can add blocks and brs to achieve certain control flow (creating stacks in some cases, and potential overhead), so we are actually introducing overhead to achieve this anyhow. So perhaps we can say "you can't jump into the middle of a loop using just br, loop and block, where 'middle of a loop' ignores dead code", but how about if we split the loop, in which case (with adding nodes) we can actually logically enter the middle...

Sorry for the lengthy paragraph, I've actually been trying to find a good way to say this - just like you, re-reading this section made me want to add something. But I can't find a way that is not too detailed while remaining correct.

I think it might actually be best to stay at the more intuitive level, which is what we'll have after this pull: (1) we mention that entering the middle of a loop is not possible, which indeed in some sense it isn't, and (2) we mention the connection to high-level languages, since our limitation is completely identical to theirs, so all readers should understand what we mean easily.

jfbastien · 2015-11-23T23:18:42Z

I'd be fine if tricky things like this were succinct in the main design, with a link to Rationale.md for more details. It sounds like what you're suggesting?

p.s.: I'm amused at jumping into the middle of Rationale.md from another part of the design. It's meta.

kripken · 2015-11-23T23:21:15Z

Basically I'm saying I don't know how to write this in the Rationale without going into that entire huge paragraph :)

But if you think details like that make sense for Rationale, happy to add them, plus a link from here.

jfbastien · 2015-11-23T23:53:35Z

Yes, I think it would be helpful: the current document isn't clear on how we got to the well-structured control flow design we have. The few of us involved in getting there have the context, but external folks don't.

kripken · 2015-11-23T23:54:19Z

Cool, will start a followup pull request for that now.

sunfishcode · 2015-11-23T23:57:57Z

I propose AstSemantics.md just say "can't jump into the middle of a loop" and mean it in the literal sense, rather than trying to mean it in the sense which includes all things which are semantically equivalent to it :-). If we want to write a separate compiler-writers guide, that'd be a good place to explain the various options for lowering a loop with multiple entries.

Also, ironically, the original text here was intended to try to ease the fears of compiler writers who aren't aware of the full power of labeled break, for whom emphasizing that "it's just like JS" is actually more confusing than enlightening.

kripken · 2015-11-24T00:20:23Z

Created followup pull in #479.

@sunfishcode: Is your proposed change in the last comment for this pull, or for the followup?

sunfishcode · 2015-11-24T01:06:31Z

I was addressing this patch. I was mainly agreeing that the larger paragraph above should go somewhere else besides AstSemantics.md :-).

And concerning my other comment, from my perspective, drawing parallels to high-level language constructs in AstSemantics.md is a distraction, but I realize that perspectives will vary.

kripken · 2015-11-24T01:23:04Z

Cool, yes, the content in that big paragraph is intended for Rationale.md, as suggested by @jfbastien. It's in #479 if you want to take a look. I think I got it better than that rambling massive paragraph here ;) but it's still not easy to summarize this stuff.

I get the point that more intuitions might be a distraction for some readers. But I think those readers likely already would understand the topic, from the rest of the spec? Whereas the parallel to high-level languages would help a large class of other compiler hackers.

kripken · 2015-11-25T01:48:21Z

Waiting for feedback from @sunfishcode.

sunfishcode · 2015-11-29T23:08:55Z

My feedback is that I personally think it's more confusing than enlightening. My impression talking to even some people who know JS well is that it's not widely known just how theoretically powerful labeled break is, because its full power is almost never used in hand-written code (for good reason, to be sure), so drawing a parallel to JS doesn't seem to convey the right idea.

kripken · 2015-11-30T02:18:54Z

Yes, I agree not all JavaScript devs know about labeled break. Certainly many casual devs might not. But still quite a significant amount of JavaScript (and Java) developers do, in particular, the ones writing compilers to and from JavaScript (and Java) would be very likely to. And as discussed earlier I think that's a very important audience for us.

ghost · 2015-11-30T02:54:01Z

Perhaps some of the rationale could be explained by the goal to optimize parsing and analysis and even runtime performance for simpler consumers?

For example: 'Control structures that lead to more efficient parsing and control flow analysis are clearly identified and separated from those needed to support general control flow. While the general control flow operators could be used to specify all control flow, implementations would be expected to be slower parsing them and may well not optimize them well so runtime performance many also be slower.'

sunfishcode · 2015-11-30T03:32:05Z

@kripken Many people know how to exit from an inner loop of a nest using labeled break. However, many people I've talked to recently were not aware that labeled breaks can build arbitrary control-flow DAGs (if we treat if (x) break L as a single operation, etc.).

kripken · 2015-11-30T04:21:48Z

Yes, it sounds like those people are in an intermediary stage between not knowing about labeled break, and fully grokking all the theoretical implications of what it implies. But the second part of the sentence confuses me?

We aren't talking about the ability to create any DAG, we are talking about what is possible to do without a helper variable, i.e., what is directly possible with labeled break - precisely what someone writing a compiler into JS would likely be aware of.
The pull mentions an exact functional equivalence between labeled break and something else. This is therefore helpful intuition to anyone that knows how labeled break works, period, even if they don't have a deep understanding of the theoretical implications of that. (Of course, such a deep understanding could help them even more.)

I am quite curious to hear more about those people and their background, and what would help them better understand things. But this is starting to sound somewhat philosophical, and this pull request is just a small no-semantic-changes text clarification with several lgtms and no other objections - is it ok if I merge it (with your reservations as already noted, of course), and we'll continue the discussion separately?

sunfishcode · 2015-12-01T15:31:54Z

The original text here is talking about the ability to create any DAG, because that's one of the things that is possible to do without a helper variable. That's one of things the original text is trying to point out, specifically to (briefly) assuage fears that wasm's structured control flow is too much restricted by high-level language sensibilities. Immediately following this with a sentence likening wasm to high-level languages weakens what the original text is trying to convey.

Another concern is that WebAssembly is a low-level language in general, but its AST structure has already led some people to think of it in high-level language terms in other areas, and it isn't a very good high-level language. The more we encourage thinking about WebAssembly literally in terms of JS or "high level languages" in the spec, the more we risk diluting wasm with conflicting purposes, potentially representing no purpose well.

kripken · 2015-12-01T20:54:44Z

I certainly don't want to weaken what the original text is trying to convey. But I don't see how it does - looks the opposite to me - so I don't have any idea how to fix it. Do you have a concrete suggestion for how I can improve this pull? Happy to iterate on that with you.

Regarding the second point, the addition here uses a specific analogy (and a 100% precise one) to explain a specific feature. It's not saying "wasm is high-level". But, if you feel we should clarify that wasm is low-level (I don't think we need to, but also I don't see the harm) then perhaps draft a separate pull request with that addition (for the FAQ maybe?) instead of opposing this one on a side issue? It also sounds like that side issue is a very big deal for you, so let's address it seriously and with the proper focus, on its own?

rossberg · 2015-12-02T14:20:30Z

I, for one, do think this PR provides a useful clarification.

sunfishcode · 2015-12-02T15:19:24Z

I have seen this list mentioned a few times around this project as something we should focus on. I think this says a lot about perspectives. I propose that a better list to think about is this list.

AstSemantics does not define or explain itself in terms of JS, and I think this is an important invariant, to encourage us to think of WebAssembly as a new language.

sunfishcode · 2015-12-03T21:47:43Z

I have been convinced to stop opposing this PR.

One comment I would add then is that labeled break is also a feature of somewhat less high-level languages such as Rust.

kripken · 2015-12-03T22:37:07Z

Sounds good, I added Rust and Go which you found have labeled break as well.

Any other languages worth mentioning?

jfbastien · 2015-12-04T00:01:36Z

The more we add, the more this should be in Rationale.md. AstSemantics.md is for "this is what things are", whereas Rationale.md is for "and here's why it's this way and what that implies". Maybe move all of this text (and the preceding paragraph) to Rationale.md?

ghost · 2015-12-04T00:27:22Z

@jfbastien Personally I like to see rationale and 'implementation notes' in an annotated specification near where the issue is specified, but don't hold anything up on this account.

@kripken Common Lisp has block and return-from. E.g. (block outer (block inner (return-from outer 1)) 2) => 1' The specification describes it as 'a structured, lexical, non-local exit facility'.

kripken · 2015-12-04T01:10:27Z

@jfbastien: not opposed, but I literally added two words and a comma :) I was hoping not to open a new discussion in this already-too-long-issue...

kripken · 2015-12-04T19:48:45Z

@jfbastien: How about if I merge this and start a followup to move parts into Rationale+links to them?

Or do you prefer I do that in this pull?

jfbastien · 2015-12-05T20:49:30Z

sgtm

kripken · 2015-12-08T23:08:29Z

Ok, merging this now, and will start on followup.

…-flow Clarify well-structured control flow

titzer reviewed Nov 23, 2015
View reviewed changes

kripken force-pushed the elaborate-structured-control-flow branch from 3cf3436 to abb1961 Compare November 23, 2015 18:56

titzer reviewed Nov 23, 2015
View reviewed changes

kripken force-pushed the elaborate-structured-control-flow branch from abb1961 to bdcc63a Compare November 23, 2015 20:56

kripken mentioned this pull request Nov 24, 2015

Well structured rationale #479

Closed

elaborate on what well-structured control flow means

a181137

kripken force-pushed the elaborate-structured-control-flow branch from bdcc63a to a181137 Compare December 3, 2015 22:35

kripken added a commit that referenced this pull request Dec 8, 2015

Merge pull request #475 from WebAssembly/elaborate-structured-control…

33e6a23

…-flow Clarify well-structured control flow

kripken merged commit 33e6a23 into master Dec 8, 2015

jfbastien deleted the elaborate-structured-control-flow branch December 9, 2015 11:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify well-structured control flow #475

Clarify well-structured control flow #475

kripken commented Nov 23, 2015

titzer Nov 23, 2015

kripken Nov 23, 2015

titzer commented Nov 23, 2015

titzer Nov 23, 2015

kripken Nov 23, 2015

jfbastien commented Nov 23, 2015

kripken commented Nov 23, 2015

jfbastien commented Nov 23, 2015

kripken commented Nov 23, 2015

jfbastien commented Nov 23, 2015

kripken commented Nov 23, 2015

sunfishcode commented Nov 23, 2015

kripken commented Nov 24, 2015

sunfishcode commented Nov 24, 2015

kripken commented Nov 24, 2015

kripken commented Nov 25, 2015

sunfishcode commented Nov 29, 2015

kripken commented Nov 30, 2015

ghost commented Nov 30, 2015

sunfishcode commented Nov 30, 2015

kripken commented Nov 30, 2015

sunfishcode commented Dec 1, 2015

kripken commented Dec 1, 2015

rossberg commented Dec 2, 2015

sunfishcode commented Dec 2, 2015

sunfishcode commented Dec 3, 2015

kripken commented Dec 3, 2015

jfbastien commented Dec 4, 2015

ghost commented Dec 4, 2015

kripken commented Dec 4, 2015

kripken commented Dec 4, 2015

jfbastien commented Dec 5, 2015

kripken commented Dec 8, 2015

Clarify well-structured control flow #475

Clarify well-structured control flow #475

Conversation

kripken commented Nov 23, 2015

titzer Nov 23, 2015

Choose a reason for hiding this comment

kripken Nov 23, 2015

Choose a reason for hiding this comment

titzer commented Nov 23, 2015

titzer Nov 23, 2015

Choose a reason for hiding this comment

kripken Nov 23, 2015

Choose a reason for hiding this comment

jfbastien commented Nov 23, 2015

kripken commented Nov 23, 2015

jfbastien commented Nov 23, 2015

kripken commented Nov 23, 2015

jfbastien commented Nov 23, 2015

kripken commented Nov 23, 2015

sunfishcode commented Nov 23, 2015

kripken commented Nov 24, 2015

sunfishcode commented Nov 24, 2015

kripken commented Nov 24, 2015

kripken commented Nov 25, 2015

sunfishcode commented Nov 29, 2015

kripken commented Nov 30, 2015

ghost commented Nov 30, 2015

sunfishcode commented Nov 30, 2015

kripken commented Nov 30, 2015

sunfishcode commented Dec 1, 2015

kripken commented Dec 1, 2015

rossberg commented Dec 2, 2015

sunfishcode commented Dec 2, 2015

sunfishcode commented Dec 3, 2015

kripken commented Dec 3, 2015

jfbastien commented Dec 4, 2015

ghost commented Dec 4, 2015

kripken commented Dec 4, 2015

kripken commented Dec 4, 2015

jfbastien commented Dec 5, 2015

kripken commented Dec 8, 2015