Experiment: streaming parse and eval #225

daveyarwood · 2016-05-02T04:43:54Z

I had an interesting thought today about a possible way to optimize the process of parsing and evaluating Alda code. If it works, this might help with #208.

The way things work currently, parsing, evaluating and playing a score are 3 separate steps, and each step must be completed fully before proceeding to the next step. The bottleneck is parsing -- the largest example score we have (Bach cello suite no. 1) takes about 1 second to parse on my Macbook. Evaluating is pretty fast -- so far it's under 200 ms for that same score, at least on my housekeeping branch where I've added benchmarks for evaluation, and the smaller scores take under 10 ms to evaluate.

To speed things up, I wonder if it might be feasible to do one or both of these things:

Start evaluating events as they are parsed. <--- this would help the most
Start playing events as they are evaluated / added to the score.

I'm imagining a kind of streaming setup where bits of Alda syntax like piano: and c4 and > are streamed to the next step of the process as they are parsed. Perhaps this could be implemented with core.async channels, e.g. as we're parsing, we could put parsed events on a parsed-events channel, to be consumed by the evaluator process, which in turn puts note events (with duration, offset, pitch, etc. determined by the context of the score) on another channel for the sound engine to consume and schedule.

The part I'm not really sure about is how to parse a string of Alda code in a streaming fashion. It looks like this isn't possible (yet) with Instaparse, and after some quick googling I was not able to find any Clojure/Java libraries for stream-based parsing. It seems like to do what I'm describing would involve writing our own parser on a lower level. This might not be so bad if it gives us the ability to stream parsed tokens directly into the next phase.

Luckily, Alda syntax is pretty simple and evaluates sequentially to build up a score, so I think it should be doable to parse sequentially and stream the results. I'm thinking of a sort of rule-based thing, where, if we're parsing a stream of input like bassoon: c8 d e f g:

we read b -- this might be a note, new part name, variable name, etc. so we keep reading...
now we've read ba -- so it's not a note, has to be either a part name or a variable name. We keep reading...
...until we've read bassoon: At this point, we know it's a part name, so we can put a "use part 'bassoon'" event on the stream and keep reading.

To make this code easy to read and work with, we could make this a reducing operation involving a "parse context" map and each new character we're reading from the input stream/string. The parse context could include the current token we're building up, and tell us things like whether we're building up an event sequence (i.e. if we've read [ and now we're reading events in the sequence until we read ]) and what's in the sequence. It could also help us rule out possible tokens, like "notes" after the first step in the example above.

Streaming the next step (evaluator -> sound engine) will probably depend on #26. We'll at least need a way to add additional notes to a score that's already being played. Then, we'll need a way for the sound-generating process to be aware of roughly how many notes will need to be played, and wait until some percentage of them are ready-to-schedule before it starts to schedule anything. Otherwise, we could run into issues where the sound engine is too eager to play notes, and ends up playing them too efficiently, resulting in short pauses where it's waiting for more notes to schedule, which will just sound like bad timing / bad performance. The sound engine should be able to ensure that it has at least 5-10 seconds(?) of notes in the buffer that are ready to schedule, before it starts to schedule anything (triggering playback).

Since the notes coming in on the evaluation process' stream are not guaranteed to be in order (since changing voices and parts causes the current offset to jump around), we should probably wait until the evaluation stream has run dry and we have all of the events ready to schedule. Then we can sort by offset and determine how to spread out / buffer the scheduling.

This is obviously a big undertaking, but if it works, I think we can speed things up significantly, and generally make Alda a lot more robust.

The text was updated successfully, but these errors were encountered:

daveyarwood · 2016-05-02T18:29:40Z

Thinking about this some more, I don't think this will help much unless we can get both the parse -> eval streaming piece and the eval -> play streaming piece in place.

If we can get Alda to evaluate events as it parses them, but we still have to wait until all of the events are evaluated before we start playing them, then parsing is still the bottleneck because parsing must be done before evaluating can be done. Getting the parse -> eval streaming piece will only take 0-200 ms off of the overall parse/eval time (however long it takes to eval the score, which is often under 10 ms), but parsing will remain the bottleneck until we get to where we can stream the playback too as events are evaluated.

Once we're fully streaming though, there are some caveats -- I'm not sure if these are necessarily bad or not, though -- this could be interesting. If we have everything fully streaming (parse -> eval -> play events as the score is being parsed and eval'd), then the user will be able to start hearing his/her score being played back even if there are syntax errors later on in the file. This behavior would be opposite to what one would expect based on experience with any(?) other programming language. Usually syntax errors are caught first, so evaluation doesn't start happening unless you have a syntactically valid file. This may or may not be a problem for Alda... I'm curious to hear other people's thoughts on this.

A related issue would be that if there are syntax errors, you won't be able to get a stacktrace because the client/server transaction will have ended once Alda starts parsing/eval'ing/playing. This could be kind of annoying, and might not be the behavior we want. A way to get around this would be to return a randomly generated playback ID, and offer an alda check command so the user can see a parse/eval/playback log for that playback, which would include any errors that have happened, with the stacktrace. I'm not sure I like this, though... could be a deal-breaker for this whole issue... although we could mitigate this somewhat by having a time-out (say 2 or 3 seconds?) and returning any error message right away if one happens within that time period. Then if it takes longer than that, Alda could keep working in the background and give the user a randomly generated ID that he/she can use to check the playback logs if anything sounds off. This would only happen if it takes longer than the time-out to parse + eval the score, which should be rare if the timeout is more than a few seconds.

daveyarwood · 2016-05-02T18:38:49Z

Re: that first caveat (playback starts before syntax errors later in the file are caught), perhaps we could make it so that the streaming pipeline is not the default behavior, but rather something you can enable as an optimization setting. That way, the default/current behavior is preserved, which is nice because I think it's what most people would expect (i.e. my score shouldn't start playing if there are syntax errors), but people who understand the trade-off can choose to enable the streaming behavior if they want to get faster playback and they don't care that syntax errors will happen on-the-fly instead of being caught up-front.

daveyarwood · 2016-11-23T21:54:25Z

Moved to alda-lang/alda-core#20.

daveyarwood added enhancement idea labels May 2, 2016

daveyarwood mentioned this issue Nov 23, 2016

Experiment: streaming parse and eval alda-lang/alda-core#20

Closed

daveyarwood closed this as completed Nov 23, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experiment: streaming parse and eval #225

Experiment: streaming parse and eval #225

daveyarwood commented May 2, 2016 •

edited

Loading

daveyarwood commented May 2, 2016 •

edited

Loading

daveyarwood commented May 2, 2016 •

edited

Loading

daveyarwood commented Nov 23, 2016

Experiment: streaming parse and eval #225

Experiment: streaming parse and eval #225

Comments

daveyarwood commented May 2, 2016 • edited Loading

daveyarwood commented May 2, 2016 • edited Loading

daveyarwood commented May 2, 2016 • edited Loading

daveyarwood commented Nov 23, 2016

daveyarwood commented May 2, 2016 •

edited

Loading

daveyarwood commented May 2, 2016 •

edited

Loading

daveyarwood commented May 2, 2016 •

edited

Loading