Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: A better aural interface for coding might resemble an abstract syntax tree #1

Open
jeremygiberson opened this issue May 28, 2019 · 0 comments

Comments

@jeremygiberson
Copy link

Hey Seth,
I followed a link from HN to this repo and wanted to make a suggestion about how code is written and read in codespeak.

A couple of issues around speech to text coding include:

  • One of the comments on HN mentioned something like: natural language to code is prone to ambiguity issues. For example, the following from i equals 1 to 10, print i, then print 'done' could be interpreted in code as for(i = 0; i < 10; i++) { print(i) } print("done") or for(i = 0; i < 10; i++) { print(i) print("done") }.
  • There's also the difficulties of speech to text in general--it's still a technology that is quite error prone and requires lots of user correction.
  • Supporting multiple languages has many considerations from different keywords, grammars and special characters.

I think all of these issues can be addressed if we tweak how we read/listen and write/speak. Rather than work with raw code we could work directly with Abstract Syntax Trees.

I'm not sure if you're familiar with some of the visual editors that have been developed recently (being legally blind I'm not sure if you're able to "see" them in action). In those editors, rather than writing code by typing it you can drag puzzle pieces from a library onto the screen and fitting them together where it makes sense. This is a mechanic that is familiar with children, purposely chosen because a lot of these editors are targeted at children coding education. But under the hood, this mechanic is really about assembling an abstract syntax tree -- and the puzzle shapes are the constraints that make sure you are building a valid tree.

Proper AST's can get convoluted when you start getting to the leaf nodes of the tree and would not be fun to work with when trying to read or write a program. However, we can make some strategic decisions about when we choose to represent the AST faithfully and when we choose to substitute simplifications to make something consumable.

An additional benefit to an AST is that you can easily transform it (for example, rendering it). Transpilers use AST's to transform code to different format or to different languages altogether. If the codespeak editor had the user "write" the AST directly, it could be translated to various languages by swapping out AST renderers.

Finally, AST's are well structured and have lots of context. The structure and context lends itself nicely to speech-to-text solutions. With structure and context you constrain the the possible voice inputs to a finite list of expected values -- which means you can improve voice interface drastically. Generally we'd want to break voice input into known commands that create specific elements of the AST and prompts for free form voice entry used for naming things.

As an example, let's visit the looped print statement as we might enter its code into our AST based editor.

Legend:

  • Command: User uttered statement
  • Prompt: Program text to speech asks user for input
  • Expected: Possible inputs from the user, tailors speech to text recognition
  • Answer: User uttered answer to prompt
  • Info: Program text to speech informs user of app state
Command: insert loop
Prompt: loop type?
Expected: do-while, while, for-in, for
Answer: for
Info: focused on loop initializer 

The app's current context is the loop initializer. Let's say loop initializer is one of the AST simplifications I mentioned earlier. In this context the following commands would be available:

- help: get list of commands
- set name <name|string>: sets the loop variable name
- set value <value|number>: sets the loop variable default value
- replace with expression: lets user replace the simplification with a full on AST expression (a context w/ its own set of available commands)
- step-next: focus on the next AST sibling node
- step-out: focus on the parent AST node
- step-in: focus on the first child of the current AST node 
Command: step-next
Info: focused on loop condition

Loop condition is another simplification that has its own available commands:

  • help
  • set operator <less than, less than or equal, greater than or equal, etc>
  • set value <value|number>
  • replace with expression
  • step-next, step-out
Command: step-next
Info: focused on loop increment
Command: step-next
Info: focused on body
Command: Insert call
Prompt: call what?
Expected: scoped callable, instance callable, static callable
Answer: scoped callable
Prompt: named?
Expected: <named callable in scope|any> (the AST can be used to constrain possible values, though complex to implement>
Answer: print
Info: focused on callable param 1

Fast forward a bit, we're now in the body of the loop and adding the print function call. The callable could be another simplification that has special commands like add param, set variable, set value, set callable which make it easy to provide the AST leaf nodes for callable w/out having to go into the nitty-gritty expression context route (where expression context has a bit more freedom [and verbosity]).

In the example I introduced a "focus" concept and commands to move the focus around the tree. The user will need to have ways to both know where they are in the code and jump around quickly and intuitively. The AST can provide more insightful location than line or column by bubbling up the tree parent nodes to find the named scope. Something like "condition in function named PrintHelloInLoop".

I hope I've managed to convey the idea clearly and make a good argument for it.

I'm sure theres still a lot of rough edges to be worked out. But the important thing I wanted to convey is that a speech based code editor may have a better in/out interface than raw code. You might want to explore interacting with the code in an abstracted manner to provide a better user experience!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant