Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: CodeGenerator per-block-type generator function dictionary #7084

Closed
cpcallen opened this issue May 12, 2023 · 4 comments · Fixed by #7150
Closed

Proposal: CodeGenerator per-block-type generator function dictionary #7084

cpcallen opened this issue May 12, 2023 · 4 comments · Fixed by #7150
Labels
issue: feature request Describes a new feature and why it should be added

Comments

@cpcallen
Copy link
Contributor

cpcallen commented May 12, 2023

This is the first of three issues created to discuss proposed changes to generators (and perhaps track implementation of those proposals, if agreed). This issue concerns proposed breaking changes to the CodeGenerator class in core/generator.ts.

In code examples in these bugs, the word "language" / "Language" / "LANGUAGE" should be read as a metasyntactic variable standing in for some particular language like JavaScript.

Background

The way code generators are currently defined involves creating a CodeGenerator instance for the particular language to generate code for, and then adding a large number of properties to that object. These can broadly be divided into three categories:

  1. Constants and similarly static (but not static) properties.
  2. Helper methods and overrides of methods declared on CodeGenerator.
  3. Per-block-type generator functions.

(See detailed example in background section of the second proposal, issue #7085.)

This is a perfectly good approach in JavaScript, but migrating this style of code to TypeScript poses some challenges because tsc generally complains about adding arbitrary properties to an object unless that object is declared as a dictionary.

The main proposal below addresses category 3, the per-block-type generator functions; the other two categories will be dealt with in the second proposal, #7085.

There are two supplementary proposals concerning other shortcomings in CodeGenerator which would require breaking changes to correct and which we may wish to address in the same release.

Main Proposal: Move generator functions to a dictionary object

At the moment CodeGenerator.prototype.blockToCode uses the CodeGenerator object itself as a dictionary to look up the generator function based on the block type:

class CodeGenerator {
  // ...
  blockToCode(block: Block|null, /* ... */): string {
    // ...
    const func = (this as any)[block.type];

Instead, have the CodeGenerator constructor create a dictionary object (perhaps named .blockGenerators) on each instance, and have blockToCode obtain the generator function there:

type BlockGenerator =(block: Block) => [string, number] | string | null;

class CodeGenerator {
  // ...
  blockGenerators: {[type: string]: BlockGenerator} = Object.create(null);
  blockToCode(block: Block|null, /* ... */): string {
    // ...
    const func = this.blockGenerators[block.type];

Breaking Changes

This will entail all code that creates generator objects (our own and external developers) to replace assignments of the form:

myGenerator['block_type'] = function(block) { /* ... */ };

with the trivially different:

myGenerator.blockGenerators['block_type'] = function(block) { /* ... */ };

This will also apply cases where generator functions are reused; i.e. JavaScript['controls_ifelse'] = JavaScript['controls_if'] will need to be updated to JavaScript.blockGenerators['controls_ifelse'] = JavaScript.blockGenerators['controls_if'].

Alternatives Considered

It seems like it should be possible to use declaration merging or intersection types to create a type that is both a CodeGenerator and dictionary of block generator functions (i.e., Record<string, BlockGenerator> or equivalently {[type: string]: BlockGenerator}). Unfortunately several attempts to create a proof-of-concept of this idea in the TypeScript Playground all fell foul of errors complaining that the various properties and methods of CodeGenerator did not conform to the BlockGenerator type. It appears that it would be necessary to declare a type for the keys of the generator function dictionary that either included every possible block type, or all strings except the names of methods declared on CodeGenerator and any of its subclasses—but in either case this depends on creating a comprehensive list of names most of which are chosen by external developers, clearly an impossible task.

This points to another weakness of both this alternative and the status quo: that there is a danger of clashes between block types and the names of non-block-generator-function methods on CodeGenerator and its subclasses—a danger entirely obviated by the proposal of creating a separate dictionary for the latter.

Additional Info

The names BlockGenerator and blockGenerators have been chosen somewhat arbitrarily and better suggestions are welcome. The latter in particular will be used frequently when creating generators, so something shorter would be advantageous.

Supplementary Proposal 1: Consider renaming members to conform to style guide

The following properties and methods on CodeGenerator have names which do not conform to our current styleguide. If we are making breaking changes, we could consider fixing these at the same time:

  • name_
  • FUNCTION_NAME_PLACEHOLDER_
  • FUNCTION_NAME_PLACEHOLDER_REGEXP_
  • INFINITE_LOOP_TRAP
  • STATEMENT_PREFIX
  • STATEMENT_SUFFIX
  • INDENT
  • COMMENT_WRAP
  • ORDER_OVERRIDES
  • RESERVED_WORDS_
  • definitions_
  • functionNames_
  • protected nameDB_

Breaking Changes

This will entail updating all code that references these names, which is essentially all generator implementations. Since these changes are less trivial than the changes entailed by proposal 1 it may be better not to impose such costs for purely stylistic reasons. Nevertheless, if we did want to update these names it makes sense to do so at the same time as we make other breaking changes.

Supplementary Proposal 2: Consider changes to parenthesisation mechanism in valueToCode

The existence of the ORDER_OVERRIDES property seems like a code smell. It appears to be needed because generators do not provide information about the (left- or right-)associativity of the operators they generate, only a numerical 'order' value.

There are various possible ways this could be improved. At the simplest level, generator functions could return [code, order, associativity] instead of just [code, order]. A more satisfactory approach might be to provide a standardised way to declare an operator precedence table that includes associativity information, and have generator functions return [code, rootOperator] tuples.

This proposal would require further development.

@cpcallen
Copy link
Contributor Author

I recommend we adopt the main proposal.

I am neutral to negative on the supplementary proposals; I just want to ensure we discuss whether we wish to make any other breaking changes at the same time.

@BeksOmega
Copy link
Collaborator

Wrt the main proposal:

So the benefits here are that we enable better type checking, and that we make it possible to also support importing different block-code generators individually? I'm not sure if that is worth forcing everyone to migrate all of their code generators for.

Would it be possible to write a migration script that automatically renames LanguageGenerator['block_type'] assignemnts to LanguageGenerator.blockGenerators['block_type'] assignments to help mitigate this?

Wrt supplementary proposal 1: It seems like the only benefit we get to this is more conformity with the style guide, and I don't think that's a big enough benefit when compared to the breaking change.

Wrt supplementary proposal 2: In general I like really like this idea! And it seems if we just want to return an extra associativity from the block generator function, we could make that optional and do it backwards compatibly. So my feeling is, this doesn't need to be completed as part of this project but would be very cool to add! That is assuming it can be done backwards compatibly though.

@cpcallen
Copy link
Contributor Author

So the benefits here are that we enable better type checking, and that we make it possible to also support importing different block-code generators individually? I'm not sure if that is worth forcing everyone to migrate all of their code generators for.

The main goal is better type checking, and in particular to avoid TypeScript users having to cast (languageGenerator as any)['block_type'] = … (or as undefined as Record<string, BlockGenerator if you want to be pedantic) every time we install a generator function.

A secondary goal is to avoid any possibility of name clash between a block type and non-generator methods/properties on a generator object.

Would it be possible to write a migration script that automatically renames LanguageGenerator['block_type'] assignemnts to LanguageGenerator.blockGenerators['block_type'] assignments to help mitigate this?

Potentially. Correctly identifying the name of the generator (is is Blockly.JavaScript or 'javascriptGenerator' or something else entirely?) could be challenging, but a command line option could suffice. But this change by itself is a pretty simple search-and-replace, so I'm not sure it's worth it unless we're also helping with other changes at the same time.

Wrt supplementary proposal 1: It seems like the only benefit we get to this is more conformity with the style guide, and I don't think that's a big enough benefit when compared to the breaking change.

That is my view as well.

Wrt supplementary proposal 2: In general I like really like this idea! And it seems if we just want to return an extra associativity from the block generator function, we could make that optional and do it backwards compatibly. So my feeling is, this doesn't need to be completed as part of this project but would be very cool to add! That is assuming it can be done backwards compatibly though.

I agree that it merits further thought for a potential future change, especially if it can be done compatibly.

@cpcallen
Copy link
Contributor Author

cpcallen commented Jun 6, 2023

At our weekly team meeting on 2023-05-16, we decided:

  • Main proposal, to create a per-block-type generator function dictionary: accepted.
  • SP1, to make some breaking name changes for stylistic reasons: rejected.
  • SP2, to redesign the parenthesisation mechanism: to be done later (if at all).

An informal poll in team chat was conducted to consider options for the name of the dictionary property; options discussed in approximate order from most to least popular:

  • .blockGenerators, as seen in proposal: descriptive but long, and "generators" arguably redundant.
  • .forBlock, as seen in summit slides: succinct, but some found thought this sounded like a method name.
  • .blocks: more succinct but possibly misleading, as dictionary does not contain Block or BlockDefinition instances.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
issue: feature request Describes a new feature and why it should be added
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants