Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove IdMap from source file. #9257

Closed
1 of 2 tasks
farmaazon opened this issue Mar 4, 2024 · 10 comments · Fixed by #10347
Closed
1 of 2 tasks

Remove IdMap from source file. #9257

farmaazon opened this issue Mar 4, 2024 · 10 comments · Fixed by #10347
Assignees
Labels
-gui -language-server d-hard Difficulty: significant prior knowledge required p-medium Should be completed in the next few sprints x-design

Comments

@farmaazon
Copy link
Contributor

farmaazon commented Mar 4, 2024

The id-map grows with the code rapidly, and even reducing its size won't help much; still, the id-map tends to take 100x more space than the code it describes. But perhaps we don't need to store them in file at all.

Main idea

Instead of storing id-map in source file, we could exchange it using Language Server API.

The id-map purpose is to make an identification of particular AST nodes, what in turn is needed to:

  1. Assign node metadata (like position, color, etc.) and - in the future - widget metadata (like which widget was directly picked by the user).
  2. Remove expression UUIDs from metadata section of a source file #10182 identify expressions in messages between GUI and Engine (where visualization is attached, what is the type of expression, etc.) Because of execution contexts and visualizations, these identifications should be persistent.

The second point do not require AST IDs persistence, while the first point may be handled by storing direct spans instead of IDs in the source file.

Further ideas and optimizations

  • To avoid initial exchange and reduce id-map size, we could assume some deterministic way of assigning IDs to nodes without id-map entry, for example: for every node with assigned ID from map, we assign increments of this ID to its children in DFS order (of course, if we meet another node with already assigned ID, we skip the entire subtree).
  • or, we could resign from exchanging IDs, and just add list of span pairs (A, B) meaning, that old code's span A's identity should be passed to new code's span B. Both parties then should manage updating their information about visualizations, execution stacks etc.

Gui Tasks

Preview Give feedback

Engine Tasks

Preview Give feedback
  1. 0 of 5
    -gui -language-server
    4e6
@farmaazon farmaazon added d-hard Difficulty: significant prior knowledge required p-medium Should be completed in the next few sprints x-design -language-server -gui labels Mar 4, 2024
@github-project-automation github-project-automation bot moved this to ❓New in Issues Board Mar 4, 2024
@JaroslavTulach
Copy link
Member

JaroslavTulach commented Mar 5, 2024

The Truffle instrumentation is built around SourceSectionFilter, not UUIDs. The filter uses ranges (offset, line, column) and tags. It would be great if we adhered more closely to the capabilities offered by Truffle when eliminating the IdMap usage.

Then the AST is hosted in the same process as the engine and its instrumentation. Assuming y.js is efficient in synchronizing AST modifications, we don't need UUIDs to keep element identities. The AST elements themselves represent identity. Visualizations shall become part of the AST - attaching/removing them modifies some attribute of clearly identified AST element. When there is an AST change that modifies the source code we need to traverse the AST and recompute ranges (offset, line, column) of elements that require instrumentation. A change in the source code always comes with new ranges. All of that is done inside of a single process without any need for network communication via a special protocol.

Opportunities with visualizations: Y.js system offers concept of subdocuments - lazily loaded entities used for on demand synchronization. We could use them for distributing visualizations across the peers. IDEs would annotate an AST element as eligible for obtaining visualization. Engine would compute the visualization and attach it as a subdocument. IDEs interested in that visualization would load it and observe its changes.

Longer term future vision: with the AST being in the same process as Truffle we should make a connection between the AST elements and Truffle nodes and avoid reparsing the whole .enso file when only a subtree is changed.

@JaroslavTulach
Copy link
Member

I have rephrased the strategic vision in a separate document. An essential part of it still remains the sharing of ASTs among peers via y.js server.

@4e6
Copy link
Contributor

4e6 commented Jun 11, 2024

#10182 will add an optional IdMap parameter in the text/applyEdit request.

I propose to simplify the serialized IdMap elements format from the current

[[{"index":{"value":0},"size":{"value":4}},"bfae92f7-07d0-4886-b37b-a26e17ee943f"],...]

to an array of 3 elements: index, size and uuid

[[0,4,"bfae92f7-07d0-4886-b37b-a26e17ee943f"],...]

@farmaazon
Copy link
Contributor Author

farmaazon commented Jun 11, 2024

#10182 will add an optional IdMap parameter in the text/applyEdit request.

I propose to simplify the serialized IdMap elements format from the current

[[{"index":{"value":0},"size":{"value":4}},"bfae92f7-07d0-4886-b37b-a26e17ee943f"],...]

to an array of 3 elements: index, size and uuid

[[0,4,"bfae92f7-07d0-4886-b37b-a26e17ee943f"],...]

I'm ok, but perhaps we should support reading both formats for some time.

@JaroslavTulach
Copy link
Member

I propose to simplify the serialized IdMap elements format from the current
to an array of 3 elements: index, size and uuid

[[0,4,"bfae92f7-07d0-4886-b37b-a26e17ee943f"],...]

I'm ok, but perhaps we should support reading both formats for some time.

Yes, being able to read previous node positions in new IDE is essential.

@4e6
Copy link
Contributor

4e6 commented Jun 11, 2024

I'm not going to change the parser. I was talking about the JSON-RPC text/applyEdit parameter

@4e6 4e6 moved this from 🔧 Implementation to 👁️ Code review in Issues Board Jun 14, 2024
@4e6 4e6 moved this from 👁️ Code review to ⚙️ Design in Issues Board Jun 15, 2024
@4e6
Copy link
Contributor

4e6 commented Jun 17, 2024

#10283 adds an optional idMap parameter to text/applyEdits that overrides/complements the IdMap in the file

interface TextApplyEditParameters {
  /** The file edit. */
  edit: FileEdit;

  /**
   * A flag indicating whether we should re-execute the program after applying
   * the edit. Default value is `true`, indicating the program should be
   * re-executed.
   */
  execute?: boolean;

  /**
   * An identifiers map associated with this file as an array of
   * index, length, uuid triples. The old id map format that was used in the
   * source file is also supported.
   */
  idMap?: [number, number, UUID][];
}

@farmaazon the next step could be to keep in the file only the IDs that are used in the second metadata line (with node positions, etc.) and send the rest of the IDs with the text/applyEdit request. This should reduce the size of the IdMap section in the file metadata. If you're ok with the idea, I'll implement it as a followup PR.

@farmaazon
Copy link
Contributor Author

@farmaazon the next step could be to keep in the file only the IDs that are used in the second metadata line (with node positions, etc.) and send the rest of the IDs with the text/applyEdit request. This should reduce the size of the IdMap section in the file metadata. If you're ok with the idea, I'll implement it as a followup PR.

Looks good, I'm ready to help with ydoc-server code if needed.

@4e6 4e6 mentioned this issue Jun 23, 2024
3 tasks
@4e6 4e6 moved this from ⚙️ Design to 🔧 Implementation in Issues Board Jun 24, 2024
@enso-bot
Copy link

enso-bot bot commented Jun 24, 2024

Dmitry Bushev reports a new STANDUP for today (2024-06-24):

Progress: Started working on storing the reduced IdMap in the file. Implemented the IdMap diff. Implemented sending IdMap in the applyEdit request. It should be finished by 2024-06-29.

Next Day: Next day I will be working on the #9257 task. Continue working on the task

@enso-bot
Copy link

enso-bot bot commented Jun 25, 2024

Dmitry Bushev reports a new STANDUP for today (2024-06-25):

Progress: Playing with different usage cases. Debugged and fixed the issue with the metadata recovery. Fixed the issue with the broken IdMap between the project restarts. Undrafted the PR It should be finished by 2024-06-29.

Next Day: Next day I will be working on the #9257 task. Continue working on the task

@4e6 4e6 moved this from 🔧 Implementation to 👁️ Code review in Issues Board Jun 26, 2024
@mergify mergify bot closed this as completed in #10347 Jul 8, 2024
@mergify mergify bot closed this as completed in b2c4559 Jul 8, 2024
@github-project-automation github-project-automation bot moved this from 👁️ Code review to 🟢 Accepted in Issues Board Jul 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
-gui -language-server d-hard Difficulty: significant prior knowledge required p-medium Should be completed in the next few sprints x-design
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants