You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Presenting a roadmap focused on Style Operations for Yorkie Text data structure.
Why is this needed:
Present an overview of the Text data structure, its current state, and background.
Present current issues and the direction of changes needed.
Provide a roadmap for future contributors to understand the context and continue the work.
The table of contents is as follows:
Introduction to Yorkie's Text Data Structure
Current Issues
Roadmap for Resolving Issues
Introduction to Yorkie's Text Data Structure
Definition
Text is one of the JSON-like data structures used to represent content in JSON-like documents. Text represents text with style attributes in rich text editors such as Quill. Users can express styles such as bold, italic, and underline in text content. It can also represent plain text in text-based editors such as CodeMirror. It supports collaborative editing, allowing multiple users to modify parts of the content without conflicts.
Edit & Style Operation
Text has two core operations: Edit and Style. Edit edits the given range with the given content and attributes, while Style applies the style to the given range. Therefore, these operations can be used as follows:
Yorkie uses external editors for testing and managing the specifications of Yorkie data structures. The Text data structure is being tested in connection with the Quill and CodeMirror editor, but this does not mean that Yorkie's Text is dependent on external editors. Other different editors can apply Yorkie service to them at any time.
Current Issues
1. Style operation does not support concurrent formatting
In a situation where two clients are working simultaneously, if one applies a Text.Style operation for formatting while the other performs a Text.edit operation to insert, the results of these two operations do not converge into one.
There are cases where the current Style logic does not adequately preserve the user's intent when editing and formatting text. While the concept of 'intent' is relatively subjective and not clearly defined, the Peritext CRDT algorithm, designed for rich-text concurrent editing, presents nine examples of such 'user intent'. Yorkie has faced issues in three of these cases, which are summarized in issue#607.
3. All nodes store attribute information
Currently, Yorkie stores attribute information for every node, even if it applies to a continuous range. For example, in the case of applying bold type to the content "Hello", each character stores the information {"bold": "true"}, which is inefficient.
4. Line formatting is not considered
Currently, Text does not differentiate between inline text and entire lines. In contrast, Quill editor uses newline characters(\n) to differentiate between new lines and handles attributes for the entire line, such as header or list, separately from attributes for inline text. Below is an example tested in conjunction with Quill, where header attributes are applied only to newline characters by Quill. However, Yorkie does not distinguish such editing cases.
Currently, we are unable to check for concurrent cases when applying the Text.Style operation. By introducing a map called latestCreatedAtMapByActor, which is used in Text.Edit or Tree.Edit, we can determine the causality between the operations of the two clients.
2. Introduce Peritext's Mark operation to Text.setStyle (for issue 2)
This issue is currently being implemented in PR#647.
Introduce the mark operation for the bold type in the existing setStyle operation
Peritext categorizes mark types based on the attributes they have, as shown below.
Currently, we have implemented and verified the feasibility of cases related to the bold type in issue#607. To do this, we defined a markSpec of bold inside Text and implemented logic to execute mark operations if the input attributes are bold, or the existing logic otherwise. This resolved example 2 and 8.
Add removeStyle operation for the bold type and test toggling
We have confirmed that adding a mark for the bold type works well. In Peritext, to remove a type, an operation is 'added', meaning that existing operations do not disappear. Therefore, we will add a removeStyle operation for the bold type and test whether the transition between setStyle and removeStyle works well when applying bold type multiple times.
// draft interface of removeStyleremoveStyle(fromIdx: number,toIdx: number,attributes: A)removeStyle(0,5,{bold: false})
Add link/ comment types to MarkSpec
We will apply mark operations for link and comment types and ensure that they work well when expand and allowMultiple properties are different.
Provide a custom interface for users to define MarkSpec
Ultimately, rather than predefining MarkSpecs in Yorkie, we can define a custom interface that allows users to define MarkSpecs and call operations.
// draft interface for custom mark typeexporttypeMarkName=string;exporttypeAttributeSpec={default?: any;required?: boolean;}exporttypeMarkSpec={expand: 'before'|'after'|'both'|'none';excludes?: string[];allowMultiple?: boolean;attributes?: {[key: string]: AttributeSpec};};exporttypeMarkTypes=Map<MarkName,MarkSpec>;publicaddMarkType(markTypes: {[markName: MarkName]: MarkSpec}): MarkTypes{if(!this.markTypes){this.markTypes=newMap();}for(const[k,v]ofObject.entries(markTypes)){this.markTypes.set(k,v);}returnthis.markTypes;}
Currently, we are working on replacing the existing setStyle with mark operations. This implementation stores attribute information only for nodes within the mark range. However, if users encounter cases where attributes should not be applied to nodes within the range, we can address this in two ways:
Add Attribute operations to set attributes for individual nodes.
Add timestamp-related options to mark operations, allowing users to specify whether attributes should be applied based on a specific timestamp.
3. Use mark information when creating Snapshots (for issue 3)
Pass styleOpsBefore and styleOpsAfter information from the JS SDK to create snapshot
When creating a snapshot, we can resolve the issue of nodes unnecessarily holding attribute information by passing only mark information instead of all attribute information.
Use mark operation information when creating snapshot in the Go SDK
This approach also results in the final document being returned to the user with attributes applied to every node. To address this, the idea of internally merging text nodes with the same attributes can be considered. However, this may not be the optimal choice due to potential performance issues.
4. Introduce line formatting (for issue 4)
Define criteria for line formatting
Similar to Quill, we can define a character to separate newlines and distinguish between attributes for lines and attributes for inline-text when setting attributes for nodes. Note that Quill only allows \n as a newline character and does not accept \r or \r\n.
Mark operations can be introduced to other attribute-setting operations such as Tree.Style. However, this requires changes to the specific implementation direction since the Tree does not currently use a linked list.
We can consider options to allow users to define the document schema, such as configuring character settings to represent newlines or defining MarkSpec for the entire document.
The text was updated successfully, but these errors were encountered:
What would you like to be added:
Style
Operations for YorkieText
data structure.Why is this needed:
Text
data structure, its current state, and background.The table of contents is as follows:
Introduction to Yorkie's Text Data Structure
Definition
Text
is one of the JSON-like data structures used to represent content in JSON-like documents.Text
represents text with style attributes in rich text editors such as Quill. Users can express styles such asbold
,italic
, andunderline
in text content. It can also represent plain text in text-based editors such as CodeMirror. It supports collaborative editing, allowing multiple users to modify parts of the content without conflicts.Edit & Style Operation
Text
has two core operations:Edit
andStyle
.Edit
edits the given range with the given content and attributes, whileStyle
applies the style to the given range. Therefore, these operations can be used as follows:External Editors
Yorkie uses external editors for testing and managing the specifications of Yorkie data structures. The
Text
data structure is being tested in connection with the Quill and CodeMirror editor, but this does not mean that Yorkie'sText
is dependent on external editors. Other different editors can apply Yorkie service to them at any time.Current Issues
1. Style operation does not support concurrent formatting
In a situation where two clients are working simultaneously, if one applies a
Text.Style
operation for formatting while the other performs aText.edit
operation to insert, the results of these two operations do not converge into one.For more details: issue#638
2. Style operations do not preserve user intent
There are cases where the current
Style
logic does not adequately preserve the user's intent when editing and formatting text. While the concept of 'intent' is relatively subjective and not clearly defined, the Peritext CRDT algorithm, designed for rich-text concurrent editing, presents nine examples of such 'user intent'. Yorkie has faced issues in three of these cases, which are summarized in issue#607.3. All nodes store attribute information
Currently, Yorkie stores attribute information for every node, even if it applies to a continuous range. For example, in the case of applying
bold
type to the content "Hello", each character stores the information{"bold": "true"}
, which is inefficient.4. Line formatting is not considered
Currently,
Text
does not differentiate between inline text and entire lines. In contrast, Quill editor uses newline characters(\n
) to differentiate between new lines and handles attributes for the entire line, such asheader
orlist
, separately from attributes for inline text. Below is an example tested in conjunction with Quill, whereheader
attributes are applied only to newline characters by Quill. However, Yorkie does not distinguish such editing cases.For more details: Quill - Line Formatting
Roadmap for Resolving Issues
1. Ensure that two clients in a concurrent editing situation converge to the same result (for issue 1)
Introduce
MaxCreatedAtMapByActor
Currently, we are unable to check for concurrent cases when applying the
Text.Style
operation. By introducing amap
calledlatestCreatedAtMapByActor
, which is used inText.Edit
orTree.Edit
, we can determine the causality between the operations of the two clients.2. Introduce Peritext's Mark operation to Text.setStyle (for issue 2)
Introduce the mark operation for the
bold
type in the existingsetStyle
operationPeritext categorizes mark types based on the attributes they have, as shown below.
Currently, we have implemented and verified the feasibility of cases related to the
bold
type in issue#607. To do this, we defined amarkSpec
ofbold
insideText
and implemented logic to execute mark operations if the input attributes arebold
, or the existing logic otherwise. This resolved example 2 and 8.Add
removeStyle
operation for thebold
type and test togglingWe have confirmed that adding a mark for the
bold
type works well. In Peritext, to remove a type, an operation is 'added', meaning that existing operations do not disappear. Therefore, we will add aremoveStyle
operation for thebold
type and test whether the transition betweensetStyle
andremoveStyle
works well when applyingbold
type multiple times.Add
link
/comment
types toMarkSpec
We will apply mark operations for
link
andcomment
types and ensure that they work well whenexpand
andallowMultiple
properties are different.Provide a custom interface for users to define
MarkSpec
Ultimately, rather than predefining
MarkSpec
s in Yorkie, we can define a custom interface that allows users to defineMarkSpec
s and call operations.(optional) Implement
Attribute
operationsCurrently, we are working on replacing the existing
setStyle
with mark operations. This implementation stores attribute information only for nodes within the mark range. However, if users encounter cases where attributes should not be applied to nodes within the range, we can address this in two ways:Attribute
operations to set attributes for individual nodes.3. Use mark information when creating Snapshots (for issue 3)
Pass
styleOpsBefore
andstyleOpsAfter
information from the JS SDK to create snapshotWhen creating a snapshot, we can resolve the issue of nodes unnecessarily holding attribute information by passing only mark information instead of all attribute information.
Use mark operation information when creating snapshot in the Go SDK
This approach also results in the final document being returned to the user with attributes applied to every node. To address this, the idea of internally merging text nodes with the same attributes can be considered. However, this may not be the optimal choice due to potential performance issues.
4. Introduce line formatting (for issue 4)
Define criteria for line formatting
Similar to Quill, we can define a character to separate newlines and distinguish between attributes for lines and attributes for inline-text when setting attributes for nodes. Note that Quill only allows
\n
as a newline character and does not accept\r
or\r\n
.For more details: Quill - Canonical
Additional Notes:
Mark operations can be introduced to other attribute-setting operations such as
Tree.Style
. However, this requires changes to the specific implementation direction since theTree
does not currently use a linked list.We can consider options to allow users to define the
document
schema, such as configuring character settings to represent newlines or definingMarkSpec
for the entire document.The text was updated successfully, but these errors were encountered: