Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: CRDT merge direction #2016

Conversation

fredcarle
Copy link
Collaborator

Relevant issue(s)

Resolves #1066

Description

This PR change the CRDT merge direction. Originally, when an update arrived over the P2P network, each block was added to the DAG store and merged into the datastore in the order that they were received (newest to oldest block).

With this update, we start my receiving all blocks up to the version that we currently have (or until we reach the root) and then we apply the merge from oldest to newest block. This will enable more CRDT types to be added.

Tasks

  • I made sure the code is well commented, particularly hard-to-understand areas.
  • I made sure the repository-held documentation is changed accordingly.
  • I made sure the pull request title adheres to the conventional commit style (the subset used in the project can be found in tools/configs/chglog/config.yml).
  • I made sure to discuss its limitations such as threats to validity, vulnerability to mistake and misuse, robustness to invalidation of assumptions, resource requirements, ...

How has this been tested?

make test

Specify the platform(s) on which this was tested:

  • MacOS

The removed code has no impact on the execution of the delete.
The returned links are never used in the code base and the nodeGetter param is never used within the function.
@fredcarle fredcarle added area/crdt Related to the (Merkle) CRDT system refactor This issue specific to or requires *notable* refactoring of existing codebases and components labels Oct 30, 2023
@fredcarle fredcarle added this to the DefraDB v0.7 milestone Oct 30, 2023
@fredcarle fredcarle requested a review from a team October 30, 2023 23:01
@fredcarle fredcarle self-assigned this Oct 30, 2023
@fredcarle fredcarle force-pushed the fredcarle/refactor/I1066-Incorrect-crdt-merge-direction branch from 87e9082 to 9a89494 Compare October 30, 2023 23:09
@codecov
Copy link

codecov bot commented Oct 30, 2023

Codecov Report

Attention: 33 lines in your changes are missing coverage. Please review.

Comparison is base (b6abbb6) 73.97% compared to head (cf810d6) 73.88%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #2016      +/-   ##
===========================================
- Coverage    73.97%   73.88%   -0.09%     
===========================================
  Files          248      248              
  Lines        24822    24801      -21     
===========================================
- Hits         18361    18324      -37     
- Misses        5205     5218      +13     
- Partials      1256     1259       +3     
Flag Coverage Δ
all-tests 73.88% <73.60%> (-0.09%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
core/crdt/composite.go 71.65% <ø> (+3.80%) ⬆️
db/fetcher/versioned.go 59.79% <100.00%> (ø)
net/server.go 72.21% <84.62%> (-4.06%) ⬇️
merkle/clock/clock.go 61.39% <37.50%> (-1.47%) ⬇️
net/dag.go 90.24% <61.90%> (-9.76%) ⬇️
net/process.go 81.88% <78.05%> (-6.35%) ⬇️

... and 2 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b6abbb6...cf810d6. Read the comment docs.

@fredcarle fredcarle force-pushed the fredcarle/refactor/I1066-Incorrect-crdt-merge-direction branch from 9a89494 to e6d5da8 Compare October 30, 2023 23:22
Makefile Outdated Show resolved Hide resolved
Copy link
Contributor

@AndrewSisley AndrewSisley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a couple of localised todos :)

net/process.go Outdated
if err != nil {
return nil, errors.Wrap("failed to decode delta object", err)
if isComposite {
bp.composites.PushFront(nd)
Copy link
Contributor

@AndrewSisley AndrewSisley Oct 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: This feels quite strange, as you are appending to a private property on an object that never reads from it (the only reading of it is done from server.go). Given that this is the only property on this object it makes me wonder why blockProcessor exists as a type at all.

Also strange is that the block in server.go that does read from this, only uses stuff (peer) that is already on this object.

I suggest either removing blockProcessor as a type entirely (and just pass the list around), or move the below block from server.go into this type (into a (bp *blockProcessor) ProcessBlocks() func or similar):

for e := bp.composites.Front(); e != nil; e = e.Next() {
			nd := e.Value.(format.Node)
			err := s.peer.processBlock(ctx, txn, col, dsKey, nd, "")
			if err != nil {
				log.ErrorE(
					ctx,
					"Failed to process block",
					err,
					logging.NewKV("DocKey", dsKey.DocKey),
					logging.NewKV("CID", nd.Cid()),
				)
			}
		}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I move the block to process.go and simplified things a little more. blockProcessor had more fiels initially but I was able make the change leaner and it got kept around. I added some relevant fields to help pass things around. I like having the blockProcessor type as it keeps the processing under a specific domaine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change looks good, thanks Fred :)

if j.nodeGetter != nil && j.cid.Defined() {
cNode, err := j.nodeGetter.Get(p.ctx, j.cid)
if err != nil {
log.ErrorE(p.ctx, "Failed to get node", err, logging.NewKV("CID", j.cid))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: Please define these errors (and others in other files) in an errors.go file

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are not returned errors. It's a log message. We don't predefine our log messages at all currently.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our log messages are public, just like errors. Most of the time the error messages will be exposed via logs (e.g. CLI/HTTP), the reasons for housing them in a dedicated and standardised file are exactly the same.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although there is no value in defining them in the errors.go file. It's not an error that can be handled or compared. It's just a message in a log file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Time for a log.go file :,)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with a solution like that. I'll leave it as is for now and create a separate PR to take care of log messages.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just a message in a log file.

nitpick: It is an error message in a log file, and indistinguishable to our users from any other 'kind' of error.

I'll leave it as is for now and create a separate PR to take care of log messages

I'm fine with that, but I do think they should live in an errors.go file, because they are errors.

Copy link
Member

@shahzadlone shahzadlone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (just one todo)

@fredcarle fredcarle merged commit 597d8f3 into sourcenetwork:develop Nov 1, 2023
29 checks passed
shahzadlone pushed a commit to shahzadlone/defradb that referenced this pull request Feb 23, 2024
## Relevant issue(s)

Resolves sourcenetwork#1066 

## Description

This PR change the CRDT merge direction. Originally, when an update
arrived over the P2P network, each block was added to the DAG store and
merged into the datastore in the order that they were received (newest
to oldest block).

With this update, we start my receiving all blocks up to the version
that we currently have (or until we reach the root) and then we apply
the merge from oldest to newest block. This will enable more CRDT types
to be added.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/crdt Related to the (Merkle) CRDT system refactor This issue specific to or requires *notable* refactoring of existing codebases and components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect CRDT Merge direction
3 participants