-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove _predicate_ from Dgraph. #3262
Conversation
From now on, expand queries will rely solely on the type of the node. This change removes all mentions of the _predicate_ internal pred that was previously being used to serve this type of queries. There is also another small change to allow the value of a list variable to be used as a list of predicates in expand queries. Most of the changes due to removing or replacing tests. These changes are due to one of the following reasons. 1. Tests that were checking for the state of _predicate_ and are no longer valid. 2. Tests that were doing checks that should be done inside of the query package. They have been removed if there was an existing query that was doing the same or a new test has been added if no similar test existed. 3. Some of the expand tests inside the query package were removed because there are tests verifying the expand functionality using types. Since types are now the only way to make expand queries work, these tests became obsolete.
Ran some benchmarks comparing the performance of the live and bulk loaders when For the bulkloader (using the 21 million dataset), I observed an improvement but it was not drastic. The number of edges went down from 120 million to 99 million. We do not observe a 50% reduction in the number of edges because there are nodes that contain more than an edge for a given predicate. During the mapping phase, it seems that the cost of adding a new In average, the version without For the live loader I used the smaller 1 million dataset. I could not observe any performance differences between the two versions. My hypothesis is that the process is throttled by transactions (which are not involved when the bulk loader is used) so the reduction in edges does not result in a reduction in the total time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 31 files reviewed, 2 unresolved discussions (waiting on @golangcibot)
dgraph/cmd/bulk/schema.go, line 40 at r1 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
File is not
goimports
-ed (fromgoimports
)
Done.
dgraph/cmd/zero/oracle.go, line 336 at r1 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
File is not
goimports
-ed (fromgoimports
)
Done.
Re did the bulk loader benchmark, this time without any indices in the schema. This time I observe the 50% reduction although the time savings are not 50% but are nonetheless noticeable (1m5s vs 42s). |
I ran the live loader benchmarks. Without |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 27 of 30 files at r1, 4 of 4 files at r2.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @golangcibot)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a pretty big changeset, but I think I looked at everything.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @golangcibot)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wao! This PR is a dream come true.
This PR also has a real potential of breaking transactions and queries. So, before you merge this, please do these things:
- Run Jepsen bank tests overnight. That means, 20 times or more, each test running for half an hour.
- Try out as many expand all queries as you can, manually with the 21 million dataset. Do this irrespective of whether you think unit tests cover all the cases or not.
Make the outcome of this verification process part of your PR commit message (when you squash and merge).
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @golangcibot and @manishrjain)
…erscore-predicate
…erscore-predicate
…erscore-predicate
…erscore-predicate
…erscore-predicate
From now on, expand queries will rely solely on the type of the node. This change removes all mentions of the _predicate_ internal pred that was previously being used to serve this type of queries. There is also another small change to allow the value of a list variable to be used as a list of predicates in expand queries. Most of the changes due to removing or replacing tests. These changes are due to one of the following reasons. 1. Tests that were checking for the state of _predicate_ and are no longer valid. 2. Tests that were doing checks that should be done inside of the query package. They have been removed if there was an existing query that was doing the same or a new test has been added if no similar test existed. 3. Some of the expand tests inside the query package were removed because there are tests verifying the expand functionality using types. Since types are now the only way to make expand queries work, these tests became obsolete. Verified jepsen tests work and that the 21 million dataset works as expected by manually adding types to certain nodes. Types and type definitons will be added to the entire dataset later.
From now on, expand queries will rely solely on the type of the node.
This change removes all mentions of the
_predicate_
internal pred thatwas previously being used to serve this type of queries.
There is also another small change to allow the value of a list variable
to be used as a list of predicates in expand queries.
Also, some of the expand tests in the query package were simplified so that the
output of the test queries is not as long.
Most of the changes due to removing or replacing tests. These changes
are due to one of the following reasons.
_predicate_
and are nolonger valid.
package. They have been removed if there was an existing query that was
doing the same or a new test has been added if no similar test existed.
because there are tests verifying the expand functionality using types.
Since types are now the only way to make expand queries work, these
tests became obsolete.
This change is