Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add apoc.diff user function #760

Merged
merged 1 commit into from
Jun 21, 2018
Merged

Add apoc.diff user function #760

merged 1 commit into from
Jun 21, 2018

Conversation

binarycoded
Copy link
Contributor

The new procedure apoc.diff.nodes returns an overview of all node property differences of two nodes given by their nodeIds. The new functionality is documented in a new *.adoc-file (with usage and example Cypher) and covered by a unit test.

@jexp
Copy link
Member

jexp commented Mar 18, 2018

Hi Benjamin,

thanks a lot for your contribution.

  1. Please sign our CLA if you haven't yet:
  2. I think apoc.diff.nodes should be also available for maps in general and relationships
  3. I'm a bit torn about it being a procedure vs. function, perhaps we could add a boolean function that returns true / false
  4. do we really need the guava dependency? I mean after all it is just 2 x entrySet.removeAll for disjuction, and 1x entrySet.retainAll for intersection
  5. please change the test to run with embedded not spinning up a neo4j server otherwise it slows down the testing too much

Thanks
Michael

@jexp
Copy link
Member

jexp commented Mar 19, 2018

Sorry fogot the CLA: https://neo4j.com/developer/cla/

@binarycoded
Copy link
Contributor Author

Hi Michael,

thanks for your feedback!

Short question on point 3: Do you think we should add both, a procedure that returns detailed diff information and a function that only returns a boolean or just the function?

Thanks
Benjamin

@jexp
Copy link
Member

jexp commented Mar 21, 2018

Yes both would be good.

the equality would be the same as properties(n) = properties(m) AND labels(n) = labels(m) or?

We could also add sub.functions that provide the difference, intersection, disjunctions of maps

Michael

@binarycoded binarycoded changed the title Add apoc.diff.nodes procedure to return an overview of all node property differences Add apoc.diff procedures and user functions Mar 29, 2018
@binarycoded
Copy link
Contributor Author

Hi Michael,

I signed the CLA and reworked all the code. We now have different functions that return only boolean values and one procedure that returns all the detailed information (examples described in the adoc file).

The tests are using the embedded Neo4j now and the guava dependency is not needed anymore. 👍

Maybe I can add more functionality in an additional PR if this one will be merged.

Hope you like it! 😉
Benjamin

@binarycoded
Copy link
Contributor Author

@jexp I just rebased to the last commit with the Travis fix - it's all green now! 👍

Copy link
Member

@jexp jexp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please have a look at my comments

public GraphDatabaseService db;

@UserFunction
@Description("apoc.diff.maps([leftMap],[rightMap])" )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add an explanation to all descriptions ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not really a diff, it's just an equals, which is the same as WHERE leftMap = rightMap in cypher, or?
I thought you'd have a function that outputs the differences between both, which then should go to apoc.map.*


@UserFunction
@Description("apoc.diff.properties([leftNodeId],[rightNodeId])")
public boolean properties(@Name("leftNodeId") Long leftNodeId, @Name("rightNodeId") Long rightNodeId) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

support nodes and not node-ids, if someone has node-ids they can use cypher to look them up.
as you want to use this from cypher you most often have nodes, and not node ids.

@@ -0,0 +1,110 @@
package apoc.diff;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps rather call it apoc.equals.*

a diff is more detailed difference computation and output

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw. Java equality is not the same as Cypher equality

e.g. in Cypher all these are equal (int)1 == long)1 == (byte)1 == (float)1.0 == (double)1.0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reworked


@Procedure(mode = Mode.READ)
@Description("apoc.diff.nodes([leftNodeId],[rightNodeId])")
public Stream<DiffResult> nodes(@Name("leftNodeId") Long leftNodeId, @Name("rightNodeId") Long rightNodeId) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could have this both as function and procedure, the function then would return a map with the fields of the DiffResult being keys.

private Map<String, Object> getPropertiesLeftOnly(Node leftNode, Node rightNode) {
Map<String, Object> leftOnly = new HashMap<>();
leftOnly.putAll(leftNode.getAllProperties());
leftOnly.putAll(rightNode.getAllProperties());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't need it
also you should access the node's properties only once per node, it's an expensive operation
so all these private methods should take Maps instead.

keyPairs.putAll(leftNode.getAllProperties());
keyPairs.keySet().retainAll(rightNode.getAllProperties().keySet());

Iterator it = keyPairs.entrySet().iterator();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use foreach

Iterator it = keyPairs.entrySet().iterator();
while (it.hasNext()) {
Map.Entry entry = (Map.Entry)it.next();
if (!leftNode.getAllProperties().get(entry.getKey()).equals(rightNode.getAllProperties().get(entry.getKey()))) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you already have it in keyPairs

while (it.hasNext()) {
Map.Entry entry = (Map.Entry)it.next();
if (!leftNode.getAllProperties().get(entry.getKey()).equals(rightNode.getAllProperties().get(entry.getKey()))) {
Map<String, Object> pairs = new HashMap<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably eaiser to use a 2 element array or list

Map<String, Object> pairs = new HashMap<>();
pairs.put("leftValue", leftNode.getAllProperties().get(entry.getKey()));
pairs.put("rightValue", rightNode.getAllProperties().get(entry.getKey()));
differing.put(entry.getKey().toString(), pairs);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keys are always strings


Iterator it = keyPairs.entrySet().iterator();
while (it.hasNext()) {
Map.Entry entry = (Map.Entry)it.next();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing generics, they come with foreach

@jexp
Copy link
Member

jexp commented Apr 26, 2018

@binarycoded ping :)

@binarycoded
Copy link
Contributor Author

@jexp I'm still here! :) Will have a look at your comments asap.

@jexp
Copy link
Member

jexp commented May 9, 2018

ping again :)

@jexp
Copy link
Member

jexp commented Jun 8, 2018

I'd really love to merge this in :) Please have a look at my feedback. Thank you!!

@binarycoded
Copy link
Contributor Author

Hey @jexp, I hope I find some time in the next few days ...

@binarycoded binarycoded changed the base branch from 3.3 to 3.4 June 15, 2018 10:02
@binarycoded binarycoded changed the title Add apoc.diff procedures and user functions Add apoc.diff user functions Jun 15, 2018
@binarycoded binarycoded changed the title Add apoc.diff user functions Add apoc.diff user function Jun 15, 2018
@binarycoded
Copy link
Contributor Author

Hi @jexp
I tried to reduce this PR to the minimum. I was able to just create a user function for my diff - no procedure. All other additional functionality should be moved to a new PR I think. Would be great if we could finally close this one. ;)
Best regards

@jexp
Copy link
Member

jexp commented Jun 21, 2018

Looks great, thanks so much.

@jexp jexp merged commit bebfff1 into neo4j-contrib:3.4 Jun 21, 2018
@binarycoded binarycoded deleted the nodediff branch June 21, 2018 04:44
jexp pushed a commit that referenced this pull request Jul 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants