Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Import] Duplicate finding cleanup, including using supplied dedupe rule #23473

Closed
wants to merge 3 commits into from

Conversation

eileenmcnaughton
Copy link
Contributor

@eileenmcnaughton eileenmcnaughton commented May 17, 2022

Overview

Fix from @darrick darrick@9b1470a combined with cleanup of how duplicate contacts are handled in the import script

Before

Dedupe handling is 'lightly sprinkled' throughout the script

After

Baby steps on consolidation

Technical Details

@darrick if you get a chance to test this please do - I'm gonna try to do a bit more cleanup on top of this

Comments

A more ambitious version of this PR is #23476 - if that is merged then this will automatically closed or if this is merged first then I will rebase this out of that (it depends which a reviewer prefers)

@civibot
Copy link

civibot bot commented May 17, 2022

(Standard links)

@eileenmcnaughton
Copy link
Contributor Author

Merged via #23476

@eileenmcnaughton eileenmcnaughton deleted the import_dupe branch May 19, 2022 00:22
@darrick
Copy link
Contributor

darrick commented May 24, 2022

I have been spending some time going over the latest merges for the import mechanism. I'm really appreciating all the work you are putting into this. I have a branch here which includes a few tests for some edge cases: master...darrick:deprecated_contact_check_params

Test that import parser will not match the imported primary to an existing contact via the related contacts fields.

  • Currently fails because CRM_Dedupe_Finder::formatParams($input, $contactType);
  • called in getDuplicateContacts flattens the contact array adding the
  • related contacts values to the primary contact.

Test that import parser will not throw error if Related Contact is not found via passed in External ID.

  • Currently fails because validation assumes the Related contact will be found.
  • When it is later not found creating the contact via the API throws an
  • error for missing required fields.

Test that import parser will add contact with employee of relationship.

  • This test actual fails because DUPLICATE_UPDATE isn't set
  • in the UserJob metadata.

I ran across a few other things but how to handle them depend somewhat on how these issues are handled. I.e. I'm still hoping hook_civicrm_findDuplicates gets' called for the related contact before it throws a required field error because my hook is matching the passed in external_identifier to my entity which holds multiple identifiers. And then I'm hoping hook_import gets called on the related contact as well. But those are less bugs and more feature requests. I have this branch here where I was working on that stuff: master...darrick:import-match-related

@eileenmcnaughton
Copy link
Contributor Author

@darrick that's really exciting - both that you have been doing some testing and that you have written tests!

I'll have a go at some of those tests now - I was just looking at the (crazy) relationship code - so the test will help a lot there.

If you have a chance to review any of the open PRs that will help me get them merged quickly too

@darrick
Copy link
Contributor

darrick commented May 24, 2022

@eileenmcnaughton I'd like to but I'm doing this a bit in my free time (9:30p here). I am finding it's easier for me to explain things via tests then in words. I will try my best. I've been using CiviCRM for about 15 years so I'm happy to do what I can to give back to the community.

@eileenmcnaughton
Copy link
Contributor Author

@darrick I took a first stab at your first test #23560 - I commented on the PR that I'm not quite sure I've nailed it in the various modes -but I'm gonna look at the other test before I break my brain on that.

@eileenmcnaughton
Copy link
Contributor Author

OK so the 3rd one was easy as you have already highlighted the problem - #23560

@eileenmcnaughton
Copy link
Contributor Author

Regarding the hook - I agree that we should have better placed hooks! One to think about when stuff is a bit more settled though

@eileenmcnaughton
Copy link
Contributor Author

OK - last test is NOT failing for me - #23562 - let's see how jenkins gets on

I have on my radar that there is still no test cover, and a possible regression, for when more than one contact can be matched

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants