Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RelationshipCache - Add a high-level index to facilitate relationship queries (more fields) #17781

Merged
merged 8 commits into from
Jul 17, 2020

Conversation

totten
Copy link
Member

@totten totten commented Jul 9, 2020

Overview

In CiviCRM, "relationships" are user-configurable relations between two contacts. The data-structure which represents these configurable relationships is complicated to query. This PR adds a supplemental data-structure (the "relationship cache") which makes it more, eh, relatable.

(NOTE: This is a rebased and extended version of #17724 which mirrors additional relationship data to further simplify querying. The original description was a bit curt; I've rewritten to incorporate more of the original explanation from #17724.)

Before

Each end of the relationship may be given a name (e.g. Spouse of, Child of, Parent of). In the civicrm_relationship_type, this corresponds to a few columns (partial listing):

  • civicrm_relationship_type.id
  • civicrm_relationship_type.name_a_b
  • civicrm_relationship_type.name_b_a

When a specific relationship is created between two contacts, Civi creates correlated values in the civicrm_relationship table (partial listing):

  • civicrm_relationship.relationship_type_id
  • civicrm_relationship.contact_id_a
  • civicrm_relationship.contact_id_b

The difficulty arises when building queries on this data - as a consumer of the data, I know the name of a relation that interests me (eg Parent of). I need to know which column (contact_id_a or contact_id_b) corresponds to each side of the relationship, and then orient the query structure to match. Depending on the metadata stored in civicrm_relationship_type, I may need swap among contact_id_a and contact_id_b in different places - or I may need to query both ways and UNION the result. This grows increasingly complicated if you wish query on multiple relationship-types.

After

You may query civicrm_relationship_cache, which has these columns:

  • Identity columns: id, relationship_id, orientation
  • Perspective columns: near_contact_id, near_relation, far_contact_id, far_relation
  • Mirror columns: relationship_type_id, start_date, end_date, is_active

The most important columns are the perspective columns. As query author, you should match your initial contact against near_contact_id; the counterparty will always be in far_contact_id. The near_relation will describe the relationship from the perspective of the near contact (and far_relation describes the perspective of the far contact).

For example, if Alice (#100) is the parent of Bob (#200), then the civicrm_relationship_cache would have these two records:

near_contact_id near_relation far_contact_id far_relation
Alice (#100) Parent of Bob (#200) Child of
Bob (#200) Child of Alice (#100) Parent of

Examples

Suppose we have a contact Alice (#100) and want to see which contacts are her children. We can query:

SELECT far_contact_id
FROM civicrm_relationship_cache
WHERE near_contact_id = 100
AND far_relation = 'Child of'

Suppose we want to lookup the immediate family (parents, children, siblings, spouses) for everyone named "Bob" - and display their names:

<?php
$select = CRM_Utils_SQL_Select::from('civicrm_contact bob')
  ->where('bob.first_name LIKE "Bob"')
  ->join('relcache', 'INNER JOIN civicrm_relationship_cache relcache ON (bob.id = relcache.near_contact_id AND relcache.near_relation IN (@relations))', [
    '@relations' => ['Parent of', 'Child of', 'Spouse of', 'Sibling of'],
  ])
  ->join('the_relative', 'INNER JOIN civicrm_contact the_relative ON relcache.far_contact_id = the_relative.id')
  ->select('bob.id as bob_cid, bob.display_name as bob_full_name')
  ->select('the_relative.id as the_relative_cid, the_relative.display_name as the_relative_full_name')
  ->select('relcache.near_relation as near_relation')
;

echo $select->toSQL();
echo "\n\n";

foreach ($select->execute()->fetchAll() as $row) {
  printf("%s (#%d) is %s %s (#%d)\n", 
    $row['bob_full_name'], $row['bob_cid'], $row['near_relation'], $row['the_relative_full_name'], $row['the_relative_cid']);
}

Which produces output like this:

Mr. Bob Cooper III (#42) is Child of Rosario Cooper (#38)
Mr. Bob Cooper III (#42) is Child of Dr. Juliann Cooper (#196)
Mr. Bob Cooper III (#42) is Sibling of Ms. Merrie Cooper (#51)
Bob Bachman-Cooper (#151) is Child of Dr. Ashley Bachman Sr. (#129)
Bob Bachman-Cooper (#151) is Child of [email protected] (#89)
Bob Bachman-Cooper (#151) is Sibling of Mr. Allan Bachman-Cooper Jr. (#173)
Bob Jameson (#180) is Child of Dr. Toby Jameson III (#190)
Bob Jameson (#180) is Child of Dr. Magan Jameson (#128)
Bob Jameson (#180) is Sibling of [email protected] (#121)

The nice thing to note here is that we did not have do any accounting for the orientation of the relationship -- there are no unions and there is no switching between contact_id_a and contact_id_b. We start from our initial contact ID (bob) and simply join the relationship's near_contact_id. The relationship is effortlessly examined from Bob's perspective.

Technical Details

For every record incivicrm_relationship, there are two records in civicrm_relationship_cache. (One record from the perspective of contact_id_a; and the other record from the perspective of contact_id_b.) This mapping is enforced at runtime via trigger, and it is initialized during upgrade.

This PR only aims to provide the data-structure. There is a follow-up need to enhance APIv4's Relationship support to take advantage of this mechanism.

@civibot
Copy link

civibot bot commented Jul 9, 2020

(Standard links)

@colemanw
Copy link
Member

We're almost there. I would like to request 2 changes to this before merging:

  1. Rename _vtx to the more conventional _cache which should be easier to grok and also IIRC gets automatic special treatment like exclusion from logging triggers.
  2. All of the mirror fields are fine except I would leave out case_id as it belongs to a component that's being slowly decoupled from core as an extension and will probably take its data to other tables at some point.

@totten totten force-pushed the master-vortex-max branch from b28a1e6 to af5eee8 Compare July 16, 2020 08:10
@totten totten changed the title RelationshipVortex - Add a high-level index to facilitate relationship queries (more fields) RelationshipCache - Add a high-level index to facilitate relationship queries (more fields) Jul 16, 2020
@totten
Copy link
Member Author

totten commented Jul 16, 2020

@colemanw Alrighty, updates:

  • Rebased
  • Did another round of squashes
  • Renamed from "vortex" to "cache
  • Removed case_id
  • Regenerated civicrm_generated.mysql
  • Updated description to be more complete (copy/paste/edit intro material from the older PR)
  • Did a round of re-testing - verify that:
    • Some example queries work
    • The upgrader creates and fills the table (tested via cv upgrade:db -vvv)
    • The trigger works for both web-based updates and SQL-CLI updates (on both clean installs and upgraded installs)

@colemanw
Copy link
Member

@civicrm-builder retest this please

@colemanw
Copy link
Member

colemanw commented Jul 16, 2020

@totten jenkins is falling over trying to apply the patch from this PR because it attempts to simultaneously add and delete the same files. It's because you renamed them.

@totten totten force-pushed the master-vortex-max branch from af5eee8 to 9cbdbf0 Compare July 16, 2020 22:21
@totten
Copy link
Member Author

totten commented Jul 16, 2020

@civicrm-builder retest this please

@colemanw
Copy link
Member

I've done r-run and this seems to work great once I add a simple API entity for it. Will PR that once this is merged.

@seamuslee001 seamuslee001 merged commit 69b599e into civicrm:master Jul 17, 2020
@totten totten deleted the master-vortex-max branch July 23, 2020 03:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants