Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix contribution detail report to work with FULL GROUP BY mode #11954

Merged
merged 1 commit into from
Apr 12, 2018

Conversation

eileenmcnaughton
Copy link
Contributor

@eileenmcnaughton eileenmcnaughton commented Apr 6, 2018

Overview

Fix the contribution detail report so that it works in full group by mode

Before

Works when full group by is not enabled, fatal errors when it is.

After

Works in both full group by and non full group by mode

Technical Details

FULL GROUP BY mode is a mysql configuration that is enabled out of the box for mysql 5.7. It enforces a level of query logic that prevents queries from returning unreliable results.

Our test suite shows that some of our tests fail with it (https://test.civicrm.org/job/CiviCRM-Core-Matrix/CIVIVER=5.1,label=ubuntu1604/lastCompletedBuild/testReport/). In some cases these fails are for good reason - eg disabling 'search by primary only' is not reliable with full group by mode enabled and our tests demonstrate that. Having turned full group by mode on locally I have hit other fatal errors & think we should be very clear that it is NOT supported.

However, it does make sense to work towards supporting it and this patch takes us in that direction. The approach is one that was developed in the extended reports extension, less because of the full group by issue & more because of the unreliable results that can be returned if you don't write queries to that standard - ie. the first value for a field might be returned only.

The change is intended to be extendable to other reports - but they have to opt-in (by setting $this->groupConcatTested to TRUE). What it does is the select clause it will add wrap any fields in GROUP_CONCAT that are not already wrapped in an aggregate function or in the groupBy selection, if there is a group by clause.

The other change is that the parent class added the order by fields to the select clause if they were section headers & the child class did the same for all order by fields. This makes the parent do it for all order by fields. I feel fairly confident about that but I think a merge straight after the rc has been cut would be good timing.

Comments

I also added handling for the metadata to declare a field a 'pseudofield' & have select function ignore it (also from extended reports)

@monishdeb @seamuslee001 ping - I keep seeing attempts at this so would be good to get this approach tested & agreed

Test cover exists in the form of a unit test that currently fails in full-group-by

@seamuslee001
Copy link
Contributor

@eileenmcnaughton is your thinking here that we need to move to actually correcting the queries on all instances rather than as we have been doing if MySQL supports it do x y z

@eileenmcnaughton
Copy link
Contributor Author

Right - either the query is correct or it is wrong - regardless of the mysql standard.

@eileenmcnaughton
Copy link
Contributor Author

BTW - it's not whether mysql supports full group by - its whether it ENFORCES it - they all support it

@seamuslee001
Copy link
Contributor

@eileenmcnaughton tho the enforcement if you were to enable it is different between Maria v MySQL and MySQL 5.5 v MySQL 5.7

@eileenmcnaughton
Copy link
Contributor Author

@seamuslee001 yeah - I'm not too sure on all the differences - would be good to get clarity. I know that ANY_VALUE is not supported on maria DB - but that is just a hack to avoid the enforcement of full group by queries & we shouldn't use it even if permitted

@seamuslee001
Copy link
Contributor

@eileenmcnaughton i remember i managed to bork the test box by making it add ONLY_FULL_GROUP_BY and it really borked jenkins on 5.5 but on 5.7 its much happier with that SQLMode

@eileenmcnaughton
Copy link
Contributor Author

Yeah - I have enabled full group by on 5.6 & I can't create smart groups - but the main difference I'm aware of is mysql 5.7 supports the ANY_VALUE hack

@eileenmcnaughton
Copy link
Contributor Author

@seamuslee001 I think perhaps the reason the FULL GROUP BY doesn't work with 5.6 for me locally is that we have used the ANY_VALUE hack instead of fixing the queries & it's not applying to 5.6.

For example it's complaining about this very complaint worthy query

INSERT INTO civicrm_mailing_recipients (mailing_id, contact_id, email_id)
SELECT 1, i.contact_id, i.email_id
FROM civicrm_contact contact_a
 INNER JOIN included_recipients_tempbc71 i ON contact_a.id = i.contact_id 
 INNER JOIN civicrm_email e ON e.id = i.email_id 
WHERE (contact_a.is_deleted = 0)
GROUP BY e.email
ORDER BY MIN(i.contact_id), MIN(i.email_id)

I can get past it locally with this patch

/**
diff --git a/CRM/Mailing/BAO/Mailing.php b/CRM/Mailing/BAO/Mailing.php
index 3a0f080..9bc77c2 100644
--- a/CRM/Mailing/BAO/Mailing.php
+++ b/CRM/Mailing/BAO/Mailing.php
@@ -326,11 +326,9 @@ class CRM_Mailing_BAO_Mailing extends CRM_Mailing_DAO_Mailing {

 $query = CRM_Utils_SQL_Select::from('civicrm_contact contact_a')->join('i', " INNER JOIN {$includedTempTablename} i ON contact_a.id = i.contact_id ");
 if (!$isSMSmode && $mailingObj->dedupe_email) {
  •  $orderBy = array("MIN(i.contact_id)", "MIN(i.$entityColumn)");
    
  •  $selectClause = array('#mailingID', 'MIN(i.contact_id) contact_id', "MIN(i.$entityColumn) $entityColumn", "e.email");
    
  •  $orderBy = array("contact_id", "$entityColumn");
     $query = $query->join('e', " INNER JOIN civicrm_email e ON e.id = i.email_id ")->groupBy("e.email");
    
  •  if (CRM_Utils_SQL::supportsFullGroupBy()) {
    
  •    $selectClause = array('#mailingID', 'ANY_VALUE(i.contact_id) contact_id', "ANY_VALUE(i.$entityColumn) $entityColumn", "e.email");
    
  •  }
    
    }

BUT, I'm not keen on ad hoc patching - I think we need to

  1. fix these reports which are failing on our test matrix (this is 1/3 & the other 2 are easy once this is done)
  2. change the CI test instance to use full group by, disabling the export test which just proves that there is a bug when trying to export with not only primary search enabled and full group by - leave that out of scope for now because it's not our priority to fix bugs in newly added features
  3. enable mysql 5.7 full group by o default for CI
  4. require that ALL full group by fixes have unit tests AND do not accept any further patches that only apply changes if full group by is enabled
  5. recommend that full group by is disabled & make it clear full group by is only partially supported

@seamuslee001
Copy link
Contributor

@eileenmcnaughton re 3 do you mean for general PR tests or just the matrix ones?

@eileenmcnaughton
Copy link
Contributor Author

@seamuslee001 I think we should enable full group by for general CI - even at the expense of losing a test that doesn't work with it - it would mean going forwards things work with full group by

@seamuslee001
Copy link
Contributor

@eileenmcnaughton i'm not opposed to that, for 5.7 that would mean upgradeding test-ubu1305-5 i think it is to mysql 5.7 as its still on mysql5.5 ping @totten i would like to keep test-ubu1305-1 on 5.5 tho just for compatibility checks

@eileenmcnaughton
Copy link
Contributor Author

Can't we just use the 'other' box from the matrix CI - it has 13 fails that the main one doesn't - 9 are fixable by these 3 tests.

The other 4 genuinely don't work - either because of full group by issues (2) or because of the way strings are handled (2) - but all 4 of those were written pretty much to verify if various bugs exist - I think commenting the 2 string handling variants & marking the 2 other tests as incomplete gets us to a baseline point that we can work from

@eileenmcnaughton
Copy link
Contributor Author

ie looks like 1604 is the strictest of our machines so if we get that passing 'one way or another' then we can make sure that one doesn't regress

@seamuslee001
Copy link
Contributor

@eileenmcnaughton i'm not sure about the jenkins config but that is certainly a possibly.

My Understanding is that the test boxes were originally set up to be as close to bog standard ubuntu so 1604 was by default MySQL5.7 and 1304 or something like that was bog standard MySQL5.5. we used to manually for unit tests override the sql_modes that were set. However we removed that a while back which started those errors showing on the 1604 box (we now just check that strict_trans_tables is there if not add it to the sqlModes list). So we either switch to 1604 as PR box or we upgrade the PR box to be MySQL5.7 (noting we have already diverged from the bog standard config by using php5.5 I think it might be now a bog standard 1404 or 1504 box maybe)

@eileenmcnaughton
Copy link
Contributor Author

@seamuslee001 ok - we can wait for @totten to weigh in on the detail - the theory anyway is that the CI tests should run on the strictest level they will all pass on (even if we have to comment out a small handful) & with this PR & a couple of small follow ups we can get the 'comment out list' down very low.

I don't really want to start on the other ones in the 'comment out' section because I think we should prioritise dealing with the PRs that are already open but by resolving the reports ones & resolving an approach I think we move forwards on the Q too

@eileenmcnaughton
Copy link
Contributor Author

Speaking to the bigger picture - one thing that fails with 5.6 full group by is the contribution summary display. It does seem like a fundamentally broken query - ie. we do a count by COUNT(civicrm_contribution.total_amount) & then group by contribution.id & then wrap in an outer. The count & group by do not belong in the inner & this would be broken anywhere - I can't see any evidence it's NOT broken for 5.7

SELECT COUNT( conts.total_amount ) as total_count,
       SUM(   conts.total_amount ) as total_amount,
       AVG(   conts.total_amount ) as total_avg,
       conts.currency              as currency FROM (
      SELECT civicrm_contribution.total_amount, COUNT(civicrm_contribution.total_amount) as civicrm_contribution_total_amount_count,
      civicrm_contribution.currency  FROM civicrm_contact contact_a LEFT JOIN civicrm_contribution ON civicrm_contribution.contact_id = contact_a.id  WHERE  ( contact_a.id = '112' )  AND (contact_a.is_deleted = 0) AND (contact_a.is_deleted = 0) AND civicrm_contribution.contribution_status_id = 1  GROUP BY civicrm_contribution.id
    ) as conts
    GROUP BY currency

Note I'm NOT pitching that we start fixing things like this until we resolve the test situation to be a rising lid & resolve all open PRs in the queue relating to this query and / or full group by (I would also note this query has performance issues so we should look at it in that context too).

@seamuslee001
Copy link
Contributor

(CiviCRM Review Template WORD-1.1)

  • (r-explain) Pass
  • (r-test) Pass - better handling of group by code
  • (r-code) Pass
  • (r-doc) Pass
  • (r-maint) Pass
  • (r-run) Pass was able to run the report using various filters on both a php5.6 / MySQL 5.7 and PHP 7.1 / MariaDB 10.2 configurations with no errors
  • (r-user) Pass
  • (r-tech) Pass

@seamuslee001
Copy link
Contributor

Merging

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants