-
-
Notifications
You must be signed in to change notification settings - Fork 824
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CRM-20958 - Track creation+modification times for activities+cases #10754
Conversation
322f13c
to
71aebd9
Compare
|
||
$title = sprintf('CRM-20958 - Compute civicrm_activity.created_date from civicrm_log (%d => %d)', $startId, $endId); | ||
$sql = 'UPDATE civicrm_activity | ||
SET created_date = (SELECT MIN(l.modified_date) FROM civicrm_log l WHERE l.entity_table ="civicrm_activity" AND civicrm_activity.id = l.entity_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I'm not certain about... this code initializes civicrm_activity.created_date
(etal) using civicrm_log.modified_date
, but they have different data-types (TIMESTAMP
vs DATETIME
, respectively).
@eileenmcnaughton @seamuslee001 To my recollection, DATETIME
and TIMESTAMP
handle timezones differently, so wondering if there's some kind of filtering I should do here to adjust for timezone.
However, my gut says that storing civicrm_log.modified_date
as DATETIME
is actually a subtle dirtiness/corruption. Ex: This suggests that civicrm_log.modified_date
would be initialized based on date('YmdHis')
(ie the user's perceived timestamp based on active PHP timezone setting) rather than a canonical timestamp. But different civicrm_log
records can be created by different people/sessions/timezones, so the raw data in civicrm_log
wouldn't be reliably articulated in any one timezone.
This wouldn't really the fault of my issue/patch -- if the theory is true, you'd expect to see idiosyncrasies in the display of this data. For migration purposes, somewhere you'd have to bite the bullet and accept that timestamps would be wrong +/- a few hours?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed the data is dirty:
- Open a new browser. Login to
d46
asadmin
. Set your timezone toAmerica/New_York
. Edit contact 139. Observe "Change Log" has new entry with timestampJuly 26th, 2017 7:29 PM
. - Open a new browser. Login to
d46
asdemo
. Set your timezone toAmerica/Los_Angeles
. Edit contact 139. Observe "Change Log" has new entry with timestampJuly 26th, 2017 4:30 PM
. (In reality, this is 1 minute later.)
But the changelog (as viewed by either user) shows weird data: admin
's change actually happened 1 minute before demo
's change, but it looks like it happened 179 minutes after.
Yep so timestamp stores everything in UTC and does conversion to UTC if connection is in another Timezone. Date time is just what ever time is sent to it from the client |
71aebd9
to
b1abc8e
Compare
CRM-9683 seems relevant to all this - same basic problem, over in CiviMail? |
I thought that CiviMail fields had been converted to timestamp already |
Maybe that was only on the Fuzion branch of things |
@seamuslee001 we have converted some fields on new installs but not on existing due to nervousness about the - the fundamental thing I believe it when you convert you should be in the most accurate tz possible |
@eileenmcnaughton for some reason i thought the CiviMail was already converted, maybe AUG / fuzion thing |
@seamuslee001 yes, converted on all Fuzion customers & any site that wishes to convert can do so with a simple ALTER TABLE mysql statement. Also converted for new installs - but we have not figured out what to do with existing ones (mostly because the timestamp will be relative to the tz used when converting & there is some room for debate there) |
(there is a case to be made for providing a change over routine that people can run through UI or cli depending which is more appropriate) |
b1abc8e
to
93891bf
Compare
Yeah, this is my main question at this point. On one hand... this PR seems to be working for me. The schema is created; the data migrates according to the evil scenario described in the commit-notes; the API seems to return and sort the fields; the fields for cases+activities are updated when using "Manage Case" UI; there's new test coverage; and the existing tests are now passing. On the other hand... it might be a safer phase-in process if we only add the new columns -- and put the other stuff (the triggers and migration) behind some kind of opt-in. |
I think the triggers should be uncontroversial - since they are also in the contact schema. The question is more the migration. On balance I probably would, since that was done for the civicrm_contact table. But if you were going to leave something off I would limit it to the migration |
93891bf
to
1ee5377
Compare
As discussed in [civicrm#10754](civicrm#10754), the source material for populating `created_date` and `modified_date` has some quality issues (vis-a-vis timezones). If we attempt to immediately initialize `created_date` and `modified_date` from pre-existing data, then it gets a lot harder to clean up the pre-existing data. Fortunately, most systems don't actually *need* this data immediately. So we can defer. I've moved these migration rules over to an extension https://github.com/civicrm/org.civicrm.doctorwhen
@eileenmcnaughton @xurizaemon @seamuslee001 I've moved the migration logic to an extension https://github.com/civicrm/org.civicrm.doctorwhen and put a status check in I've removed the "WIP" flag. |
cool - I wonder how people would find out about that if they wanted to switch |
CC @demeritcowboy - you might find this PR interesting. |
makes sense to me |
Thanks for the notice. Hmm, DoctorWhen - Timezones and Relative Dates in Civi? Don't think it makes a difference here but noting the original use of civicrm_log wasn't entirely redundant - for example to see an activity's original creation date you still needed to look in civicrm_log. It also probably doesn't make a difference but since the opening paragraph mentions it as the rationale, noting that at least in my experience Activity Date (activity_date_time) is usually the field used for sorting/reporting, and it allows users to have some control over how things sort without messing up the "system" dates like creation date. The system dates are usually just used during investigation. Cases and some of their activities are often allocated to certain reporting periods via the activity date, similar to financial reporting. But none of that prevents making changes for create/modify. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi @totten, left some inline comments
/** | ||
* Check that the timestamp columns are populated. (CRM-20958) | ||
* | ||
* @return array<CRM_Utils_Check_Message> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't remember seeing this syntax before and I'm not sure of what it means. If it returns an array or a CRM_Utils_Check_Message instance, shouldn't it be array|CRM_Utils_Check_Message
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
array<CRM_Utils_Check_Message>
means "an array in which the elements are instances of class CRM_Utils_Check_Message
".
Personally, I go a bit bonkers reading @return array
because it doesn't really say anything.
AFAIK, php.net
hasn't blessed a notation for array typing, but there a few references for comparison:
- It's similar to Java's notation for generics (eg
Vector<Message>
,HashMap<String,Message>
). - It matches Hack's array notation. (Hack is PHP derivative with stronger typing.)
- PHP-FIG has a draft spec for docblocks. It mentions both C-style arrays (
Message[]
) as well Java-style generics (Collection<Message>
). - In my local copy of PHPStorm, it provides drilldown for C-style arrays (
Message[]
) but not Java-style generics. - I'm not particularly consistent. Sometimes it seems meaningful to convey even more information, so I use really verbose descriptor like:
@return array
Array(string $msgId => CRM_Utils_Check_Message $msg).
I guess standardizing/documenting/cleaning-up would be good, but that should be probably be a separate project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that most projects use the notation defined by http://phpdoc.org, which uses the CRM_Utils_Check_Message[]
. Given that the PHP-FIG draft is a derivative of that phpdoc and that most projects already use the C-style arrays, I think safe enough to assume this is the "right" way to do it.
Sometimes I prefer a more verbose description too
*/ | ||
public function checkSchema() { | ||
$problems = array(); | ||
foreach (self::getConvertedTimestamps() as $tgt) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does tgt
mean? Perhaps, for the sake of readability, it would be better to not use an abbreviation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll rename it to $target
.
* @package CRM | ||
* @copyright CiviCRM LLC (c) 2004-2017 | ||
*/ | ||
class StaticTriggers { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually, we use singular for class names. Any reason for using the plural here? The same applies for the namespace, which, in a way, is part of the class name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re: SqlTriggers
namespace -- Agree that should be singular. Pretty much all namespaces can be equally described as singular or plural, and it's nice to have consistent style.
Re: StaticTriggers
class -- I don't think the singular/plural issue as a matter of style here -- it's really a semantic difference. SqlTrigger
would just be one object managing one trigger. But if one object manages multiple triggers, then you'd say SqlTriggers
or SqlTriggerSet
or SqlTriggerList
or SqlTriggerArray
or SqlTriggerGroup
. Personally, I prefer for the simplest construction that's accurate.
In this case, the instance of StaticTriggers
contains three triggers that serve a common goal. The goal is to update civicrm_case.modified_date
whenever a related activity changes -- but that decomposes to three triggers for INSERT, UPDATE, and DELETE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't say that I agree, but I don't have a better name to suggest. Since you think it's good enough, I won't argue :)
* @see \CRM_Core_DAO::triggerRebuild | ||
* @see http://issues.civicrm.org/jira/browse/CRM-10554 | ||
* | ||
* @param $info |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the type of this param?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a good question, but the answer is to drill-down on @see \CRM_Utils_Hook::triggerInfo
. I don't want to reproduce the hook spec here. Will put another comment directing folks there.
* @see http://issues.civicrm.org/jira/browse/CRM-10554 | ||
* | ||
* @param $info | ||
* @param null $tableFilter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What type of value is expected here, when it isn't null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a good question, but the answer is to drill-down on @see \CRM_Utils_Hook::triggerInfo
. I don't want to reproduce the hook spec here. Will put another comment directing folks there.
tests/phpunit/api/v3/CaseTest.php
Outdated
)); | ||
$this->assertRegExp(';^\d\d\d\d-\d\d-\d\d \d\d:\d\d;', $case_1['created_date']); | ||
$this->assertRegExp(';^\d\d\d\d-\d\d-\d\d \d\d:\d\d;', $case_1['modified_date']); | ||
$this->assertApproxEquals(strtotime($case_1['created_date']), strtotime($case_1['created_date']), 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're comparing the created_date with itself here. Is this correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I'll fix that.
tests/phpunit/api/v3/CaseTest.php
Outdated
)); | ||
$this->assertRegExp(';^\d\d\d\d-\d\d-\d\d \d\d:\d\d;', $activity_1['created_date']); | ||
$this->assertRegExp(';^\d\d\d\d-\d\d-\d\d \d\d:\d\d;', $activity_1['modified_date']); | ||
$this->assertApproxEquals(strtotime($activity_1['created_date']), strtotime($activity_1['created_date']), 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here too
tests/phpunit/api/v3/CaseTest.php
Outdated
|
||
$this->assertEquals($activity_1['created_date'], $activity_2['created_date']); | ||
$this->assertNotEquals($activity_1['modified_date'], $activity_2['modified_date']); | ||
$this->assertTrue(strtotime($activity_1['modified_date']) < strtotime($activity_2['modified_date']), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about using assertLessThan()
instead of assertTrue()
?
$this->assertRegExp(';^\d\d\d\d-\d\d-\d\d \d\d:\d\d;', $case_2['modified_date']); | ||
$this->assertEquals($case_1['created_date'], $case_2['created_date']); | ||
$this->assertNotEquals($case_2['created_date'], $case_2['modified_date']); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I must confess it was a bit hard for me to read this test and understand what was going on. I feel it would have been easier if it was split into smaller tests, each one having a descriptive name of what is being tested. Examples:
testCreatedDateShouldBeTheSameAsModifiedDateWhenACaseIsCreated
testCreatedDatedIsNotUpdatedDuringWhenACaseIsUpdated
testModifiedDateShouldBeUpdatedWhenACaseIsUpdated
testCreatedDateAndModifiedDateCannotBeManuallyChanged
testDateFormatOfCreatedDateAndModifiedDateShouldBeCorrect
Note that the examples have some scenarios that were not even covered by tests. In addition to that, do you think it would make sense to also have tests for the BAO (that is, check that the same rules will work when you do something like Case::create()
?
* | ||
* @see CRM_Core_DAO::executeQuery | ||
*/ | ||
public static function task_executeQuery($ctx, $sql, $vars) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I missed it, but it doesn't look like you're using this anywhere. Are you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Valid point. It was a helper to support the migration task, but then the "migration task" migrated itself off to doctorwhen
, so we don't really need this straggler.
OTOH, it is a handy straggler...
eeb3739
to
dfe984e
Compare
As discussed in [civicrm#10754](civicrm#10754), the source material for populating `created_date` and `modified_date` has some quality issues (vis-a-vis timezones). If we attempt to immediately initialize `created_date` and `modified_date` from pre-existing data, then it gets a lot harder to clean up the pre-existing data. Fortunately, most systems don't actually *need* this data immediately. So we can defer. I've moved these migration rules over to an extension https://github.com/civicrm/org.civicrm.doctorwhen
tests/phpunit/api/v3/CaseTest.php
Outdated
$result = $this->callAPISuccessGetSingle('Case', array('case_id' => $id)); | ||
// Modification dates are likely to differ by 0-2 sec. Check manually. | ||
$this->assertTrue($result['modified_date'] >= $case['modified_date']); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw you accepted my suggestion to use assertLessThan()
so maybe can you use assertGreaterThanOrEqual
here, especially because there's no custom message for this assertion
40399e1
to
609ecd0
Compare
As discussed in [civicrm#10754](civicrm#10754), the source material for populating `created_date` and `modified_date` has some quality issues (vis-a-vis timezones). If we attempt to immediately initialize `created_date` and `modified_date` from pre-existing data, then it gets a lot harder to clean up the pre-existing data. Fortunately, most systems don't actually *need* this data immediately. So we can defer. I've moved these migration rules over to an extension https://github.com/civicrm/org.civicrm.doctorwhen
@agh1 @MegaphoneJon @Stoob - I think you guys are pretty thoughtful about messaging. Could you comment on how well these screens present the matter of transitioning/cleaning-up dates/times? See the screenshots and hyperlink from the PR description. In years past, we might have tried to address the change/cleanup of time data with an automated upgrade script, and then there'd be fall-out when the script doesn't anticipate some issue... which is fairly likely because no one has access to the kind of context/tools/datasets needed to ensure a change works everywhere. In theory, the RC period helps, but I'm feeling a little untrusting there. So this PR tries to break it into smaller steps -- eg make the smallest change possible, show a notification, say "Don't panic", and then link to experimental docs/tools for the rest. The in-app messages will probably set the tone+reactions. Most of the folks on this ticket have already formed some perspective (due historical issues) -- so I wanted to ping you guys for a fresh perspective on the messaging/transition-path. |
a18077f
to
9443b00
Compare
As discussed in [civicrm#10754](civicrm#10754), the source material for populating `created_date` and `modified_date` has some quality issues (vis-a-vis timezones). If we attempt to immediately initialize `created_date` and `modified_date` from pre-existing data, then it gets a lot harder to clean up the pre-existing data. Fortunately, most systems don't actually *need* this data immediately. So we can defer. I've moved these migration rules over to an extension https://github.com/civicrm/org.civicrm.doctorwhen
9443b00
to
d0f064f
Compare
== Before == * SQL triggers to populate `civicrm_contact.created_date` and `civicrm_contact.modified_date` are generate via `CRM_Contact_BAO_Contact::triggerInfo($info, $tableName)` == After == * `CRM_Contact_BAO_Contact::triggerInfo` calls a helper `TimestampTriggers` * The helper `TimestampTriggers` accepts arguments describing the names of the tables/columns which needed for the timestamp triggers. == Comments == To test, I used this command to update and dump the schema: ``` cv api system.flush triggers=1 && mysqldump --triggers ... ``` The schema was identical before and after. (Notably, by alternately hacking the code, I was able to validate the test was capable of revealing discrepencies.)
…reation/modification date. The technique of using hard-coded example record doesn't work with this data.
There appears to be some application logic which follows a process like this: 1. Read the case 2. Tweak the data 3. Save updated case The problem is comes if step civicrm#4 resaves a timestamp loaded in step #1, which is fairly likely to happen if you read+save the same record. This was specifically observed on the "Manage Case" screen when editing activities, but the data-flow is pretty common, so make a general fix to the BAO.
…link to DoctorWhen
…ation date. Checking the `modified_date` is a bit racy -- depending on sub-second performance/alignment, the original `Case` creation and the subsequent `Case` update may have the same `modified_date` or may have different `modified_date`.
d0f064f
to
ed30b24
Compare
Following up on my questions about messaging... I escalated to Also, just rebased to handle a merge-conflict. |
Overview
In developing workflows, reports, and UIs for activities and cases, it is useful to sort and filter based based on when a record was created or when a record was last modified.
Before
civicrm_activity
andcivicrm_case
do not have the timestamp columnscreated_date
andmodified_date
.civicrm_log
includes some records to indicate creation/modification time:civicrm_log
would have beenredundantlow-priority.)civicrm_log.modified_date
are not entirely reliable -- they are stored asDATETIME
without any adjustments for timezone. This is a pre-existing and systemic problem with fields being flagged asDATETIME
instead ofTIMESTAMP
.After
civicrm_activity
andcivicrm_case
do have the timestamp columnscreated_date
andmodified_date
.Activity
via APIv3/BAO/DAO/SQL.Case
via APIv3/BAO/DAO/SQL.Activity
via "Contacts => New Activity", "View Contact => Actions", or "View Contact => Activities". (This stems from SQL triggers, but it's been tested separately.)Case
via "Cases => New Case" or "Manage Cases" UI. (This stems from SQL triggers, but it's been tested separately.)created_date
ormodified_date
.DATETIME
instead of a more sensibleTIMESTAMP
. (This addresses bothcivicrm_log.modified_date
and the pre-existing discrepancies left over from CRM-9683).If the admin installs
doctorwhen
, then a new item appears in "Administer => Doctor When":Technical Details
{civicrm_activity,civicrm_case}.{created_date,modified_date}
, there are several SQL triggers. The configuration closely parallelscivicrm_contact
:modified_date
specifiesON UPDATE CURRENT_TIMESTAMP
). (Note: MySQL only allows one column to be flagged with a default value ofCURRENT_TIMESTAMP
.)civi.activity.triggers
andcivi.case.triggers
install SQL triggers which initialize thecreated_date
.civi.activity.triggers
andcivi.case.triggers
install SQL triggers which updatemodified_date
whenever custom-data changes.civi.case.staticTriggers
installs SQL triggers which update thecivicrm_case.modified_date
whenever a related activity is created, modified, or deleted.CRM_Utils_Check_Component_Timestamps
.Comments