-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MergeDataTables does not correctly merge if table2 has more data than table1 #15294
Changes from all commits
accab87
d4d3795
ed50c08
781ca55
d20e141
7615923
1c83ed2
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -13,9 +13,8 @@ | |
|
||
class MergeDataTables | ||
{ | ||
|
||
/** | ||
* Merge the columns of two data tables. | ||
* Merge the columns of two data tables. Only takes into consideration the first row of each table. | ||
* Manipulates the first table. | ||
* | ||
* @param DataTable|DataTable\Map $table1 The table to eventually filter. | ||
|
@@ -25,13 +24,14 @@ public function mergeDataTables($table1, $table2) | |
{ | ||
// handle table arrays | ||
if ($table1 instanceof DataTable\Map && $table2 instanceof DataTable\Map) { | ||
$subTables2 = $table2->getDataTables(); | ||
foreach ($table1->getDataTables() as $index => $subTable1) { | ||
if (!array_key_exists($index, $subTables2)) { | ||
// occurs when archiving starts on dayN and continues into dayN+1, see https://github.com/piwik/piwik/issues/5168#issuecomment-50959925 | ||
continue; | ||
$subTables1 = $table1->getDataTables(); | ||
foreach ($table2->getDataTables() as $index => $subTable2) { | ||
if (!array_key_exists($index, $subTables1)) { | ||
$subTable1 = $this->makeNewDataTable($subTable2); | ||
$table1->addTable($subTable1, $index); | ||
} else { | ||
$subTable1 = $subTables1[$index]; | ||
} | ||
$subTable2 = $subTables2[$index]; | ||
$this->mergeDataTables($subTable1, $subTable2); | ||
} | ||
return; | ||
|
@@ -52,4 +52,23 @@ public function mergeDataTables($table1, $table2) | |
} | ||
} | ||
|
||
private function makeNewDataTable(DataTable\DataTableInterface $subTable2) | ||
{ | ||
if ($subTable2 instanceof DataTable\Map) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @diosmosis wondering if we would need to create a clone and keep any filters? Not sure though There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess we could. Might be an issue if filters hold references to the old table or to some other object that isn't relevant anymore (ie, like ones that calculate percent over a total or specific column value in another table or something like that). Actually, it's probably better to require this class be used before filters are run, since if metrics are formatted they won't be able to be added properly... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It shouldn't hold a reference to the table but possibly to some other parameters. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Some filters do hold references to other tables, eg, CalculateEvolutionFilter. It is conceivable that they could hold onto a lot of data. Ideally it would be great to deprecate queued filters and instead move that logic to a method in the Report class, like, |
||
$result = new DataTable\Map(); | ||
$result->setKeyName($subTable2->getKeyName()); | ||
return $result; | ||
} else if ($subTable2 instanceof DataTable\Simple) { | ||
$result = new DataTable\Simple(); | ||
$result->setAllTableMetadata($subTable2->getAllTableMetadata()); | ||
return $result; | ||
} else if ($subTable2 instanceof DataTable) { | ||
$result = new DataTable(); | ||
$result->setAllTableMetadata($subTable2->getAllTableMetadata()); | ||
return $result; | ||
} else { | ||
throw new \Exception("Unknown datatable type: " . get_class($subTable2)); | ||
} | ||
} | ||
|
||
} |
+6 −0 | tests/System/expected/test___API.get_year.xml |
+5 −2 | Controller.php | |
+16 −0 | tests/System/expected/test___API.getReportMetadata_day.xml |
+2 −2 | VisitorDetails.php | |
+8 −0 | tests/System/expected/test___API.get_day.xml |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,22 @@ | ||
<?xml version="1.0" encoding="utf-8" ?> | ||
<result /> | ||
<results> | ||
<result period="2012-03-03" /> | ||
<result period="2012-03-04" /> | ||
<result period="2012-03-05" /> | ||
<result period="2012-03-06"> | ||
<nb_uniq_visitors_returning>1</nb_uniq_visitors_returning> | ||
<nb_users_returning>0</nb_users_returning> | ||
<nb_visits_returning>1</nb_visits_returning> | ||
<nb_actions_returning>1</nb_actions_returning> | ||
<nb_visits_converted_returning>1</nb_visits_converted_returning> | ||
<bounce_count_returning>1</bounce_count_returning> | ||
<sum_visit_length_returning>0</sum_visit_length_returning> | ||
<max_actions_returning>1</max_actions_returning> | ||
<bounce_rate_returning>100%</bounce_rate_returning> | ||
<nb_actions_per_visit_returning>1</nb_actions_per_visit_returning> | ||
<avg_time_on_site_returning>0</avg_time_on_site_returning> | ||
</result> | ||
<result period="2012-03-07" /> | ||
<result period="2012-03-08" /> | ||
<result period="2012-03-09" /> | ||
</results> |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,25 @@ | ||
<?xml version="1.0" encoding="utf-8" ?> | ||
<result /> | ||
<results> | ||
<result period="2012-03"> | ||
<nb_uniq_visitors_returning>0</nb_uniq_visitors_returning> | ||
<nb_users_returning>0</nb_users_returning> | ||
<nb_visits_returning>1</nb_visits_returning> | ||
<nb_actions_returning>1</nb_actions_returning> | ||
<nb_visits_converted_returning>1</nb_visits_converted_returning> | ||
<bounce_count_returning>1</bounce_count_returning> | ||
<sum_visit_length_returning>0</sum_visit_length_returning> | ||
<max_actions_returning>1</max_actions_returning> | ||
<bounce_rate_returning>100%</bounce_rate_returning> | ||
<nb_actions_per_visit_returning>1</nb_actions_per_visit_returning> | ||
<avg_time_on_site_returning>0</avg_time_on_site_returning> | ||
</result> | ||
<result period="2012-04" /> | ||
<result period="2012-05" /> | ||
<result period="2012-06" /> | ||
<result period="2012-07" /> | ||
<result period="2012-08" /> | ||
<result period="2012-09" /> | ||
<result period="2012-10" /> | ||
<result period="2012-11" /> | ||
<result period="2012-12" /> | ||
</results> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@diosmosis reading the previous comment in #5168 (comment) I wonder if this could cause any issues with potentially wrong data when archiving overlaps another day basically? Not sure I fully understand #5168 . I suppose if it was a problem, it would need to be handled somewhere else maybe and not in this class as when merging you expect both tables to be merged as it is implemented in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So:
then some bug happens? I'm not sure I understand either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe @mattab knows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as you said, "if it was a problem, it would need to be handled somewhere else maybe and not in this class as when merging you expect both tables to be merged as it is implemented in this PR" -> so sounds good to proceed with this proper fix and we can see if the problem in #5168 starts appearing again