pkp/pkp#8933 Restore original file after cancelling file upload wizard #8941

Vitaliy-1 · 2023-04-24T15:15:52Z

No description provided.

controllers/api/file/PKPManageFileApiHandler.php

controllers/wizard/fileUpload/FileUploadWizardHandler.php

js/controllers/wizard/fileUpload/FileUploadWizardHandler.js

NateWr

Just one longer comment about modifying the schema at run-time like this. I don't know if you'll have time to rethink that approach, but something to think about anyway. Otherwise, it's all small comments. 👍

api/v1/submissions/PKPSubmissionHandler.php

classes/security/Validation.php

classes/facades/Repo.php

classes/log/event/Collector.php

controllers/wizard/fileUpload/FileUploadWizardHandler.php

js/controllers/wizard/fileUpload/FileUploadWizardHandler.js

schemas/eventLog.json

classes/log/event/DAO.php

classes/migration/install/LogMigration.php

asmecher · 2023-05-30T17:53:57Z

classes/migration/upgrade/v3_4_0/I8933_EventLogLocalized.php

+            $table->dropForeign('event_log_user_id_foreign');
+            $table->dropIndex('event_log_user_id');
+            $table->bigInteger('user_id')->nullable()->change();
+            $table->foreign('user_id')->references('user_id')->on('users')->onDelete('cascade');


Are you sure you want an ON DELETE CASCADE here? I would think ON DELETE SET NULL would be more appropriate -- the point is to be able to continue to log events when a user account doesn't exist, right?

Correct me if I'm wrong: the only appropriate way to remove a user account is to merge it with another one. For this specific case we also combine their log entries. Thus, on delete cascade here would prevent data corruption in unexpected cases. If not, I can set to null on delete.

Forgot to mention, the idea behind making user_id nullable, is to allow to log events from the system, automated events or other cases when a specific user isn't involved.

That makes sense to me, thanks, please go ahead. But please do add a ->comment to the user_id column in LogMigration to explain what a null value means (we'll add more of these in future releases)!

asmecher · 2023-05-30T17:58:58Z

classes/migration/upgrade/v3_4_0/I8933_EventLogLocalized.php

+                $mapLocaleWithSettingIds = [];
+                foreach ($logChunks as $row) {
+                    // Get locale based on a submission file ID log entry
+                    $locale = $this->getContextPrimaryLocale($row, $sitePrimaryLocale);


I think you can do this all in SQL for a significant performance boost. Do a LEFT JOIN to the appropriate table for each assoc_type, then join on journals (contexts) using COALESCE() around the candidate IDs; that'll give you a single journalsjoin regardless of what kind ofassoc_type` was present.

Thinking further about performance: would we be able to turn this around to optimize it? Currently it's...

for each file upload, edit, or revise event,

get its context ID's primary locale, and

update the setting's locale column to match.

If I understand it right, it would probably be a lot faster to go the other way around:

For each context,

Get its primary locale, and

Update any file upload, edit, or revise event log entry settings to match.

Thanks, that was also on my mind as a belated idea but I wasn't sure if the performance boost would be worth to re-implement it

Just for reference, there are 3M entries in SciELO's event log table; 567k are in (SUBMISSION_LOG_FILE_UPLOAD, SUBMISSION_LOG_FILE_EDIT, SUBMISSION_LOG_FILE_REVISION_UPLOAD). So the cost of going through these individually would be very high.

I'm getting context primary locale having only submission or submission file. Event log doesn't have context_id column, thus I need to iterate over all those entries. I meant to move statement that selects context locale out of the loop.

I've optimised migration where possible. The major change is setting chunking to 10000, which means that the update statement will run once for 10k entries. On my laptop 500k records are updated within 15 minutes. If it's not enough, I can try to create a wider join statement, a series of something like:

SELECT * FROM event_log AS e JOIN event_log_settings AS es on e.log_id = es.log_id JOIN submissions AS s ON e.assoc_id = s.submission_id WHERE e.assoc_type = 1048585 AND es.setting_name = 'filename' AND s.context_id = '?'

Then updating log entries altogether, knowing the associated context and its primary locale. It might be faster but less prone to corrupted data.

That sounds good, thanks -- please go ahead, and I'll do a sanity check with a large dataset once all upgrade-related merges are complete in order to make sure we don't regress on performance.

classes/migration/upgrade/v3_4_0/I8933_EventLogLocalized.php

…t log issue

Vitaliy-1 mentioned this pull request Apr 24, 2023

Cancelling revision upload removes the original file #8933

Closed

Vitaliy-1 force-pushed the i8933_restore_original branch from a80dd3b to fc94ba2 Compare April 24, 2023 15:19

NateWr suggested changes Apr 25, 2023

View reviewed changes

Vitaliy-1 force-pushed the i8933_restore_original branch 4 times, most recently from 5383bf9 to 5ff9333 Compare May 20, 2023 09:13

NateWr suggested changes May 23, 2023

View reviewed changes

Vitaliy-1 force-pushed the i8933_restore_original branch from 379e695 to 9b2f174 Compare May 28, 2023 20:29

asmecher reviewed May 30, 2023

View reviewed changes

classes/migration/install/LogMigration.php Outdated Show resolved Hide resolved

asmecher reviewed May 30, 2023

View reviewed changes

classes/migration/upgrade/v3_4_0/I8933_EventLogLocalized.php Outdated Show resolved Hide resolved

Vitaliy-1 force-pushed the i8933_restore_original branch 3 times, most recently from 87f2613 to f18aae2 Compare June 2, 2023 08:24

Vitaliy-1 added 14 commits June 2, 2023 17:19

pkp#8933 Restore original file after cancelling file upload wizard

7f93ddb

pkp#8933 Refactor event log to restore revised submission file

1365309

pkp#8933 Fix migration errors

2608759

pkp#8933 Fix indent

6d158c3

pkp#8933 Fix migration - nullable setting_type & inserting empty even…

9fa310f

…t log issue

pkp#8933 Fix migration - get the correct context primary locale

96493ff

pkp#8933 Make is_translated boolean

25799d4

pkp#8933 Re-add Validation::isLoggedInAs()

067f41c

pkp#8933 Combine filterByAssocType & filterByAssocId

f760cde

pkp#8933 Remove support for custom props

c58ab9f

pkp#8933 generic event log property for custom data

16cef3c

pkp#8933 Update comments

cc86ccb

pkp#8933 Pass Context to event log Repository::validate()

2f9ddea

pkp#8933 Event log is never updated

ed6f893

Vitaliy-1 added 8 commits June 2, 2023 17:19

pkp#8933 Fix extra indent

c73c18a

pkp#8933 Update docs

ec8efe2

pkp#8933 Looser validation for logged decision values

10c8e61

pkp#8933 Force cast int on original file ID

7c4f9cd

pkp#8933 Optimize migration performance

c2daeed

pkp#8933 Change isTranslated to boolean false when creating an event log

8942fe6

pkp#8933 Remove setting_type column

51a74e9

pkp#8933 Optimize event log migration performance

25123ae

Vitaliy-1 force-pushed the i8933_restore_original branch from f18aae2 to 25123ae Compare June 2, 2023 14:19

Vitaliy-1 merged commit a8af7b3 into pkp:main Jun 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pkp/pkp#8933 Restore original file after cancelling file upload wizard #8941

pkp/pkp#8933 Restore original file after cancelling file upload wizard #8941

Vitaliy-1 commented Apr 24, 2023

NateWr left a comment

asmecher May 30, 2023

Vitaliy-1 May 30, 2023

asmecher May 30, 2023

asmecher May 30, 2023 •

edited

Loading

asmecher May 30, 2023

Vitaliy-1 May 30, 2023

asmecher May 30, 2023

Vitaliy-1 May 31, 2023

asmecher May 31, 2023

pkp/pkp#8933 Restore original file after cancelling file upload wizard #8941

pkp/pkp#8933 Restore original file after cancelling file upload wizard #8941

Conversation

Vitaliy-1 commented Apr 24, 2023

NateWr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

asmecher May 30, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

asmecher May 30, 2023 •

edited

Loading