Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(long) migration from old dmponline_v4 to roadmap 3.0.1 delivers customized templates with wrong family_id #2813

Closed
nicolasfranck opened this issue Feb 11, 2021 · 11 comments

Comments

@nicolasfranck
Copy link
Contributor

I've completed the full migration flow from dmponline_v4 to roadmap 3.0.1.
One of the strange things I've seen is that there are now templates with attribute
customization_of set to a value that has no correspondence with a template
of that family_id.

So this returns no original template:

customized_template = Template.where("customization_of IS NOT NULL").first
original_template   = Template.where(family_id: customized_template.customization_of).first

This is only true for a few customized templates though (13?) in our collection,
but it makes some controllers crash where the method Template#base_org is used.
That method looks up the organisation of the original template..

The only place where these attributes are created/updated is in migration 20161122152339_new_plan_template_structure.rb, but I have no idea
why templates with same family_id seem to have dissappeared..

Any idea?

@briri
Copy link
Contributor

briri commented Feb 11, 2021

Interesting @nicolasfranck. I would suspect that perhaps the original data may have had an incorrect org_id on the project and template. See line

template = initTemplate(dmptemplate, modifiable, project.organisation_id) # needs to select next version of temp based on old_temp_id
. The modifiable variable determines whether or not the customization_of is null or contains a value, so maybe it had the wrong org_id and thus pointed to the incorrect template.

I did not personally have to run through the upgrade tasks since my DMPTool came from a different codebase originally. I do recall our former developer, @xsrust, mentioning that the original DMPonline data had some issues like orphaned records or incorrect associations that he needed to deal with prior to the migration. Perhaps its related?

Congratulations for getting your system fully migrated to v3! That is a great accomplishment and a lot of work!

@nicolasfranck
Copy link
Contributor Author

I did some investigation. If you run this query:

select id,title,family_id,customization_of from templates where customization_of is not null and customization_of not in (select family_id from templates)

then every customization_of corresponds to the id of the old dmptemplates (I checked their titles).

I've learned that family_id originates from these old dmptemplates.id,
and that customization_of refers to a family_id.

Possible causes?

  • migration mentioned above does the right thing, but a later migration updates family_id without updating customization_of
  • migration mentioned above generates a new template with customization_of set, but that referred template is never created.

@nicolasfranck
Copy link
Contributor Author

@briri euhm the version of that migration you refer to is from the master branch, not the one from branch dmponline4_upgrade_step2 (why is that even different?). This version has two references to the function initTemplate:

https://github.com/DMPRoadmap/roadmap/blob/dmponline4_upgrade_step2/db/migrate/20161122152339_new_plan_template_structure.rb

@briri
Copy link
Contributor

briri commented Feb 16, 2021

If I remember correctly, those dmponline4_upgrade_stepX branches diverged off of the master branch because they required the creation of temporary tables/fields for use only with the migrations from that other dmponline4 codebase.

Looking at the lib/tasks/upgrade.rake file which contains all of the data manipulations needed through v1 and v2, I see that your suspicions may be correct. There a few upgrade tasks that clean up issues with templates and customizations that may have caused the issue.
It looks like the customization_of and family_id are not managed through ActiveRecord's belongs_to but are instead unfortunately managed through functions. It is possible that one of the other tasks updated those fields in an attempt to clean something up

@nicolasfranck
Copy link
Contributor Author

I did the test: that migration is causing this issue:

select id,title,customization_of from templates where customization_of is not null and customization_of not in (select dmptemplate_id from templates)

Apparently there are two main loops that call the method initTemplate,
first from the perspective of the Project, and then from the perspective
of the Dmptemplate.

About the first loop:

  • If project.organisation is set and not equal to project.dmptemplate.organisation, then it is marked as non-modifiable, so marked as a customization. That last thing happens implicitly, as it is actually done by initTemplate when modifiable=true.
  • when customization_of is set, it expects (implicitly) the second loop to create that new template with that dmptemplate_id/family_id

About the second loop:

  • here initTemplate is called with all arguments set. A new dmptemplate_id is generated, and given as argument.
  • if dmptemplate_id is given as an argument to initTemplate, then it is used in favour of the old dmptemplate.id. This way no new template is generated with that dmptemplate_id/family_id

@briri
Copy link
Contributor

briri commented Feb 17, 2021

Ugh! That's a complicated mess. Thanks for doing the research into this issue @nicolasfranck

We don't have the bandwidth to patch it at the moment. Would you be willing to submit a patch to that specific migration branch? Or, as an alternative, since it seems to have only impacted a few records in your migration (and would likely be similar for others trying to migrate), we could make a note on the upgrade instructions wiki page outlining the bug and its severity (with a link to this issue) and a suggested approach for a manual update of those template records to fix them.

@nicolasfranck
Copy link
Contributor Author

@briri : by setting this line ..

https://github.com/DMPRoadmap/roadmap/blob/dmponline4_upgrade_step2/db/migrate/20161122152339_new_plan_template_structure.rb#L401

to

dmptemplate_id = dtemp.id

I could fix this issue. But I guess that line was there for a reason? Something like

https://github.com/DMPRoadmap/roadmap/blob/dmponline4_upgrade_step2/db/migrate/20161122152339_new_plan_template_structure.rb#L399

was is meant here by "confusing versioning numbers"?

@briri
Copy link
Contributor

briri commented Feb 18, 2021

I'm not sure unfortunately. I migrated from a different codebase so I am unfamiliar with the old dmponline4 table structures and content. Its may have been to deal with an issue/scenario that was unique to the old dmponline data.

I added a note to Step 2 on the wiki instructions to let people know about this issue and to offer a manual solution. Can you take a look at how I described it and let me know if there are any ways to make it more clear/helpful to others encountering the same problem?

Thanks again @nicolasfranck!

@nicolasfranck
Copy link
Contributor Author

nicolasfranck commented Feb 23, 2021

Another thing that I've found in that branch: it generates entries in table answers_question_options that refer to the wrong option, i.e. it select an option that is NOT part of its question's options. This means that after the (full) migration the right option is not selected..

Although you cannot see this easy in the console:

answer = question.answers.first
# this returns a record ..
question_option = answer.question_options.first
# but this returns false 
answer.option_selected?(question_option.id)

The entries from answers_question_options belong to the right answer,
but not to right question_option_id.

What is probably causing this: https://github.com/DMPRoadmap/roadmap/blob/dmponline4_upgrade_step2/db/migrate/20161122152339_new_plan_template_structure.rb#L637

It is very subtile:

  • in the first loop a template is either selected or created and attached the Plan. For each subtile difference a new template is created.
  • during template creation a new QuestionOption is created with foreign key option_id. That is the id of an Option that belongs somewhere in the original template.
  • initAnswer copies its selected options by searching on QuestionOption.find_by(option_id: option.id). But each duplicated template has question_options with that same option id, so it incorrectly selects the first hit. Maybe add an extra selector and so use QuestionOption.find_by(option_id: option.id, new_question_id: new_answer.new_question_id)

@briri
Copy link
Contributor

briri commented Feb 23, 2021

yes, that is very subtle. I wonder if adding an additional query parameter to that would help: QuestionOption.find_by(option_id: option.id, new_question_id: new_question.id)
I think that would properly select the cloned/new question option.

@briri
Copy link
Contributor

briri commented Jul 27, 2021

closing this out since you've migrated to the latest version @nicolasfranck

@briri briri closed this as completed Jul 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants