Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export file migration: v0.3 to v0.4 to v0.5 #2478

Merged
merged 7 commits into from
Jun 19, 2019
Merged

Export file migration: v0.3 to v0.4 to v0.5 #2478

merged 7 commits into from
Jun 19, 2019

Conversation

yakutovicha
Copy link
Contributor

@yakutovicha yakutovicha commented Feb 14, 2019

Fixes #2426.
Fixes #3010.

@yakutovicha
Copy link
Contributor Author

yakutovicha commented Feb 14, 2019

Note:
The DB migration 0009 does 'data.base.Bool.' -> 'data.bool.Bool.' (without 'node.' prefix)
probably the node prefix was added in some other migration

In this commit I've implemented migration WITH 'node.' prefix

UPDATE:
to be consistent with the order of migrations I decided to remove 'node.' prefix and add it in the migration '0025'

@sphuber
Copy link
Contributor

sphuber commented Feb 14, 2019

The DB migration 0009 does 'data.base.Bool.' -> 'data.bool.Bool.' (without 'node.' prefix)
probably the node prefix was added in some other migration

Yes, this was added later in another migration

@yakutovicha
Copy link
Contributor Author

@CasperWA and @szoupanos I added both of you to my aiida_core fork, so you can also contribute.

I will sequentially go through all the migrations.

@yakutovicha
Copy link
Contributor Author

The migration 0010 does not need to be implemented, since the 'process_type' was never used before the migration 0020. In the migration 0020, however, the 'process_type' string will contain some information that was previously stored in 'type' (plugin entry point, I assume) for the calculation nodes.

Therefore, the type -> process_type conversion will be implemented together with the migration 0020.

@yakutovicha
Copy link
Contributor Author

The migration 0011 does not need to be implemented, as all the affected tables (kombu_message, kombu_queue, db_dbsetting) are not exported.

@CasperWA
Copy link
Contributor

The migration 0010 does not need to be implemented, since the 'process_type' was never used before the migration 0020. In the migration 0020, however, the 'process_type' string will contain some information that was previously stored in 'type' (plugin entry point, I assume) for the calculation nodes.

Therefore, the type -> process_type conversion will be implemented together with the migration 0020.

For the conversion of the content in type to process_type see the infer_calculation_entry_point-function

@yakutovicha
Copy link
Contributor Author

The migration 0012 deletes db_dblock table from the database that was not exported anyways. So skipping this migration as well.

@yakutovicha
Copy link
Contributor Author

yakutovicha commented Feb 14, 2019

The migration 0013 changes content type in two columns of db_dbuser table: last_login (to DateTimeField) and email (to EmailField).

  • `last_login' column is not exported anyways, so will be skipped.
  • email column is exported as a string for before and after migration database. So no changes are required also.

This changes do not influence the migration therefore

@yakutovicha
Copy link
Contributor Author

The migration 0014 checks the uniqueness of the nodes' uuids. Based on the import file it is impossible to know what exactly should be done with those nodes, therefore the migration process will be aborted.

@yakutovicha
Copy link
Contributor Author

The migration 0015 invalidates node hashes stored in extras. Since extras are not exported in v0.3 - we skip this migration step.

@yakutovicha yakutovicha reopened this Feb 14, 2019
@yakutovicha
Copy link
Contributor Author

The migration 0016 acts on the Code class used to be just a sub class of Node but was changed to act like a Data node.
Basically, the only change was to replace type 'code.Code.' with 'data.code.Code.'

@yakutovicha
Copy link
Contributor Author

yakutovicha commented Feb 15, 2019

The migration 0017 drops the DbCalcState table which is not exported 'as is' but as a state item of a node_attributes dictionary.

I assume (@sphuber correct me if I am wrong) that state became a node attribute, and therefore it is exported in exactly the same way as before the migration 0017.

So this migration does not need to be implemented

@yakutovicha
Copy link
Contributor Author

yakutovicha commented Feb 15, 2019

The migration 0018 imposes uuid uniqueness constraints to all tables that contain uuid column.

@giovannipizzi, this is the list of such a tables in AiiDA 0.12.2:

=> select table_name from information_schema.columns where column_name = 'uuid';
  table_name   
---------------
 db_dbcomment
 db_dbcomputer
 db_dbgroup
 db_dbnode
 db_dbworkflow
(5 rows)

Among those columns only three (db_dbcomputer, db_dbgroup, db_dbnode) were exported in AiiDA 0.12.2. So I added the uniqueness check only to the dictionaries that correspond to those Db tables.

@giovannipizzi
Copy link
Member

giovannipizzi commented Feb 15, 2019

Perfect (re your comment on the UUID tables and the check)

@yakutovicha
Copy link
Contributor Author

yakutovicha commented Feb 15, 2019

Migration 0019:

  • remove namespace ‘simpleplugins’ from ‘templatereplacer’ and ‘addcalculation’ in ‘db_dbnode’ table’s ‘process_type’ and ‘type’ fields.

  • remove 'simpleplugins' prefix from the 'input_plugin' attribute of 'arithmetic.add' and 'templatereplacer' codes

@CasperWA
Copy link
Contributor

Try, going forward, to rebase on provenance_redesign instead of merging

@yakutovicha
Copy link
Contributor Author

As the migration 0020 is quite large, I won't describe here what it is doing. One could just have a look inside the corresponding file to get more information.

The thing that I did NOT implement is detection of "workflow nodes with isolated CREATE links" (third check here). All the rest should be working.

@yakutovicha
Copy link
Contributor Author

yakutovicha commented Feb 21, 2019

Try, going forward, to rebase on provenance_redesign instead of merging

I would do rebasing at the end, before requesting a review from Sebastiaan.

@sphuber
Copy link
Contributor

sphuber commented Feb 21, 2019

Yes, but for PRs that involve multiple authors, rebasing at the end becomes extremely painful if there have been intermediate merges, because in that case I cannot squash everything, because that would lose the attributions for the authors involved. So rule-of-thumb: for PRs with multiple authors that will need to keep individual attributions, only use rebasing during development to get branch up to date.

@yakutovicha
Copy link
Contributor Author

Yes, but for PRs that involve multiple authors, rebasing at the end becomes extremely painful if there have been intermediate merges, because in that case I cannot squash everything, because that would lose the attributions for the authors involved. So rule-of-thumb: for PRs with multiple authors that will need to keep individual attributions, only use rebasing during development to get branch up to date.

ok

@yakutovicha
Copy link
Contributor Author

Migration 0021 renames the following columns of group table:
name -> label
type -> type_string

@yakutovicha
Copy link
Contributor Author

Migration 0022 modifies the group type_strings as follows:

'' -> 'user'
'data.upf.family' -> 'data.upf'
'aiida.import' -> 'auto.import'
'autogroup.run' -> 'auto.run'

@yakutovicha
Copy link
Contributor Author

@sphuber , could you please have a look at my last commit where I implemented the migration 0023? I am not sure I understood what exactly should be done here.

@CasperWA , @szoupanos I think I stop here. Please continue further. The biggest migration among the remaining once is done by @szoupanos (0024) so I assume it would be much easier for him to work on it.

Let me know if I should already rebase the existing commits to simplify your life.

@yakutovicha
Copy link
Contributor Author

In the meanwhile I realized that the migration 0024 can be skipped entirely as it focuses on the migration of logs.

@yakutovicha
Copy link
Contributor Author

yakutovicha commented Feb 22, 2019

Migration 0025 changes the type string for Data nodes from data.* to node.data.*

@CasperWA
Copy link
Contributor

Updated for the rework of verdi import. Should maybe not have done this and instead waited for later. Sorry. Will wait for reviewers to ping me to update it or similar.

@CasperWA
Copy link
Contributor

I am currently working on adding the new migrations that will be introduced, one has already been merged in develop: #2937.
The changes will be pushed after they have all been merged into develop.

@CasperWA
Copy link
Contributor

CasperWA commented May 28, 2019

Changes that will have to be included:

Changes that will not have to be included:

@CasperWA
Copy link
Contributor

CasperWA commented May 29, 2019

Current state:
Tests must fail, since the database migrations have still not all been merged, nor has the export version been upped to v0.5. However, the export archive migrations should now work for v0.5, as is intended at this point in time.

What is missing:
Update testing-repository aiida-export-migration-tests with tests for v0.5, a new "advanced" archive, and release an upped version (v0.5.0).
Edit: A PR has been made for the testing-repository. A single test is missing.
Edit no. 2: The "missing" test has been relegated to nothingness, since no new export file was created from a workflow - instead it is a migration of export file v0.4, since the workflow outcome would not change, and the changes from v0.5 are only minor.

@CasperWA CasperWA changed the title Export file migration: v0.3 to v0.4 Export file migration: v0.3 to v0.4 to v0.5 Jun 17, 2019
Copy link
Contributor

@sphuber sphuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found some problems in the migrations. Some just unnecessary checks, some actual mistakes that could lead to wrong migrations.

Additionally, as a note, when we refactor importexport to its own module in tools, I think we should move the aiida.cmdline.utils.migration also there, because it is not really a cmdline thing. Moreover, the name migration now does not necessarily make it clear that it deals with export archive migrations and not for example database migration utils. Anyway, fine to leave for now, but good to remember when you get around refactoring the import export code into its own module.

aiida/cmdline/utils/migration/v03_to_v04.py Outdated Show resolved Hide resolved
aiida/cmdline/utils/migration/v03_to_v04.py Outdated Show resolved Hide resolved
aiida/cmdline/utils/migration/v03_to_v04.py Outdated Show resolved Hide resolved
aiida/cmdline/utils/migration/v03_to_v04.py Outdated Show resolved Hide resolved
aiida/cmdline/utils/migration/v03_to_v04.py Outdated Show resolved Hide resolved
metadata['all_fields_info'].update(new_entities)
metadata['unique_identifiers'].update({"Log": "uuid", "Comment": "uuid"})

# Update data.json with the new Extras
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably put this in a separate function as well, even though it does not correspond to an actual database migration

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the Log and Comment entries as well? (The lines above this comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only moved the introduction of the extras dicts for now

aiida/cmdline/utils/migration/v03_to_v04.py Outdated Show resolved Hide resolved
yakutovicha and others added 7 commits June 19, 2019 13:21
The following database migrations have been translated into a migration
for the contents of existing export archives with version 0.3:

 9, 14, 16, 18, 19, 20, 21, 22, 23, 25, 26, 27, 28

The ones that have been skipped were not applicable for export archives
- Rev. 0029
- Rev. 0030
- Rev. 0031
- Rev. 0032
- Rev. 0033

Introduce changes to `metadata.json`:

- Add `Log` to `all_fields_info` and `unique_identifiers`
- Add `Comment` to `all_fields_info` and `unique_identifiers`
- Changes to key values of `all_fields_info`

Add extra dicts to `data.json`:

- `node_extras`
- `node_extras_conversion`

Both will be empty dicts.
If `Node`s are present in the exported file, key-value-pairs will be added,
where the key is the `Node` pk/id and the value is an empty dict.

Renamed new test file by prepending `test_`

Updated `test_v3_to_v4` in new test file to migrate `export_v0.3_no_UPF.aiida`
instead of `arithmetic.add.aiida`.

Removed UPF node attributes information from `data.json` files for:

- `export_v0.1.aiida`
- `export_v0.2.aiida`
- `export_v0.3.aiida`

Renamed them by appending `_no_UPF`.

Migrated `export_v0.3_no_UPF.aiida` to export version 0.4.

Migrated other existing export files under `fixtures` to export version 0.4.
But only for those that are expected to be the newest version.

Updated `test_calcjob.py` and found an issue for `inputcat` and `outputcat`.
This has been reported in issue #2611.

Updated all tests that use exported files from `fixtures`:

- `test_export_and_import.py`
- `test_calcjob.py`
- `test_export.py`
- `test_import.py`
- `test_archive.py`

Added test `test_no_node_export` to check migration of export file with no Nodes succeed.
Added `export_v0.3_no_Nodes.aiida` export file for this test.

Updated test utils file `fixtures.py` to incorporate the  `get_archive_file` function
from `test_export.py`.

All instances of reimplementations of this function have been updated
with the new `get_archive_file` util function.

Inserted key checks, if migration of an export file should have been partially performed,
for one reason or another.

Update arithmetic.add.aiida
….cmdline.utils.migration`.

Following files have been created:
- `__init__.py`: Contains `migrate_recursive` and allows for import from `aiida.cmdline.utils.migration`
- `utils.py`: Contains `verify_metadata_version` and `update_metadata`
- `v01_to_v02.py`: Contains function to migrate from export version v0.1 to v0.2
- `v02_to_v03.py`: Contains function to migrate from export version v0.2 to v0.3
- `v03_to_v04.py`: Contains functions to migrate from export version v0.3 to v0.4

Moving and renaming new test-file `test_export_file_migrations.py` from `aiida.backends.tests`
to `aiida.backends.tests.cmdline.utils.migrations`, renaming it to `test_v03_to_v04.py`.

See previous commits for actual authors of the code.
…port-migration-tests".

Tests for the following files in `cmdline.utils.migration`:
- `__init__.py` in `test_migration.py`
- `v01_to_v02.py` in `test_v01_to_v02.py`
- `v02_to_v03.py` in `test_v02_to_v03.py`
- `v03_to_v04.py` in `test_v03_to_v04.py`

Found and fixed bug for migration of TrajectoryData in Rev. 1.0.26 and 1.0.27 for v0.3 to v0.4:
"symbols" was only updated for "node_attributes" not "node_attributes_conversion".

Added more representative export files for the different export versions.
They have been created using the same workflow (by @yakutovicha) in specialized QuantumMobile VMs.
AiiDA version used for export versions:
Export: 0.2, AiiDA: 0.9.1
Export: 0.3, AiiDA: 0.12.3
Export: 0.4, AiiDA: 1.0.0b2

Export version 0.1 was set to AiiDA version 0.6.0,
but AiiDA was never released on PyPi until version 0.8.0,
and the changes from export version 0.1 to 0.2 are very minor.
So a more representative export file for export version 0.1 is left out.
The export files are in the separate repo.

Added dependency for separate repo if "testing" is chosen when installing via pip.

To be pythonic and have more transparent code,
the migration steps were divided into separate functions.
This also results in the fact that a lot of migrations can now be skipped,
if no Nodes are present in the export file.
Add migration from v0.4 to v0.5.
This includes migration 0034 and 0036 (0035 has no effect).

Create new simple export file for v0.5 and update tests to use it.
Migrate other export files.

Up version dependancy of `aiida-export-migration-tests` to 0.5.0.
@sphuber sphuber merged commit 32496e3 into aiidateam:develop Jun 19, 2019
@sphuber
Copy link
Contributor

sphuber commented Jun 19, 2019

Thanks a lot to @yakutovicha, @CasperWA and @szoupanos

@CasperWA
Copy link
Contributor

Thanks a lot to @yakutovicha, @CasperWA and @szoupanos

woooot 😎 so cool - thanks @sphuber !

@CasperWA CasperWA deleted the export_migration branch June 19, 2019 13:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants