Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Archive export refactor (2) #4534

Merged
merged 34 commits into from
Nov 12, 2020
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
681803b
add intital implementation
chrisjsewell Oct 30, 2020
187725f
minor updates
chrisjsewell Nov 3, 2020
6d3311a
implement zippath
chrisjsewell Nov 3, 2020
f9e265b
Update writers.py
chrisjsewell Nov 3, 2020
e477280
full implmentation
chrisjsewell Nov 4, 2020
f9feb27
Merge branch 'develop' into archive/export-refactor2
chrisjsewell Nov 4, 2020
fca7620
fix pre-commit
chrisjsewell Nov 4, 2020
95b55a0
fix pre-commit (2)
chrisjsewell Nov 4, 2020
b3227ac
pre-commit fix (3!)
chrisjsewell Nov 4, 2020
4f1cef7
fix `__all__`
chrisjsewell Nov 4, 2020
6097589
update ZipPath
chrisjsewell Nov 5, 2020
b3fd6bb
improve `verdi import` stdout
chrisjsewell Nov 5, 2020
cd7a0ef
fix pre-commit
chrisjsewell Nov 5, 2020
b3c0ea2
Add null writer
chrisjsewell Nov 5, 2020
1294692
convert `test_simple` to pytest
chrisjsewell Nov 5, 2020
d4e9f3e
Apply suggestions from code review
chrisjsewell Nov 5, 2020
68f4944
Merge branch 'develop' into archive/export-refactor2
chrisjsewell Nov 5, 2020
3d22a7b
commented out batch size --batch-size option
chrisjsewell Nov 5, 2020
73c99aa
cache at set
chrisjsewell Nov 6, 2020
5b4c306
Improve zip/tar read efficiency
chrisjsewell Nov 7, 2020
0f1cb3c
move compression code to archive-path module
chrisjsewell Nov 8, 2020
78eb161
minor logging improvement
chrisjsewell Nov 9, 2020
ae32cab
minor update
chrisjsewell Nov 9, 2020
7681f79
fix error
chrisjsewell Nov 9, 2020
4ca4bcb
Remove `safe_extract_tar` and `safe_extract_zip`
chrisjsewell Nov 9, 2020
3d76c15
Add `zip-lowmemory` archive format
chrisjsewell Nov 9, 2020
88d75bb
change `_zipinfo_cache` default to None
chrisjsewell Nov 10, 2020
f509685
Merge branch 'develop' into archive/export-refactor2
chrisjsewell Nov 10, 2020
d7021c8
some extra typing fixes
chrisjsewell Nov 10, 2020
2b96a3e
Merge branch 'develop' into archive/export-refactor2
chrisjsewell Nov 11, 2020
98f0b9d
apple review comments
chrisjsewell Nov 11, 2020
37d2a4f
Merge branch 'develop' into archive/export-refactor2
chrisjsewell Nov 11, 2020
a2720fe
Update aiida/tools/importexport/dbexport/__init__.py
chrisjsewell Nov 12, 2020
86a81ab
fix pre-commit
chrisjsewell Nov 12, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 11 additions & 2 deletions aiida/cmdline/commands/cmd_export.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,10 +101,18 @@ def inspect(archive, version, data, meta_data):
show_default=True,
help='Include or exclude comments for node(s) in export. (Will also export extra users who commented).'
)
@click.option(
Copy link
Member

@ltalirz ltalirz Nov 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment #4534 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This allows for performance (cpu/memory) testing from the CLI, as we have done for other PRs (see below). I would just comment it out before merging, because it is definitely something very useful and something that will be beneficial if not now then for the new format.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commented out

'-b',
'--batch-size',
default=1000,
type=int,
help='Batch database query results in sub-collections to reduce memory usage.'
)
@decorators.with_dbenv()
def create(
output_file, codes, computers, groups, nodes, archive_format, force, input_calc_forward, input_work_forward,
create_backward, return_backward, call_calc_backward, call_work_backward, include_comments, include_logs, verbosity
create_backward, return_backward, call_calc_backward, call_work_backward, include_comments, include_logs, verbosity,
batch_size
):
"""
Export subsets of the provenance graph to file for sharing.
Expand Down Expand Up @@ -143,7 +151,8 @@ def create(
'call_work_backward': call_work_backward,
'include_comments': include_comments,
'include_logs': include_logs,
'overwrite': force
'overwrite': force,
'batch_size': batch_size,
}

if archive_format == 'zip':
Expand Down
3 changes: 2 additions & 1 deletion aiida/cmdline/commands/cmd_import.py
Original file line number Diff line number Diff line change
Expand Up @@ -207,7 +207,7 @@ def _import_archive(archive: str, web_based: bool, import_kwargs: dict, try_migr
archive_path = archive

if web_based:
echo.echo_info(f'downloading archive {archive}')
echo.echo_info(f'downloading archive: {archive}')
try:
response = urllib.request.urlopen(archive)
except Exception as exception:
Expand All @@ -216,6 +216,7 @@ def _import_archive(archive: str, web_based: bool, import_kwargs: dict, try_migr
archive_path = temp_folder.get_abs_path('downloaded_archive.zip')
echo.echo_success('archive downloaded, proceeding with import')

echo.echo_info(f'starting import: {archive}')
try:
import_data(archive_path, **import_kwargs)
except IncompatibleArchiveVersionError as exception:
Expand Down
3 changes: 2 additions & 1 deletion aiida/tools/importexport/archive/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,5 +15,6 @@
from .migrators import *
from .readers import *
from .writers import *
from .zip_path import *

__all__ = (migrators.__all__ + readers.__all__ + writers.__all__ + common.__all__)
__all__ = (migrators.__all__ + readers.__all__ + writers.__all__ + common.__all__ + zip_path.__all__)
4 changes: 2 additions & 2 deletions aiida/tools/importexport/archive/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
from pathlib import Path
import tarfile
from types import TracebackType
from typing import Any, Callable, Dict, Iterable, List, Optional, Set, Tuple, Type, Union
from typing import Any, Callable, Dict, Iterable, List, Optional, Tuple, Type, Union
import zipfile

from aiida.common import json # handles byte dumps
Expand Down Expand Up @@ -47,7 +47,7 @@ class ArchiveMetadata:
# optional data
graph_traversal_rules: Optional[Dict[str, bool]] = dataclasses.field(default=None)
# Entity type -> UUID list
entities_starting_set: Optional[Dict[str, Set[str]]] = dataclasses.field(default=None)
entities_starting_set: Optional[Dict[str, List[str]]] = dataclasses.field(default=None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess there is a good reason for this change; just mentioning that set is also in the name of the attribute...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeh I went back and forth on this, because it has to be converted to a list before storing as a json and so it was a little easier to convert before passing to the writer.
but your right that the naming is now a little off

include_comments: Optional[bool] = dataclasses.field(default=None)
include_logs: Optional[bool] = dataclasses.field(default=None)
# list of migration event notifications
Expand Down
Loading