-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Duplicate CREATE links when importing old export file #4125
Comments
Pinging @CasperWA for comments as he's been working a lot with the migrations of export files @greschd could you please quickly check all the outgoing links (type and label pairs) and report them here? |
From the
I think the code producing these issues is here, although I'm not 100% sure this is the right commit version: https://github.com/greschd/aiida-tbextraction/blob/034f8a6ceaa300118064f1eb287037e310f3b45f/aiida_tbextraction/fp_run/_helpers/_inline_calcs.py The labels of the offending links are Also, I was using a sketchy development version of AiiDA at the time - it's entirely possible this is some edge case related to that. |
We've tried to tackle the most general patches of these, but to get just right for a particular graph it ususally takes a bit of manual labour, since the desired outcome may vary from owner to owner, e.g., you don't care about the labels, others might. Since this migration is most likely due to migrating from v0.3 to v0.4 of the export version, we can try to do this the "hard" way, by migrating to only this version first, then going over the export There might also be a log file somewhere with the list of links affected (perhaps in the folder where from where you ran the migrate/export command? |
Yeah, the problem here is that the links aren't affected by the migration at all - so they also don't appear in the It appears to me the "inline calculation" produced duplicates of the output nodes, which are both linked with the same label. I'll give it a quick try at reproducing this with a clean |
Found the culprit... In If this is the only place where this can occur I'd be tempted to mark this However, this made me remember how you could have duplicate CREATE links in previous versions: Outputs did not have to be unique, but then you needed to access them with In [17]: inline_calc.get_outputs_dict()
Out[17]:
{u'res': <Data: uuid: 52bed73b-cabc-49ce-aae1-588479454123 (pk: 15)>,
u'res_15': <Data: uuid: 52bed73b-cabc-49ce-aae1-588479454123 (pk: 15)>,
u'res_16': <Data: uuid: c4005890-1bb5-47f4-b1fb-eeb85cfe0144 (pk: 16)>}
In [18]: inline_calc.out.res
Out[18]: <Data: uuid: 52bed73b-cabc-49ce-aae1-588479454123 (pk: 15)>
In [19]: inline_calc.out.res_15
Out[19]: <Data: uuid: 52bed73b-cabc-49ce-aae1-588479454123 (pk: 15)>
In [20]: inline_calc.out.res_16
Out[20]: <Data: uuid: c4005890-1bb5-47f4-b1fb-eeb85cfe0144 (pk: 16)> The link labels don't contain the In [28]: for link in inline_calc.dbnode.output_links.all():
...: print(link, link.label)
...:
(<DbLink: InlineCalculation (14) --> Float (16)>, u'res')
(<DbLink: InlineCalculation (14) --> Float (15)>, u'res') Here's the link to the relevant documentation in If memory serves me right, this could happen in any calculation / workchain / whatever, and the caching issue isn't actually creating an invalid link state as far as the old AiiDA version is concerned. Maybe a way to fix this would be explicitly adding the |
Hi! Anyway - I agree. Let's set this to |
Sorry, I wasn't being totally clear: Since the caching / inline calcs are in fact not the only place where this can occur I would suggest not to mark this as The uniqueness was not required in the previous version, but now you can only have one P.S.: Even if we decide to mark it |
Hmm, I think I misunderstood something: Duplicate outgoing links with the same label still appear to be valid: In [10]: from aiida.orm import CalcJobNode
In [11]: from aiida.orm import Data
In [12]: calc = CalcJobNode()
In [13]: d1 = Data()
In [14]: d2 = Data()
In [15]: d1.add_incoming(calc, link_type=LinkType.CREATE, link_label='test')
In [16]: d2.add_incoming(calc, link_type=LinkType.CREATE, link_label='test') So.. is this just an extra validation in the import that actually shouldn't be there? |
Ha, just figured out how I tricked the link validation in the example above: Links only show up in In [16]: calc = CalcJobNode()
In [17]: calc.store()
Out[17]: <CalcJobNode: uuid: 6e3b970f-cc00-465e-a6c5-67507a3a7740 (pk: 49831)>
In [18]: d1 = Data()
In [19]: d1.add_incoming(calc, link_type=LinkType.CREATE, link_label='test')
In [20]: calc.get_outgoing(link_type=LinkType.CREATE, link_label_filter='test', only_uuid=True).all()
Out[20]: []
In [21]: d1.store()
Out[21]: <Data: uuid: a776a906-dbfc-48d8-8b56-ac2627b07a28 (pk: 49832)>
In [22]: calc.get_outgoing(link_type=LinkType.CREATE, link_label_filter='test', only_uuid=True).all()
Out[22]: [LinkTriple(node='a776a906-dbfc-48d8-8b56-ac2627b07a28', link_type=<LinkType.CREATE: 'create'>, link_label='test')] This is relevant here because I'm not sure if
Tagging @sphuber for comment. |
I am confused. I thought this behavior was not allowed and therefore your example should be a bug. If you take your example and then store the nodes, there will be no exception and we have two links that violate the uniqueness constraints: In [4]: from aiida.orm import CalcJobNode
...: from aiida.orm import Data
...: calc = CalcJobNode()
...: d1 = Data()
...: d2 = Data()
...: d1.add_incoming(calc, link_type=LinkType.CREATE, link_label='test')
...: d2.add_incoming(calc, link_type=LinkType.CREATE, link_label='test')
...: calc.store()
Out[4]: <CalcJobNode: uuid: 94def715-33ab-4065-b750-aca73266d0f0 (pk: 11081)>
In [5]: calc.get_outgoing().all()
Out[5]: []
In [6]: d1.store()
Out[6]: <Data: uuid: 023c9af7-328d-43ca-8ef5-05f4f0bb773c (pk: 11082)>
In [7]: calc.get_outgoing().all()
Out[7]: [LinkTriple(node=<Data: uuid: 023c9af7-328d-43ca-8ef5-05f4f0bb773c (pk: 11082)>, link_type=<LinkType.CREATE: 'create'>, link_label='test')]
In [8]: d2.store()
Out[8]: <Data: uuid: e24e1f7f-6337-469b-a624-c1d13aa1c8eb (pk: 11083)>
In [9]: calc.get_outgoing().all()
Out[9]:
[LinkTriple(node=<Data: uuid: 023c9af7-328d-43ca-8ef5-05f4f0bb773c (pk: 11082)>, link_type=<LinkType.CREATE: 'create'>, link_label='test'),
LinkTriple(node=<Data: uuid: e24e1f7f-6337-469b-a624-c1d13aa1c8eb (pk: 11083)>, link_type=<LinkType.CREATE: 'create'>, link_label='test')] both links are stored but that should not have been allowed. That can be shown by calling In [10]: calc.outputs.test
---------------------------------------------------------------------------
MultipleObjectsError Traceback (most recent call last)
<ipython-input-10-6f9a623c1bc3> in <module>
----> 1 calc.outputs.test
~/code/aiida/env/dev/aiida-core/aiida/orm/utils/managers.py in __getattr__(self, name)
81 """
82 try:
---> 83 return self._get_node_by_link_label(label=name)
84 except NotExistent:
85 # Note: in order for TAB-completion to work, we need to raise an
~/code/aiida/env/dev/aiida-core/aiida/orm/utils/managers.py in _get_node_by_link_label(self, label)
62 if self._incoming:
63 return self._node.get_incoming(link_type=self._link_type).get_node_by_label(label)
---> 64 return self._node.get_outgoing(link_type=self._link_type).get_node_by_label(label)
65
66 def __dir__(self):
~/code/aiida/env/dev/aiida-core/aiida/orm/utils/links.py in get_node_by_label(self, label)
296 else:
297 raise exceptions.MultipleObjectsError(
--> 298 'more than one neighbor with the label {} found'.format(label)
299 )
300
MultipleObjectsError: more than one neighbor with the label test found Weirdly, when looking at the tests in def test_node_outdegree_unique_triple(self):
"""Test that the validation of links with outdegree `unique_triple` works correctly
The example here is a `CalculationNode` that has two outgoing CREATE links with the same label, but to different
target nodes. This is legal and should pass validation.
"""
creator = CalculationNode().store()
data_one = Data()
data_two = Data()
# Verify that adding two create links with the same link label but to different target is allowed from the
# perspective of the source node (the CalculationNode in this case)
data_one.add_incoming(creator, link_type=LinkType.CREATE, link_label='create')
data_two.add_incoming(creator, link_type=LinkType.CREATE, link_label='create')
data_one.store()
data_two.store()
uuids_outgoing = set(node.uuid for node in creator.get_outgoing().all_nodes())
uuids_expected = set([data_one.uuid, data_two.uuid])
self.assertEqual(uuids_outgoing, uuids_expected) which tests exactly the example here and claims this should be find. However, the link validation code in link_mapping = {
LinkType.CALL_CALC: (WorkflowNode, CalculationNode, 'unique_triple', 'unique'),
LinkType.CALL_WORK: (WorkflowNode, WorkflowNode, 'unique_triple', 'unique'),
LinkType.CREATE: (CalculationNode, Data, 'unique_pair', 'unique'),
LinkType.INPUT_CALC: (Data, CalculationNode, 'unique_triple', 'unique_pair'),
LinkType.INPUT_WORK: (Data, WorkflowNode, 'unique_triple', 'unique_pair'),
LinkType.RETURN: (WorkflowNode, Data, 'unique_pair', 'unique_triple'),
} clearly stating that |
The validation relies on the outgoing node being stored because it uses |
I agree that the test is probably just wrong, since it's also pretty impossible to get into this situation with "normal" user code (workchains, calcjobs, etc.). |
While importing an old export file (export version 0.3), the migrations run through without error, but the actual import raises:
This is due to the "unique pair" check, here: https://github.com/aiidateam/aiida-core/blob/develop/aiida/tools/importexport/dbimport/backends/django/__init__.py#L656
I can't quite remember, was it allowed in a previous version of AiiDA to have multiple outgoing link labels of the same name? If so, what would be a good way to resolve this issue?
In this particular case I can just patch the code to rename the links (since I don't care too much about their names), but in general that might be dangerous.
The export I'm trying to import here can be found at https://polybox.ethz.ch/index.php/s/I1se1WGAiP2iLYX, but it's large and also suffers from #3450 (which I fixed with a
sed
on thedata.json
) - so it's not great for testing on.The text was updated successfully, but these errors were encountered: