Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rdf serializer #308

Merged
merged 8 commits into from
Nov 6, 2024
Merged

Conversation

JaFeKl
Copy link

@JaFeKl JaFeKl commented Oct 3, 2024

Just finished on the serialization part for RDF. Wanted to put it in here already before getting into the deserializer part.

Things to consider:

  • this serialization introduces two new dependencies: "rdflib" and "pyshacl". Both only support Python>3.8.1 which is currently not in line with the 3.8.0 minimum requirement for the sdk. Pyshacl is used during test to validate against the official shacl schema files.

  • With regards to testing: There are some open issues on the SHACL file generation. The currently generated SHACL file has some flaws that make it unsuable for correct validation right now, Pull Request #526. I validated the serializer with the file provided by @mhrimaz in the above pull request.

@JaFeKl JaFeKl changed the title Rdf mapper Rdf serializer Oct 3, 2024
Copy link
Contributor

@s-heppner s-heppner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for your contribution. Let's work together to get it ready for merging.

Can you please fill out the Eclipse Contributor Agreement with the E-Mail address you used to commit? This is a necessary requirement for contributing to Eclipse projects.

To do that, you need to create an Eclipse account with that E-Mail address and then click that you agree to the ECA: https://accounts.eclipse.org/user/register

basyx/aas/adapter/rdf/__init__.py Outdated Show resolved Hide resolved
basyx/aas/adapter/rdf/rdf_serialization.py Show resolved Hide resolved
@JaFeKl
Copy link
Author

JaFeKl commented Oct 5, 2024

I created an Eclipse account and needed to change the author informations of my previous commits, hope this works...

@JaFeKl
Copy link
Author

JaFeKl commented Oct 5, 2024

The testing should currently be skipped since no schemas will be loaded. I would suggest to add it to the pipeline as soon as the shacl file available is valid. Then its desgined to be the same way as for the XML and the JSON schema which are both loaded into a schema folder which is created during the automated testing.

@mhrimaz
Copy link
Contributor

mhrimaz commented Oct 5, 2024

BTW, there are flaws in the RDF representation that makes it unusable. So at the end of the day, IDTA needs to fix it.

@s-heppner
Copy link
Contributor

Thanks again for your work!
The last problems I see seem to have to do with typing, as mypy complains in the CI, though it seems to be mostly the same issue all over:

basyx/aas/adapter/rdf/rdf_serialization.py:190: error: Argument 2 to "_abstract_classes_to_rdf" of "AASToRDFEncoder" has incompatible type "BNode"; expected "URIRef"  [arg-type]

Could you please check that?

Also, mypy complains about pyshacl:

test/adapter/rdf/test_rdf_serialization.py:12: error: Skipping analyzing "pyshacl": module is installed, but missing library stubs or py.typed marker  [import-untyped]

That's weird, since, the project seems to have a py.typed file indicating that they actually use static types here. My suspicion is that it's located in the wrong spot, but I could of course be wrong, so could you please have a look as well?
I've created an issue here: RDFLib/pySHACL#267.

@s-heppner
Copy link
Contributor

BTW, there are flaws in the RDF representation that makes it unusable. So at the end of the day, IDTA needs to fix it.

@mhrimaz Did you already create issues regarding that in aas-core or aas-specs? I'd be interested in knowing about them, before we go and add a feature that doesn't work to our SDK 😅

@mhrimaz
Copy link
Contributor

mhrimaz commented Oct 10, 2024

@s-heppner @JaFeKl This is a longstanding issue. The RDF representation has several problems (admin-shell-io/aas-specs#384), but the most critical one is that it doesn't preserve the order of elements (admin-shell-io/aas-specs#45). This is crucial for References, where we have a chain of keys, as well as for other elements where order matters.

The challenge is that there is no single correct solution. We are aware of the potential fixes. We've also discussed adding this to the aas-core (aas-core-works/aas-core-codegen#427) and AAS4j (https://www.eclipse.org/lists/aas4j-dev/msg00050.html), but the discussions always stall because we are waiting for the new representation, which I believe the Ontology Workgroup is currently working on.

I hope more people push AAS/RDF and to make this a priority for IDTA. Having a known issue for three years without addressing it is not good for publicity.

@s-heppner
Copy link
Contributor

I understand. Thank you for sharing!
@mhrimaz, @JaFeKl could you write me an e-mail to make a short meeting in which we discuss what is actually now implemented, what works and what the implications of the problems with the specification are?

I'm fully on board to add RDF to BaSyx Python, but I think potential users should be well aware of what they get.

@JaFeKl
Copy link
Author

JaFeKl commented Oct 17, 2024

Sorry for my late response ...
@mhrimaz Thanks you for the summary. At least nothing that I was not aware of.
As you said, when you stay stricktly with the implemented specification, serialization and deserialization won't work due to the missing order of keys.

Nevertheless, since I needed a serialization to push some AAS to a graph database I just thought why not contribute it instead of creating it solely.

@s-heppner In the end, you should decide if you want it in the python-sdk even though these limitations exist, or you just wait until something on the IDTA side decides to tackle these problems.

@mhrimaz
Copy link
Contributor

mhrimaz commented Oct 18, 2024

@JaFeKl, is the email address written here: https://www.ifl.kit.edu/mitarbeiter_jan_felix_klein.php correct? We tried to reach you but didn’t get a response.
@JaFeKl, I’m really glad that you made this contribution. Do you have any plans to add a deserialization function?

By the way, yesterday (18.10.2024), there was a meeting with the IDTA Ontology workgroup, and they were interested in testing some of the available tools. I mentioned your contribution here. You can also provide your inputs and feedbacks, especially since you have experience with the implementation. Maybe you can reach out to Nico Wilhelm, he is the workgroup lead.

@s-heppner
Copy link
Contributor

After some discussion with the team, we propose the following:

Let's get this PR ready to be merged, but merge it into a new RDF branch (naming in progress).
That way, we have the functionality officially inside the SDK repository and people can use it if they need to (e.g by pip installing straight from GitHub and specifying the branch), but we will not automatically include something that fundamentally cannot work due to problems with the specifications in our next release.

As a side effect, we can also maybe shine a spotlight towards the issues and once they are resolved, we adapt the code in the branch accordingly and can merge it to main.

Would that be okay with you @JaFeKl?

@JaFeKl
Copy link
Author

JaFeKl commented Oct 24, 2024

@s-heppner This actually sounds like a really good solution to me for now. This allows us also to set it as a dependency with the included RDF serialization in our software. Any more changes needed on the pull request for this?

@mhrimaz yes that is my current e-mail, but I got too many things going on right now so it kinda got lost in between. Thanks for mentioning the working group lead, so I may contact him.

Copy link
Contributor

@s-heppner s-heppner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good! I think the last step would be to update your branch to the current state of main and moving your dependencies to the new pyproject.toml. After that, we should be able to merge this PR.

requirements.txt Outdated Show resolved Hide resolved
@s-heppner
Copy link
Contributor

Great, thank you very much. We should be on the home stretch. Can you have a look at this mypy error:

basyx/aas/adapter/rdf/rdf_serialization.py:257: error: Argument 2 to "_value_to_rdf" of "AASToRDFEncoder" has incompatible type "type[relativedelta] | type[datetime] | type[Date] | type[time] | type[GYearMonth] | <10 more items> | None"; expected "type[relativedelta] | type[datetime] | type[Date] | type[time] | type[GYearMonth] | <26 more items>"  [arg-type]

It seems like the provided argument can logically actually only have less types than the function expects. Can you make sure these two align? I think everything else looks good to me.

@JaFeKl
Copy link
Author

JaFeKl commented Oct 28, 2024

I just reviewed this and I am unable to comprehend why mypy throws the error for this line. The _value_to_rdf method is used multiple times in exactly the same fashion for other aas types whithout any issues and only fails when serializing the extension object. I am actually really confused.
Its also used similar in the xml serializer without any errors...

@s-heppner s-heppner changed the base branch from main to Experimental/Adapter/RDF November 2, 2024 08:45
@s-heppner
Copy link
Contributor

I've created the new branch, so as soon as the # type: ignore is done we can merge it.
Thanks for your patience!

@s-heppner s-heppner merged commit 323835c into eclipse-basyx:Experimental/Adapter/RDF Nov 6, 2024
7 of 8 checks passed
@s-heppner
Copy link
Contributor

I've merged this PR in order to be able to modify it myself. Could you have a look at #320? Does this work for you?

@JaFeKl
Copy link
Author

JaFeKl commented Nov 7, 2024

Yes, all fine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants