Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

serialisers output diverge given a base #5

Open
bblfish opened this issue Feb 27, 2023 · 1 comment
Open

serialisers output diverge given a base #5

bblfish opened this issue Feb 27, 2023 · 1 comment

Comments

@bblfish
Copy link
Owner

bblfish commented Feb 27, 2023

Different serialisers give very different results when given a base.
This shows up in inconsistent results returned by SerialisationTestSuite "graphs with relative URIs" tests.

The reference graph consists of the three triples

<http://www.w3.org/2001/sw/RDFCore/ntriples/> 
     <http://purl.org/dc/elements/1.1/creator> "Dave Beckett", "Art Barstow" ;
     <http://purl.org/dc/elements/1.1/publisher> <http://www.w3.org/> .

Call the graph consisting of those triples referenceGraph.
Writing it out to a string with the following base

writer.asString(referenceGraph, Url("http://www.w3.org/2001/sw/RDFCore/"))

gives very different results depending on the serialisers.

The Jena Turtle Serialiser refers to the publisher using the relative URL /

<ntriples/>  <http://purl.org/dc/elements/1.1/creator>
                "Art Barstow" , "Dave Beckett" ;
        <http://purl.org/dc/elements/1.1/publisher>
                </> .

The Jena Json-LD serialized (Titanium) refers to the publisher using the
path relative URL ../../../

{
    "@id": "ntriples/",
    "http://purl.org/dc/elements/1.1/publisher": {
        "@id": "../../../"
    },
    "http://purl.org/dc/elements/1.1/creator": [
        "Art Barstow",
        "Dave Beckett"
    ]
}

and using the Jena RDF/XML serialisation, the publisher is referred to by its absolute URL.

<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:dc="http://purl.org/dc/elements/1.1/">
  <rdf:Description rdf:about="ntriples/">
    <dc:publisher rdf:resource="http://www.w3.org/"/>
    <dc:creator>Art Barstow</dc:creator>
    <dc:creator>Dave Beckett</dc:creator>
  </rdf:Description>
</rdf:RDF>

These are just differences between the parsers for Jena. (currently, RDF4J is on hold because of a dotty issue scala/scala3#16408, but it is unlikely that it gives us more uniformity )

There may be ways the behaviour could be tuned, but the best way to tune it is unclear.

The problem is that our tests try to deserialise the resulting relative graph strings with a different base. Given the above, this results in very different graphs. The idea was to test what happens if one moves a graph with relative URLs to a different base. But the publisher URL remains the same in the RDF/XML case, whereas that changes in the other two serialisations. Furthermore, in the json-ld case, the new base would have to ensure it is at least three levels deep to allow the reference ../../../ to refer to something. The Turtle root / would not need that.

@bblfish bblfish changed the title parsers/serialisers behave differently when given a base serious divergence in output between serialisers given a base Feb 27, 2023
@bblfish bblfish changed the title serious divergence in output between serialisers given a base serialisers output diverge given a base Feb 27, 2023
@bblfish
Copy link
Owner Author

bblfish commented Feb 27, 2023

The simplest way to fix these failing tests is to use the same base for deserialising the serialised content as was used for serialising it.

The above shows that doing anything else is way more complicated than initially thought. If someone needs such advanced behaviour they could write a function to change each URL in a graph to a relative URL according to their needs, and then serialising the resulting relative graph.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant