Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop http://www.w3.org/1999/02/22-rdf-syntax-ns#namedNode ? #1

Closed
RubenVerborgh opened this issue May 14, 2020 · 12 comments
Closed

Drop http://www.w3.org/1999/02/22-rdf-syntax-ns#namedNode ? #1

RubenVerborgh opened this issue May 14, 2020 · 12 comments

Comments

@RubenVerborgh
Copy link

The URI does not resolve, and it's a long string comparison to be performed. Suggestion: just leave empty.

And for blank, use a single char like _ or so.

@rescribet
Copy link
Contributor

Yeah, this basically is a published version of an experimental internal model, I found it disappointing that I couldn't find them on LOV.

We want to have some kind of IRI though (which should indeed resolve), since that enables the parser to be able to simply switch on datatype unconditionally. AFAIK (most) JS engines use interned strings, so the compare should come down to a simple reference compare.

@RubenVerborgh
Copy link
Author

RubenVerborgh commented May 14, 2020

Well, an identical series of characters parsed n times will still occupy n times the space. Interning doesn’t reconcile those. So if it must be a URI (and I don’t see why that would provide an advantage, certainly not memory- or speed-wise), perhaps make it a short URN.

You can still switch unconditionally, just much faster.

@joepio
Copy link
Member

joepio commented May 26, 2020

I agree that the current URL should be dropped, as it does not resolve.

In most RDF serialization formats, the default datatype for literals is String. If HexTuples would us the NamedNode (or the URI) as the default datatype, it should always serialize strings with xsd:string. That way, NamedNode Tuple statements do not need an IRI in the datatype.

If it really needs an IRI, I suggest linking to a NamedNode concept in this very spec - it would be a sensible place to resolve to. Or maybe the URI spec?

joepio added a commit that referenced this issue Jun 4, 2020
- Support blank nodes in graph #2
- Default to URI datatype and drop the unresolving named node URL #1
@joepio joepio mentioned this issue Jun 4, 2020
@rescribet
Copy link
Contributor

xsd:anyURI is a good candidate replacement for rdf:namedNode

@RubenVerborgh
Copy link
Author

No, that’s a URI, not the node named by that URI.

@rescribet
Copy link
Contributor

rescribet commented Jun 19, 2020

Perhaps I'm misunderstading, but

[s, p, v, dt, l, g]

that’s a URI

rdf:namedNode is currently in the dt position which indicates the datatype of v

not the node named by that URI

No that'd be the v position

So together

[s, p, "schema.org/name", "http://www.w3.org/2001/XMLSchema#anyURI", l, g]

Though rereading that part of a spec more closely, they allow them to be relative, which might pose a problem

@RubenVerborgh
Copy link
Author

RubenVerborgh commented Jun 19, 2020

There's a difference between

  • <http://example.org>
  • "http://example.org"^^<http://www.w3.org/2001/XMLSchema#anyURI>

Both exist and they are not the same.

Even "foo"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#namedNode> exists (despite the URI not resolving).

So I relaunch my suggestion of "" and "_", which will make a tremendous performance difference. Bonus: you can then even switch on dt.length and not on its contents, because no valid URIs of length 0 and 1 exist.

@rescribet
Copy link
Contributor

There's a difference between

http://example.org
"http://example.org"^^http://www.w3.org/2001/XMLSchema#anyURI

Hmm, seems bizarre to me. I intuitively figured that, since 'everything is a resource', literals are just a convenient way to point to certain resources which lack a uri space / are too complex for the uri spec. Thinking of the datatype iri as the 'scheme' and the value as the 'path'.

I'll go and rethink some things ;)

@RubenVerborgh
Copy link
Author

Here's a quick performance test: https://gist.github.com/RubenVerborgh/1b70a456230027468a715b54afb59242

On my machine:

  • 2.5M triples with full URIs for dt: 4.6s
  • 2.5M triples with single chars for dt: 2.8s

@rescribet
Copy link
Contributor

rescribet commented Jun 19, 2020

literals are just a convenient way to point to certain resources which lack a uri space / are too complex for the uri spec

*Or vice-versa, that uri's are used to determine points in irregular defined spaces

@RubenVerborgh
Copy link
Author

RubenVerborgh commented Jun 19, 2020

Hmm, seems bizarre to me.

See "http://example.org"^^http://www.w3.org/2001/XMLSchema#anyURI as a shortcut for

_:x a Literal.
_:x _:value "http://example.org".
_:x _:dataType <http://www.w3.org/2001/XMLSchema#anyURI>.

The syntax in fact hints at this interpretation, with ^ representing reverse path traversal in N3.

Full answer in https://www.w3.org/TR/rdf11-mt/

joepio added a commit that referenced this issue Dec 6, 2021
Resolve RDF namespace collisions

#1
@rescribet
Copy link
Contributor

Closed in https://github.com/ontola/hextuples-parser/releases/tag/v2.0.0

Has been replaced with globalId and localId respectively

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants