-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use canonical urls for entity references when serializing jsonld #887
Comments
👍
… On Jul 30, 2018, at 10:55 AM, dannylamb ***@***.***> wrote:
Right now, entity references have URLs pointing to their jsonld counterparts, but we should probably be writing out the canonical URL. The jsonld is an alternate representation of the node/media/term, but the triples are describing the canonical resource, independent of any other serialization.
In other words, with triples edited for brevity,
<http://localhost:8080/fcrepo/rest/4f/eb/ac/c6/4febacc6-20c9-4e50-8077-30bf0e8b6da5>
...
schema:author
<http://localhost:8000/user/1?_format=jsonld>
;
schema:additionalType
<http://localhost:8000/taxonomy/term/8?_format=jsonld>
;
schema:additionalType
<http://localhost:8000/taxonomy/term/1?_format=jsonld>
;
...
should become
<http://localhost:8080/fcrepo/rest/4f/eb/ac/c6/4febacc6-20c9-4e50-8077-30bf0e8b6da5>
...
schema:author
<http://localhost:8000/user/1>
;
schema:additionalType
<http://localhost:8000/taxonomy/term/8>
;
schema:additionalType
<http://localhost:8000/taxonomy/term/1>
;
...
The change is easy enough, just delete the setRouteParameter bit in https://github.com/Islandora-CLAW/jsonld/blob/c23337950a0d310f6ae032c27557e04c171a8daf/src/Normalizer/ContentEntityNormalizer.php#L258. I'm proposing we do it there because I think this should apply to everything, and is really independent of the whole "Islandora flushing to Fedora" process.
There are downstream ramifications, though. We'll need to update Milliner to be aware of the difference. It looks for the _format=jsonld url in a handful of places, and it also uses the _format=jsonld url when indexing the resource in Gemini.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
@dannylamb maybe missing a piece there so have a few questions: So currently The following line, the one your propose to remove, only adds the If the difference is just the argument after the base URL, also, would it not easier, if you really need to remove it, to do that as a post process when parsing it on milliner? |
@DiegoPino At a practical level, I don't think the data should be indexed in Gemini, the triplestore, and Fedora with But for me, semantically, my thinking is that the jsonld representation contains triples about the web resource, not triples about the jsonld representation of the web resource. Am I range 14'ing it here a bit? |
@dannylamb not sure what you mean with at a practical level, but interpret this/ guess the GET argument interferes with some preconceptions of what a valid URL should look like? Or with some processing needs? We can discuss it the URL is opaque or even semantically significant as an URL (true!) but at a w3c, rdf specs level with or without The issue i see with removing the ?_format=jsonld piece is that, not only web-semantically but also human-semantically, a published web resource, a.k.a an HTML rendered node can contain way more stuff that the entity/node described by the original json-ld. A Node can contain embedded (on display) the related media,viewers, tables of contents of other stuff, forms, comments, images, some blocks, etc. So the real issue here is the lack of content negotiation that does not allow the same URL to output different serialization v/s display eyecandy depending on how you request the URL without having to use the argument. And that leads for me to a community question that we need to resolve: WE need a canonical web resource URL, not the one drupal provides but our own, unique, stable,consistent one. Happens that incremental node ids (node/1) are an issue for any repository. Drupal 8 right now does not expose existing entity UUIDs as a autoload argument on its routing system (core), but i remember we got this working in CLAW on one of the first iterations some 2 years ago, so it is possible! Since content negotiation is not build into by default but any EventSubscriberInterface derived class that is subscribed to Symfony HTTP Kernel events (like onRequest) can have access and set to the So if we provide a real canonical in terms of reliable, consistent URI for our islandora resources (if there is such thing actually) then we could use content negotiation to redirect internally and generated the correct output solving this issue. Can we open this discussion to other folks? I kinda find it fundamental to define what our canonical will be. Thanks! |
The way I conceive it is that the node URL is the 'thing' and should be mapped in Gemini and the triplestore; it 'happens' to return a default serialization of HTML (cause that is what people/browsers expects). Yeah, the _format query parameter is a poor substitution for content negotiation (side-long glance at Drupal and Symphony). In addition to @DiegoPino's suggestion of generating a RedirectResponse, supposedly we could use an Apache rewrite condition to translate accept headers into the _format query parameter to plug this hole although I haven't tested it. There is a small chance Drupal will eventually support content negotiation again. Since there are a number of options that can be built on top of the base URL I wouldn't want to rely on the _format parameter for our URLs. |
Jumping in here to agree with both sides of the discussion, but to note that Drupal's use of But Drupal != semantic datastore so perhaps we need to accept this is a lighter-weight repository and if you want a full (LDP|graph|whatever) implementation you could expose yours to the web? |
FWIW, our REST api allows an informed client to at least find the
And I'll throw another option onto the table. We can embed the JSONLD in the html, so that they are one in the same and don't need separate urls. Then Google would actually be reading your metadata at least. |
JSON-LD in the html seems like and overkill. RDF is embeded in the html as
rdfa already so solved for seo/discoverability purpouses i guess. So to be
honest if you all really need to get rid of the format argument for
whatever purpose that is needed i’m ok with it. Would just love if that
would be done as an alter of the jsonld serialization from within
islandora(thay thing you wrote) and not at the serializer level, at least
until drupal has content negotiation
El El mar, 7 de ago. de 2018 a las 11:07, dannylamb <
[email protected]> escribió:
FWIW, our REST api allows an informed client to at least find the _format
routes from the canonical route. I'm not suggesting it as a general
replacement for, or touting that it's better than, *actual* conneg, but
all the available formats are advertised as "alternate" link headers on all
GET and HEAD responses.
***@***.***:~$ curl -I http://localhost:8000/node/5
HTTP/1.1 <http://localhost:8000/node/5HTTP/1.1> 200 OK
Date: Tue, 07 Aug 2018 14:47:22 GMT
Server: Apache/2.4.18 (Ubuntu)
X-Powered-By: PHP/7.0.30-0ubuntu0.16.04.1
Cache-Control: must-revalidate, no-cache, private
Link: <http://purl.org/coar/resource_type/c_c513>; rel="tag"; title="Image"
Link: <http://localhost:8000/media/13>; rel="related"; title="Preservation Master"
Link: <http://localhost:8000/media/14>; rel="related"; title="Service File"
Link: <http://localhost:8000/media/15>; rel="related"; title="Thumbnail"
Link: <http://localhost:8000/node/5?_format=jsonld>; rel="alternate"; type="application/ld+json"
Link: <http://localhost:8000/node/5?_format=json>; rel="alternate"; type="application/json"
Link: <http://localhost:8000/node/5>; rel="alternate"; hreflang="en"
Link: </node/5>; rel="canonical"
Link: </node/5>; rel="shortlink"
Link: </node/5>; rel="revision"
Link: </node?node=5>; rel="create"
X-Drupal-Dynamic-Cache: MISS
X-UA-Compatible: IE=edge
Content-language: en
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Vary:
X-Generator: Drupal 8 (https://www.drupal.org)
X-Drupal-Cache: MISS
Content-Type: text/html; charset=UTF-8
And I'll throw another option onto the table. We can embed the JSONLD in
the html, so that they are one in the same and don't need separate urls.
Then Google would actually be reading your metadata at least.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#887 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AGn8536NI88MiM_UkzcCd0Ypn9MTcVG4ks5uOa0XgaJpZM4VmkRu>
.
--
Diego Pino Navarro
Digital Repositories Developer
Metropolitan New York Library Council (METRO)
|
A reminder that this will need to be reflected in documentation of the object model/data flow. |
Testing InstructionsSetup
Confirm Prior Behaviour
Configure new behaviour
Confirm new behaviour
|
We'll follow all of this up with a claw playbook PR to make this configuration default. That way, folks can just get the new code and not be affected and configure it if they want to. But for new users, claw-playbook is just going to do this for them. But these PRs can get merged in without the claw-playbook PR after we confirm both before and after behaviour. Documentation PR pending as well... |
@rosiel doc PR is up, too. Just waiting on a successful vagrant build to make the claw-playbook PR... |
Islandora-Devops/islandora-playbook#106 is up and ready to test. I had to move a bunch of branches over to the foundation repos so that I could get at them with ansible. I've also moved all the PRs over to point at those branches so we're merging what we're testing in case I push up more changes. New Testing Instructions (only confirms new behaviour, not old)After installing from Islandora-Devops/islandora-playbook#106:
|
OK @Islandora-CLAW/committers Islandora-Devops/islandora-playbook#106 is ready to go. I've successfully installed using it and have a box with no more |
I have reviewed the Crayfish, Islandora, claw-playbook and ansible-role-crayfish PRs. Someone else needs to review the JsonLD one. |
I just approved the JSON-LD one. Are these getting merged in any particular order? |
@seth-shaw-unlv JSON-LD needs to happen first, then a version of it should be tagged, then the Islandora PR needs to be updated to reference that new version of JSON-LD. Crayfish is in, so that PR needs to be updated to reference the new version I tagged. ( |
Alrighty then, I give you JSON-LD 0.2.0. |
@seth-shaw-unlv @whikloj I'll update the islandora pull to reference it. |
well... and all the other stuff too. |
islandora-deprecated/ansible-role-crayfish#23 and #134 have been updated to use 0.2.0 for both jsonld and crayfish. When those are in, I'll update Islandora-Devops/islandora-playbook#106 |
Resolved via Islandora-Devops/islandora-playbook@0036464 |
Right now, entity references have URLs pointing to their jsonld counterparts, but we should probably be writing out the canonical URL. The jsonld is an alternate representation of the node/media/term, but the triples are describing the canonical resource, independent of any other serialization.
In other words, with triples edited for brevity,
should become
The change is easy enough, just delete the
setRouteParameter
bit in https://github.com/Islandora-CLAW/jsonld/blob/c23337950a0d310f6ae032c27557e04c171a8daf/src/Normalizer/ContentEntityNormalizer.php#L258. I'm proposing we do it there because I think this should apply to everything, and is really independent of the whole "Islandora flushing to Fedora" process.There are downstream ramifications, though. We'll need to update Milliner to be aware of the difference. It looks for the
_format=jsonld
url in a handful of places, and it also uses the_format=jsonld
url when indexing the resource in Gemini.The text was updated successfully, but these errors were encountered: