Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request: make available formats explicit #225

Closed
geoffroy-noel-ddh opened this issue Oct 8, 2022 · 6 comments
Closed

Request: make available formats explicit #225

geoffroy-noel-ddh opened this issue Oct 8, 2022 · 6 comments
Labels
Collection Endpoint Issues that deal with the Collection Endpoint Document Endpoint Issues that deal with the Document Endpoint

Comments

@geoffroy-noel-ddh
Copy link

geoffroy-noel-ddh commented Oct 8, 2022

Hi,

Tell me if I'm wrong but I don't see where in the specification an implementation can expose the formats it supports. Is that some implicit knowledge an API client is currently assumed to have about the services it calls (i.e. a text viewer must somehow know that a specific DTS implementation offers HTML or plain text as alternative formats)?

If that feature is ever considered, for maximum granularity/flexibility, it might be preferable to declare acceptable formats at the document level (e.g. in the members returned by the Collection response).

@awagner-mainz
Copy link
Contributor

A defensive strategy could rely on the idea that a "REST API should be entered with no prior knowledge beyond the initial URI (bookmark) and set of standardized media types that are appropriate for the intended audience (i.e., expected to be understood by any client that might use the API)." (https://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven) But I agree that this would still leave plaintext, xml, html, ebook formats, pdf, or even image formats for the corresponding scan images as potential representations. I see two (not mutually exclusive) standard approaches:

  1. Content negotiation. This relies on the client specifying what formats it would prefer, and the server responding with the best format it can provide.

  2. HTTP Link headers. A resource URI (document endpoint) would list all available formats in the form of multiple HTTP Link headers, wherein each link's URI would contain a format parameter for a media type, and its type would specify the mimetype available behing this URI.

Both methods work without adding fields and information to the response body. IMHO, where standards deal with things on a higher level of the API stack, they should be solved there (i.e. HTTP or HATEOAS/RESTful behaviour, maybe Hydra also has something to say about it?). However, I am not sure if this means that the spec should not mention/discuss this at all.

@PonteIneptique
Copy link
Member

Hi,
I think we already discussed this internally, but I won't speak for everyone here.

I believe Content negotiation is the go-to for this, but HTTP Link headers could be a neat way to supplement it.

@PonteIneptique PonteIneptique added Document Endpoint Issues that deal with the Document Endpoint Collection Endpoint Issues that deal with the Collection Endpoint labels Oct 13, 2022
@geoffroy-noel-ddh
Copy link
Author

geoffroy-noel-ddh commented Oct 13, 2022

Relying on underlying protocol layers if possible would be nice. Although requested format is already part of the DTS layer. One disadvantage of the content negotiation approach (if I understand correctly!) is that the client would have to know what is the range of possible formats it can ask for in general, so that may excludes discovery of new or custom formats. Also it doesn't allow the client to filter a collection by formats.

Example use case: a web-based text viewer which can only show HTML texts from a list. With the two options suggested above I think the viewer would have to probe each document in a collection individually in order to show to the user a shortlist. If the collection is long, the process will be slow and won't scale well (e.g. EDH has a collection with 80,000 docs). Alternatively showing the complete list of documents in the collection and letting the user probe them one by one lead to a frustrating experience.

PonteIneptique added a commit to monotasker/specifications that referenced this issue Feb 9, 2024
- Renamed `?id` query parameter identifying a `Resource` to `?resource`
- Renamed the `?format` query parameter for content-negociation to `?mediaType` (Implements parts of distributed-text-services#225)
- Renamed the `<dts:fragment>` XML node for XML/TEI responses to `<dts:wrapper>`.
- Added attributes to the `<dts:wrapper>` element to allow for identifying specific nodes within the wrapped TEI (Fixes distributed-text-services#133)
- Removed the requirement for `Link` and `Media-Type` HTTP Response Headers
  - Implementation stil **should** provide such capacity.
- Clarified error codes and condition of errors generations.
- Removed URI templates as per Hydra definition.
- Added implementation of multiple trees through the `?tree` parameter ( (fixes distributed-text-services#142, distributed-text-services#223, distributed-text-services#202)
@PonteIneptique
Copy link
Member

During the RC Workshop in Durham, it was decided that an supportedMediaTypes property would be part of the Resource object in the Navigation and Collection endpoints.

Implementation in the specs will be visible in #238

@monotasker
Copy link
Collaborator

The property mediaTypes (not supportedMediaTypes) has now been added to the root return object from the Collection and Navigation endpoints in release 1-alpha1

@monotasker
Copy link
Collaborator

We published the resolution of this issue during the tech committee meeting on 2024-03-08
commit a0db8ca
release https://github.com/distributed-text-services/specifications/releases/tag/1-alpha1

This is an alpha release and we are looking for feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Collection Endpoint Issues that deal with the Collection Endpoint Document Endpoint Issues that deal with the Document Endpoint
Projects
Status: Accepted
Development

No branches or pull requests

4 participants