-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What's the citation model for DTS Resource ? #101
Comments
I'd actually had that I would recommend moving the namespace to "http://www.tei-c.org/ns/1.0#" instead of "http://www.tei-c.org/ns/1.0" otherwise prefix extension produces "http://www.tei-c.org/ns/1.0matchPattern" and that's awful. Or "http://www.tei-c.org/ns/1.0/" btw. Both are good to me. |
Opened an issue regarding this in distributed-text-services/specifications#101
|
I updated number 4 so that it's clearer. My answer here is mostly targeted at your 1 : we actually can't infer because there is a possibility people have a citation complex tree rather than a citation "line". Most of CTS texts have wonderful "book->poem->line" but what about
Here, the CTS model would fail most probably. You can have different match pattern (let say poems are numbered while paragraph are |
My 2 cents:
|
Quick answers :
|
I'm confused about what this is meant to achieve (possibly I just haven't had enough coffee yet). Canonical References in TEI allow you to construct a custom URI referencing system, which is fine and good. But I'm missing the point of them here. Shouldn't the Reference API just tell you what sorts of references you can have? Why should the Collection API bother telling you how they're constructed? |
and @hcayless 's comment makes me realize I misunderstood the point of this issue. I thought we were talking about the TEI refsDecl structure ... I clearly had either had not enough or too much coffee myself at that point :-) To respond to Hugh's point, I could see the DTS API making this information available being useful for purposes of a chain of provenance or reproducibility. To reframe my answers to the above in the correct context:
|
The issue with the reference API is that it throws at you references, but for example, one of the very common thing I do with CTS APIs is : Retrieve Text Metadata -> Retrieve all References at Deepest Level (thanks to Text Metadata) -> Retrieve passages based on the last results. Right now, our system cannot provide this kind of workflow because we do not have a space to state how the references of the text are structured. |
Ok. I see the point of that use case, but I don't see yet how having the Collection API give you TEI cRefPatterns helps. Maybe I'm being dense. I see the problem, but I don't see how this is a solution. Wouldn't it be better to come up with some declarative representation of the available levels and how citations to them are constructed? Put another way, I see the point of the What about something like: {
"@id" : "urn:cts:latinLit:phi1103.phi001.lascivaroma-lat1",
"@type": "Resource",
"...": "...",
"dts:citeStructure": [
{
"dts:citePattern": "(\\w+)",
"dts:level": 1,
"label": "poem"
},
{
"dts:citePattern": "(\\w+)\\.(\\w+)",
"dts:level": 2,
"label": "line"
}
]
} Seems like IRI templates might be better for this than regex patterns though... |
I'd be totally for it. It's just that we talked about it being based on
cRefPattern, but your proposed structure is good to me.
|
I probably failed to properly think through the implications when we talked bout it, but now I think it's better to just tell the client how citations are constructed than to give it implementation details it can't really use. |
I am not completely certain of the match pattern and replacement pattern use (whatever the namespace or implementation is). On the other end, having information about the "citation graph" structure and metadata about it seems to me important as well :) I think we have an agreement here right ? |
How about a URI template along these lines:
|
Recursivity and graph description of citation scheme : {
"@id" : "urn:cts:latinLit:phi1103.phi001.lascivaroma-lat1",
"@type": "Resource",
"...": "...",
"dts:citeStructure": [
{
"label": "poem",
"dts:citeStructure": [
{
"label": "line"
}
]
}
]
} |
I need to understand the requirements and use case better. If I am in a client, what are the sequence of steps I am taking when I encounter this data, and what do I want to do with it? I assume we have to be able to handle any kind of reference the same way, supporting CTS and other references that may be quite different. Are you looking for a way to describe the citation structure for a given resource? What do you want the client to do with it? A set of use cases written down in this issue would be helpful. |
"dts:citeStructure": [
{
"dts:level": 1,
"label": ["poem", "section"]
},
{
"dts:level": 2,
"label": ["line", "paragraph"]
}
] |
Three simple use cases :
|
I'd like to add a further use case, coming from a citation matching perspective which directly derives from what I'm doing with the CTS API via Capitains resolvers to build HuCit a knowledge base of classical texts and citable text units.
I give a concrete example of this use case at p. 108 of my PhD dissertation: |
I still have some misgivings about this. The example I mentioned in our last meeting was Ovid's Tristia, where you have a general structure of book, poem, line, but Book 2 is a single, almost 600-line poem. You'll note if you go to Book 2 in Perseus, that it doesn't bother to chunk it the way it does (e.g.) the Aeneid Book 1 (despite their similar length). I understand wanting to tell a client what the levels are, but I'd want to be able to do that in a useful way. As an API client, If I was deciding how to chunk things, I could certainly do it on Book / poem for most of the Tristia, but I'd want to (maybe) do it on Book / 20-30 lines for Book 2. |
This becomes more and more complicated right :) "dts:citeStructure": [
{"@value": ["book", "poem", "line"]},
{"@value": ["book", "line"]}
] But it would definitely start to make things complicated if you have more than - say - 3 or 4 different schemes. Again, if we want to have full details, maybe this would be up to the Navigation endpoint ? |
Option I back for next week is #101 (comment) |
Action item : do a pull request with comment on top with |
Fixed in #104 |
In the discussion we had, we discussed having
tei:refsDecl
in the metadata of Resource in the Collection API.Basically, my examples covered that in this way :
Which would result in having the object
I open this issue because we skipped really quickly over it in talks, agreed upon it only generally.
I think I'd have the following question :
2
(lines within poem). This is something that we cannot capture by the simple expression of match pattern actually. In the CapiTainS draft guidelines, I used the attributetei:corresp
for that but maybe we should use something likedts:depth
?@type
totei:type
in the examples.Note that while some of that might seem to be too much, they all are partial responses to real problems in the understanding from the parser standpoint of the structure of the text...
The text was updated successfully, but these errors were encountered: