forked from ivoa-std/DataLink
-
Notifications
You must be signed in to change notification settings - Fork 0
/
DataLink.tex
1524 lines (1280 loc) · 63.5 KB
/
DataLink.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\documentclass[11pt,a4paper]{ivoa}
\input tthdefs
\title{IVOA DataLink}
% see ivoatexDoc for what group names to use here
\ivoagroup{DAL}
\author[http://www.ivoa.net/twiki/bin/view/IVOA/PatrickDowler]
{Patrick Dowler}
\author[http://www.ivoa.net/twiki/bin/view/IVOA/FrancoisBonnarel]
{Fran\c{c}ois Bonnarel}
\author[http://www.ivoa.net/twiki/bin/view/IVOA/LaurentMichel]
{Laurent Michel}
\author[http://www.ivoa.net/twiki/bin/view/IVOA/MarkusDemleitner]
{Markus Demleitner}
\author[http://www.ivoa.net/twiki/bin/view/IVOA/MarkTaylor]
{Mark Taylor}
\editor[http://www.ivoa.net/twiki/bin/view/IVOA/FrancoisBonnarel]
{Fran\c{c}ois Bonnarel}
\editor[http://www.ivoa.net/twiki/bin/view/IVOA/PatrickDowler]
{Patrick Dowler}
% \previousversion[????URL????]{????Concise Document Label????i}
\previousversion[https://www.ivoa.net/documents/DataLink/20230413/]{PR-DataLink-1.1-20230413}
\previousversion[https://www.ivoa.net/documents/DataLink/20211115/]{WD-DataLink-1.1-20211115}
\previousversion[https://www.ivoa.net/documents/DataLink/20150617/]{DataLink-1.0}
\newcommand{\blinks}{\{links\}}
\newcommand{\attval}[2]{#1={\allowbreak}{"}#2{"}}
\newcommand{\rfcmust}{\textbf{must}}
\newcommand{\rfcshould}{\textbf{should}}
\newcommand{\rfcmay}{\textbf{may}}
\newcommand{\rfcrecommended}{\textbf{recommended}}
\newcommand{\rfcoptional}{\textbf{optional}}
\begin{document}
\begin{abstract}
This document describes the linking of data discovery metadata
to access to the data itself, further detailed metadata, related
resources, and to services that perform operations on the data. The web
service capability supports a drill-down into the details of a specific
dataset and provides a set of links to the dataset file(s) and related
resources. This specification also includes a VOTable-specific method
of providing descriptions of one or more services and their input(s),
usually using parameter values from elsewhere in the VOTable document.
Providers are able to describe services that are relevant to the records
(usually datasets with identifiers) by including service descriptors in
a result document.
\end{abstract}
\section*{Acknowledgments}
The authors would like to thank all the participants in DAL-WG discussions
for their ideas, critical reviews, and contributions to this document.
\section*{Conformance-related definitions}
The words ``\rfcmust'', ``\rfcshould'', ``\rfcmay'', ``\rfcrecommended'', and
``\rfcoptional'' (in upper or lower case) used in this document are to be
interpreted as described in IETF standard RFC2119 \citep{std:RFC2119}.
This document uses curly braces (e.g. \{name\}) to refer to a named concept
such as a web service endpoint where the text requires a logical name but
the actual name in a service implementing the standard are not restricted.
The \emph{Virtual Observatory (VO)} is a
general term for a collection of federated resources that can be used
to conduct astronomical research, education, and outreach.
The \href{http://www.ivoa.net}{International
Virtual Observatory Alliance (IVOA)} is a global
collaboration of separately funded projects to develop standards and
infrastructure that enable VO applications.
\section{Introduction}
This specification defines mechanisms for connecting data items
discovered via one service to related data products and web services.
The {\em links\/} web service capability is a web service capability
for drilling
down from a discovered data item such as an identifier,
a source in a catalog or any other data item. In the first case
(typically an IVOA publisher dataset identifier) it allows
clients to find ancillary resources like progenitors, derived data
products, or alternate representations of the data, and
services that can act upon the data (usually without having to download
the entire dataset). The expected usage is for DAL (Data Access Layer)
data discovery services (e.g.\ a TAP service \citep{2010ivoa.spec.0327D}
with the ObsCore \citep{2017ivoa.spec.0509L} data
model or one of the simple DAL services) to provide an identifier that
can be used to query the associated DataLink capability. The DataLink
capability will respond with a list of links that can be used to access
the data. Here we specify the calling interface for the capability and
the response, which lists the links and provides both concrete metadata
and a semantic vocabulary so clients can decide which links to use.
The {\em service descriptor resource\/}
uses the metadata features of VOTable to
embed service metadata along with tabular data, such as would be obtained
by querying a simple DAL data discovery service or a TAP service. This
service metadata tells the client how to invoke a service and, for those
registered in an IVOA registry, how to lookup additional information
about the service. The service provider can use this mechanism to tell
clients about services that can be invoked to access the discovered
data item in some way: get additional metadata, download the data, or
invoke services that act upon the data files. These services may be
IVOA standard services or custom services from the data providers.
We expect that the {\em service descriptor resource\/}
mechanism will be the primary way that clients will find and
use the {\em links\/} capability from data discovery
responses.
\subsection{The Role in the IVOA Architecture}
DataLink is a data access protocol in the IVOA architecture whose purpose
is to provide a mechanism to link resources found via one service to
resources provided by other services.
\begin{figure}[ht]
\centering
\includegraphics[width=0.9\textwidth]{role_diagram.pdf}
\caption{Architecture diagram for this document}
\label{fig:archdiag}
\end{figure}
Although not shown in Figure \ref{fig:archdiag},
any implementation of an access protocol could
make use of DataLink to expose resources. DataLink services conform to
the Data Access Layer Interface specification
(DALI, \citet{2017ivoa.spec.0517D}),
including the
Virtual Observatory Support Interfaces resources
(VOSI, \citet{2017ivoa.spec.0524G}).
DataLink services use VOTable \citep{2019ivoa.spec.1021O}
as the default output format both for successful
output and to return error documents.
DataLink specifies a standardID for itself which, as defined in VOResource
\citep{2018ivoa.spec.0625P}, is used to identify compliant service
capabilities in Registry and VOSI metadata.
It also specifies how to
include standardID values in the response to describe links to services.
DataLink includes a description of how data discovery services can include
the link to the associated DataLink service in VOTable. VOTable is
also the default output format for the DataLink web service capability.
\subsection{Motivating Use Cases}
Below are some of the more common use cases that have motivated the
development of the DataLink specification. While this is not complete,
it helps to understand the problem area covered by this specification.
\subsubsection{Multiple Files per Dataset}
\label{sec:useMultiFile}
It is very common for a single dataset to be physically manifest as
multiple files of various types. With a DataLink web service, the client
can drill down using a discovered dataset identifier and obtain links to
download one or more data files. For static data files, the DataLink
service will be able to provide a URL as well as the content-type and
content-length (file size) for each download.
\subsubsection{Progenitor Dataset}
In some cases, the data provider may wish to provide one or more links to
progenitor (input) datasets; this would enable the users to drill down
to input data in order to better understand the content of the product
dataset, possibly reproduce the product to evaluate the processing,
or reprocess it with different parameters or software.
\subsubsection{Alternate Representations}
For some datasets (large ones) it is useful to be able to access
preview data (either precomputed or generated on-the-fly) and use it
to determine if the entire dataset should be downloaded (e.g.\ in an
interactive session). A DataLink service can provide links to previews
as a URL with a specific relationship to the dataset and include other
metadata like content-type (e.g.\ image/png) and content-length to assist
the client in selecting a preview; multiple previews with different sizes
(content-length) could be returned in the list of links. Plots derived
from the dataset could also be linked as previews. Some previews may be
of the same content-type as the complete dataset, but reduced content
in some fashion (e.g.\ a representative image or spectrum derived from
a large data cube).
Links to alternate representations may be to pre-generated resources
or may be computed on the fly, using either an opaque URL or a custom
parameterised service (see \ref{sec:useCustom} below).
Other alternate representations that are not previews could also
be included in the list of links. For example, one could provide an
alternate download format for a data file with different content-type
(e.g.\ FITS and HDF).
\subsubsection{Standard Services}
\label{sec:useStandard}
Data providers often implement services that can access a dataset
or its files using standard service interfaces or provide alternate
representations of the dataset. For example, the links for a dataset
discovered via a TAP service could be to an SSA service, allowing
the caller to get an SSA query response that describes the same dataset
with metadata specific to the SSA service.
Providers should be able to link to current and future data
access services that perform filtering and transformations as these
services are defined and implemented (without requiring a new DataLink
specification). For IVOA standard services, the DataLink response would
use the VODataService standardID as the service type to tell the client which
standard (and version) the linked service complies to. The client can
select services they understand and use the link to invoke the service
(with additional service parameters added by the client).
\subsubsection{Free or Custom Services}
\label{sec:useCustom}
Data providers often implement custom services that can access a dataset
or its files or provide alternate representations of the dataset. The
availability of such services should be conveyed to clients/users in
the same fashion as for standard services. This allows services defined
within the VO to be used in conjunction with services defined outside
the VO to deliver features to users.
\subsubsection{Access Data Services}
In many access scenarios, server-side processing of data is
highly desirable, typically to reduce the amount of data to be
transferred. Examples for such operations are cutouts, slicing of
cubes, and re-binning to a coarser grid. Other examples for server-side
operations include on-the-fly format conversion or recalibration. For
the purpose of this specification, we call such services
{\em access data services}.
DataLink should let providers declare such access data services
in a way that a generic client can discover what operations are supported,
their semantics, and the domains of the operations' parameters. This lets
clients operate multiple independent access services behind a common user
interface, allowing scenarios like ``give me all voxels around positions
X in wavelength range Y of all spectral cubes from services Z\_1, Z\_2,
and Z\_9''.
Access data services may be custom services with peculiar functionalities
or IVOA standard services. The IVOA access data service standard is
SODA \citep{2017ivoa.spec.0517B}.
SODA services should be described in the same
way as custom access data services.
\subsubsection{Recursive DataLink}
In some cases, a dataset may contain many files
(as in \ref{sec:useMultiFile} above)
and the provider may wish to make some files directly accessible and
other (less important) files only accessible via additional calls. Such
organisation of links could be accomplished by including a link to
another DataLink service in the initial DataLink response (e.g.\ recursive
DataLink). This service link would be described with both a service type
(as in \ref{sec:useStandard}) and content type.
\subsubsection{Datasets linked to an astronomical source}
\label{sec:useSource}
There are a lot of catalogs of astronomical sources made available
using VO standards such as ConeSearch \citep{2008ivoa.specQ0222P} or
TAP. For some catalogs ``associated data'' are available. These
data include images from which sources have been extracted, or imaging the
object in case of extended objects, as well as additional observations
such as Spectra or Time Series of the source and even spectral cubes
and Time Series of images for extended or varying objects. The \blinks\
response obtained for the source id can allow easy retrieval of all
these associated data in one shot.
\subsubsection{Metadata and data related to provenance entities}
\label{sec:useProvenance}
The IVOA Provenance datamodel \citep{pr:provdm} represents metadata
tracing the history of the data. This information can be stored and retrieved
in several ways including in DAL services.
The Entity instances represent the state of the data items between
various steps of the data processing flow. ``Entities'' can be hooked
to the more complete data they represent using the \blinks\ endpoint.
Reversely full provenance records can be linked to standard discovery
service rows using the same endpoint.
\section{The \blinks\ endpoint}
\label{sec:linksEndpoint}
Most commonly, DataLink link lists are retrieved from \blinks\ endpoints.
These are DALI-sync endpoints with implementor-defined names.
As specified by DALI-sync, the parameters for a request are submitted
using an HTTP GET (query string) or POST action. Any service may offer
zero or more datalink endpoints.
\subsection{Parameters on \blinks\ endpoints}
On \blinks\ endpoints, the ID and RESPONSEFORMAT parameters as defined
below are mandatory.
\subsubsection{ID}
\label{sec:resourceId}
The ID parameter is used by the client to specify one or more
identifiers. The service will return at least one link for each of the
specified values. The ID values are found in data discovery services
and \rfcmay\ be readable URIs or opaque strings. Submitting ID values in batches
may be more efficient if the client is planning to submit many such values;
clients can control the size of the output by limiting the number of ID values
they submit in each request.
Services \rfcmay\ place a limit on the number of ID values they will process in one
request. If the client submits more ID values than a service is prepared to
process, the service \rfcshould\ process ID values up to the limit and
\rfcmust\ include an overflow indicator in the output to denote that
the result is truncated as described in DALI.
The service \rfcmust\ \textbf{not} truncate the output within the set of rows
(links) for a single ID value.
If the client submits no ID values, the service \rfcmust\ respond with a
normal response (e.g.\ an empty results table for VOTable output).
The service may include service descriptors
(see \ref{sec:serviceDescriptors})
for related services and a service descriptor describing itself
(see \ref{sec:selfDescribing}).
\subsubsection{RESPONSEFORMAT}
\label{sec:responseformat}
The RESPONSEFORMAT parameter is described in DALI;
support for RESPONSEFORMAT is mandatory.
The only output format required by this specification is VOTable with
TABLEDATA serialization; services \rfcmust\ support this format. Clients
that want to get the standard (VOTable) output format should simply
ignore this parameter.
To comply with this standard, a \blinks\ endpoint only needs to strip
off MIME type parameters and understand the following:
\begin{itemize}
\item no RESPONSEFORMAT
\item RESPONSEFORMAT=votable
\item RESPONSEFORMAT=application/x-votable+xml
\end{itemize}
All of these result in the standard output format.
Service implementers \rfcmay\ support additional output formats but \rfcmust\ follow
the DALI specification if they chose any formats described there.
\subsection{Registering \blinks\ endpoints}
\label{sec:capability}
Since normal datalink operations do not involve the Registry, this
specification poses no requirements to register \blinks\ endpoints.
Datalink clients also generally have no reason to inspect VOSI
capabilities endpoints, and hence there are no requirements on
mentioning \blinks\ endpoints in any VOSI capability documents.
Operators still wishing to declare \blinks\ endpoints can do this by
giving a capability with a standardID of
\begin{verbatim}
ivo://ivoa.net/std/DataLink#links-1.1
\end{verbatim}
This specification does not constrain the capability type used in such
declarations. The access URL of the \blinks\ endpoint \rfcmust\ be given in a
\xmlel{vs:ParamHTTP}-typed interface element.
Hence, a single datalink capability could be declared as follows within
either a VOResource record or a VOSI capabilities element:
\begin{verbatim}
<capability standardID="ivo://ivoa.net/std/DataLink#links-1.1"
xmlns:vs="http://www.ivoa.net/xml/VODataService/v1.1">
<interface xsi:type="vs:ParamHTTP" role="std">
<accessURL use="base">
http://example.com/datalink/mylinks
</accessURL>
<queryType>GET</queryType>
<queryType>POST</queryType>
<resultType>
application/x-votable+xml;content=datalink
</resultType>
<param std="true" use="required">
<name>ID</name>
<description>publisher dataset identifier</description>
<ucd>meta.id;meta.main</ucd>
<dataType>string</dataType>
</param>
<param std="true" use="optional">
<name>RESPONSEFORMAT</name>
<description>Return the links in this tabular format (defaults
to VOTable).</description>
</param>
</interface>
</capability>
\end{verbatim}
\subsection{VOSI}
\label{sec:vosi}
Since DataLink services are not usually registered, the VOSI-capabilities endpoint
is not required; the VOSI-availability endpoint is \rfcoptional.
\section{\blinks\ Response}
All responses from the \blinks\ endpoint follow the rules for DALI-sync
resources, except that the \blinks\ response allows for error
messages for individual input identifier values.
\subsection{DataLink MIME Type}
\label{sec:mime}
In some data discovery responses (e.g.\ ObsCore, \citet{2017ivoa.spec.0509L}),
there are columns
with a URL (access\_url in ObsCore) and a content type (access\_format in
ObsCore). If the implementation uses a DataLink service to implement this
data access, it should include a complete (including the ID parameter)
DataLink URL and a parameterised VOTable MIME type:
\begin{verbatim}
application/x-votable+xml;content=datalink
\end{verbatim}
to denote that the response from that URL is a DataLink response.
This is also the preferred MIME type for the \blinks\ response
(see \ref{sec:successfulRequests})
unless the caller has explicitly requested a specific value
via the RESPONSEFORMAT parameter (see \ref{sec:responseformat}).
In various other IVOA service responses the \blinks\ endpoint URL
recognition is not explicitly normalized by service specification. In
these cases it's worth reading the DataLink recognition
implementation note\footnote{at the time of writing:
http://github.com/ivoa/DataLinkRecImplNote}.
\subsection{List of Links}
\label{sec:listOfLinks}
The list of links that is returned by the \blinks\ endpoint can be
represented as a table with the columns listed in Table \ref{fig:linkFields}.
\begin{table}[h]
\begin{center}
\begin{tabular}{|l|p{0.29\textwidth}|p{0.12\textwidth}|p{0.12\textwidth}|l|}
\hline
{\bf name} & {\bf description} & {\bf field \newline required}
& {\bf value \newline required} & {\bf UCD} \\
\hline
ID & Input identifier & yes & yes & meta.id;meta.main \\
\hline
access\_url & link to data or service
& yes & & meta.ref.url \\
\cline{1-3} \cline{5-5}
service\_def & reference to a service descriptor resource
& yes & one only & meta.ref \\
\cline{1-3} \cline{5-5}
error\_message & error if an access\_url cannot be created
& yes & & meta.code.error \\
\hline
description & human-readable text describing this link
& yes & no & meta.note \\
\hline
semantics & Term from a controlled vocabulary describing the link
& yes & yes & meta.code \\
\hline
content\_type & mime-type of the content the link returns
& yes & no & meta.code.mime \\
\hline
content\_length & size of the download the link returns
& yes & no & phys.size;meta.file \\
\hline
content\_qualifier & nature of the content the link returns
& no & no & \\
\hline
local\_semantics & An identifier that allows clients to associate rows from
different datalink documents on the same service with each other.
& no & no & meta.id.assoc \\
\hline
link\_auth & use of the link requires authentication
& no & no & meta.code \\
\hline
link\_authorized & caller is authorized to use the link
& no & no & meta.code \\
\hline
\end{tabular}
\end{center}
\caption{Fields for Links Output}
\label{fig:linkFields}
\end{table}
Fields \rfcmust\ be present and values provided
(or null) as described in Table \ref{fig:linkFields}. Each row in the table
represents one link and \rfcmust\ have exactly one of:
\begin{itemize}
\item an access\_url
\item a service\_def
\item an error\_message
\end{itemize}
To facilitate consumption of large datalink results in streaming mode, all links
for a single ID value \rfcmust\ be served in consecutive rows in the output.
If an error occurs while processing an ID value, there \rfcshould\ be at least
one row for that ID value and an error\_message. For example, if an input
ID value is not recognised or found, one row with an error\_message
to that effect is sufficient.
If some links can be created (e.g.\ download links)
but others cannot due to some temporary failure (e.g.\ service outage),
then one could have one or more rows with the same ID and different
error\_message(s).
Services \rfcmay\ include additional columns; this can be used to include
values that can be referenced from service descriptor input parameters
(see \ref{sec:serviceResources}).
Unless specified otherwise below, all fields are text values (\attval{datatype}{char}
in the VOTable FIELD).
\subsubsection{ID}
The ID column contains the input identifier value.
\subsubsection{access\_url}
The access\_url column contains a URL to download a single resource.
This URL can be a static link or a link to a dynamic resource (e.g.\ preview generation).
Access URLs may have fragment parts, which could,
for instance, refer to id-ed elements within XML documents or extensions
within FITS files. As in URIs in general, the interpretation of a fragment
identifier depends on the media type. Apart from that no other client handling
is expected.
\subsubsection{service\_def}
The service\_def column contains a reference from the result row to
a separate resource. This resource describes a service as specified
in section \ref{sec:serviceResources}.
For example, if the response document includes this resource
to describe a service:
\begin{verbatim}
<RESOURCE type="meta" utype="adhoc:service" ID="srv1">
...
</RESOURCE>
\end{verbatim}
then the service\_def column would contain {\em srv1\/} to indicate that
a resource with XML ID srv1 in the same document describes the service.
Note that service descriptors do not always require an XML ID value;
it is only the reference from service\_def that warrants adding
an ID to the descriptor.
\subsubsection{error\_message}
\label{sec:errorMessage}
The error\_message column is used when no access\_url or service\_def can be generated for
an input identifier. If an error\_message is included in the output, the
ID and semantics values \rfcmust\ be provided as usual; in particular,
the value in the semantics column should reflect the semantics of the
link that could not be produced.
From version 1.1 of this standard,
services \rfcmay\ provide values in other fields or leave them null (as was required in 1.0).
For example, if an ID value is unrecognized by the service, it would normally provide the
minimum output: the input value for the ID, \verb|#this| for semantics, and an error
message. If a service did recognise the input ID and would normally create a download link,
but generating the access\_url failed, the service could include the usual content\_type,
content\_length, and description along with the ID, semantics, and error\_message.
\subsubsection{description}
The description column \rfcshould\ contain a human-readable description of
the link; it is intended for display by interactive applications and very
important to help user distinguish links with same semantics (see below).
\subsubsection{semantics}
\label{sect:semantics}
The semantics column contains a URI for a concept
that describes the meaning of the linked item relative
to what ID references. The semantics column is intended to be
machine-readable and to assist automated link selection, presentation, and
usage.
The value is always interpreted as a URI; relative URIs
\citep{std:RFC3986} are completed using the base URI of the
core DataLink vocabulary,
\url{http://www.ivoa.net/rdf/datalink/core}. Terms from this
vocabulary \rfcmust\ always be written as relative URIs. This means that for
concepts from the core vocabulary, the value in the semantics column
always starts with a hash.
For example, if the \blinks\ table contains a
link to a preview of a dataset, the ID column will contain the dataset
identifier, the access\_url column will contain the URL of the preview,
and the semantics column will be \verb|#preview|.
The core DataLink vocabulary defines a special term for
the concept of {\em this\/};
this term is used to describe links available for the retrieval of the
file(s) making up what ID references.
For concepts outside the core DataLink vocabulary, the full concept URI
\rfcmust\ be given. It \rfcshould\ resolve to a human-readable document
describing what the concept means and what clients are expected to do
with links annotated with it.
As per Vocabularies in the VO 2 \citep{2021ivoa.spec.0525D}, at
\url{http://www.ivoa.net/rdf/datalink/core} the datalink core vocabulary
can be retrieved in various formats including HTML (in a way that the
concept URI is usable in a web browser), various RDF serialisations, and
the VO-specific Desise optimised for simple machine consumption; this
should be used by clients to present the user with labels (and perhaps
definitions) rather than the URI parts given in the semantics column.
In RDF terms, the concepts in datalink core are properties. A datalink
row can be interpreted as an RDF triple
$$(
\langle\textit{access\_url\/}\rangle,
\textit{is-a-}\langle\textit{semantics\/}\rangle\textit{-for},
\langle\textrm{ID}\rangle
).$$
\subsubsection{content\_type}
The content\_type column tells the client the general file format
(mime-type) they will receive if they use the link
(access\_url or invoking a service).
For recursive DataLink links, the content\_type value \rfcshould\
be as specified in section \ref{sec:mime}.
This field \rfcmay\ be null (blank) if the value is unknown.
\subsubsection{content\_length}
The content\_length column tells the client the size of the download
if they use the link, in bytes. For VOTable, the FIELD \rfcmust\ be
\attval{datatype}{long} with \attval{unit}{byte}.
The value \rfcmay\ be null (blank)
if unknown and will typically be null for links to services.
\subsubsection{content\_qualifier}
\label{sec:contentQualifier}
The content\_qualifier column is \rfcoptional. If it is present, it tells
the client the nature of the thing or service they will receive or access
if they use the link.
If the access\_url references a data product, the content\_qualifier
field \rfcshould\ define its product type. In that case, the considerations
for the semantics column (Sect.~\ref{sect:semantics}) apply, except that
the basic vocabulary is \url{http://www.ivoa.net/rdf/product-type}, and
the interpretation as an RDF triple would be $$(
\langle\textit{access\_url}\rangle, \textit{is-a},
\langle\textit{content\_qualifier}\rangle)$$
For rows not linking to data products, content\_qualifier's
interpretation will be different, and the default vocabulary will be
inappropriate. Full concept URIs will have to be used in this case, and
their translations to RDF triples is not covered by this version of
DataLink.
\subsubsection{local\_semantics}
\label{sec:localSemantics}
The local\_semantics column allows for identification of corresponding rows for
different IDs in the same DataLink service where the combination of semantics,
content\_type and content\_qualifier is not sufficient to identify them.
It contains a service specific vocabulary. It aids clients in presenting to the
user the same sort of link as they go from one dataset to another within a service.
For instance, suppose a service serves both continuum and line cubes.
Using local\_semantics, users can configure their clients such that, as they change
to a new data set, they always see the line cube even when the semantics,
content\_qualifier and content\_type columns agree for both types of data.
The vocabulary can be a simple list of terms defined for the service
(eg : local\_semantics="line-cube") or can be described in an ad hoc external resource
accessible via an URI
(eg : local\_semantics="http://our-service-adhoc-vocab/terms\#continuum-cube").
\subsubsection{link\_auth}
\label{sec:linkAuth}
The link\_auth column tells the client whether or not authentication is required
to use the link. Valid values are:
\verb|false| : the link allows anonymous access only
\verb|optional| : the link supports both anonymous and authenticated access
\verb|true| : authentication is required
This field \rfcmay\ be null (blank) if the value is unknown.
\subsubsection{link\_authorized}
\label{sec:linkAuthorized}
The link\_authorized column tells the client whether the currently authenticated
identity is authorized to use the link. For VOTable, the FIELD \rfcmust\ be
\attval{datatype}{boolean}. This is generally a prediction to save
clients from trying to use a link and getting a permission denied response. Valid
values are:
\verb|false| : current user not authorized
\verb|true| : current user is authorized
If the value is \verb|false| and the caller tries to use the link anyway, it may be
challenged for credentials (e.g.\ HTTP 401 response with WWW-Authenticate headers) or
denied (e.g.\ HTTP 403 ``permission denied'').
If the value is \verb|true|, the caller should proceed with the same authentication
and should expect to succeed.
This field \rfcmay\ be null (blank) if the value is unknown.
\subsection{Successful Requests}
\label{sec:successfulRequests}
Successfully executed requests \rfcshould\ result in a response with HTTP
status code 200 (OK) and a response in the format requested by the client
or in the default format for the service. The content of the response
(for tabular formats) is described above,
with some additional details below.
Unless the incoming request included a RESPONSEFORMAT parameter requesting
a different format, the content-type header of the response \rfcmust\ be one of the
values allowed by the VOTable specification, which at the time of this writing includes
``application/x-votable+xml'' and ``text/xml''. The former value is preferred
and SHOULD be augmented with the ``content'' parameter set to ``datalink'',
with the canonical form given in \ref{sec:mime}
strongly \rfcrecommended. Contrary to
all other uses of the string given in \ref{sec:mime},
clients wishing to evaluate
the content type of the response must, however, perform a full parse
of header value. This specification cannot and does not outlaw content
types with additional parameters
(e.g.\ ``application/x-votable+xml; content=datalink;charset=iso-8859-1'')
or with extra spaces or quotes
(as allowed for MIME types, \citet{std:RFC2045}).
If the incoming request includes a DALI RESPONSEFORMAT parameter,
content-type follows the DALI rules.
\subsubsection{VOTable output}
\label{sec:output}
The table of links \rfcmust\ be returned in a RESOURCE with
\attval{type}{results}. The table \rfcmust\ be in TABLEDATA serialization
unless another serialization is specifically requested
(see \ref{sec:responseformat})
and supported by the implementation.
The name and UCD attributes for FIELD elements in the VOTable
(and the units in one case) are specified above (see \ref{sec:listOfLinks}).
The DALI specification states that VOTable output main ``results'' RESOURCE should include an
INFO element with \attval{name}{standardID} and the standardID string as a value.
\begin{verbatim}
<RESOURCE type="results">
...
<INFO name="standardID" value="ivo://ivoa.net/std/DataLink#links-1.1"/>
...
<TABLE>
...
</TABLE>
...
</RESOURCE>
\end{verbatim}
From version 1.1 of this standard, the \blinks\ response main ``results'' RESOURCE \rfcmust\ include this
INFO element so that a table of links is easily identified by users and applications
when initially received from the service and if saved for later use.
\subsubsection{Other Output Formats}
This specification does not describe any other output formats, but allows
(via the RESPONSEFORMAT in section \ref{sec:responseformat})
implementations to provide
output in other formats.
\subsection{Errors}
The error handling specified for DALI-sync resources applies
to service failure (where no links can be generated).
Services should return the
document format requested by the client (see \ref{sec:responseformat}).
For the standard
output format (VOTable) the error document \rfcmust\ also be VOTable.
For errors that occur while generating individual links, each
identifier may result in a link with only an error\_message
as described above.
In either case (error document or per-link error\_message),
the error message \rfcmust\ start with one of the strings in
Table \ref{tab:errors}, in order of specificity.
\begin{table}[ht]
\begin{center}
\begin{tabular}{|l|l|}
\hline
\vrule height 12pt depth 7pt width 0pt{\bf Error} & {\bf Meaning} \\
\hline
\vrule height 12pt width 0pt NotFoundFault & Unknown ID value \\
UsageFault & Invalid input (e.g.\ invalid ID value) \\
TransientFault & Service is not currently able to function \\
FatalFault & Service cannot perform requested action \\
DefaultFault & Default failure (not covered above) \\
\hline
\end{tabular}
\end{center}
\caption{Error Messages}
\label{tab:errors}
\end{table}
In all cases, the service \rfcmay\ append additional useful information to the
error strings above.
If there is additional text, it must be separated
from the error string with a colon (:) character, for example:
\begin{verbatim}
NotFoundFault: ivo://example.com/data?foo cannot be found
UsageFault: foo:bar is invalid, expected an ivo URI
\end{verbatim}
\section{Service Descriptors}
\label{sec:serviceDescriptors}
The DataLink service interface is designed to add functionality to data
discovery services by providing the connection between the discovered
datasets and the download of data files and access to services that act
on the data. When the \blinks\ capability returns links to services, the
response document also needs to describe the services so that clients can
figure out how to invoke them. This is done by including an additional
metadata resource in the response document to describe each type of
service that can be used.
Here we describe how to construct a resource that describes a service
and add it to a VOTable document. This ``service descriptor'' mechanism can
be used in any VOTable document, such as a data discovery response from a TAP query
or one of the simple DAL query protocols or the \blinks\ endpoint described above.
The linked services can be any HTTP service, including but not limited to the \blinks\
endpoint described above, other IVOA services (e.g. SODA), custom services, or other
kinds of internet resources like web pages (e.g. interactive applications, DOI landing
pages, or documentation).
\subsection{Service Resources}
\label{sec:serviceResources}
In a data discovery response, one RESOURCE element (usually the first)
will have an attribute \attval{type}{results} and tabular data; this resource
contains the query result. To describe an associated service, the VOTable document
would also contain one or more resources with attribute \attval{type}{meta} and
\attval{utype}{adhoc:service} (or \attval{utype}{adhoc:this} in case of
a self-describing service --- see \ref{sec:selfDescribing}). A resource of this
type has no tabular data, but may include a rich set of metadata. The utype attribute
makes it easy for clients to find the RESOURCE elements that describe services.
A short name attribute, and a more verbose DESCRIPTION subelement,
MAY be added to the service descriptor RESOURCE to provide the user
with information about the service's purpose or semantics. This SHOULD
be done if the semantics are not obvious, and especially in the case
of multiple sibling service descriptors, or non-standard services.
In cases where a response document contains several ``service descriptor'' RESOURCEs
and several ``results'' RESOURCEs, these RESOURCEs MAY be nested in
order to better display correct association.
\subsection{Descriptive PARAMs}
\label{sec:descParams}
A service resource contains PARAM elements to describe the service.
The standard PARAM elements for a {\em service\/} resource
are described in Table \ref{tab:serviceParams}.
\begin{table}[h]
\begin{center}
\begin{tabular}{|l|l|l|}
\hline
\vrule height 12pt depth 7pt width 0pt {\bf name} & {\bf value} & {\bf required} \\
\hline
\vrule height 12pt width 0pt accessURL & URL to invoke the capability & yes \\
standardID & URI for the capability & no \\
resourceIdentifier & IVOA registry identifier & no \\
contentType & Media type of the service response & no \\
exampleURL & example invocation of the service & no \\
\hline
\end{tabular}
\end{center}
\caption{Parameters Describing the Service}
\label{tab:serviceParams}
\end{table}
For services that implement an IVOA standard, the standardID is specified
as the value attribute of the PARAM with \attval{name}{standardID}.
For free or custom services, this PARAM is not included.
For registered services, the resourceIdentifier PARAM allows the client
to query an IVOA registry for complete resource metadata. This could be
used to find documentation, contact info, etc. Although they need not be,
free or custom services could be registered in an IVOA registry and thus
have a resourceIdentifier to enable lookup of the record.
For standard services, the value of the accessURL PARAM \rfcmust\ be the
accessURL for the capability specified by the standardID. The accessURL
is not generally usable as-is; the client must include extra parameters
as described below. If a standardID indicates a capability that supports
multiple HTTP verbs (GET, POST, etc.), the client may use any supported
verbs. Otherwise, there is no way in this version to specify that POST
(for example) is supported so clients should assume that only HTTP GET
may be used. Since the accessURL may contain parameters, clients must
parse the URL to decide how to append additional parameters when
invoking the service.
In case the contentType is ``text/html'', the client SHOULD send the result
of the service query to a web browser. This is appropriate for both HTML
documents and web interactive interfaces.
A service descriptor \rfcmay\ contain multiple exampleURL PARAMs.
In exampleURL PARAMs, operators can give valid service calls as GET-able
URLs in the PARAMs' value attribute. They are intended as an aid for
debugging, in particular to aid users and developers in making sure a
service is still operating as expected. The PARAM's description \rfcshould\
give an indication of what the call will result in. End-user clients
might indicate exampleURLs to the user after unexpected service failures.
\subsection{Input PARAMs}
A service descriptor \rfcmust\ contain a GROUP element with \attval{name}{inputParams}
to describe user-specified input parameters of the service. There are three types of
input params: params with a fixed value, params where the values come from the
``results'', and params where the value is variable and chosen/specified by the user.
For params with a fixed value (e.g. \attval{fly}{true}), the client \rfcmust\
treat it as a required parameter and include it in the service invocation; this allows
a service implementor to include constant params explicitly (and describe them via a
DESCRIPTION element) rather than just include them in the ``accessURL'' without the
possibility to explain them.
For services where the parameter value(s) come from the ``results'' resource, the value
attribute is empty (\attval{value}{}) and the PARAM includes a ref attribute to indicate
the FIELD (column) that contains the values. For example, a TAP query result may contain
identifiers that can be used to invoke the {links} service; the FIELD with the identifiers
\rfcmust\ have an XML ID attribute (e.g. \attval{ID}{abc}) and the input PARAM would include
the attribute \attval{ref}{abc}). When this mechanism is used, the client \rfcmust\
treat it as a required parameter and the parameter and value \rfcmust\ be included in
the service invocation.
For user-specified input PARAMs the value attribute is empty (\attval{value}{})
and the user supplies the value(s). The PARAM specifies the type of value required via
the datatype, arraysize, and xtype attributes; this may be augmented further by the ucd,
units and utypes\footnote{An example of utype usage for service
parameters is described in section 3.4 of the SODA specification} attributes
and a child DESCRIPTION element. To allow for expressive, usable user
interfaces, operators SHOULD indicate useful ranges of parameters in MIN and MAX children
or, for enumerated parameters, indicate the valid values in OPTIONS in case
these values cannot be inferred from relevant metadata retrieved
before the service descriptor discovery. In general, services
may have parameters of this type that are optional or required and this distinction is
not currently described; services \rfcshould\ use a child DESCRIPTION element to document any
requirements. Clients should assume that these user-specified parameters are optional, but
that specifying some of them may be necessary to have the service do something useful.
Services \rfcshould\ respond with an informative error message if the input is not adequate to
perform the operations(s).
\subsection{Service self-description}
\label{sec:selfDescribing}
A service may include a service descriptor that describes itself with
its normal output.
In that case the utype ``adhoc:this'' indicates the self-describing
nature of the service descriptor.
This convention makes finding the self-description unambiguous in
cases where the output also contains other service descriptors.
This usage is comparable to prototype work on S3
(see \citet{note:s3})
and when combined with calling a service with no input parameters
(e.g., as allowed in \ref{sec:resourceId}),
and/or with the DALI \texttt{MAXREC=0} convention,
will make it easy for clients to obtain a
description of both standard and custom features.