Skip to content
Gerard Lemson edited this page Apr 19, 2021 · 11 revisions

What is the role of the datamodels in the VO?

  • [LM] The model is a piece of structured documentation that allows people to understand each to other when they talk about data content. VO models can be used to design protocol responses in the case of simple protocols. They can be used to provide data with an interoperable interface through a serialization mechanism.
  • [GL] see introduction of VO-DML spec

Do we need another way to map data on datamodels?

  • [LM] For now data are mapped on models (in VOTable) by using GROUPS and UTypes. This scheme, very useful in many cases, is not able to map complex data structures this is why we need another way to do it.
  • [GL] In fact the original VO-DML mapping used GROUPS and utypes identifying VO-DML types and roles and could map complex data models. It needed a small extra model to represent some special features related to serializing the models. These made the mapping to VO-DML less explicit, which led to complaints, which is why we then proposed a new scheme with very explicit VOTable elements. Which was also not received with cheers but seems now to be acceptable?

Should data annotations be considered as part of the data model specification or should they be model-agnostic?

  • [LM] A data annotation scheme must allow clients to get model-compliant data representations. Having a model-agnostic annotation mechanism allows the same code to build instances of different models. This is strong argument to separate the modeling effort from annotation syntax specification.

  • [LM] All models are based on a few number of patterns (class, attribute, associations...); the annotation syntax must be able to render model instances just by using these patterns.

  • [GL] If not model agnostic than tools wishing to interpret data model annotations would need to adjust for each new model. VO-DML was designed to support model agnostic annotations and the VO-DML mapping spec worked out how to do that. Would be nice if we can finally continue discussing how far that was successful and where possibly modifications are required or desired.

Does the VO need one common annotation syntax?

  • [LM] There is no obligation for the mapping syntax to be unique in the VO; it is however highly desirable to get one and only one solution. This will save time and resources for the IVOA groups, the data annotators and the client developers. Furthermore, this improves the chances for such a solution to be adopted.

  • [GL] This would clearly seem preferable.

What are the annotation expectations from the client point of view?

  • [LM]

    • To identify the nature of the content of the dataset
    • To get a way to build interoperable data representations
    • To be able to run a generic code extracting complex data structures.
  • [GL] basically, to be able to infer how objects from IVOA-standard data models can be retrieved form a serialization.

How to make the data annotation process affordable for data providers?

  • [LM]

    • By allowing the use of small components
    • By hiding complex modeling features (inheritance, association vs aggregation ...).
  • [GL]

    • Provide tools that can assist in the annotation. For example a tool like the VODML-Mapper, which has sufficient knowledge to propose required type instances for roles etc and can generate the mapping syntax can facilitate the work.
    • Allow annotaiton as a community effort. Again in the VODML-Mapper TAP schemas for example can be imported and annotated by external users who can share the results for final review by the data owners.
    • Create models that are explicit and do not need too many components to finally get to an atomic value that can be connected to a column for example.

Do we expect TAP servers to be able to generate annotations in the future?

  • [LM] There different situations where the annotation of TAP responses can provide a great added-value
    • Adding missing meta data such as coordinate frames
    • Restoring complex data structures (e.g. Provenance)
    • Assembling composite datasets (e.g. sources with photometric points)
  • [GL] If this is not done we will have failed. On the other hand this is one of the most appealing efforts to work on: create query parsers that can interpret the metadata annotation of a TAP schema to infer how the objects stored inside the database are represented in the final result.