Skip to content
Laurent MICHEL edited this page Apr 23, 2021 · 17 revisions

Mango proposal

Mark CD comments

Models

  • Mango:
    • I've sumbitted Issues to the Mango repository with comments/suggestions on the model.
    • Parameter.description: is this a modeled concept?
      • there is some question (in my mind anyway) on whether something like this is a modeled concept (ie: an attribute on an object), since it can apply to any object, is completely arbitrary, often not included, and in no way contributes to the definition of the object itself.
      • the alternative is to not included it in the Model element(s), but let the Annotation address it in some way.
        "Any object may have a description. If supported by the implementation, the value is pulled from the VOTable DESCRIPTION sub-node of the associated PARAM|FIELD|TABLE" (we don't annotate GROUPs I think).

Annotations

  • Annotating content within a VOTable element (PARAM, FIELD, etc)
    This came up in the context of mango:Param.description where, if this is a modeled element, we may want to populate it from the VOTable content. However, the VOTable DESCRIPTION element cannot be referenced, it is internal to other VOTable elements (PARAM, FIELD, GROUP, TABLE).
    • The annotation syntax Laurent uses (Vodml-instance-vot) has elements (SC_*) which are supposed to handle this sort of problem.
      • Section 3.12 describes these "shortcuts"
        • example case (Quantity) illustrates annotating an ivoa:Quantity using this syntax.
          • uses independent ATTRIBUTEs for the Quantity.value and Quantity.unit
        • The SC_*QUANTITY tags consolidate this grouping into a single annotation tag.
        • Question: is there no way to assign a Quantity.unit from the 'unit' arg of PARAM|FIELD?
      • given the description in Section 3.12, I do not see how it satisfies this case of annotating content within a VOTable element atom.

LM answers

MANGO

  • Although all use-cases are supported by Mango, some weakness have been pointed out. They have been issued on the MANGO pages and will be subject to a major PR after the workshop.

Annotations

  • unit element shouldn't be mandatory in SC*QUANTITY. This part of the spec must be enhanced.

GL - VizieR mango experiments (and proposal)

a beta-prototype has been implemented to provide rich VOTable using Mango:

http://viz-beta.u-strasbg.fr/viz-bin/Mango? [vizier-asu-parameters]

e.g.:

The beta-prototype is available for vizier queries having a unique table in output.

The capabilities implemented are:

  • measure annotation available (currently) for positions, photometry, parallax
  • associated data is used to link the VizieR catalogue Provenance
  • grouping column mechanism to gather columns related to a same measure

The beta version is a piped process which takes a VizieR VOTable in input, extracts metadata from the VOTable (COOSYS and FIELDS definition) and returns in output a VOTable enriched with a VODML(lite) section header. The process is completed with a VizieR database connection to extract some metadata not available in the VOTable (v1.4) like the photometry systems.

Measure annotation

Important: filters are not (always) in the original article but assigned by CDS.

Provenance

Provenance is important for end-users to be aware of the data origin. There are indeed no standard way today to annotate authors, articles, year of publications in a VOTable. Furthermore, the VizieR process can assign metadata (like photometry) which is not part of original data. This information is described in a beta implementation of ProvDM.

In the VizieR Mango prototype, the provenance is added as an associated data with URLs (e.g.: http://viz-beta.u-strasbg.fr/viz-bin/Mango?-out.max=10&-source=J/ApJ/789/115 - see collection with dmrole="mango:Source.associatedDataDock")

Grouping columns

Mango enables to gather measures with errors. But the "associatedParameters" anables also to group columns related to a same measure. For instance a quality flag which qualifies the measure given in an other column.

e.g.: catalogue J/A+A/584/A5 (Evans, 2015)

The table “table2” contains 2 radial velocity columns: RV1 (primary velocity) and RV2 (secondary velocity). Each of theses columns are qualified with an error and a number of observation used to compute the radial velocity measure. So we can gather columns: (RV1, e_RV1, o_RV1) and (RV2, e_RV2, o_RV2)

Examples:

LM: Mango parser

This section demonstrates the API prototype for the MANGO parser.

Annotated data sample can be found in dm-usecases/*/mango_based_proposal

  • Prerequisites
    • VOTable are annotated with MANGO using the ModelInstanceInVot syntax.
    • These 2 standards are applied as they was at workshop time. Requested enhancements are not taken into account.
    • The code presented here is a work in progress. It could be made more robust once all components will be more stable (syntax, models)
    • semantic fields are not properly set so far.

Python API.

  • API function always returned Python dictionaries where model objects are referenced by their roles (key=role, value = object)
  • In some cases, human readable keys are generated either to improve the readability or to avoid key duplication.
  • The code snippets below show Python dictionaries as pretty printed JSON. This is just a convenient way to show up the API outputs
  • The API exposes the binding between model elements and columns (referenced by either ID or rank in the field list). So that clients are free to set the cursor position between using the mapping and using the native data as stated in the use-case section of ModelInstanceInVot.
    • Ignoring the mapping
    • Using the mapping just to know which model(s) has been mapped
    • Taking the meta-data from the model and reading the data in a conventional way
    • Processing the whole VOTable through the mapping layer
  • Using an API based on dictionaries using model elements as key makes very easy the connection with other packages (partially demonstrated with Astropy)

Reading the MANGO annotation block

from client.parser.mango_browser import MangoBrowser
mango_browser = MangoBrowser(votable_path) 
  • This action has no output.
  • It builds an internal representation of the datato-model binding.

Getting the list of the mapped parameters

Python

from utils.dict_utils import DictUtils

mango_parameters = mango_browser.get_parameters()
DictUtils.print_pretty_json(mango_parameters)
  • Returns the dictionary of all parameters.
  • Parameters are identified with a key made of the parameter rank + the parameter UCD (e.g. "#0 meta.id;meta.main")

Output

{
  "#0 meta.id;meta.main": {
    "mango:MangoObject.identifier": {
      "id": "namesaada",
      "index": 24
    },
    "measure_type": "mango:MangoObject.identifier"
  },
  "#1 pos": {
    "coord_type": "mango:stcextend.LonLatPoint",
    "coords:SpaceFrame": {
      "@ID": "SpaceFrame_ICRS",
      "@dmtype": "coords:SpaceFrame",
      "coords:SpaceFrame.equinox": {
        "@dmtype": "coords:Epoch",
        "@value": "NoSet"
      },
      "coords:SpaceFrame.refPosition": {
        "@dmtype": "coords:StdRefLocation",
        "coords:StdRefLocation.position": {
          "@dmtype": "ivoa:string",
          "@value": "NoSet"
        }
      },
      "coords:SpaceFrame.spaceRefFrame": {
        "@dmtype": "ivoa:string",
        "@value": "ICRS"
      }
    },
    "coosys_type": "coords:SpaceFrame",
    "description": "Corrected position",
    "error_type": "meas:Error",
    "mango:stcextend.LonLatSkyPosition": {
      "field:latitude": {
        "id": "_dec_147",
        "index": 1
      },
      "field:longitude": {
        "id": "_ra_146",
        "index": 0
      }
    },
    "meas:Error": {
      "field:meas:Symmetrical.radius": {
        "id": "_poserr_148",
        "index": 2
      },
      "unit": "NotSet"
    },
    "measure_type": "mango:stcextend.LonLatSkyPosition",
    "semantic": "#position.corrected",
    "ucd": "pos"
  },
...
}

(truncated output)

Getting the space coordinate system of a specific mapped parameter

  • The searched parameter is identified by its key in parameter list.
  • This identifier can be used for different purpose
    • retrieving the complete description of the parameter
    • retrieving the parameter associated with a particular data cell

Python

space_coosys = mango_browser.get_param_coordsys("#1 pos")
DictUtils.print_pretty_json(space_coosys)

Output

{
  "@ID": "SpaceFrame_ICRS",
  "@dmtype": "coords:SpaceFrame",
  "coords:SpaceFrame.equinox": {
    "@dmtype": "coords:Epoch",
    "@value": "NoSet"
  },
  "coords:SpaceFrame.refPosition": {
    "@dmtype": "coords:StdRefLocation",
    "coords:StdRefLocation.position": {
      "@dmtype": "ivoa:string",
      "@value": "NoSet"
    }
  },
  "coords:SpaceFrame.spaceRefFrame": {
    "@dmtype": "ivoa:string",
    "@value": "ICRS"
  }
}

Getting the Astropy counterpart of that coordinate system

print(mango_browser.get_astropy_space_frame("#1 pos"))
<ICRS Frame>

Getting the time coordinate system of a specific mapped parameters

Python

time_coosys = mango_browser.get_param_coordsys("#12 time;obs.start")
DictUtils.print_pretty_json(time_coosys)

Output

{
  "@ID": "TimeFrame_BARYCENTER",
  "@dmtype": "coords:TimeFrame",
  "coords:TimeFrame.refPosition": {
    "@dmtype": "coords:StdRefLocation",
    "coords:StdRefLocation.position": {
      "@dmtype": "ivoa:string",
      "@value": "BARYCENTER"
    }
  },
  "coords:TimeFrame.timescale": {
    "@dmtype": "ivoa:string",
    "@value": "TCB"
  }
}

Getting the Astropy counterpart of that coordinate system

The time frame being not an Astropy object, it is returned as a tuple (scale, location, format).

print(mango_browser.get_astropy_time_frame("#12 time;obs.start"))
('tcb', None, 'mjd')

Getting the position data

The parameters to be fetched are selected by either measure_type (model class name) or by ucd.

  • If no selector is set, all mapped data are returned.

Python

mango_data = mango_browser.get_data(measure_type="mango:stcextend.LonLatSkyPosition")
DictUtils.print_pretty_json(mango_data)

Output

Returns an object containing all requested data with their model references.

This method can be used to read all data in once or to store in a easy way the column-to-model binding for further uses (e.g. streaming readout)

Only the first row is currently read in order limit the output size.

  • data: one array per row. The value order matches the model mapping not the column number
  • head: Reference to the model element matching the corresponding columns. This field contains the parameter key in [] in order to easily retrieve the complete description
  • selected_index: Gives the actual column number of each listed quantity.
{
  "data": [
    [
      340.91055060369,
      -17.071667101891,
      1.50765
    ]
  ],
  "head": [
    "field:longitude [#1 pos]",
    "field:latitude [#1 pos]",
    "error: field:meas:Symmetrical.radius [#1 pos]"
  ],
  "selected_index": [
    0,
    1,
    2
  ]
}

Reading Associated Data

Mango objects are made with 2 data docks:

  • one for the parameters (see above)
  • one for associated data (detections, spectra, ....)

In this example, Mango objects (table rows) are associated with the Provenance of the catalog they belong to.

Python

associated_data = mango_browser.get_associated_data()
DictUtils.print_pretty_json(associated_data)

Output

{
  "#1 mango:WebEndpoint": {
    "data_type": "mango:WebEndpoint",
    "description": "Complete VizieR catalogue Provenance",
    "mango:WebEndpoint": {
      "ContentType": "text/xml",
      "url": "https://cdsarc.unistra.fr/viz-bin/provenance?cat=IX/45&filter=true"
    },
    "semantic": "computed"
  },
  "#2 mango:WebEndpoint": {
    "data_type": "mango:WebEndpoint",
    "description": "Complete VizieR catalogue Provenance",
    "mango:WebEndpoint": {
      "ContentType": "image/png",
      "url": "https://cdsarc.unistra.fr/viz-bin/provenance?cat=IX/45&filter=true&out=prov:png"
    },
    "semantic": "computed"
  }

Reading Associated Parameter

Mango allows to put together parameters that are related each to others:

Python

In this example, X-ray fluxes are associated with their bounds.

mango_parameters = mango_browser.get_parameters()
DictUtils.print_pretty_json(mango_parameters)

mango_data = mango_browser.get_data()
DictUtils.print_pretty_json(mango_data)

Output

 ...
 "#2 phot.flux;em.X-ray": {
    "associatedParameters": {
      "#1 stat.fit.param": {
        "coord_type": "meas:GenericMeasure.coord",
        "description": "lower bound (minimal value)",
        "meas:GenericMeasure": {
          "field:value": {
            "id": "b_Fb",
            "index": 2
          }
        },
        "measure_type": "meas:GenericMeasure",
        "semantic": "native",
        "ucd": "stat.fit.param"
      },
      "#2 stat.fit.param": {
        "coord_type": "meas:GenericMeasure.coord",
        "description": "upper bound (maximal value)",
        "meas:GenericMeasure": {
          "field:value": {
            "id": "B_Fb",
            "index": 3
          }
        },
        "measure_type": "meas:GenericMeasure",
        "semantic": "native",
        "ucd": "stat.fit.param"
      }
    },
    "coord_type": "meas:GenericMeasure.coord",
    "description": "main column",
    "meas:GenericMeasure": {
      "field:value": {
        "id": "Fb",
        "index": 4
      }
    },
    "measure_type": "meas:GenericMeasure",
    "semantic": "native",
    "ucd": "phot.flux;em.X-ray"
  }
...

Associated data are tagged as relating to their host parameter (e.g. (phot.flux;em.X-ray native main column)->field:value [#2 phot.flux;em.X-ray])

 ...
{
  "data": [
    [
      "J000000.0-093415",
      0.0,
      -9.57106,
      3.1499999955447375e-14,
      2.1800000566123308e-14,
      4.2899999552108506e-14
    ]
  ],
  "head": [
    "mango:MangoObject.identifier",
    "field:longitude [#1 pos]",
    "field:latitude [#1 pos]",
    "field:value [#2 phot.flux;em.X-ray]",
    "(phot.flux;em.X-ray native main column)->field:value [#2 phot.flux;em.X-ray]",
    "(phot.flux;em.X-ray native main column)->field:value [#2 phot.flux;em.X-ray]"
  ],
  "selected_index": [
    5,
    0,
    1,
    4,
    2,
    3
  ]
}

Architecture Notes

The MANGO parser is a 2 layers library:

  1. The mapping processor: Model agnostic layer translating the mapping bloc into an internal representation and connecting it with the VOTable API. The mapping processor API is rather developer oriented.
  2. MANGO parser: See above

TODO : This code makes intensive use of string constants for either mapping or model elements. These strings, duplicated everywhere in the code so far, must be grouped in a single dictionary.

Clone this wiki locally