Skip to content

Commit

Permalink
Merge pull request #345 from casangi/batch_msv4_review_fixes_improvem…
Browse files Browse the repository at this point in the history
…ents_326_328_334_343_344

Fixes and improvements in docs and schema from MSv4 review, batch 326 328 334 343 344
  • Loading branch information
Jan-Willem authored Dec 9, 2024
2 parents 98f313c + a519134 commit cb1e83b
Show file tree
Hide file tree
Showing 8 changed files with 1,262 additions and 609 deletions.
6 changes: 3 additions & 3 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,9 @@ Review Timeline

- **October 16, 2024**: Documentation release and community notification.
- **November 25, 2024**: Deadline for feedback via GitHub issues.
- **TBD **: Panel input for final agenda.
- **TBD **: Review meeting (3 half-days).
- **TBD **: Submission of review panel report.
- **TBD**: Panel input for final agenda.
- **TBD**: Review meeting (3 half-days).
- **TBD**: Submission of review panel report.

The report will be used by the XRADIO team to finalize the Measurement Set v4.0.0 schema and API.

Expand Down
1,397 changes: 1,077 additions & 320 deletions docs/source/measurement_set/guides/ALMA_SD.ipynb

Large diffs are not rendered by default.

301 changes: 90 additions & 211 deletions docs/source/measurement_set/guides/GBT_single_dish.ipynb

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -217,6 +217,10 @@ Measure arrays

.. xradio_array_schema_table:: xradio.measurement_set.schema.SkyCoordArray

.. autoclass:: xradio.measurement_set.schema.PointingBeamArray()

.. xradio_array_schema_table:: xradio.measurement_set.schema.PointingBeamArray

.. autoclass:: xradio.measurement_set.schema.LocalSkyCoordArray()

.. xradio_array_schema_table:: xradio.measurement_set.schema.LocalSkyCoordArray
Expand Down
2 changes: 1 addition & 1 deletion docs/source/measurement_set_overview.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
"\n",
"## Schema Layout\n",
"\n",
"An xarray dataset conforming to the MSv4 schema contains data for a single observation, spectral window, polarization setup, observation mode, processor and beam per antenna (though finer partitioning, such as splitting by scan, is allowed if desired). This structure simplifies the MS v4 data representation relative to the MS v2, enabling it to be stored as n-dimensional arrays with consistent shapes over time (rare baseline dropouts are handled by NaN padding). Related datasets can be grouped together into a Processing Set (`ps`), which is useful for processing them together. Importantly, each MS v4 is fully self-describing. As shown in [Figure 1](#figure-1) (a simplified diagram; for full details, see the [Data Model Schema](measurement_set/schema_and_api/measurement_set_schema.rst)), the MS v4 is structured as a dataset (xds) of datasets comprising the `correlated_xds` along with `antenna_xds`, `pointing_xds`, `phase_calibration_xds`, `weather_xds`, `system_calibration_xds`, `gain_curve_xds`, and `phased_array_xds`, all stored in the attribute section. The `correlated_xds` contains the `VISIBILITY` (for interferometer data) or `SPECTRUM` (for single dish data), `UVW`, `WEIGHT`, and `FLAGS` data variables, along with info dictionaries in the attributes. The `field_and_source_xds` is specifically stored within the attributes of the `VISIBILITY`/`SPECTRUM` data variable.\n",
"An xarray dataset conforming to the MSv4 schema contains data for a single observation, spectral window, polarization setup, observation mode, processor and beam per antenna (though finer partitioning, such as splitting by scan or antenna, is allowed if desired). This structure simplifies the MS v4 data representation relative to the MS v2, enabling it to be stored as n-dimensional arrays with consistent shapes over time (rare baseline dropouts are handled by NaN padding). Related datasets can be grouped together into a Processing Set (`ps`), which is useful for processing them together. Importantly, each MS v4 is fully self-describing. As shown in [Figure 1](#figure-1) (a simplified diagram; for full details, see the [Data Model Schema](measurement_set/schema_and_api/measurement_set_schema.rst)), the MS v4 is structured as a dataset (xds) of datasets comprising the `correlated_xds` along with `antenna_xds`, `pointing_xds`, `phase_calibration_xds`, `weather_xds`, `system_calibration_xds`, `gain_curve_xds`, and `phased_array_xds`, all stored in the attribute section. The `correlated_xds` contains the `VISIBILITY` (for interferometer data) or `SPECTRUM` (for single dish data), `UVW`, `WEIGHT`, and `FLAGS` data variables, along with info dictionaries in the attributes. The `field_and_source_xds` is specifically stored within the attributes of the `VISIBILITY`/`SPECTRUM` data variable.\n",
"\n",
"<div style=\"text-align: center;\">\n",
" <figure id=\"figure-1\" style=\"display: inline-block;\">\n",
Expand Down
74 changes: 37 additions & 37 deletions src/xradio/measurement_set/_utils/_msv2/conversion.py
Original file line number Diff line number Diff line change
Expand Up @@ -1029,7 +1029,7 @@ def get_observation_info(in_file, observation_id, intents):
datetime.timezone.utc
).isoformat(),
"xradio_version": importlib.metadata.version("xradio"),
"schema_version": "4.0.-9994",
"schema_version": "4.0.-9991",
"type": "visibility",
}
)
Expand Down Expand Up @@ -1399,58 +1399,58 @@ def antenna_ids_to_names(
return xds


def add_group_to_data_groups(
data_groups: dict, what_group: str, correlated_data_name: str, uvw: bool = True
):
"""
Adds one correlated_data variable to the data_groups dict.
A utility function to use when creating/updating data_groups from MSv2 data columns
/ data variables.
Parameters
----------
data_groups: str
The data_groups dict of an MSv4 xds. It is updated in-place
what_group: str
Name of the data group: "base", "corrected", "model", etc.
correlated_data_name: str
Name of the correlated_data var: "VISIBILITY", "VISIBILITY_CORRECTED", "SPECTRUM", etc.
uvw: bool
Whether to add a uvw field to the data group (assume True = interferometric data).
"""
data_groups[what_group] = {
"correlated_data": correlated_data_name,
"flag": "FLAG",
"weight": "WEIGHT",
}
if uvw:
data_groups[what_group]["uvw"] = "UVW"


def add_data_groups(xds):
xds.attrs["data_groups"] = {}

data_groups = xds.attrs["data_groups"]
if "VISIBILITY" in xds:
xds.attrs["data_groups"]["base"] = {
"correlated_data": "VISIBILITY",
"flag": "FLAG",
"weight": "WEIGHT",
"uvw": "UVW",
}
add_group_to_data_groups(data_groups, "base", "VISIBILITY")

if "VISIBILITY_CORRECTED" in xds:
xds.attrs["data_groups"]["corrected"] = {
"correlated_data": "VISIBILITY_CORRECTED",
"flag": "FLAG",
"weight": "WEIGHT",
"uvw": "UVW",
}
add_group_to_data_groups(data_groups, "corrected", "VISIBILITY_CORRECTED")

if "VISIBILITY_MODEL" in xds:
xds.attrs["data_groups"]["model"] = {
"correlated_data": "VISIBILITY_MODEL",
"flag": "FLAG",
"weight": "WEIGHT",
"uvw": "UVW",
}
add_group_to_data_groups(data_groups, "model", "VISIBILITY_MODEL")

is_single_dish = False
if "SPECTRUM" in xds:
xds.attrs["data_groups"]["base"] = {
"correlated_data": "SPECTRUM",
"flag": "FLAG",
"weight": "WEIGHT",
"uvw": "UVW",
}
add_group_to_data_groups(data_groups, "base", "SPECTRUM", False)
is_single_dish = True

if "SPECTRUM_MODEL" in xds:
xds.attrs["data_groups"]["model"] = {
"correlated_data": "SPECTRUM_MODEL",
"flag": "FLAG",
"weight": "WEIGHT",
"uvw": "UVW",
}
add_group_to_data_groups(data_groups, "model", "SPECTRUM_MODEL", False)
is_single_dish = True

if "SPECTRUM_CORRECTED" in xds:
xds.attrs["data_groups"]["corrected"] = {
"correlated_data": "SPECTRUM_CORRECTED",
"flag": "FLAG",
"weight": "WEIGHT",
"uvw": "UVW",
}
add_group_to_data_groups(data_groups, "corrected", "SPECTRUM_CORRECTED", False)
is_single_dish = True

return xds, is_single_dish
5 changes: 3 additions & 2 deletions src/xradio/measurement_set/convert_msv2_to_processing_set.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,8 +76,9 @@ def convert_msv2_to_processing_set(
partition_scheme : list, optional
A MS v4 can only contain a single data description (spectral window and polarization setup), and observation mode. Consequently, the MS v2 is partitioned when converting to MS v4.
In addition to data description and polarization setup a finer partitioning is possible by specifying a list of partitioning keys. Any combination of the following keys are possible:
"FIELD_ID", "SCAN_NUMBER", "STATE_ID", "SOURCE_ID", "SUB_SCAN_NUMBER". For mosaics where the phase center is rapidly changing (such as VLA on the fly mosaics)
partition_scheme should be set to an empty list []. By default, ["FIELD_ID"].
"FIELD_ID", "SCAN_NUMBER", "STATE_ID", "SOURCE_ID", "SUB_SCAN_NUMBER", "ANTENNA1".
"ANTENNA1" is intended as a single-dish specific partitioning option.
For mosaics where the phase center is rapidly changing (such as VLA on the fly mosaics) partition_scheme should be set to an empty list []. By default, ["FIELD_ID"].
main_chunksize : Union[Dict, float, None], optional
Defines the chunk size of the main dataset. If given as a dictionary, defines the sizes of several dimensions, and acceptable keys are "time", "baseline_id", "antenna_id", "frequency", "polarization". If given as a float, gives the size of a chunk in GiB. By default, None.
with_pointing : bool, optional
Expand Down
82 changes: 47 additions & 35 deletions src/xradio/measurement_set/schema.py
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,7 @@ class QuantityInHertzArray:
@xarray_dataarray_schema
class QuantityInMetersArray:
"""
Quantity with units of Hertz
Quantity with units of meters
"""

data: Data[ZD, float]
Expand All @@ -137,7 +137,7 @@ class QuantityInMetersArray:
@xarray_dataarray_schema
class QuantityInMetersPerSecondArray:
"""
Quantity with units of Hertz
Quantity with units of meters per second
"""

data: Data[ZD, float]
Expand All @@ -149,7 +149,7 @@ class QuantityInMetersPerSecondArray:
@xarray_dataarray_schema
class QuantityInRadiansArray:
"""
Quantity with units of Hertz
Quantity with units of radians
"""

data: Data[ZD, float]
Expand Down Expand Up @@ -304,6 +304,31 @@ class SkyCoordArray:
"""


@xarray_dataarray_schema
class PointingBeamArray:
"""Pointing beam data array in :py:class:`PointingXds`."""

data: Data[
Union[
tuple[Time, AntennaName, LocalSkyDirLabel],
tuple[TimePointing, AntennaName, LocalSkyDirLabel],
tuple[Time, AntennaName, LocalSkyDirLabel, nPolynomial],
tuple[TimePointing, AntennaName, LocalSkyDirLabel, nPolynomial],
],
numpy.float64,
]

type: Attr[SkyCoord] = "sky_coord"
units: Attr[UnitsOfSkyCoordInRadians] = ("rad", "rad")
frame: Attr[AllowedSkyCoordFrames] = "fk5"
"""
From fixvis docs: clean and the im tool ignore the reference frame claimed by the UVW column (it is often mislabelled
as ITRF when it is really FK5 (J2000)) and instead assume the (u, v, w)s are in the same frame as the phase tracking
center. calcuvw does not yet force the UVW column and field centers to use the same reference frame! Blank = use the
phase tracking frame of vis.
"""


@xarray_dataarray_schema
class LocalSkyCoordArray:
"""Measures array for the arrays that have coordinate local_sky_dir_label in :py:class:`PointingXds`"""
Expand Down Expand Up @@ -660,7 +685,7 @@ class FrequencyArray:
"""Frequency coordinate in the main dataset."""

data: Data[Frequency, float]
""" Time, expressed in SI seconds since the epoch. """
""" Center frequencies for each channel. """
spectral_window_name: Attr[str]
""" Name associated with spectral window. """
frequency_group_name: Optional[Attr[str]]
Expand Down Expand Up @@ -698,7 +723,7 @@ class FrequencyCalArray:
only measures data, as opposed to the frequency array of the main dataset."""

data: Data[FrequencyCal, float]
""" Time, expressed in SI seconds since the epoch. """
""" Center frequencies for each channel. """

type: Attr[SpectralCoord] = "spectral_coord"
units: Attr[UnitsHertz] = ("Hz",)
Expand Down Expand Up @@ -1188,7 +1213,10 @@ class PartitionInfoDict:
""" List of source names. """
# source_id: mising / remove for good?
intents: list[str]
""" Infromation in obs_mode column of MSv2 State table. """
""" An intent string identifies one intention of the scan, such as to calibrate or observe a
target. See :ref:`scan intents` for possible values. When converting from MSv2, the list of
intents is derived from the OBS_MODE column of MSv2 state table (every comma separated value
is taken as an intent). """
taql: Optional[str]
""" The taql query used if converted from MSv2. """
line_name: list[str]
Expand Down Expand Up @@ -1469,7 +1497,7 @@ class WeatherXds:
time: Optional[Coordof[TimeInterpolatedCoordArray]]
""" Mid-point of the time interval. Labeled 'time' when interpolated to main time axis """
time_weather: Optional[Coordof[TimeWeatherCoordArray]]
""" Mid-point of the time interval. Labeled 'time_cal' when not interpolated to main time axis """
""" Mid-point of the time interval. Labeled 'time_weather' when not interpolated to main time axis """
antenna_name: Optional[Coordof[AntennaNameArray]]
""" Antenna identifier """
ellipsoid_pos_label: Optional[Coord[EllipsoidPosLabel, str]] = (
Expand Down Expand Up @@ -1574,7 +1602,7 @@ class PointingXds:
"""
Pointing dataset: antenna pointing information.
In the past the relationship and definition of the pointing infromation has not been clear. Here we attempt to clarify it by explaining the relationship between the ASDM, MSv2 and MSv4 pointing information.
In the past the relationship and definition of the pointing information has not been clear. Here we attempt to clarify it by explaining the relationship between the ASDM, MSv2 and MSv4 pointing information.
The following abreviations are used:
Expand Down Expand Up @@ -1615,29 +1643,31 @@ class PointingXds:
Direction labels.
"""

POINTING_BEAM: Data[
Union[
tuple[Time, AntennaName],
tuple[TimePointing, AntennaName],
tuple[Time, AntennaName, nPolynomial],
tuple[TimePointing, AntennaName, nPolynomial],
],
LocalSkyCoordArray,
]
POINTING_BEAM: Dataof[PointingBeamArray]
"""
The direction of the peak response of the beam and is equavalent to the MSv2 DIRECTION (M2_direction) with_pointing_correction=True, optionally expressed as polynomial coefficients.
"""

# Optional coords:

time: Optional[Coordof[TimeInterpolatedCoordArray]] = None
"""
Mid-point of the time interval for which the information in this row is
valid. Required to use the same time measure reference as in visibility dataset.
Labeled 'time' when interpolating to main time axis.
"""

time_pointing: Optional[Coordof[TimePointingCoordArray]] = None
""" Midpoint of time for which this set of parameters is accurate. Labeled
'time_pointing' when not interpolating to main time axis """

n_polynomial: Optional[Coord[nPolynomial, numpy.int64]] = None
"""
Polynomial index, when using polynomial coefficients to specify POINTING_BEAM
"""

# Optional data vars:

POINTING_DISH_MEASURED: Optional[
Data[
Union[
Expand Down Expand Up @@ -1901,15 +1931,6 @@ class VisibilityXds:
xradio_version: Optional[Attr[str]] = None
""" Version of XRADIO used if converted from MSv2. """

intent: Optional[Attr[str]] = None
"""Identifies the intention of the scan, such as to calibrate or observe a
target. See :ref:`scan intents` for possible values.
"""
data_description_id: Optional[Attr[str]] = None
"""
The id assigned to this combination of spectral window and polarization setup.
"""


@xarray_dataset_schema
class SpectrumXds:
Expand Down Expand Up @@ -2001,12 +2022,3 @@ class SpectrumXds:

xradio_version: Optional[Attr[str]] = None
""" Version of XRADIO used if converted from MSv2. """

intent: Optional[Attr[str]] = None
"""Identifies the intention of the scan, such as to calibrate or observe a
target. See :ref:`scan intents` for possible values.
"""
data_description_id: Optional[Attr[str]] = None
"""
The id assigned to this combination of spectral window and polarization setup.
"""

0 comments on commit cb1e83b

Please sign in to comment.