Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RangeSet encoding for CIS 1.1 XML #23

Open
jerstlouis opened this issue Nov 14, 2021 · 6 comments
Open

RangeSet encoding for CIS 1.1 XML #23

jerstlouis opened this issue Nov 14, 2021 · 6 comments

Comments

@jerstlouis
Copy link
Member

jerstlouis commented Nov 14, 2021

Are XML / GeneralGridCoverage in use anywhere?

The <V> </V> from the current examples seem overly verbose.
It does not seem that the V refer to the name of the attributes either, so it is really useless bloat.

The GML 1.0 examples uses a gml:DataBlock and gml:tupleList using a comma to delimit attribute values (Data Record fields) from the same Data Record (range value), and spaces to delimit range values / DataRecords.

EDIT: I discovered from the PointCloud example that cis:CV is used to separate DataRecords, but only when the DataRecord contain multiple attributes.

@joanma747

@pebau
Copy link
Contributor

pebau commented Nov 14, 2021 via email

@jerstlouis
Copy link
Member Author

jerstlouis commented Nov 14, 2021

@pebau

happy to see how finally the coverage specs are read even by coverage spec
writers :-)

I have been trying to read and understand those specs spending a very considerable amount of time over several years, but they are very lengthy and difficult to grasp and follow. Hence why I propose a simplification / re-organization to make them more accessible.

By microsyntax, do you mean the spaces and commas to separate attributes values inside XML character data in GMLCOV?

I am not clear on what this has to do with a "common modeling space". In the CIS JSON encoding of the RangeSet (which also had ambiguous examples about how one should encode a RangeSet) the values are all directly flattened in a single JSON array. There could be clear JSON / XML encoding rules that can be established describing how a UML model should be translated to a particular encoding, as we have discussed with @joanma747 in the context of the 2DTMS revision. Such a rule could say something like "to serialize a large number of values in a compact manner (whether of a simple type or of multiple instances of the same composed type), XML character data can be used with spaces as a separator, while in JSON an array is used, where a set of composed values are expanded into their individual components" @samadammeek @rob-metalinkage

About this <V></V>, the CIS specification 1.1.1 says in section 12.1 gml-coverage Requirement 32:

In a coverage encoded in GML, each atomic range value (i.e., cis:v element) shall contain exactly one value.

Note that the lowercase v is mismatched with the examples uppercase V.

It's not clear from that clause how one encodes a composed range value (cis:CV is not referenced anywhere in the document). There is one example for a point cloud that shows the <CV></CV> tag for range value, within which we have a <V></V> tag for the individual fields.

It is also odd XML-wise that the DataRecord (CV) is only used when it contains more than one Field (V). Compared with the JSON encoding of the corresponding point cloud example where the values are directly in the array, it's also inconsistent.

The JSON encoding conformance class description in CIS 1.1.1 is devoid of almost any information. If a simple flat list of values works for JSON, it should work just as well for XML and simply separating values by spaces in character data should work fine, as e.g. in GML coordinates.

In the end those <V> and <CV> provide no value and just makes an uncompressed XML representation of CIS RangeSet significantly larger than they need to be.

@pebau
Copy link
Contributor

pebau commented Nov 15, 2021

@jerstlouis

I have been trying to read and understand those specs spending a very considerable amount of time over several years, but they are very lengthy and difficult to grasp and follow.

Indeed, nobody said it's easy to write standards! Quite the opposite: we have to do the hard work so that others can benefit.
And to be exact and unambiguous is tough indeed. Just read the SQL standard - 1000 pages of rock-solid syntax and static and dynamic semantics definition...but it pays off: SQL simply works. No issue of mixing up Lat and Lon like in GML coordinates! Welcome to the hell of standardization... ;-)

Just an afterthought: If someone finds SWE Common unnecessarily difficult, why not join the Sensor people and help improving there? It would be an obvious severe disadvantage to detach from there.

@pebau
Copy link
Contributor

pebau commented Nov 15, 2021

In the end those and provide no value

that is not correct. It simply represents information structure in the way GML foresees.

@pebau
Copy link
Contributor

pebau commented Nov 15, 2021

PS: GML and JSON have rather different models (and modeling constructs) - eg, GML does not know arrays whereas JSON does. This obviously begs for sometimes different methods to represent the same information.

@jerstlouis
Copy link
Member Author

jerstlouis commented Nov 15, 2021

Just an afterthought: If someone finds SWE Common unnecessarily difficult, why not join the Sensor people and help improving there? It would be an obvious severe disadvantage to detach from there.

To clarify, I don't find SWE Common unnecessarily difficult, and I never suggested to detach CIS from it (it should still be normatively referenced). But the CIS specification only uses a tiny portion of the specification, so it should be easy to include the DataRecord tidbits (or a summary of them) relevant to the RangeType directly in the specification (at least covering the 90% use cases), even if only as informative clarifications, so that most implementers do not require to dig in the separate large specification. And of course those examples being corrected :)

NOTE: I am somewhat confused how that relates to this particular issue? V and CV are not defined in SWE Common, are they?

why not join the Sensor people and help improving there?

Speaking for myself, there are tons of battlefronts in the OGC and I already picked way too many battles ;)

In the end those and provide no value
that is not correct. It simply represents information structure in the way GML foresees.

PS: GML and JSON have rather different models (and modeling constructs) - eg, GML does not know arrays whereas JSON does. This obviously begs for sometimes different methods to represent the same information.

But GML has XML character data, and uses it for compact representation of vector features geometry coordinates (and GMLCOV uses it for RangeSet as well). I still think choosing <V>1</V><V>2</V><V>3</V> over 1,2,3 or 1 2 3 really introduces useless bytes significantly increasing uncompressed file size with no tangible advantage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants