Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion for hierarchies in batch table #66

Closed
arneschilling opened this issue Jan 29, 2016 · 28 comments
Closed

Suggestion for hierarchies in batch table #66

arneschilling opened this issue Jan 29, 2016 · 28 comments

Comments

@arneschilling
Copy link

We are working on ideas for supporting hierarchies in B3DM.
The background is that we are working with 3D GIS and BIM data (as CityGML or IFC), which are usually more complicated than 3D viz datasets. We want to preserve as much information as possible.
Currently there is a way to group nodes in GLTF. However, we also want to optimzie rendering as much as possible and make use of batches including attribute tables in B3DM.
In our scenarios, we have complex input data with groups, hierarchies and attributes on different levels. For instance,
a building has a unique id and a set of attributes and properties. The parts the building is made of also has a set of attributes, which are usually different. Sometimes we have 3 hierarchy levels and more.
the application we are working on must support selecting and highlighting entities at all available levels and display the attached attributes (as key value pairs).
We want to click on buildings and display building specific attributes and we also want to click on building parts and display component specific (wall, roof, window, door...) attributes without switching
to another data set with different configuration. The behavior of what kind of element is selected is controlled by a toggle button or by other means.

The current design of the batch table is quite simple, it is basically a 2D grid.
Now the question is how we group together batches and attach attributes to these groups.

We found out that the number of columns is flexible and that its not restricted to the number of batches. We can extend the batch table to include additional columns representing abstract features for which we can include attitional attributes.
This is nowhere specified, but it currently happily consumed by Cesium.

@jbo023 has set up a demo application
http://hosting.virtualcitysystems.de/demos/hierarchy/
Use the CityGML Explorer to access attributes and toggle between building selection and part selection using the buttons on the right.

Example B3DM file:
http://hosting.virtualcitysystems.de/demos/hierarchy/examples/data/buildings_semantic/15/35210/6826.b3dm

There is no formal specification of our approach because we see it as workaround taking into account the current limitations of B3DM.
Please let us know in case somebody is working on a similar topic. We are happy to discuss possible solutions.

As to the example file above, you will see batch table with ids, attributes (ignore the strange format for now..) and a row called parentPosition.
The latter is providing the information on how things are grouped together. E.g the first two entries in this row are 1243, which means that these batches are grouped together.
The id of this group can be found in the id array at position 1243. However, there is no batch with this number, its just an abtract group feature.

What do you think of this approach?
Has anybody alternative ideas or suggestions to accomplish this? Our intention is to include this feature in the 3D Tiles specification so that we can base our framework on the master branch. We can try to formally describe our concept in the git repo and create a pull request.

best regards,
Arne

@pjcozzi
Copy link
Contributor

pjcozzi commented Jan 29, 2016

Thanks for all the details for this use case. Is this related to #65?

@jbo023
Copy link
Contributor

jbo023 commented Jan 29, 2016

Its related, but allowing JSON Objects in the batchtable is independent of this diskussion.

@pjcozzi
Copy link
Contributor

pjcozzi commented Feb 16, 2016

If I understand correctly, you want to be able to have more items in the batch table than features, e.g., buildings, in the .b3dm file so that fields in the batch table can be used as an index to data in the batch table. Is this correct?

This is because you have a nested data structure you want to navigate?

It seems like an ability to carry an app-specific payload as part of the tile or to get the data from another web service would be a cleaner approach that is general enough to put into the core spec.

@clausnagel
Copy link

Yes, the problem is that the current design of the batch table is suitable for flat data but not for hierarchically structured data. So we would like to be able to express and access hierarchical data directly in the .b3dm file.

For example, assume a typical CityGML hierarchy: a building is described by a roof and wall surface and the wall surface contains a door.

building
  |- roof surface
  |- wall surface
      |- door

Now, for instance, we would like to be able to color only the roof surfaces of the buildings according to an attribute that is only assigned to the roof. Likewise, we would like to be able to color only the wall surfaces, or the entire building (which means coloring all its nested child elements).

And it should be possible to navigate the hierarchy. For instance, when clicking on the door it should be possible to retrieve its attributes but also its (transitive) parents and their attributes.

Please note that the extension of the batch table as done in the demo application is only a suggestion and the way we currently solved the problem. I think the core request is to have support for hierachical data in the 3D Tiles specification. If there are other (better) ways than extending the batch table, we are also happy.

@arneschilling
Copy link
Author

It's not necessarily app-specific payload, since its about having a general concept to deal with nested data, but I agree that not everybody will want to use this kind of feature, so it should be kept optional. We were trying to find a workaround so that we can navigate upwards starting with a picked object ID and find parent group nodes which we can highlight.
Storing hierarchies and additional attributes in separated files is an option, but this would double the number of HTTP requests and the contained information must be merged with the B3DM content anyway. That's why we were looking for a way to have it directly in B3DM.

@pjcozzi
Copy link
Contributor

pjcozzi commented Feb 25, 2016

Given that an element in the batch table arrays can be an array, couldn't each feature just have an array of batch ids for its children? In your example, building would have [roof, wall]. Likewise, each feature could also have a batch id for its parent (or an array for parents). Would this work?

@jbo023
Copy link
Contributor

jbo023 commented Feb 25, 2016

Hmm, the problem are the building batch entries, because a buildings does not have a geometry, only the children [roof, wall] have geometries.

@pjcozzi
Copy link
Contributor

pjcozzi commented Feb 25, 2016

Ah, I see. If this is a common enough use case then we could add an optional app-specific payload to each tile (the data could be JSON, binary, etc.). If we have one more user (including me if I run into any cases) who needs this, then I say that justifies the minor spec complexity, and that we add it.

Otherwise, you could store it in a separate payload or have a convention where, for example, batch id 0 includes an object with the metadata for geometry-less features.

@pjcozzi
Copy link
Contributor

pjcozzi commented Jul 13, 2016

Labeling this as draft 1.0 since the ability to have multiple ids (or however it is implemented) to identify buildings and facades, for example, is becoming commonly requested.

@lilleyse
Copy link
Contributor

lilleyse commented Nov 1, 2016

We have some ideas for defining a hierarchical batch table, I'm curious to know what others think.

@arneschilling @jbo023 @clausnagel @pmconne

Summary

We want to be able to pick a feature in the tile and get information from its own metadata, as well as metadata from its parent, grandparent, etc. In the current batch table spec, this is only easily possible by flattening all the hierarchy’s metadata in each feature, resulting in a lot of duplicate data.

Batch Table Hierarchy

This approach rethinks the batch table in terms of a hierarchy of items, where each item has a “class” associated with it. It supports metadata for features and abstract “groups” that aren’t backed by geometry - like in @arneschilling's example where the walls and doors are features, but buildings aren’t.

Example

Number of doors = 4
Number of walls = 3
Number of buildings = 2
Number of zones = 1
Number of features = 7 (door and walls)
Number of items = 10 (doors, walls, buildings, zones --- buildings and zones are not backed by geometry so they are abstract items)

Organized like:

  • zone
    • building
      • wall
        • door
        • door
      • wall
    • building
      • wall
        • door
      • door
{
    CLASSES : [
        {
            name : 'door',
            door_mass : [10, 11, 14, 7],
            door_width : [1.2, 1.3, 1.21, 1.5],
            door_name : ['door0', 'door1', 'door2', 'door3']
        },
        {
            name : 'wall',
            wall_paint : ["red", "green", "pink"],
            wall_windows : [4, 6, 1],
            wall_name : ['wall0', 'wall1', 'wall2']
        },
        {
            name : 'building',
            building_height : [100, 20],
            building_name : ["building0", "building1"]
        },
        {
            name : 'zone',
            zone_name : ["zone0"]
        }
    ],
    CLASS_ID : [0, 0, 1, 1, 0, 0, 1, 2, 2, 3],
    PARENT_ID : [2, 2, 7, 7, 6, 8, 8, 9, 9, -1]
}

CLASSES defines the classes. In this example there are 4 classes: doors, walls, buildings, and zones. Each class is like a mini batch-table - storing the properties for all items of that class. The arrays can be JS arrays or batch table binary arrays.

CLASS_ID stores the class of each item. In the above example item 0 is "door0", item 1 is "door1", item 2 is "wall0", item 3 is "wall1", and item 4 is "door2".

PARENT_ID stores the parent of each item, as an index into the CLASS_ID section. -1 means the item has no parent. In the above example "door0"'s parent is "wall0".

Multiple Parents

In order to support an item having multiple parents, such as parents that act as classification tags, the approach can be extended so each item defines its parent count:

    CLASS_ID : [0, 0, 1, 1, 0, 0, 1, 2, 2, 3],
    PARENT_COUNT : [1, 2, 1, 1, 1, 1, 1, 1, 1, 0]
    PARENT_ID : [2, 2, 3, 7, 7, 6, 8, 8, 9, 9]

Now "door1" has two parents: "wall0" and "wall1".

Across Tiles

One challenge is supporting the concept of a hierarchical bath table across different tiles.

In 3D Tiles implementations intermediary tiles that contain batch table metadata may be unloaded, so allowing PARENT_IDs to reference external tiles is dangerous.

Another approach is to contain the full hierarchy in each tile. The downside is duplicate data across sibling tiles, which could be minimal in some use cases but worse in others. This should still improve the original situation because duplicate data is stored across tiles rather than across features.

Another downside here is it may be difficult for implementations to support editing batch table values since they would need to edit the duplicate data that exists.

Any ideas or feedback on this approach?

@jbo023
Copy link
Contributor

jbo023 commented Nov 2, 2016

At a first glance this looks like a nice concept.
In the paragraph CLASS_ID you probably meant "item 3 is wall1", if not I didn't get the concept yet.
Also the PARENT_COUNT row probably has one item to much.

@lilleyse
Copy link
Contributor

lilleyse commented Nov 2, 2016

Thanks for the corrections @jbo023, should be fixed now.

@jbo023
Copy link
Contributor

jbo023 commented Nov 2, 2016

If i want to get the Attributes for a given BatchID:

I have to find the corresponding ClassID, which is just an array access, where the batchID is the index.
But to find the correstponding array position for the CLASS attributes i have to iterate over the CLASS_ID row and count the appearances of the ClassId? Is this correct? This seems a bit expensive.

@lilleyse
Copy link
Contributor

lilleyse commented Nov 2, 2016

Yes it would require knowing how many appearances of CLASS_ID came before it. This can be done once at load time so that each item stores its index into its class's array. The alternative is providing another array of indices in the batch table hierarchy - possibly not worth the extra data.

@jbo023
Copy link
Contributor

jbo023 commented Nov 2, 2016

Ah yeah I didn't think about doing this on load. I was more thinking about doing this on the fly for styling. But this is really not worth the extra data in the b3dm tile if we can just generate this info on load.

@pmconne
Copy link

pmconne commented Nov 2, 2016

This sounds pretty solid to me.
In my use cases, the same fixed set of 'parent' items will tend to be shared amongst all tiles - but the ratio of parents to children will tend to be quite small, so duplication of parent data shouldn't be a big deal.

@pjcozzi
Copy link
Contributor

pjcozzi commented Nov 2, 2016

Thanks for the prompt input @pmconne and @jbo023.

As @lilleyse and I discussed offline, I think this is a great approach. Here's some notes for the spec and ideas for the schema:

Spec Content and Terminology

  • feature vs. item - a feature has actual geometry where as an item is a feature or a class instance (or class) that may not be backed by geometry since it represents an abstract type. item is generic and needs so much context to be meaningful. What term do you suggest we use instead?
    • In general adopt well known OO terminology as much as possible for the spec, e.g., parent ids create an inheritance relationship.
  • Above says "pick a feature in the tile and get information from its own metadata, as well as metadata from its parent, grandparent."
    • The spec needs to be very explicit that we mean parent features, not parent tiles. The feature hierarchy is separate from the spatial hierarchy; the feature hierarchy is built on semantics about the features, where as the tileset hierarchy is built on spatial coherence. (feel free to clean up these notes for the spec).
  • Include a solid motivation section in the spec, including the common expected use cases.
  • Include an example for each use case (if reasonable and not redundant), including the "gmail label" case, and start with a simple example, e.g., just: one of each door -> wall -> building.
  • Be explicit about how the instance arrays in a class are indexed, e.g., there is no random access as @jbo023 mentioned, the loader needs to scan the CLASS_ID array and track an increasing index for each class type. I think this is fine for the implementations we expect, but the spec needs to be explicit on how a loader maps from batchId to the per-instance data.

Suggested Schema Changes

  • Some classes have instances, e.g., there are two building instances in the above example, and other classes are abstract so to speak, e.g., the zone above. Per-class vs. per-class instance data should be cleanly separated. Currently, it is:
        {
            name : 'building',
            building_height : [100, 20],
            building_name : ["building0", "building1"]
        },
        {
            name : 'zone',
            zone_name : ["zone0"]
        }

It is awkward to differentiate if ["zone0"] is class data or the first class instance's data (in this case there are no class instances).

Please think through the schema and propose something, but it could be as simple as adding an instances object property, e.g.,

        {
            name : 'building',
            instances : {
                building_height : [100, 20],
                building_name : ["building0", "building1"]
            }
        },
        {
            name : 'zone',
            zone_name : "zone0"
        }
  • "Another approach is to contain the full hierarchy in each tile..."
    • This may just be wording, but I would suggest only storing the full feature hierarchy for the features in that tile; not all features in all tiles. We also need to guarantee that class data and class instance data for the same class/instance is the same across tiles. We may need to introduce a unique id separate from the "id" used to index the arrays and carefully name the id vs. index. I suggest that we tackle this after implementing single and multiple parents.
    • We may also be able to store "heavy" class data in another file (like glTF buffer/bufferView) so each tile references the same external file. Let's hold to see if this is actually needed since @pmconne's use case doesn't need this.

Implementation

Here's a few use case that we want to make sure are reasonable:

  • Avoid duplicate data in the batch table: this is straightforward, just move it to a class.
  • Style a feature of a particular class (including multiple levels of inheritance): should be straightforward to walk the parent ids to check a to implement something like isClass(feature).
    • How reasonable is this for point clouds with a GLSL backend?
  • Likewise, it should be reasonable to style a feature based on its ancestors' properties, e.g., "show" : "feature.zone_name === 'zone0'" where feature may be a building
    • It should be trivial for the implementation to re-evaluate the style when the inherited property changes (when supporting multiple tiles, the runtime would also need to sync these values).
  • Click on a feature and highlight its parent or ancestors:
    • feature.parent or feature.parent[i] or feature.parent.all. We need to think through this.
  • Click on a feature and highlight all the sibling features of the same class, e.g., click on a door and then highlight all doors in the room.
    • all(feature.parent.children).thatAre('DOOR') - that's not the real syntax; we need to think through this.

What other cases should we consider?

@lilleyse
Copy link
Contributor

lilleyse commented Nov 3, 2016

It is awkward to differentiate if ["zone0"] is class data or the first class instance's data (in this case there are no class instances).

In the example zone0 is still considered an instance of the "zone" class. I like the separation that the instances object provides, so the example JSON would be:

        {
            name : 'building',
            instances : {
                building_height : [100, 20],
                building_name : ["building0", "building1"]
            }
        },
        {
            name : 'zone',
            instances : {
                zone_name : ["zone0"]
            }
        }

Do we have a strong need to support Class data? I figured we would have a required set of Class properties like name, id, etc but not allow for custom properties, since that data really belongs to the instances.

@pjcozzi
Copy link
Contributor

pjcozzi commented Nov 3, 2016

Do we have a strong need to support Class data?

The gmail labels example is a good example to consider, there may be a number of classes, for example: friendly, neutral, enemy, air, sea, ground, and we want to assign these to each feature, e.g., [friendly, ground], but none of the classes are physical features. Feels like we would want per-class data, not a dummy class instance to assign data to the class. Does this complicate the implementation or spec significantly?

@lilleyse
Copy link
Contributor

lilleyse commented Nov 4, 2016

It doesn't complicate it too much. The main complication is that PARENT_ID could be both an index into the CLASS_ID array (when referring to an instance) or a reference to a CLASS's unique id. The CLASS's id would need to be greater than the number of items in the CLASS_ID array so that its clear that an instance's parent is a class rather an instance.

@pjcozzi
Copy link
Contributor

pjcozzi commented Nov 4, 2016

Can you provide an example? It sounds like it might actually be easier to keep it as is even if it isn't as conceptually clean from a purist perspective.

@lilleyse
Copy link
Contributor

lilleyse commented Nov 4, 2016

Edit: this is not a proposed solution, just an example case

{
    CLASSES : [
        {
            name : 'door',
            id : 3
            instances : {
                door_mass : [10, 11],
                door_width : [1.2, 1.3],
                door_name : ['door0', 'door1']
            }
        },
        {
            name : 'wall',
            id : 4
            instances : {
                wall_paint : ["red"],
                wall_windows : [4],
                wall_name : ['wall0']
            }
        },
        {
            name : 'ground',
            id : 5

        }
    ],
    CLASS_ID : [0, 0, 1],
    PARENT_ID : [2, 5, 5]
}

door0's parent is wall0, door1's parent is ground, wall0's parent is ground.

Since here the ground class has no instances, door1 and wall0 set their parent id to the ground class id (which is 5). The ground class id can't be 0, 1, or 2 because a parent id set to that value would reference one of the 3 instances. So the limitation is that all CLASS ids need to be greater than the number of instances.

Right now I'm more in favor of treating everything as an instance.

@arneschilling
Copy link
Author

Hi,

regarding the terminlogy:
I would like to stick to the concept of FEATURES. In GIS features represent geometries with attributes. In my opinion it does not matter whether the geometry is explicit, i.e. defined as GLTF mesh, or aggregated, i.e. a collection of meshes representing a whole building.
A feature could also be an instance of a generic prototype (tree model) with custom instance attributes. So there is no need to distinguish between items that are backed by geometries and groups that are made of items. Both need metadata and can be used for nesting features.

I find CLASSES an appropriate term for features of a specifc type. However, I would not bloat the CLASS concept with semantic meaning.
My understanding is that classes represent features with a specific set of attributes. Like in higher programming languages classes have a fixed set of fields (in this case attributes) that may or may not be set by instances.
Thats a nice concept because we have varying sets of attributes. A "zone" may have only an ID wheres a door may have 20 or more attributes. Merging all attributes in a single table (single class) is possible but results in many empty table entries. Having multiple classes helps compressing the attribute workload.
In our current implementation we use JSON Objects containing attribute sets, but this creates redundancies as well, because the attribute name must be repeated every time, e.g.:

   "attributes" : [{"externalReference externalObjectName":"DENIAL4300009RYq","creationDate":"2016-07-06","gml:name":"DENIAL4300009RYq","HoeheGrund":"54.3",......

The classes concept is inbetween these extremes. We can easily figure out which features we can associate with a class.

CLASSES : [
    {
        name : 'featuretype123',
        id : 3
        instances : {
            'externalReference externalObjectName' : ['DENIAL4300009RYq', 'DENIAL430043345'],
            'creationDate' : ['2016-07-06', '2016-07-07'],
            'gml:name' : ['DENIAL4300009RYq', 'DENIAL4300009RYe'],
            'HoeheGrund' : [54.3, 14.4]
        }
    },
    {
        name : 'featuretype456',
        id : 4
        instances : {
            'id' : ['zone123']
        }
    }
 }

Questions:

  • is it necessary to have class ids and names? If the class simply represent a set of fields/attributes, referencing to the index within the CLASSES array is sufficient.
  • why do we need support for multiple parents? In scene graph concepts each node can have only one parent. If for performance reasons geometries need to be shared among nodes, we can do this using GLTF cross references and reusing vertex data.

@lilleyse
Copy link
Contributor

lilleyse commented Dec 5, 2016

In part using the name "feature" is to distinguish something that is independently visible and styleable. Naming every instance in the class hierarchy a feature may result in some confusion, as the non-geometry-backed instances do not have the same styling abilities.

As for the naming of class, I'm curious if you have any suggestions.

is it necessary to have class ids and names? If the class simply represent a set of fields/attributes, referencing to the index within the CLASSES array is sufficient.

In the Cesium implementation PR (CesiumGS/cesium#4625) I removed the id. The name is useful when checking if a feature is an instance of a certain class, via isClass, isExactClass, getClassName.

why do we need support for multiple parents? In scene graph concepts each node can have only one parent. If for performance reasons geometries need to be shared among nodes, we can do this using GLTF cross references and reusing vertex data.

Multiple parents is useful for grouping instances in more flexible ways. One example might be to group a random half of the instances into a "classifier_a" class and the others into a "classifier_b" class, in addition to the existing hierarchy.

@arneschilling
Copy link
Author

In our application, we need a feature hierarchy first, not a class hierarchy.

Example of class hierarchy would be:

Object -> ManMadeObject -> Building -> ResidentialBuilding -> Villa

(the Villa is an instance of ResidentialBuilding as well as ManMadeObject. If you want to style all ResidentialBuildings you could use this class inheritance information)

Example of feature hierarchy:

City -> Building -> BuildingPart -> Wall -> Door -> DoorKnob

(the DoorKnob is part of a door, which is part of a wall, which is part of a BuildingPart etc. If you want to style all elements of a particular wall, you could use this grouping Information)

I would suggest to make a clear distinction between these two concepts.

In the examples above, the instances make the features. CLASSES are classes.

@lilleyse
Copy link
Contributor

lilleyse commented Dec 6, 2016

Thanks for breaking it down, the spec may need to cover both concepts as use cases for the hierarchy. The cases don't need to be treated differently from a spec/implementation point of view though, and can even operate simultaneously with multiple parents.

@lilleyse lilleyse mentioned this issue Jan 18, 2017
1 task
@pjcozzi
Copy link
Contributor

pjcozzi commented Mar 8, 2017

If anyone wants to review the spec for this, see #171

@pjcozzi
Copy link
Contributor

pjcozzi commented Mar 10, 2017

Thanks for everyone's input. #171 was merged.

@pjcozzi pjcozzi closed this as completed Mar 10, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants