Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pose components #3

Open
osrf-migration opened this issue Oct 9, 2018 · 9 comments
Open

Pose components #3

osrf-migration opened this issue Oct 9, 2018 · 9 comments
Labels

Comments

@osrf-migration
Copy link

Original report (archived issue) by Louise Poubel (Bitbucket: chapulina, GitHub: chapulina).


This issue is meant to discuss the frame of reference, and more generally pose components, within the ECS.

Current implementation

Currently, ignition::gazebo::components::Pose holds an ignition::math::Pose3d for an entity, but its frame of reference is not clear from the component itself. For now, the physics and rendering systems have been assuming the SDF spec:

Entity Pose frame
Model Parent entity (world or parent model)
Link Parent model
Collision Parent link
Visual Parent link
Joint (not on ign-gazebo yet) Child link

Related discussions

Before any considerations, it would be interesting to be aware of previous documents:

Use cases

Several parts of ign-gazebo will handle pose components, and each of them has different use cases. For example:

  • Server: Must be able to parse SDF to create pose components.
  • Physics system: Creates its internal representation based on pose components, and updates them as time elapses.
  • Rendering system: Creates its scene based on pose components and updates the scene as the pose changes.
  • GUI: Displays entity poses to users, and also lets users edit the pose. For increased usability, the user should be able to choose which frame of reference to use, the units, and convention (i.e. Euler / quaternion).

Proposals

Here are some solutions I can think of, please feel free to discuss and propose alternatives below.

"Pose manager"

It looks like it would be convenient to have a central API which allows converting from any frame of reference to any other at any given time, and is accessible by all systems. This would be akin to the TF framework on ROS, but thinner and more specific to the ECS. This way, systems don't need to worry about keeping track of frames, they could just query as needed.

Some considerations:

  • The manager could be implemented as a system, which gathers all pose information and exposes that to other systems on demand.
  • A synchronous interface would be ideal, so systems can make fast queries within their update callbacks. But we also need an asynchronous interface for when systems are spread across processes. So I think it makes sense to use Ignition Transport to provide pose query services, and it would be interesting to use sync/async as needed.
  • ignition::gazebo::systems::SceneBroadcaster already implemented some features which could be moved to the pose manager, like a pose/info publisher and a pose/graph service.

Pose component's frame

It has been suggested before that the pose component carry two pieces of information:

  • The ignition::math::Pose3d
  • The reference frame

This way, each entity could be explicit about what frame its pose is being expressed about.

I think this adds complexity, as all systems will need to handle poses with arbitrary frames.

So my first proposal is: The pose component should not carry frame information, instead, that is defined by convention.

Then the question is about what should the convention be. I lean towards the pose always being expressed w.r.t. the parent entity, as opposed to any other ancestor closer to the root (like the world or the root model). The main reason being that an entity knows who their immediate parent entity is, but it would necessary to crawl the entity tree to figure our who the other descendant would be.

Using the pose manager, any systems which use different conventions should be able to easily get the pose in the frame they're interested in.

Unanswered questions

  • Who is the parent of a joint entity and how does its pose get defined?
@osrf-migration
Copy link
Author

Original comment by Michael Grey (Bitbucket: mxgrey, GitHub: mxgrey).


Following up on this remark:

I'm not sure about the world frame being "almost always the most useful frame of reference for systems" though, lots of rendering frameworks position nodes relative to parent nodes.

Many rendering engine APIs (e.g. Ogre and OpenSceneGraph) allow users to define tree structures and provide relative transforms because they assume that a tree structure is convenient for users when expressing the layout of objects, and most of the time this is definitely the case.

However this means the rendering engine is performing forward kinematics computations, but in our ECS framework specifically we're already computing forward kinematics by necessity in the physics system. It would be wasteful and unnecessary for both systems to compute forward kinematics. In the libdartsim-gui-osg (which is DART integrated with OpenSceneGraph) we construct a "scene graph" where every object node is just a direct child of the root, and DART simply feeds in the object transforms with respect to the world which were already computed during the physics update. That saves the rendering engine from doing a whole ton of redundant matrix computations.

In general, the best possible way to be able to transform between different frames of reference is to have all transform data provided in the world frame and do a single inverse-matrix-multiply operation to switch between reference frames. We don't need the sophistication that TF has because TF assumes that there is no canonical "world frame", but as a simulation engine with full world knowledge, we have the benefit of a definitive world frame, and we should take advantage of that.

This world frame approach is essentially how Frame Semantics currently works in ign-physics. The Frame Semantics implementation assumes an underlying ECS framework because ign-physics is itself an ECS, so we could potentially adapt the frame semantics implementation for ign-gazebo by moving the core components upstream to ign-math. The overall implementation was intended for Eigen, but the core templates like RelativeQuantity and classes like FrameID can use ign-gazebo as its ECS and ign-math as its arithmetic library.

Note that we can represent pose information in the world frame while still maintaining a "parent" concept as a separate component. Then if the "pose relative to parent" of a child is requested, it would simply be P.inverse() * C where P is the parent pose in the world while C is the child pose in the world.

@osrf-migration
Copy link
Author

Original comment by Michael Grey (Bitbucket: mxgrey, GitHub: mxgrey).


Who is the parent of a joint entity and how does its pose get defined?

This is a really tough question to be honest, since there are two transforms that could be considered:

  1. The transform from the parent link to the joint assuming zeroed-out joint positions

  2. The transform of (1) multiplied by the transform produced by the current set of joint positions

I think (1) is the more common convention for trying to define a "joint pose".

Although I would definitely say that the parent link should be considered the "parent" of a joint.

@osrf-migration
Copy link
Author

Original comment by Michael Grey (Bitbucket: mxgrey, GitHub: mxgrey).


I lean towards the pose always being expressed w.r.t. the parent entity, as opposed to any other ancestor closer to the root (like the world or the root model). The main reason being that an entity knows who their immediate parent entity is, but it would necessary to crawl the entity tree to figure our who the other descendant would be.

This makes the ability to convert between different frames of reference strictly more difficult and more expensive. This means we always need to climb a tree and compute a chain of matrix transforms (including matrix inversions) in order to transform to a different arbitrary reference frame. However, if poses are always expressed in the world frame (e.g. T_A is the pose of body A in the world and T_B is the pose of body B in the world) then the transform of B relative to A (i.e. A_T_B) would always simply be A_T_B = T_A.inverse() * T_B. This is true for any two bodies A and B, no matter what kind of parent/child relationship they might or might not have.

@osrf-migration
Copy link
Author

Original comment by Nate Koenig (Bitbucket: Nathan Koenig).


There are a couple topics here:

  1. How is frame information represented on disk, such as in SDF and log files.

  2. How is frame information represented in memory.

  3. How is frame information communicated between libraries, and across the wire.

Thanks for opening this issue Louise, but I'm now feeling that this topic will grow sufficiently complex that we need a new (and final) design document. See pull request #58.

@osrf-migration
Copy link
Author

Original comment by Louise Poubel (Bitbucket: chapulina, GitHub: chapulina).


I agree with most of your points, @mxgrey , and I agree they make sense for a physics engine, but I think you're not taking a few characteristics of ign-gazebo into account. Having all nodes expressed w.r.t. the world frame is convenient when all frames are always being updated. But this is not the case here (at least for now). Think of this scenario:

Consider the following SDF:

<model ...>
  <link ...>
    <visual ...>
   <collision ...>
   <sensor...>
      <imu...>
   </sensor>
</model>

When loaded, the following entities are created in the ECS, each with its own pose component: model, link, visual, collision, imu.

The physics system will handle model, link and collision, but it is unaware of visual and imu.

The rendering system builds a tree for model, link and visual, and an IMU system handles imu.

The next time-step, the model is supposed to move, but the link and collision poses w.r.t. the model remained the same (a very common situation).

Let's see what each system will need to do in each approach:

World as reference

  • Physics will need to update the model, link and collision poses, because they all changed w.r.t. the world.
  • Rendering will need to crawl the tree to find visual's parent link, and add the visual pose offset to update the node on ign-rendering that is attached to the world.
  • IMU system will need to find the IMU's parent link, add imu's pose offset, and then perform calculations.

Parent as reference

  • Physics only updates the model pose.
  • Rendering will update model's pose, and ign-rendering takes care of moving all descendants accordingly.
  • IMU system will need to compute FK all the way down to imu, and then perform calculations.

Summarizing the advantages of the 2nd approach:

  1. Offsets which are fixed most of the time don't need to be continuously updated through the ECS - which means less data flying around.
  2. Systems can delegate FK computation to other libraries, like ign-rendering, to take advantage of their scene graph.

@osrf-migration
Copy link
Author

Original comment by Michael Grey (Bitbucket: mxgrey, GitHub: mxgrey).


Updating the transform of non-physical entities is a great point! That definitely rules out the assumption that the physics system can be singularly responsible for all FK computation.

Parent as a reference

Physics only updates the model pose.

I assume in this approach, physics will also update the relative poses of all physical entities. Note that in most cases (like ODE), this means the physics system will additionally have to compute T_P.inverse() * T_E for each entity E (where T_P is the world transform of the entity's parent and T_E is the world transform of the entity) and put this into the ECS. This isn't a terrible cost, but it's wasteful when the alternative is for it to just copy each T_E directly into the ECS and let rendering (and other systems) leverage world transforms of entities instead of recomputing the FK.

Offsets which are fixed most of the time don't need to be continuously updated through the ECS - which means less data flying around.

Anything that is relevant to physics or that gets rendered will need to have its FK computed somewhere at some point on each ECS cycle. I believe this is going to be the case for all entities that we consider to be frames. By making the relative pose the authoritative data in the ECS, all we're doing is forcing many redundant FK computations. It's worth noting that outside of collision detection and collision handling, FK is the most computationally expensive stage of most simulations.

Systems can delegate FK computation to other libraries, like ign-rendering, to take advantage of their scene graph.

I don't see how this is an advantage. The 2nd approach forces systems to rely on other libraries for FK, but I don't see how that's a positive thing.

Existing conventions

The conventional approach for most physics engines is to have one phase of the simulation cycle where the whole FK is authoritatively computed, and it never needs to be computed again until the next cycle.

As a counter-example to the conventional approach, dartsim does lazy evaluation for the FK, where it does some bookkeeping under the hood to keep track of when relative transforms are modified, and then if a user requests a transform that requires an FK computation, then dartsim will automatically update FK as needed and cache the results for later use.

The worst case scenario is when FK needs to be repeatedly recomputed multiple times in each simulation cycle. Each matrix multiplication involves a significant number of arithmetic operations, and the costs of those operations adds up quickly when performed redundantly on long chains. Physics engines typically avoid this aggressively, and I would urge us to do likewise.

Possible compromise?

I can't refute the inherent issue that not all possible frames will be physical, so the conventional approach of depending entirely on the physics engine might not be an option.

If we're willing to add a small amount of complexity, we may be able to do a lite version of what dartsim does. What I'm thinking is we would have a Pose component which is a bit more complex than a simple ignition::math::Pose3d object. It would privately store both a math::Pose3d value and an entity index that indicates the frame of reference for that math::Pose3d data. The only way to get information out of the component is to pass it through a function such as ignition::math::Pose3d FrameSemantics::Resolve(poseComponent, referenceFrame) where you explicitly specify what reference frame you want the result to be expressed in (default being world frame). The frame semantics object will identify whether FK is needed based on the current frame of reference ID cached inside the component. Any time the frame semantics computes FK, it will update the component's cache with the newly computed world transform.

This means the way data is stored within the pose component is opaque to systems. Systems that write to pose components would do something like

*poseComponent = PoseComponent(pose3d, referenceFrameID);

while systems that read from them would do

pose3d_relativeToSomeFrame = frameSemantics->Resolve(poseComponent, someFrameID);

So physics systems that are already computing things in the world frame can just pass in world frame data:

*poseComponent = PoseComponent(poseRelativeToWorld, WorldID());

while other systems can pass in poses with respect to any frame:

*poseComponent = PoseComponent(poseRelativeToParent, parentEntityID);

and these representations will be opaque to anyone who reads from the component later. Any request for information will always do the most efficient thing possible.

Assumptions of this proposal

For that proposal to always work correctly, it relies on three assumptions:

  1. The pose component of each entity will always be updated on each ECS cycle

  2. The pose component will always be updated before being read from

I think assumption (2) is a perfectly reasonable and unavoidable assumption. We already know that we're required to design the ECS so that components get written to before they're read from.

Assumption (1) might be more controversial, because there may be entities that are static with respect to their parents, and it might be desirable to "set and forget" their relative transform (i.e. we'll tell the ECS what the entity's relative transform is one time and then never write to the entity or component again, ever, and just let systems read from it). I think I would push back against this and say that we shouldn't have stale entities or components. The existence and management of entities should be an active process, or else we risk leaking entities and components into the ECS even when they're not needed anymore.

To satisfy assumption (1), I might propose something like a Static Transform System to manage abstract non-physical frames that aren't managed by any other system (e.g. markers? abstract reference frames?) whose transforms are static with respect to their parent frames. On each ECS update, the Static Transform System would essentially "remind" the ECS of what the static entity's transform with respect to its parent is. This would be analogous to the static transform publisher of ROS's TF. Entities like visuals or IMUs would still get updated by the Rendering or Sensors systems respectively. The Static Transform System would just be a catch-all for any reference frame entity that doesn't neatly fall under the jurisdiction of an existing system. (I don't know if this Static Transform System idea will even be necessary, but at least it offers us a guaranteed way to satisfy assumption [1] if we find any conceptual outliers.)

Side note

I'm not totally clear: Should we be moving this conversation to this PR, or should we migrate to there once we've determined what approach to take?

@osrf-migration
Copy link
Author

Original comment by Nate Koenig (Bitbucket: Nathan Koenig).


You can migrate to the PR when decisions are reached. The point of the PR is to have a place to store the conclusion.

@osrf-migration
Copy link
Author

Original comment by Nate Koenig (Bitbucket: Nathan Koenig).


Related to this an SDF pull request that will handle pose and frame information: https://bitbucket.org/osrf/sdformat/pull-requests/468/wip-pose-dom/diff

@EricCousineau-TRI
Copy link
Contributor

EricCousineau-TRI commented Oct 8, 2020

\cc @azeey to see if this covers frames as first-class citizens, or file a new issue

related to (but not required for) Gazebo prototype for gazebosim/sdformat#278

ahcorde added a commit that referenced this issue Jul 2, 2021
…#874)

* fix conditional, extract common code

Signed-off-by: ddengster <[email protected]>

* add test for worlds that import obj models, checks for textures

Signed-off-by: ddengster <[email protected]>

* cpplint

Signed-off-by: ddengster <[email protected]>

* cpplint #2

Signed-off-by: ddengster <[email protected]>

Update test/integration/collada_world_exporter.cc

Co-authored-by: Alejandro Hernández Cordero <[email protected]>

* fix path concatencation

Signed-off-by: ddengster <[email protected]>

* cpplint #3

Signed-off-by: ddengster <[email protected]>

Co-authored-by: Alejandro Hernández Cordero <[email protected]>
Co-authored-by: Louise Poubel <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants