Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Coordinate systems and reference frames to WebVR #149

Closed
wants to merge 1 commit into from
Closed

Add Coordinate systems and reference frames to WebVR #149

wants to merge 1 commit into from

Conversation

RafaelCintron
Copy link

@RafaelCintron RafaelCintron commented Nov 10, 2016

Currently, the WebVR standard assumes that at any given time there is a single world space frame of reference (which can be reset by resetPose). While this is a simple model that corresponds to how developers are used to thinking about things, it does not map 1:1 to how inside-out-trackers see the world. Specifically, as you move away from the origin an inside-out-tracker has less information available to accurately locate where that original "world origin" is in relation to the HMD current position. For that reason placement of objects can start showing precision issues and drift.

To solve this, we would need to reason about multiple frames of reference in some way, so that apps can more accurately render their experiences as users move away from the original origin. This pull request represents an path to explicitly representing to developers the "squishy" nature of the tracking technologies. In this proposal there is no special "blessed" world space frame of reference. The user can create any number of frames of reference of various kinds, and will explicitly decide which one they're using for the experience they are trying to build. They will then supply this frame of reference whenever querying for transform data (such as getFrameData). In the near future we expect that this will then be extended to include other types such as anchors to specific locations in the world and surface reconstruction meshes.

One benefit of this system is that it maps very directly to inside-out-tracking algorithms, which reduces the risk that we over-simplify the mental model for developers. It also means we can support truly large scale applications where the user would move far from their original position (e.g. Pokemon Go in MR). In addition, it means that things like the stage can be naturally expressed as yet another frame of reference. A risk is that we could be introducing some relatively mind-bending concepts in core areas of the API.

This pull request is intended as a starting point for this conversation, and we’re looking forward to feedback from both consumers of the API, other implementers, and HMD makers.

Currently, the WebVR standard assumes that at any given time there is
a single world space frame of reference (which can be reset by resetPose).
While this is a simple model that corresponds to how developers are used to
thinking about things, it does not map 1:1 to how inside-out-trackers see
the world. Specifically, as you move away from the origin an inside-out-tracker
has less information available to accurately locate where that original
"world origin" is in relation to the HMD current position. For that reason
placement of objects can start showing precision issues and drift.

To solve this, we would need to reason about multiple frames of reference
in some way, so that apps can more accurately render their experiences
as users move away from the original origin. This pull request represents
an path to explicitly representing to developers the "squishy" nature of
the tracking technologies.   In this proposal there is no special "blessed" world
space frame of reference. The user can create any number of frames of
reference of various kinds, and will explicitly decide which one they're using for
the experience they are trying to build.  They will then supply this frame of
reference whenever querying for transform data (such as getFrameData).  In the near
future we expect that this will then be extended to include other types such as
anchors to specific locations in the world and surface reconstruction meshes.

One benefit of this system is that it maps very directly to inside-out-tracking
algorithms, which reduces the risk that we over-simplify the mental model
for developers. It also means we can support truly large scale applications
where the user would move far from their original position (e.g. Pokemon Go in MR).
In addition, it means that things like the stage can be naturally expressed
as yet another frame of reference.
Copy link
Member

@toji toji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking really good! Left some comments, with one really big concern about gamepads. Please excuse some bikeshedding on the names. :)

<dfn>Face-locked</dfn>
Content that is not related to the user's environment. Regardless of the user changing orientation or position, the content stays at the same place in the user's field of view.

<dfn>Head-locked</dfn>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we just use one of these terms to prevent confusion. (Mild preference for "Head-Locked" on my part)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd agree with this; is there some difference in face-locked vs head-locked somewhere else? Aside from that there may be people are using both of these?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen both terms used, so I wanted to be clear that they are referring to the same concept to reduce confusion. That said, I'm happy to have the definition on "head locked" rather than on "face locked" since the former is more commonly used.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to reduce confusion maybe tag onto the end of the head locked description "This is sometimes referred to as 'Face-locked'". That way we establish that they're the same concept, but also that we will be using a single canonical term to describe it.

intent and often does not provide the precision necessary for high-quality VR/MR. The WebVR API provides purpose-built interfaces
to VR/MR hardware to allow developers to build compelling, comfortable VR/MR experiences.

## Terminology ## {#intro-terminology}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this section has been a long time coming. Thanks! :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome work here! 👍

<dfn method for="VRDisplay">createAttachedFrameOfReference()</dfn>
Creates a new {{VRAttachedFrameOfReference}} for the {{VRDisplay}}. The returned frame of reference's {{VRCoordinateSystem}} should be supplied to {{getFrameData()}} for 3DOF experiences (such as 360 video) and as a fallback for other experiences when positional tracking is unavailable.

While the returned {{VRAttachedFrameOfReference}} is body-locked, neck-modeling may be included and, as such, {{VRFrameData}} objects filled in by calls to {{getFrameData()}} using the {{VRAttachedFrameOfReference}}.{{VRAttachedFrameOfReference/coordinateSystem}} MAY include position information.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would also encourage adding something to the effect of "Use of a VRAttachedFrameOfReference may provide power savings on some devices relative to using a VRStationaryFrameOfReference or VRStageFrameOfReference" to give devs a little more motivation to use this when applicable rather than always defaulting to using a VRStationaryFrameOfReference and stripping positional data out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this. Encouraging dev's to do the right thing by making it explicitly in their best interests.

* sitting-space experiences.
*/
void resetPose();
VRStageFrameOfReference? createStageFrameOfReference();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you're inheriting our previous name here, but 'stage' feels really awkward to me in this context. Not sure what a better one would be, though. "FloorLevel" seems not terrible? I'll give it some more thought.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open to suggestions!

* the current frame.
*/
boolean getFrameData(VRFrameData frameData);
VRAttachedFrameOfReference createAttachedFrameOfReference();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that this terminology comes from Win Holographic, and I'm not really opposed to it, but it seems a little odd to me that we go out of our way to define things like "Body Locked" and "3DOF experience" above, and then use an unrelated term here. Could this maybe be create3DOFFrameOfReference? Reads a little weird. :P I mostly want to cut down on the amount of jargon we produce.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I hear ya. The concern I have with saying 3DOF is that there are positional elements when neck modeling is included so it's not actually 3DOF. Open to discussion!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the term "frame of reference" could also be explained in the terminology list above. I know, it does not solve the naming issue but I think it might be useful for developers (if we agree upon the name "frame of reference" and its need).

readonly attribute boolean hasPosition;
readonly attribute boolean hasOrientation;

boolean getPose(VRCoordinateSystem coordinateSystem, VRPose pose);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so here's the big issue for Gamepad that I still don't have a good answer for: A gamepad may have a pose and not be related to a VRDisplay at all. This covers basically anything with a gyro in it (Like the PS4 or Wii controllers). They'll need a way to report values in their own CoordinateSystem, which will likely look like an Attached one but it's really hard to say.

Like I said, I don't have a good answer for this. Suggestions welcome!


<pre class="idl">
interface VRStageFrameOfReference {
readonly attribute VRCoordinateSystem coordinateSystem;

readonly attribute float sizeX;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why we felt sizeX and sizeZ were appropriate here? I have an urge to change it to width and depth while we're mucking around anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, how do folks feel about defining the stage boundaries as a polygon rather than a rect?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was actually wondering about that, especially after seeing Oculus' guardian system. I think it would be a good idea to expose a boundary polygon (defined as 2D points at floor level), but for developer's sanity maybe also provide a simple quad that fits within that as the prescriptive play area.

Worth noting, however, that the boundary polygon could be a hell of a fingerprinting data source.


<dfn attribute for="VRStageParameters">sizeX</dfn>
<dfn attribute for="VRStageFrameOfReference">sizeX</dfn>
Width of the play-area bounds in meters. The bounds are defined as an axis-aligned rectangle on the floor. The center of the rectangle is at (0,0,0) in standing-space coordinates. These bounds are defined for safety purposes. Content should not require the user to move beyond these bounds; however, it is possible for the user to ignore the bounds resulting in position values outside of this rectangle.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we're here can we add a note that this may be zero if the bounds aren't known?

Width of the play-area bounds in meters. The bounds are defined as an axis-aligned rectangle on the floor. The center of the rectangle is at (0,0,0) in standing-space coordinates. These bounds are defined for safety purposes. Content should not require the user to move beyond these bounds; however, it is possible for the user to ignore the bounds resulting in position values outside of this rectangle.

<dfn attribute for="VRStageParameters">sizeZ</dfn>
<dfn attribute for="VRStageFrameOfReference">sizeZ</dfn>
Depth of the play-area bounds in meters. The bounds are defined as an axis-aligned rectangle on the floor. The center of the rectangle is at (0,0,0) in standing-space coordinates. These bounds are defined for safety purposes. Content should not require the user to move beyond these bounds; however, it is possible for the user to ignore the bounds resulting in position values outside of this rectangle.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto


// Render to the right eye's view to the right half of the canvas
gl.viewport(canvas.width * 0.5, 0, canvas.width * 0.5, canvas.height);
gl.uniformMatrix4fv(projectionMatrixLocation, false, frameData.rightProjectionMatrix);
gl.uniformMatrix4fv(viewMatrixLocation, false, frameData.rightViewMatrix);
drawGeometry();
drawFunction();

// Indicate that we are ready to present the rendered frame to the VRDisplay
vrDisplay.submitFrame();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So here's a fun question, though it seems Windows Holographic must already handle this: I render my scene using N different CoordinateSystems. Maybe a couple of stationary ones for objects around my play space and an attached one for the UI because reasons. When it comes time to call submitFrame what do I feed back to the VR API for reprojection purposes?

I think it actually works out fine on most devices that I'm familiar with, because they won't actually be reporting N different poses. They'll report one internally and the various CoordinateSystems will just be transforms from that, so the one internal one will be what's used as the basis for reprojection. Do we feel like that'll hold true for every device type, though?


<pre class="idl">
interface VRAttachedFrameOfReference {
readonly attribute VRCoordinateSystem coordinateSystem;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want the ability to call reset() on the FrameOfReferences to reorient the CoordinateSystem using the users current orientation as the new forward vector. The complication is that sometimes it won't be resettable. For example: Resetting a Vive's room-scale CoordinateSystem would be a no-op (always oriented to the room), and I've just found out that resetting a Daydream device's orientation programatically is forbidden (Always done with a user gesture on the controller) BUT if you use the same phone with a Cardboard harness you resetting is just fine. :P

At it's most basic supporting that feature would probably mean including a readonly boolean canReset attribute and void reset() method. Maybe you could also argue that you can implicitly reset if you just call create___FrameOfReference again, and thus you don't need the method? Not sure how I feel about that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I considered having a reset() function and ended up removing it because I didn't want to have two ways of accomplishing the same end result.
If we do just use create___FrameOfReference() to function as a "reset", do you have a scenario concern about the "origin" of the new FrameOfReference being the same as the previous one on Daydream devices?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not concerned about multiple instances of a FoR having the same origin. That'll implicitly be the case for a room-scale FoR anyway. I guess what I'm most interested in is finding a reasonable way to communicate to the developer whether or not a certain action will reset the origin or not, so they can design the UI accordingly.

Data returned in calls to {{VRDisplay}}.{{getFrameData()}} using the reference frame's {{VRCoordinateSystem}} will be relative to that fixed orientation and may also include position data if the {{VRDisplay}} performs neck-modeling.

<pre class="idl">
interface VRAttachedFrameOfReference {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Attached and Stationary FrameOfReference interfaces are identical at this point. Why not just have a single VRFrameOfReference that's returned by both? And then the StageFoR could inherit from it.

@blairmacintyre
Copy link
Contributor

I'm glad to see this. I've been thinking about this very thing lately, as I've been looking at WebVR. I'm going to jump in here (even though I will say up front that I'm just "coming up to speed on this") because I really want to be part of this discussion. My group at Georgia Tech been working with AR (what the MS folks would call MR) and the web for a while (see the current version of argon.js, our AR framework, at @argonjs and http://argonjs.io on github), and have gravitated to making Frames of Reference a very explicit, first class concept that developers use to specify where things are in the world, for similar reasons mentioned here. I've just joined Mozilla to work on AR, and I've just started contemplating if the best route to supporting AR/MR on the web would be to extend WebVR to support AR (MR).

This proposal seems like a great step toward that.

In support of these suggested changes, I will say that we've also found it really important to provide an abstraction to separate the local coordinate systems used for rendering (e.g., what is the origin of the graphics system) from the world reference frames, especially when pursuing examples that use full geospatial data (e.g., the mentioned "Pokemon GO for MR" example). We've gone "full geospatial" in @argonjs, by leveraging the math libraries from @AnalyticalGraphicsInc cesiumjs software, something that turned out to be a very powerful approach (in contrast, say, to focusing on having the programmer/developer define their own local coordinate frames for their application content).

Considering what it would mean to support full-blowing geospatial coordinate frames may be something you want to consider here, if we want to start talking about adding AR/MR concepts to WebVR. I'm not suggesting using geospatial coordinates, but rather thinking about what it would take to let apps work with them cleanly, if they want. It may be that this should be managed at a higher layer (e.g., via a library like argon.js, which builds on cesium.js, and which could manage the WebVR frames of reference); but you should decide if that's the approach you want.

The ability to create many frames, use one as the local rendering frame, and then find out the transforms between that frame and other frames is essential. So, the suggestions here are really good. Especially when we start mixing very different frames of reference for the user and for content (e.g., geospatial frames of reference, local frames of reference from inside out or outside in tracking, and frames of reference used by other kinds of sensing (e.g., computer vision tracking like PTC's Vuforia, etc)), having more explicit queries to determine if one frame is known relative to another (the "getTransformTo" on VRCoordinateSystem suggestion) will end up being a core capability to make apps work robustly.

Here's an example of an issue when doing large-scale AR/MR. We eventually decided to not have the programmer specify the coordinate frame to use as a the local origin for graphics rendering, but rather to have the system pick it. We want to support experiences over a very large scale, not just rooms or buildings, but experiences that may run as the user moves through a town or city, or even travels across the country. To support this, we have the system decide when to "recenter" the "local coordinates used by the system for rendering and math" (e.g., the local euclidean system) at a new geospatial location, and have the programmer react when it changes. When you start working at geospatial (objects on the earth, likely represented in ECEF coordinates, or even off the earth, such as near-earth satellites or other planetary bodies) you need to keep the rendering local, but still be able to represent the objects in their natural frames.

I'm not sure the best way to manage that in WebVR. If you want a VR or MR app that will work as the user travels (kids playing "Pokemon GO for MR" in the backseat of the car on a road trip, or a VR app that lets you travel around a city or country) this will be a concern. Forcing each programmer to decide when to move their local frame of reference based on things like accuracy issues (do I need to get my head pose relative to a new frame of reference now?) seemed impractical to us. Certainly, devices like Hololens will have to deal with this as they move from working in "building scale" to working in "world scale" smoothly.

An experience that utilizes knowledge of the floor plane and encourages users to walk around within specific bounds. An example of this category of experience is CAD modeling which allows a user to walk around the object being modeled.

<dfn>World-scale experience</dfn>
An experience that takes advantage of the ability to walk anywhere without bounds. In such experiences, there is no single floor plane. An example of this category of experience is turn-by-turn directions within a multistory building.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you may want to differentiate between "buidling-scale" and "world-scale". As we move beyond "a single room with a single ground plane" into "a local world that is more complex" there are issues to be dealt with. But, a local 3D coordinate system is still sufficient to relate things to each other and do rendering. As you move to geospatial coordinates (true world-scale), the use of a local coordinate system becomes difficult. For example, a natural coordinate system for the world is ECEF (earth-centered earth-fixed), which has the "ground plane from the user's viewpoint" not sitting on the X/Z plane with Y up (rather the ground plane is a tangent plane to the earth). So, perhaps world-scale isn't the right term here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to see discussion of including base-level support for working in both geospatial and 3D computer graphics coordinate systems. I expect that there are plenty of good use cases where people would like to bring data from geospatial databases into a VR scene.

X3D resolved to use ECEF (WGS84) as the base world coordinate system with an optional origin shift due to precision issues while maintaining a local Y rotated up coordinate system to support standard navigation functionality in the browser. We also realized that there are needs for the scene author to have the ability to work in both coordinate systems. Perhaps WebVR could have a VREarthFrameOfReference with a .getTransformTo() method for getting the transformation matrices between the coordinate systems?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the things Cesium does is provide a convenient way of asking for the East-North-Up coordinate system on the surface of the WGS84 elipsoid at a point on the surface (which is how we get a reasonable local coordinate system in @argonjs.) The downside is it's not a cheap calculation, so you don't want people asking for this on a regular basis. But, yes, a VREarthFrameOfReference (ECEF) would be a nice extension. Especially since, once you can go between ECEF and WebVR coordinates, you can then choose to use libraries like cesium.js to do much more complex things in geospatial coordinates.

@@ -730,6 +822,15 @@ partial interface Gamepad {
<dfn attribute for="Gamepad" id="gamepad-getvrdisplays-attribute">displayId</dfn>
Return the {{VRDisplay/displayId}} of the {{VRDisplay}} this {{Gamepad}} is associated with. A {{Gamepad}} is considered to be associated with a {{VRDisplay}} if it reports a pose that is in the same space as the {{VRDisplay}} pose. If the {{Gamepad}} is not associated with a {{VRDisplay}} should return 0.

<dfn attribute for="Gamepad">hasPosition</dfn>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a thought: Seems like the concepts of having a position/orientation/etc are something that is actually tied into the FrameOfReference? The attached coordinate system has no position, while the others do, etc. Would it be reasonable to move these capability bits onto the FrameOfReference, and then make it so that the devices capabilities are inferred based on the FramesOfReference that it supports? that would eliminate the need to awkwardly patch these values into the Gamepad, at least. Or maybe I'm overthinking it. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was having similar thoughts. When reviewing getPose() for instance, there is this magic between the Gamepad and the VRCoordinateSystem of the VRFrameData...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, a "pose" requires both position and orientation. When one isn't available, it's pretty common for a system to use a default (identity, for example). Having a flag that says which of these is "valid" would be great.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we're at it, I'd love to see all the pose() values we report from sensors having some accuracy estimates associated with them. For example, you could use them to estimate how precise the alignment of a virtual and physical thing is (we did this years ago for AR, it was super useful when you start getting out into real spaces, see http://www.cc.gatech.edu/projects/ael/projects/accounting.html)

@toji toji added this to the 1.2 milestone Nov 14, 2016
The {{Gamepad}}.{{Gamepad/hasOrientation}} attribute MUST return whether the {{Gamepad}}'s orientation is capable of being tracked.

<dfn method for="Gamepad">getPose()</dfn>
Retrieves a {{VRPose}} in the supplied {{VRCoordinateSystem}} for the current {{VRFrameData}}. This function will return false if the Gamepad's pose cannot be expressed in the supplied {{VRCoordinateSystem}} and will return true otherwise.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gamepad's can gain value from being tracked at a faster refresh rate or separately from the HMD in some cases allowing for better embodied presence. Is it correct then to state that the Gamepad's pose must be related to the current VRFrameData?

Maybe this is how it has to be for Web VR to make sense though, some normalization process has to occur otherwise having Gamepad's tracked in completely different ways from the HMD and maybe other tracked objects would be hard to reason about.

void resetPose();
VRStageFrameOfReference? createStageFrameOfReference();

boolean getFrameData(VRCoordinateSystem coordinateSystem, VRFrameData frameData);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you moved the create methods into an option on the new VRFrameData() constructor, then each VRFrameData could carry a read-only value for its frame of reference. I recognize that frame of reference has to carry implementation specific data internally to make sense and thus its object identity is important. If it weren't we could reduce it down to always an enumeration.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd still need a way to query what frames of reference the VRDisplay supports.

<dfn method for="VRDisplay">createAttachedFrameOfReference()</dfn>
Creates a new {{VRAttachedFrameOfReference}} for the {{VRDisplay}}. The returned frame of reference's {{VRCoordinateSystem}} should be supplied to {{getFrameData()}} for 3DOF experiences (such as 360 video) and as a fallback for other experiences when positional tracking is unavailable.

While the returned {{VRAttachedFrameOfReference}} is body-locked, neck-modeling may be included and, as such, {{VRFrameData}} objects filled in by calls to {{getFrameData()}} using the {{VRAttachedFrameOfReference}}.{{VRAttachedFrameOfReference/coordinateSystem}} MAY include position information.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should assume that a developer will use all possible combinations of a frame of reference with all possible types of devices and describe explicitly the fallbacks. The line on 197 (the last bit of it) implies that it should be "supplied" as a fallback, however, that implies that the author had to do some work to choose this attachment. Should't it just be that they create the attachment they want and then we have some guarantees about how an attachment of that sort works no matter what the underlying device supports?

@toji
Copy link
Member

toji commented Dec 8, 2016

Thoughts that grew out of discussing this proposal with zSpace today: I'm still generally concerned about how to make this work with most gamepads, because they seem to fall into two camps:

  • The Gamepad is inherently linked with a VRDisplay (or similar object) and thus should be using a FrameOfReference retrieved from said display.
  • The Gamepad is standalone, with basic gyros/accelerometers but nothing with which to associate them expect gravity (PS4 controller/Wiimote in absence of an outside input), in which case it gets it's FrameOfReference from.... where?

Of course, if your device doesn't have any external frame of reference there's only so much it can report. Basically just the deltas from it's last state. Which... sounds an awful lot like the proposed AttachedFrameOfReference. (Orientation only, no association with other FoRs, etc.) Now as far as I can see such a FoR should be supportable by ANYTHING with a motion sensor, and doesn't actually have much (if anything) to do with a particular piece of hardware. It deals exclusively with the hardware's motion relative to itself, while the other FoRs deal with the hardware's motion relative to something else. (Hope I'm making sense here...)

So given that, does it make sense that the AttachedFrameOfReference (sorry, still don't like that name) be "Global". That is, you could retrieve one without any device, and use it with any device. Maybe something like:

gamepad.getPose(navigator.vr.attachedFrameOfReference);

Or:

var defaultFoR = new AttachedFrameOfReference();
vrDisplay.getFrameData(defaultFoR);
gamepad.getPose(defaultFoR);

This would affect where we handle pose resetting, since it would no longer be practical to do it on an object like this. In any case, any other FoR would be queried from the device somehow because they're all going to have something to do with the hardware's capabilities and how it views the world, but this one is "special" and can be used with minimal hassle anywhere.

I feel like I'm maybe overlooking something here. Thoughts?

@klausw
Copy link
Contributor

klausw commented Apr 3, 2017

The reference frame used isn't just informational, it can also affect behavior directly. I think we should make sure that the spec covers this distinction.

For example, on an OpenVR (Vive) system, requesting seated relative poses modifies the way that the Chaperone warning system works. As long as your headset remains close enough to the seated origin, the Chaperone display is suppressed. This is useful for cases where the seated position is at the edge of the roomscale stage area, or possibly even outside it. Using standing mode in this scenario would be very annoying if it permanently shows chaperone boundaries in your field of view. In this case, the underlying implementation should be able to call IVRCompositor::SetTrackingSpace(TrackingUniverseOrigin::TrackingUniverseSeated) or equivalent if it can infer that the WebVR application wants to use seated space.

I think this distinction is covered if poses are specifically requested for an appropriate coordinate system, but would be difficult to implement if it's based on providing conversion matrices where inferring the intended usage scenario may be difficult.

@toji
Copy link
Member

toji commented May 13, 2017

This concept has been incorporated into the 2.0 explainer for a while now, so I'm closing this as a matter of housekeeping.

@toji toji closed this May 13, 2017
@cwilso cwilso modified the milestones: Spec-Complete for 1.0, 1.0 Apr 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants