Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WebGL backend] Add max 1D texture dimension flag #6808

Merged
merged 16 commits into from
Sep 1, 2022

Conversation

Linchenn
Copy link
Collaborator

@Linchenn Linchenn commented Aug 31, 2022

Fix: #6775

Bug summary

Problem

Mali GPU could not provide interpolation values with enough precision for long skinny triangles, proved by the demo. The varying from the interpolation values will be used to calculate coords (in type) for sampling, finally causing sampling mismatch.

Standalone code to reproduce the issue

tf.ENV.set('WEBGL_PACK_DEPTHWISECONV', true)

let w = Array.from({length: 3 * 3 * 816}, () => Math.random())
let x = Array.from({length: 12 * 10 * 816}, () => Math.random())

let inputs = {
    filter: tf.tensor(w, [3, 3, 816, 1]),
    x: tf.tensor(x, [1, 12, 10, 816]),
    strides: 1,
    pad: [[0, 0], [1, 1], [1, 1], [0, 0]],
    dataFormat: "channelsLast",
    dilations: 1,
    activation: 'relu'
};

tf.setBackend('webgl')
tf.ENV.set('WEBGL_MAX_TEXTURE_SIZE', 4096)
let out_4096 = tf.fused.depthwiseConv2d(inputs);

tf.ENV.set('WEBGL_MAX_TEXTURE_SIZE', 2048)
inputs.x = tf.tensor(x, [1, 12, 10, 816])
inputs.filter = tf.tensor(w, [3, 3, 816, 1])
let out_2048 = tf.fused.depthwiseConv2d(inputs);

tf.setBackend('cpu')
inputs.x = tf.tensor(x, [1, 12, 10, 816])
inputs.filter = tf.tensor(w, [3, 3, 816, 1])
let out_reference = tf.fused.depthwiseConv2d(inputs);

const doTensorsDiffer = function(t0, t1) {
	return tf.any(tf.greater(tf.abs(tf.sub(t0, t1)), tf.scalar(1e-2))).dataSync()[0];
}

console.log("Default and 2048 differ? " + doTensorsDiffer(out_4096, out_2048));
console.log("Reference and 2048 differ? " + doTensorsDiffer(out_reference, out_2048));
console.log("Reference and 4096 differ? " + doTensorsDiffer(out_reference, out_4096));

The three differs are expected to be 0, while Mali GPU generates 1 for the first and the third.

Findings

Take the texture with physical size of [3672, 1] for example. As the second dimension increase, the precision becomes better. Actually, when the second dimension is increased to 2, the precision for calculating coords is enough. (reference data: for each table: the second column is the computed index from varying values; the third and forth columns are ground-truth; the fifth column is the difference between indices. You could find all difference between indices are 0.)

Solution

This PR expose a flag, WEBGL_MAX_TEXTURE_DIMENSION_FOR_1D_TEXTURE. For 1D texture (physical height or physical width is 1), if the physical length of any texture edges exceed the threshold, the other edge will be increased to 2.

To see the logs from the Cloud Build CI, please join either our discussion or announcement mailing list.


This change is Reviewable

@Linchenn Linchenn requested a review from pyu10055 August 31, 2022 01:19
@Linchenn
Copy link
Collaborator Author

cc @shanumantesc

@Linchenn Linchenn requested a review from qjia7 August 31, 2022 01:22
Copy link
Contributor

@qjia7 qjia7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks.

Can you also add the depthwiseConv2d case described in the bug for webgl?

Copy link
Collaborator Author

@Linchenn Linchenn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks!

Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @pyu10055)

Copy link
Contributor

@qjia7 qjia7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forget to submit the test file?

Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @pyu10055)

Copy link
Collaborator

@pyu10055 pyu10055 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Lin for investigating and fixing this deep hidden bug. I am curious why we take the approach to double the dimension of shape 1, instead of using the squarish function?
I assume the squarish function to produce smaller texture size and potentially reduce the computation if the texture is output.

Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @Linchenn)


tfjs-backend-webgl/src/webgl_util.ts line 446 at r1 (raw file):

      env().getNumber('WEBGL_MAX_TEXTURE_DIMENSION_FOR_1D_TEXTURE') *
      (isPacked ? 2 : 1);
  if (textureShape[0] > maxDimFor1DTex || textureShape[1] > maxDimFor1DTex) {

It would be good to separate these two conditions to avoid unnecessary computation later.


tfjs-backend-webgl/src/webgl_util.ts line 449 at r1 (raw file):

    // For 1D texture, if the length exceeds maxDimFor1DTex, the texture will be
    // upgraded to 2D with a physical shape as [length, 2] or [2, length].
    textureShape[0] = Math.max(isPacked ? 4 : 2, textureShape[0]);

should the packed texture be 2?

@Linchenn Linchenn requested a review from pyu10055 August 31, 2022 18:38
Copy link
Collaborator Author

@Linchenn Linchenn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Ping for the suggestion! Updated the algorithm to squarify the shape.

Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @pyu10055)


tfjs-backend-webgl/src/webgl_util.ts line 446 at r1 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

It would be good to separate these two conditions to avoid unnecessary computation later.

Updated.


tfjs-backend-webgl/src/webgl_util.ts line 449 at r1 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

should the packed texture be 2?

Using squarifying now.

Copy link
Collaborator Author

@Linchenn Linchenn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qjia7 Since this PR only updates the getTextureShapeFromLogicalShape function, I just added some tests for it.

For the depthwise's mismatch bug here, we do not guarantee this PR could fix it. Instead, this PR expose the flags to users and let users to tune the flag and correct the mismatch.

Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @pyu10055)

Copy link
Collaborator

@pyu10055 pyu10055 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @Linchenn and @pyu10055)


tfjs-backend-webgl/src/webgl_util.ts line 374 at r2 (raw file):

      env().getNumber('WEBGL_MAX_SIZE_FOR_NARROW_TEXTURE');
  if (maxSizeForNarrorTex === Infinity &&
      env().getBool('WEBGL_AUTO_SQUARIFY_NARROW_TEXTURE_SHAPE_FOR_MALI_GPU')) {

we need to detect the GPU type if it is MALI, to ensure it only applies to MALI GPU.
or you can change the name to WEBGL_AUTO_SQUARIFY_NARROW_TEXTURE_SHAPE
so it is user's responsibility to detect MALI GPU.


tfjs-backend-webgl/src/webgl_util.ts line 404 at r2 (raw file):

      (textureShape: [number, number]) => {
        return Math.max(...textureShape) > maxSizeForNarrorTex &&
            Math.min(...textureShape) <= (isPacked ? 2 : 1);

if the texture is a 0 shaped tensor, we should not change it.
Which means Math.min(...textureShape) === (isPacked ? 2 : 1)


tfjs-backend-webgl/src/webgl_util.ts line 420 at r2 (raw file):

      logShape.length === 2 && logShape[0] <= maxTexSize &&
      logShape[1] <= maxTexSize &&
      !isLongNarrowTex(logShape as [number, number])) {

cache this boolean value, since it is checked in multiple places.

@Linchenn Linchenn requested a review from pyu10055 August 31, 2022 21:55
Copy link
Collaborator Author

@Linchenn Linchenn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Ping!

Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @pyu10055)


tfjs-backend-webgl/src/webgl_util.ts line 374 at r2 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

we need to detect the GPU type if it is MALI, to ensure it only applies to MALI GPU.
or you can change the name to WEBGL_AUTO_SQUARIFY_NARROW_TEXTURE_SHAPE
so it is user's responsibility to detect MALI GPU.

Renamed to WEBGL_AUTO_SQUARIFY_NARROW_TEXTURE_SHAPE, as we are not sure about when Mali GPU's varying becomes correct. We could add it when we want to maintain this.


tfjs-backend-webgl/src/webgl_util.ts line 404 at r2 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

if the texture is a 0 shaped tensor, we should not change it.
Which means Math.min(...textureShape) === (isPacked ? 2 : 1)

Since we need to include both Math.min(...textureShape) === 1 and Math.min(...textureShape) ===2 for isPacked === true, I instead added one more condition here.


tfjs-backend-webgl/src/webgl_util.ts line 420 at r2 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

cache this boolean value, since it is checked in multiple places.

Since the input for isLongNarrowTex is different for each branch, maybe we could not cache it

Copy link
Collaborator Author

@Linchenn Linchenn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @pyu10055)


tfjs-backend-webgl/src/webgl_util.ts line 420 at r2 (raw file):

Previously, Linchenn wrote…

Since the input for isLongNarrowTex is different for each branch, maybe we could not cache it

Moves checking isLongNarrowTex to the end of the function. Then we do not need to add isLongNarrowTex to multiple places.

@Linchenn Linchenn requested a review from qjia7 September 1, 2022 02:09
Copy link
Contributor

@qjia7 qjia7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with nits. Thanks.

return [1, size];
let textureShape: [number, number];
if (logShape.length <= 1 && size <= maxTexSize &&
size <= maxSizeForNarrorTex) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: size <= maxSizeForNarrorTex is not needed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks!

@@ -395,30 +403,42 @@ export function getTextureShapeFromLogicalShape(
}

let size = util.sizeFromShape(logShape);
if (logShape.length <= 1 && size <= maxTexSize) {
return [1, size];
let textureShape: [number, number];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: let textureShape = null. Then the last else {} can be removed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Collaborator

@pyu10055 pyu10055 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 2 of 1 approvals obtained (waiting on @pyu10055)

@@ -368,8 +368,16 @@ export function getShapeAs3D(shape: number[]): [number, number, number] {
export function getTextureShapeFromLogicalShape(
logShape: number[], isPacked = false): [number, number] {
let maxTexSize = env().getNumber('WEBGL_MAX_TEXTURE_SIZE');
let maxSizeForNarrorTex =

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let maxSizeForNarrorTex =
let maxSizeForNarrowTex =

(same everywhere else)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Done.

@Linchenn Linchenn merged commit 83d69c9 into tensorflow:master Sep 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mismatch in packed depthwise conv 2d results on Mali GPU
4 participants