Skip to content

Commit

Permalink
Fix WavePrefixCountBits() being off by one.
Browse files Browse the repository at this point in the history
It was counting bits up to the current lane included, whereas the
documentation says it should be excluded. This now matches dxc's behavior
as well.

Fix KhronosGroup#2929
  • Loading branch information
Ryp committed Apr 22, 2022
1 parent 06ac141 commit f906b89
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 5 deletions.
6 changes: 3 additions & 3 deletions Test/baseResults/hlsl.waveprefix.comp.out
Original file line number Diff line number Diff line change
Expand Up @@ -1126,7 +1126,7 @@ local_size = (32, 16, 1)
0:54 0 (const int)
0:54 Constant:
0:54 0 (const int)
0:54 subgroupBallotInclusiveBitCount ( temp uint)
0:54 subgroupBallotExclusiveBitCount ( temp uint)
0:54 subgroupBallot ( temp 4-component vector of uint)
0:54 Compare Equal ( temp bool)
0:54 direct index ( temp uint)
Expand Down Expand Up @@ -2289,7 +2289,7 @@ local_size = (32, 16, 1)
0:54 0 (const int)
0:54 Constant:
0:54 0 (const int)
0:54 subgroupBallotInclusiveBitCount ( temp uint)
0:54 subgroupBallotExclusiveBitCount ( temp uint)
0:54 subgroupBallot ( temp 4-component vector of uint)
0:54 Compare Equal ( temp bool)
0:54 direct index ( temp uint)
Expand Down Expand Up @@ -2818,7 +2818,7 @@ local_size = (32, 16, 1)
390: 6(int) Load 389
392: 391(bool) IEqual 390 26
393: 13(ivec4) GroupNonUniformBallot 35 392
394: 6(int) GroupNonUniformBallotBitCount 35 InclusiveScan 393
394: 6(int) GroupNonUniformBallotBitCount 35 ExclusiveScan 393
395: 42(ptr) AccessChain 24(data) 25 386 25 26
Store 395 394
Return
Expand Down
4 changes: 2 additions & 2 deletions glslang/HLSL/hlslParseHelper.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5430,7 +5430,7 @@ void HlslParseContext::decomposeIntrinsic(const TSourceLoc& loc, TIntermTyped*&
}
case EOpWavePrefixCountBits:
{
// Mapped to subgroupBallotInclusiveBitCount(subgroupBallot())
// Mapped to subgroupBallotExclusiveBitCount(subgroupBallot())
// builtin

// uvec4 type.
Expand All @@ -5444,7 +5444,7 @@ void HlslParseContext::decomposeIntrinsic(const TSourceLoc& loc, TIntermTyped*&
TType uintType(EbtUint, EvqTemporary);

node = intermediate.addBuiltInFunctionCall(loc,
EOpSubgroupBallotInclusiveBitCount, true, res, uintType);
EOpSubgroupBallotExclusiveBitCount, true, res, uintType);

break;
}
Expand Down

0 comments on commit f906b89

Please sign in to comment.