-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mipmaps: Disable slope mode pending a better method. Implementing it entirely as a LOD bias is not the way it works. #10130
Conversation
Are you sure? Maybe I forgot to push the pspautotest for it - I can't look until later, but I'm pretty sure the slope DID work like that. I had notes somewhere... Did you try testing? But I did find something else was affecting the slope calculation, just didn't know what it was and didn't have time then to look into it more. -[Unknown] |
See notes: I welcome any further tests that point to actual behavior. But I definitely saw slope being affected by bias, it was pretty clear. I think something else is affecting how slope is calculated before the bias. -[Unknown] |
No I'm not saying that it's not affected by bias, I think bias is applied on top of the SLOPE calculation. But remember that when you sample normally using a GPU, it does its own AUTO calculation. I don't think the slope calculation, whatever it is, happens on top of that, but as a replacement - and thus, it's wrong to apply SLOPE as a bias instead of directly as a level to the sampler together with the bias (which needs to be done through a shader instruction). And instead of doing that, let's turn it off until we understand it. |
That's also true of CONST, it is not a constant bias (that's an old misunderstanding from the original code trying to support it), it also directly selects a mip level. The point of setting a min and a max level was to try to force it to take the desired level. -[Unknown] |
As for CONST, yeah I understand that, and that should be correct. I try to do that in the modified code too. Though like I said, we should probably implement both slope and const using textureLod() : https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/textureLod.xhtml , but that's a later project. |
It might be good to actually run through tests, though. Here's the 2d test, although I may have hacked it to test something else: #include <common.h>
#include <malloc.h>
#include <pspdisplay.h>
#include <pspgu.h>
#include <pspkernel.h>
extern "C" int sceDmacMemcpy(void *dest, const void *source, unsigned int size);
extern "C" void sendCommandi(int cmd, int argument);
extern int HAS_DISPLAY;
u8 *fbp0 = 0;
u8 *dbp0 = fbp0 + 512 * 272 * sizeof(u32);
static u32 copybuf[512 * 272];
u32 *drawbuf;
unsigned int __attribute__((aligned(16))) list[262144];
enum {
IMG_HALVES,
IMG_EQUAL,
IMG_REVERSE,
};
u32 *imageData[3] = {};
u32 imageColors[] = {
0xFF000000,
0xFF101010,
0xFF202020,
0xFF303030,
0xFF404040,
0xFF505050,
0xFF606060,
0xFF707070,
};
typedef struct {
u16 u, v;
s16 x, y, z;
} Vertex;
enum {
VERTS_EXACT,
VERTS_MAGNIFY,
VERTS_2X,
VERTS_4X_W,
VERTS_4X_H,
VERTS_64X,
VERTS_256X,
};
inline s16 norm16x(int x) { return x * 65536 / 480 - 32768; }
inline s16 norm16y(int x) { return 32767 - x * 65536 / 272; }
Vertex verticesThrough[][2] = {
{ {0, 0, 0, 0, 0}, {2, 2, 2, 2, 0} },
{ {0, 0, 0, 0, 0}, {2, 2, 8, 8, 0} },
{ {0, 0, 0, 0, 0}, {4, 4, 2, 2, 0} },
{ {0, 0, 0, 0, 0}, {8, 1, 2, 2, 0} },
{ {0, 0, 0, 0, 0}, {1, 8, 2, 2, 0} },
{ {0, 0, 0, 0, 0}, {64, 64, 1, 1, 0} },
{ {0, 0, 0, 0, 0}, {256, 256, 1, 1, 0} },
};
Vertex verticesTransform[][2] = {
{ {0, 0, norm16x(0), norm16y(0), 65500}, {256, 256, norm16x(2), norm16y(2), 65535} },
{ {0, 0, norm16x(0), norm16y(0), 65500}, {256, 256, norm16x(8), norm16y(8), 65535} },
{ {0, 0, norm16x(0), norm16y(0), 65500}, {512, 512, norm16x(2), norm16y(2), 65535} },
{ {0, 0, norm16x(0), norm16y(0), 65500}, {1024, 128, norm16x(2), norm16y(2), 65535} },
{ {0, 0, norm16x(0), norm16y(0), 65500}, {128, 1024, norm16x(2), norm16y(2), 65535} },
{ {0, 0, norm16x(0), norm16y(0), 65500}, {8192, 8192, norm16x(1), norm16y(1), 65535} },
{ {0, 0, norm16x(0), norm16y(0), 65500}, {32768, 32768, norm16x(1), norm16y(1), 65535} },
};
void displayBuffer(const char *reason) {
sceKernelDcacheWritebackInvalidateAll();
sceDmacMemcpy(copybuf, drawbuf, sizeof(copybuf));
sceKernelDcacheWritebackInvalidateAll();
const u32 *buf = copybuf;
int c = buf[0] & 0x00FFFFFF;
checkpoint("%s: %06x", reason, c);
// Reset.
memset(copybuf, 0, sizeof(copybuf));
sceKernelDcacheWritebackInvalidateAll();
sceDmacMemcpy(drawbuf, copybuf, sizeof(copybuf));
sceKernelDcacheWritebackInvalidateAll();
}
u32 *getTexturePtr(int img, int l) {
int p = 0;
int w;
if (img == IMG_HALVES) {
w = 256;
} else if (img == IMG_EQUAL) {
w = 16;
} else if (img == IMG_REVERSE) {
w = 4;//1;
}
while (l-- > 0) {
p += w * w;
if ((p % 4) != 0) {
p += 4 - (p % 4);
}
if (img == IMG_HALVES) {
w >>= 1;
} else if (img == IMG_EQUAL) {
w = 16;
} else if (img == IMG_REVERSE) {
w <<= 1;
}
}
return imageData[img] + p;
}
void drawTexFlush(int img, unsigned int mode, const void *verts, u8 bias) {
sceGuStart(GU_DIRECT, list);
sceGuEnable(GU_TEXTURE_2D);
sceGuTexSync();
sceGuTexFlush();
sendCommandi(200, (bias << 16) | mode);
sceGuTexMode(GU_PSM_8888, 7, 0, GU_FALSE);
sceGuTexWrap(GU_CLAMP, GU_CLAMP);
sceGuTexFunc(GU_TFX_DECAL, GU_TCC_RGB);
sceGuTexSlope(2.0f);
for (int l = 0; l < 8; ++l) {
int w;
if (img == IMG_HALVES) {
w = 256 >> l;
} else if (img == IMG_EQUAL) {
w = 16;
} else if (img == IMG_REVERSE) {
//w = 1 << l;
w = 4 << l;
}
sceGuTexImage(l, w, w, w < 4 ? 4 : w, getTexturePtr(img, l));
}
sceGuDrawArray(GU_SPRITES, GU_TEXTURE_16BIT | GU_VERTEX_16BIT | GU_TRANSFORM_2D, 2, NULL, verts);
sceGuFinish();
sceGuSync(0, 0);
}
void init() {
void *fbp0 = 0;
drawbuf = (u32 *)sceGeEdramGetAddr();
sceGuInit();
sceGuStart(GU_DIRECT, list);
sceGuDrawBuffer(GU_PSM_8888, fbp0, 512);
sceGuDispBuffer(480, 272, fbp0, 512);
sceGuDepthBuffer(dbp0, 512);
sceGuOffset(2048 - (480 / 2), 2048 - (272 / 2));
sceGuViewport(2048, 2048, 480, 272);
sceGuDepthRange(65535, 65500);
sceGuDepthMask(0);
sceGuScissor(0, 0, 480, 272);
sceGuEnable(GU_SCISSOR_TEST);
sceGuFrontFace(GU_CW);
sceGuShadeModel(GU_SMOOTH);
ScePspFMatrix4 ones = {
{1, 0, 0, 0},
{0, 1, 0, 0},
{0, 0, 1, 0},
{0, 0, 0, 1},
};
sceGuSetMatrix(GU_MODEL, &ones);
sceGuSetMatrix(GU_VIEW, &ones);
sceGuSetMatrix(GU_PROJECTION, &ones);
sceGuFinish();
sceGuSync(0, 0);
sceDisplayWaitVblankStart();
sceGuDisplay(1);
memset(copybuf, 0, sizeof(copybuf));
sceKernelDcacheWritebackInvalidateAll();
sceDmacMemcpy(drawbuf, copybuf, sizeof(copybuf));
sceKernelDcacheWritebackInvalidateAll();
HAS_DISPLAY = 0;
}
void setupTexture() {
imageData[IMG_HALVES] = (u32 *)memalign(16, 4 * 256 * 344);
int w = 256;
for (int l = 0; l < 8; ++l) {
u32 *p = getTexturePtr(IMG_HALVES, l);
int bufw = w < 4 ? 4 : w;
//p[0] = imageColors[l];
//p[1] = imageColors[l];
//p[bufw] = imageColors[l];
//p[bufw + 1] = imageColors[l];
for (int i = 0; i < bufw * w; ++i) {
p[i] = imageColors[l];
}
w >>= 1;
}
imageData[IMG_EQUAL] = (u32 *)memalign(16, 4 * 16 * 16 * 8);
w = 16;
for (int l = 0; l < 8; ++l) {
u32 *p = getTexturePtr(IMG_EQUAL, l);
int bufw = w < 4 ? 4 : w;
for (int i = 0; i < bufw * w; ++i) {
p[i] = imageColors[l];
}
}
//imageData[IMG_REVERSE] = (u32 *)memalign(16, 4 * 256 * 344);
imageData[IMG_REVERSE] = (u32 *)memalign(16, 4 * 512 * 1024);
w = 4;//1;
for (int l = 0; l < 8; ++l) {
u32 *p = getTexturePtr(IMG_REVERSE, l);
int bufw = w < 4 ? 4 : w;
//p[0] = imageColors[l];
//p[1] = imageColors[l];
//p[bufw] = imageColors[l];
//p[bufw + 1] = imageColors[l];
for (int i = 0; i < bufw * w; ++i) {
p[i] = imageColors[l];
}
w <<= 1;
}
sceKernelDcacheWritebackInvalidateAll();
}
void testUvRotationBox(const char *title, int img, unsigned int mode, int vindex, u8 bias) {
const void *verts = verticesThrough[vindex];
drawTexFlush(img, mode, verts, bias);
sceDisplayWaitVblankStart();
displayBuffer(title);
}
void testUVRotation(const char *title, int img, unsigned int mode, u8 bias) {
char full[1024] = {0};
snprintf(full, sizeof(full) - 1, "%s +%02x:", title, bias);
checkpointNext(full);
testUvRotationBox(" 1:1", img, mode, VERTS_EXACT, bias);
testUvRotationBox(" Magnify", img, mode, VERTS_MAGNIFY, bias);
testUvRotationBox(" Minify 2x WH", img, mode, VERTS_2X, bias);
testUvRotationBox(" Minify 4x W", img, mode, VERTS_4X_W, bias);
testUvRotationBox(" Minify 4x H", img, mode, VERTS_4X_H, bias);
testUvRotationBox(" Minify 64x", img, mode, VERTS_64X, bias);
testUvRotationBox(" Minify 256x", img, mode, VERTS_256X, bias);
}
extern "C" int main(int argc, char *argv[]) {
init();
setupTexture();
sceDisplaySetFrameBuf(sceGeEdramGetAddr(), 512, GU_PSM_8888, PSP_DISPLAY_SETBUF_IMMEDIATE);
sceDisplaySetMode(0, 480, 272);
sceGuStart(GU_DIRECT, list);
//sceGuTexFilter(GU_NEAREST_MIPMAP_NEAREST, GU_NEAREST_MIPMAP_NEAREST);
sceGuTexFilter(GU_NEAREST_MIPMAP_LINEAR, GU_NEAREST_MIPMAP_LINEAR);
sceGuFinish();
sceGuSync(0, 0);
u8 biases[] = {0x00, 0x07, 0x08, 0x10, 0x70, 0x77, 0x78, 0x80, 0x87, 0x88, 0xF0};
for (size_t i = 0; i < ARRAY_SIZE(biases); ++i) {
testUVRotation("Typical mips (AUTO)", IMG_HALVES, GU_TEXTURE_AUTO, biases[i]);
testUVRotation("Equal mips (AUTO)", IMG_EQUAL, GU_TEXTURE_AUTO, biases[i]);
testUVRotation("Reversed mips (AUTO)", IMG_REVERSE, GU_TEXTURE_AUTO, biases[i]);
}
for (size_t i = 0; i < ARRAY_SIZE(biases); ++i) {
testUVRotation("Typical mips (CONST)", IMG_HALVES, GU_TEXTURE_CONST, biases[i]);
testUVRotation("Equal mips (CONST)", IMG_EQUAL, GU_TEXTURE_CONST, biases[i]);
testUVRotation("Reversed mips (CONST)", IMG_REVERSE, GU_TEXTURE_CONST, biases[i]);
}
for (size_t i = 0; i < ARRAY_SIZE(biases); ++i) {
testUVRotation("Typical mips (SLOPE)", IMG_HALVES, GU_TEXTURE_SLOPE, biases[i]);
testUVRotation("Equal mips (SLOPE)", IMG_EQUAL, GU_TEXTURE_SLOPE, biases[i]);
testUVRotation("Reversed mips (SLOPE)", IMG_REVERSE, GU_TEXTURE_SLOPE, biases[i]);
}
for (size_t i = 0; i < ARRAY_SIZE(biases); ++i) {
testUVRotation("Typical mips (3)", IMG_HALVES, 3, biases[i]);
testUVRotation("Equal mips (3)", IMG_EQUAL, 3, biases[i]);
testUVRotation("Reversed mips (3)", IMG_REVERSE, 3, biases[i]);
}
sceGuTerm();
free(imageData[IMG_HALVES]);
free(imageData[IMG_EQUAL]);
free(imageData[IMG_REVERSE]);
return 0;
} Gotta run for some hours. -[Unknown] |
See hrydgard/pspautotests#193, in case you don't have a setup to run PSP tests anymore. Software passes that test, but it gets #9772 wrong. So obviously there's a better test that could be written. I'm not against a workaround / disabling this SLOPE behavior, but I think it's best to be informed by how the actual hardware works - see stencil and other behaviors.
Note: about D3D9, you need to read the docs on how those settings work. You've got them wrong. They're backwards and confusing, blame Microsoft. Let's at least not break Direct3D 9 unnecessarily? -[Unknown] |
Thanks, I'll see what I can figure out to break as little as possible. I need to get my setup working again... What really annoys me is that there's absolutely no point for these games to be using SLOPE mode at all. We could just force it to AUTO and everything would be fine. I think it's a result of the games being lazy ports of PS2 games where you had to always compute the mipmap level manually and pass it in using a similar mechanism to SLOPE. |
Yeah, that sounds reasonable. See here for one usage of SLOPE (which we do get wrong but I think AUTO was wrong too): But generally you're probably right that treating it as AUTO is safer for now, at least until we figure it out. Just don't want to break the others. -[Unknown] |
7d3e95d
to
b1a93ad
Compare
Just wondering, in the output from test.py, how do you tell which of the tests failed? There's no context with each diff line.. Might be a good idea to change the test to output on every line whether it's SLOPE or AUTO etc. This is how I run the test, which produces hard-to-interpret output:
Vulkan should now be working better but has validation errors in debug mode when running headless. |
Oh, at some point I added a hack which uses this to a pspautotests/test.py: open("__tempfile.txt", "wt").write(output.replace("\r", "") + "\n")
proc = subprocess.Popen(["diff", "-U3", expected_filename, "__tempfile.txt"]) But yeah, could add it to each line, it's usually easier. -[Unknown] |
GPU/Common/TextureCacheCommon.cpp
Outdated
break; | ||
case GE_TEXLEVEL_MODE_CONST: | ||
key.maxLevel = (int)(lodBias * 256.0f); | ||
key.minLevel = (int)(lodBias * 256.0f); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth thinking about if the bias is greater than the number of mip levels or less than zero.
Of course, in AUTO mode, this makes sense, as it's a bias (on top of the auto equation.) But in CONST, it's not a "bias", it's just specifying a CONSTant mip level. I think it clamps to available mip levels.
That mipmap test may not currently test this, unfortunately... I kept hacking it to try to figure out other things.
-[Unknown]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As previously mentioned, in all the APIs it seems to be OK to give out-of-range values for these.
GPU/Common/TextureCacheCommon.cpp
Outdated
key.minLevel = 0; | ||
key.lodBias = (int)(lodBias * 256.0f); | ||
break; | ||
case GE_TEXLEVEL_MODE_CONST: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, 3 is possible and operates the same way as CONST, I think this was handled by the old code, in case games accidentally use it.
-[Unknown]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, fixing.
Tried to break as little as possible. Will have to be good enough for now, let's revisit later. |
Works around #9772
@unknownbrackets What do you think? I think slope mode is not a bias on top of the regular automatic calculation, so using lod bias to implement it is wrong no matter what. Better disable it to avoid breaking Tony Hawk.