-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Artdink games geometry problem #5549
Comments
Is this affected by the prescale or software skinning options? -[Unknown] |
Not sure what you ment by "prescale", texture/rendering scalling? None of it affects it, software skinning on/off doesn't either. I also tried other options including cpu settings just to be sure, nothing really seems to affect it neither to better or worser. |
Looks like bad transform matrices to me, either world or bone matrices. Could be either a CPU bug or maybe even a GPU timing/scheduling problem, where the game replaces matrices in memory at the wrong point in time or something. |
Has this changed with the vrot change? If not, does multithreading affect it? -[Unknown] |
No change at all, multithreading also does not affect it. Edit: also what I didn't mention creating this issue, sometimes after special moves that instantly transform from mech to very fast flying fighter, the sky texture(or like 1 pixel from it) covers the screen in figher mode it also makes player loose all movement control until changing form(since it affects only fighter form) like it was zooming way out of the map, unfortunately it's random and sometimes hard to reproduce, have to use cheats, since otherwise special moves aren't really so common, looks like that: It covers whole screen with 1 color, like it was zooming in to that sky texture and concentrating on 1 pixel. |
Does this game ever call How terrible does this look in the software rendrerer? -[Unknown] |
Never calls that, also apart of a few extra glitches looks same in software. |
This might be a cpu core bug, happening both in JIT and interpreter wasn't exactly pointing to it, but IR interpreter makes a difference. |
Hmm, definitely could be a vfpu bug then. If you change in Core/MIPS/IR/IRCompVFPU.cpp: // #define CONDITIONAL_DISABLE { Comp_Generic(op); return; }
#define CONDITIONAL_DISABLE ; To: #define CONDITIONAL_DISABLE { Comp_Generic(op); return; }
// #define CONDITIONAL_DISABLE ; Does it go back to working the way it does in the interpreter? If yes, switch it back, and then above #undef CONDITIONAL_DISABLE
#define CONDITIONAL_DISABLE { Comp_Generic(op); return; } And you can keep doing that to "bisect" the file. When it's still working, move it upward. When it stops working, go downward. Then when you think you've found the function that does it, remove the lines and just add If you can find a function where -[Unknown] |
Thanks for the guide:
restores the issue when using IR interpreter, more specifically I removed that disable again and tried both cases there and disabling just:
restores it. So that's I guess where the bug must be in JIT & old interpreter. |
Hmm, interesting. It could be it's a prefix thing, then. What if you use You can also log the values (but this is on compilation, so it may log only once and well before the issue) with -[Unknown] |
Well it logs quite a bit, all with imm < 128, here's all log from [JIT] using IR interpreter:
|
I added logging of the current block when that happens, can you get the latest and run again? |
Note: I think it only logs to Visual Studio when you F5 in the Output tab. Not in the debug console. I guess this means we're handling a prefix more correctly in IR than in jit/interp? And so making it uneaten is breaking it? Or maybe it's a false positive. Maybe because it's now avoiding IR this made us not notice the real problem in some other op... Want to jump on #ppsspp if you have time? -[Unknown] |
I'm thinking it might be best to re-verify that it's this op by removing CONDITIONAL_DISABLE; from it, and then doing the bisect I described again. It's very possibly caused by a different op. -[Unknown] |
Couldn't see anything new in the log, but after rebuilding again got:
at that address it looks like: Well I checked most functions first time and that was the only one which disabling restores the bug. Through as I initially mentioned, IR has other issues, animations look all correct for a while, but after moving for a bit, they kind of get doubled, or more exactly run at double speed as movement is also affected that way, maybe that bug somehow just masks the other one, but then it does work completely fine at times, especially if I just walk in one direction doing nothing after spawning on the map to make a screenshot: |
Ha. That's an evil little prefix there. I guess we should inline that "if" somehow so we can maintain prefix info, maybe? I wonder how common that technique is. Hmm. You know, the thing is, if there's an uneaten prefix, we rebuild the block with vfpu jit off, basically. In Core/MIPS/x86/Jit.cpp, what if you just delete this part? // Drat. The VFPU hit an uneaten prefix at the end of a block.
if (js.startDefaultPrefix && js.MayHavePrefix()) {
WARN_LOG(JIT, "An uneaten prefix at end of block: %08x", GetCompilerPC() - 4);
js.LogPrefix();
// Let's try that one more time. We won't get back here because we toggled the value.
js.startDefaultPrefix = false;
cleanSlate = true;
} Or just delete the -[Unknown] |
Sorry got distracted ~ with this change IR interpreter has a lot of missing graphics, JIT however does appear to be fixed by it. |
Okay, what this means is that we probably have a bug in the interpreter only. Let me explain, because vfpu prefixes are tricky. Let me start with the important bit: As far as jit is concerned, the "prefixes" are part of the instruction. A prefix modifies how the instruction operates. Basically, I say "vfpxd [M, 0:1, 0:1, 0:1]" before a vfpu op, and now the vfpu op (only the next one) will: mask (not write) x, and clamp yzw each to [+0, 1]. I could set the prefix based on a GPR register, which could be read from a file. That would be very very evil, but I could do it. Handling this in the jit would make it even more complex than it already is... and it's rare. So we don't; if we find a prefix that we don't know the value of, we stop jitting vfpu ops entirely. We just give up, because the game is evil. It turns out, this game is evil. At least by our current definition. So whether you run jit or interpreter, the code in CompVFPU.cpp is not actually being used. It seems like the real problem is that we have a bug in interpreter which we don't have in jit. When you call You should get better results if you do the bisect while you have the code I quoted above removed. Now it won't be impacted by unused prefixes. Otherwise, I can't really see what mtv could be doing wrong. @hrydgard thinking about this, aside from fixing whatever interp bug, the best way to handle this behavior (if we want it to get the fast version of the jit) would be to detect when:
In that situation, we could compile the vfpu op (with prefixes) as the delay slot, and then branch to immediately after the vfpu op. This would keep us away from having any blocks starting with unknown prefixes. -[Unknown] |
Sorry, I think I added some confusion. The code in Core/MIPS/x86/Jit.cpp should only do anything if you are using the jit. The IR doesn't use that file at all - or it shouldn't. It uses Core/MIPS/IR/IRJit.cpp. Right now, IR and jit are completely disconnected. If you select jit, there's no IR. If you select IR, you're only using the IR interpreter. There's no IR jit yet. I was thinking you could bisect jit (ignore IR, pretend it doesn't exist) in Core/MIPS/x86/CompVFPU.cpp. I'm wondering which interpreter op is wrong. Does changing CONDITIONAL_DISABLE at the top of Core/MIPS/x86/CompVFPU.cpp not affect how cpu = jit (not ir) works? -[Unknown] |
Ah ok then, yeah changing conditional disable at the top of x86/CompVFPU restores the bug again in JIT, will try to find exact function now. Edit: Found it, putting disable under this:
restores the bug with JIT. |
Just a reminder that we don't stop entirely, we just stop jitting the first VFPU instruction in each block - as soon as the flags are "eaten" we can trust it again. But yes, this can surface bugs in the interpreter for sure. |
@unknownbrackets yes, it's pretty clear that the game just forgot to fill the delay slot with something else, and looking a little ahead into other blocks we can eliminate it like you say. |
Right, that's true, we'll jit after the first vfpu op in each block. -[Unknown] |
I wonder if this was somehow related to #6554. I know IR was different, though. Still happening the same? -[Unknown] |
It's still the same, however I made cwcheat workarounds for all games that have such problem(all I know of at least) by simply writing the same problematic code triggering uneaten prefix in a more wasteful way to avoid prefix opcode in the delay slot. (Well to be exact to avoid prefix opcode in the delay slot, followed by another prefix opcode) |
Sounds to me like #6554 is very different, that one was about not advancing the vertex pointer when decoding bbox calls, while this is about VFPU prefixes in delay slots, according to the comment #5549 (comment) |
So, this is either that vocp shouldn't apply prefixes on D (testable), mishandling of prefixes somehow from delay slots when bailing to interp (but probably not that), or interp not handling nan / some values properly. It would be great to reproduce this issue in a small hardware test. Starting from a clean build, you could replace alignas(16) u32 compareVocpValues[4];
u32 compareVocpEnable;
void Jit::Comp_Vocp(MIPSOpcode op) {
CONDITIONAL_DISABLE(VFPU_VEC);
// Hack: forcing this always on makes the geometry right...
//if (js.HasUnknownPrefix())
// DISABLE;
VectorSize sz = GetVecSize(op);
int n = GetNumVectorElements(sz);
u8 sregs[4], dregs[4];
GetVectorRegsPrefixS(sregs, sz, _VS);
GetVectorRegsPrefixD(dregs, sz, _VD);
// Flush SIMD.
fpr.SimpleRegsV(sregs, sz, 0);
fpr.SimpleRegsV(dregs, sz, MAP_NOINIT | MAP_DIRTY);
X64Reg tempxregs[4];
for (int i = 0; i < n; ++i) {
// HACK: Don't actually write to D regs, write to some temp regs.
int reg = fpr.GetTempV();
fpr.MapRegV(reg, MAP_NOINIT | MAP_DIRTY);
fpr.SpillLockV(reg);
tempxregs[i] = fpr.VX(reg);
}
MOV(PTRBITS, R(TEMPREG), ImmPtr(&one));
MOVSS(XMM1, MatR(TEMPREG));
for (int i = 0; i < n; ++i) {
MOVSS(XMM0, R(XMM1));
SUBSS(XMM0, fpr.V(sregs[i]));
MOVSS(tempxregs[i], R(XMM0));
}
for (int i = 0; i < n; ++i) {
MOVSS(M(&compareVocpValues[i]), tempxregs[i]);
}
ApplyPrefixD(dregs, sz);
fpr.ReleaseSpillLocks();
MOV(32, M(&compareVocpEnable), Imm32(1));
Comp_Generic(op);
MOV(32, M(&compareVocpEnable), Imm32(0));
} And then, replace };
using namespace ::MIPSComp;
namespace MIPSComp {
extern u32 compareVocpValues[4];
extern u32 compareVocpEnable;
};
namespace MIPSInt {
void Int_Vocp(MIPSOpcode op) {
float s[4], d[4];
int vd = _VD;
int vs = _VS;
VectorSize sz = GetVecSize(op);
ReadVector(s, sz, vs);
for (int i = 0; i < GetNumVectorElements(sz); i++) {
// Always positive NaN.
d[i] = my_isnan(s[i]) ? fabsf(s[i]) : 1.0f - s[i];
}
ApplyPrefixD(d, sz);
WriteVector(d, sz, vd);
PC += 4;
EatPrefixes();
if (MIPSComp::compareVocpEnable) {
u32 du[4] = {};
ReadVector(reinterpret_cast<float *>(du), sz, vd);
bool same = true;
for (int i = 0; i < GetNumVectorElements(sz); i++) {
same = same && du[i] == MIPSComp::compareVocpValues[i];
}
if (!same) {
NOTICE_LOG(CPU, "FOUND: Mismatched vocp result at %08x", PC);
}
}
} This would help us find the instruction. Could add logging there for the actual register values, then we could replicate and figure out whether we just have vocp prefix handling wrong or something else. -[Unknown] |
Sorry was trying to hack together quickly by typing - I've updated the code above, should compile now. Well, I'm after the VFPU regs for the misbehaving vocp instruction. That way I can fix interp, and potentially arm64jit if it has the same bug, etc. That address only affects That said, as I was typing this - I realized what I can't believe I didn't see until now... Instead of the above, try replacing: ReadVector(s, sz, vs);
for (int i = 0; i < GetNumVectorElements(sz); i++) { With: ReadVector(s, sz, vs);
ApplySwizzleS(s, sz);
for (int i = 0; i < GetNumVectorElements(sz); i++) { Then disable any hacks or cheats otherwise trying to make it work. Hopefully that makes the interp fallback behave correctly. -[Unknown] |
The uneaten prefix still logs, but the geometry problem seems to be fixed with that change:]. Yay! |
Also, guess that vsocp also applies prefixes. See hrydgard#5549.
Thanks - #11822. So are all Artdink games generally fixed by this? Might still be interesting as mentioned there to optimize the delay slot case (eating the target instruction if vfpu when the delay slot is a prefix op.) -[Unknown] |
I'm leaving to work in a bit so can't check all of them, but they all ran on same engine, used same function and suffered from exactly same issue so it's safe to assume they are all working now. I think there's only that mipmap weirdness(#6357) left for those games to be pretty much perfect. |
Does using this for void Jit::CompBranchExits(CCFlags cc, u32 targetAddr, u32 notTakenAddr, bool delaySlotIsNice, bool likely, bool andLink) {
if (andLink)
gpr.SetImm(MIPS_REG_RA, GetCompilerPC() + 8);
if (!likely && delaySlotIsNice && js.MayHavePrefix() && !js.HasUnknownPrefix()) {
auto addrReplacesPrefix = [&](Memory::Opcode op, MIPSInfo info) {
if (info & IS_VFPU) {
const char *targetName = MIPSGetName(op);
if (js.prefixS == 0xE4 && js.prefixT == 0xE4 && !strcmp(targetName, "vpfxd"))
return true;
if (js.prefixT == 0xE4 && js.prefixD == 0x00 && !strcmp(targetName, "vpfxs"))
return true;
if (js.prefixS == 0xE4 && js.prefixD == 0x00 && !strcmp(targetName, "vpfxt"))
return true;
}
return false;
};
// Check if the target would consume a prefix set in the delay slot.
auto targetOp = Memory::Read_Instruction(targetAddr);
auto targetInfo = MIPSGetInfo(targetOp);
bool targetReplacesPrefix = (targetInfo & OUT_EAT_PREFIX) || addrReplacesPrefix(targetOp, targetInfo);
auto notTakenOp = Memory::Read_Instruction(notTakenAddr);
auto notTakenInfo = MIPSGetInfo(notTakenOp);
bool notTakenReplacesPrefix = (notTakenInfo & OUT_EAT_PREFIX) || addrReplacesPrefix(notTakenOp, notTakenInfo);
if (targetReplacesPrefix && notTakenReplacesPrefix) {
Gen::FixupBranch ptr;
u32 prefixS = js.prefixS;
u32 prefixT = js.prefixT;
u32 prefixD = js.prefixD;
JitState::PrefixState prefixSFlag = js.prefixSFlag;
JitState::PrefixState prefixTFlag = js.prefixTFlag;
JitState::PrefixState prefixDFlag = js.prefixDFlag;
// Not using FlushAll() to avoid flushing the prefix.
gpr.Flush();
fpr.Flush();
ptr = J_CC(cc, true);
// Take the branch - including the vfpu op if it needs the prefix.
if (targetInfo & OUT_EAT_PREFIX) {
js.compilerPC += 4;
CheckJitBreakpoint(targetAddr, 0);
MIPSCompileOp(targetOp, this);
FlushAll();
CONDITIONAL_LOG_EXIT(targetAddr + 4);
WriteExit(targetAddr + 4, js.nextExit++);
} else {
CONDITIONAL_LOG_EXIT(targetAddr);
WriteExit(targetAddr, js.nextExit++);
}
// Not taken, but let's finish off the vfpu op and jump past it.
SetJumpTarget(ptr);
// Don't forget to restore the prefix state.
js.prefixS = prefixS;
js.prefixT = prefixT;
js.prefixD = prefixD;
js.prefixSFlag = prefixSFlag;
js.prefixTFlag = prefixTFlag;
js.prefixDFlag = prefixDFlag;
if (notTakenInfo & OUT_EAT_PREFIX) {
AddContinuedBlock(targetAddr);
js.compilerPC = targetAddr;
CheckJitBreakpoint(notTakenAddr, 0);
MIPSCompileOp(notTakenOp, this);
FlushAll();
CONDITIONAL_LOG_EXIT(notTakenAddr + 4);
WriteExit(notTakenAddr + 4, js.nextExit++);
} else {
CONDITIONAL_LOG_EXIT(notTakenAddr);
WriteExit(notTakenAddr, js.nextExit++);
}
js.compiling = false;
}
}
// We may want to try to continue along this branch a little while, to reduce reg flushing.
bool predictTakeBranch = PredictTakeBranch(targetAddr, likely); I haven't had a chance to test it (would need to setup a matching code block), but at least it will compile... heh, sorry. The idea is as above - if the target of the branch is a vfpu op, compile it as part of the branch. This does require both paths (take or don't take branch) consume the prefix, or else we'd still have an uneaten prefix. -[Unknown] |
Still shows:
|
This makes me suspect that some arm64 bugs might really be interpreter bugs - the x86 jit is more complete so if the arm64 falls back but the x86 doesnt... |
Anyway, I'm closing this - please reopen if testing shows that any more Artdink games have issues, and feel free to continue the above investigation in this thread. |
My tests are showing that vocp does NOT apply S prefixes, but instead applies the T prefix to the 1.0f. Do Artdink games still work if you use: void Int_Vocp(MIPSOpcode op)
{
float s[4], t[4], d[4];
int vd = _VD;
int vs = _VS;
VectorSize sz = GetVecSize(op);
ReadVector(s, sz, vs);
int n = GetNumVectorElements(sz);
for (int i = 0; i < n; ++i) {
t[i] = 1.0f;
}
ApplySwizzleT(t, sz);
for (int i = 0; i < n; i++) {
// Always positive NaN.
d[i] = my_isnan(s[i]) ? fabsf(s[i]) : t[i] - s[i];
}
ApplyPrefixD(d, sz);
WriteVector(d, sz, vd);
PC += 4;
EatPrefixes();
} And put -[Unknown] |
That's... wacky. I'd have expected that it might run as a regular subtraction where S would apply to the 1.0 and T to the input, if it's gonna behave in some strange way, but apparently not then.... |
That new code just restores the initial bug with oversized body parts. In fact it makes things worse, a workaround avoiding uneaten prefix no longer works with it. |
Actually, it seems to wire T regnum to one. The S prefix still applies, just forces/ignores negate. Here's more accurate behavior matching a better range of prefix tests: void Int_Vocp(MIPSOpcode op)
{
float s[4], t[4], d[4];
int vd = _VD;
int vs = _VS;
VectorSize sz = GetVecSize(op);
ReadVector(s, sz, vs);
// S prefix forces the negate flags.
u32 sprefix = currentMIPS->vfpuCtrl[VFPU_CTRL_SPREFIX];
ApplyPrefixST(s, sprefix | 0x000F0000, sz);
// T prefix forces constants on and regnum to 1.
// That means negate still works, and abs activates a different constant.
u32 tprefix = currentMIPS->vfpuCtrl[VFPU_CTRL_TPREFIX];
ApplyPrefixST(t, (tprefix & ~0x000000FF) | 0x00000055 | 0x0000F000, sz);
for (int i = 0; i < GetNumVectorElements(sz); i++) {
// Always positive NaN. Note that s is always negated from the registers.
d[i] = my_isnan(s[i]) ? fabsf(s[i]) : t[i] + s[i];
}
ApplyPrefixD(d, sz);
WriteVector(d, sz, vd);
PC += 4;
EatPrefixes();
} This actually sorta makes sense. It's basically a prefix hack that then hits regular vadd? Even the NAN behavior makes more sense. -[Unknown] |
This one does work(still with the DISABLE; at Comp_Vocp). |
See hrydgard#5549. Matches tests for various prefix settings.
With Unknown's latest investigations into the VFPU, this is now adequately explained and should be fixed. Closing again. |
There's a problem affecting potentially quite a few/all Artdink games as most of them(all?) seems to be about robots and run on same engine. Basically during some animations certain model parts seems to be disrupted/stretched and stand out in an ugly&weird way:
This is Macross Triangle Frontier with Variable Fighter 1D(VF-1D) as an example.
Gurlok in the forums mentioned it also affects Gundam Exia melee attack in Gundam Assault Survive and Claxus found out it also affects Battle Robot Damashii.
Worth noting that not all models/animations are affected by glitch from my observation in macross it seems that affected melee attacks are those witch longest reach, and transformation glitch affects mostly hand with gunpod like it would glitch only something getting too close/far from the camera or some central position, it could be accidential through.
Just to be sure I tried JIT on/off as well as software mode, it wasn't affecting the problem in any way. Also as far as I checked Macross games, the problem existed from the very first builds it could go in-game.
The text was updated successfully, but these errors were encountered: