Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Artdink games geometry problem #5549

Closed
LunaMoo opened this issue Feb 23, 2014 · 46 comments
Closed

Artdink games geometry problem #5549

LunaMoo opened this issue Feb 23, 2014 · 46 comments
Milestone

Comments

@LunaMoo
Copy link
Collaborator

LunaMoo commented Feb 23, 2014

There's a problem affecting potentially quite a few/all Artdink games as most of them(all?) seems to be about robots and run on same engine. Basically during some animations certain model parts seems to be disrupted/stretched and stand out in an ugly&weird way:
artdink geometry fubar

This is Macross Triangle Frontier with Variable Fighter 1D(VF-1D) as an example.
Gurlok in the forums mentioned it also affects Gundam Exia melee attack in Gundam Assault Survive and Claxus found out it also affects Battle Robot Damashii.

Worth noting that not all models/animations are affected by glitch from my observation in macross it seems that affected melee attacks are those witch longest reach, and transformation glitch affects mostly hand with gunpod like it would glitch only something getting too close/far from the camera or some central position, it could be accidential through.
Just to be sure I tried JIT on/off as well as software mode, it wasn't affecting the problem in any way. Also as far as I checked Macross games, the problem existed from the very first builds it could go in-game.

@unknownbrackets
Copy link
Collaborator

Is this affected by the prescale or software skinning options?

-[Unknown]

@LunaMoo
Copy link
Collaborator Author

LunaMoo commented Feb 23, 2014

Not sure what you ment by "prescale", texture/rendering scalling? None of it affects it, software skinning on/off doesn't either. I also tried other options including cpu settings just to be sure, nothing really seems to affect it neither to better or worser.

@dbz400
Copy link
Contributor

dbz400 commented Feb 23, 2014

Presscale which is the texture coord speedhack option

untitled

@LunaMoo
Copy link
Collaborator Author

LunaMoo commented Feb 23, 2014

Good to know, but nah it doesn't affect it anyway, none of the optional settings do. The only thing which seems to affect this glitch somehow is distance to camera.

For example if I'm on the ground in the VF-1D and just tap movement this is enough to cause the glitch for the left hand, when I tap up or right, because the animation happens in some distance from the camera it get's glitched less, if I tap down or left, the left hand is the closest thing to camera and the glitch is way bigger than it should be by just distance:
left hand far
left hand close

@hrydgard
Copy link
Owner

Looks like bad transform matrices to me, either world or bone matrices. Could be either a CPU bug or maybe even a GPU timing/scheduling problem, where the game replaces matrices in memory at the wrong point in time or something.

@unknownbrackets
Copy link
Collaborator

Has this changed with the vrot change? If not, does multithreading affect it?

-[Unknown]

@LunaMoo
Copy link
Collaborator Author

LunaMoo commented Jun 16, 2014

No change at all, multithreading also does not affect it.

Edit: also what I didn't mention creating this issue, sometimes after special moves that instantly transform from mech to very fast flying fighter, the sky texture(or like 1 pixel from it) covers the screen in figher mode it also makes player loose all movement control until changing form(since it affects only fighter form) like it was zooming way out of the map, unfortunately it's random and sometimes hard to reproduce, have to use cheats, since otherwise special moves aren't really so common, looks like that:

cover

It covers whole screen with 1 color, like it was zooming in to that sky texture and concentrating on 1 pixel.

@unknownbrackets
Copy link
Collaborator

Does this game ever call sceGeGetMtx? Just wondering.

How terrible does this look in the software rendrerer?

-[Unknown]

@LunaMoo
Copy link
Collaborator Author

LunaMoo commented Jan 24, 2016

Never calls that, also apart of a few extra glitches looks same in software.

@LunaMoo
Copy link
Collaborator Author

LunaMoo commented May 15, 2016

This might be a cpu core bug, happening both in JIT and interpreter wasn't exactly pointing to it, but IR interpreter makes a difference.
v1.2.2-465-gd4e45f4 as well as #8740 (since v1.2.2-469-g661360e just crashes) with IR interpreter activated seems to work fine without supersized arms, after a while it gets different issues like animations starts repeating or your robot starts moving very fast.

@unknownbrackets
Copy link
Collaborator

Hmm, definitely could be a vfpu bug then.

If you change in Core/MIPS/IR/IRCompVFPU.cpp:

// #define CONDITIONAL_DISABLE { Comp_Generic(op); return; }
#define CONDITIONAL_DISABLE ;

To:

#define CONDITIONAL_DISABLE { Comp_Generic(op); return; }
// #define CONDITIONAL_DISABLE ;

Does it go back to working the way it does in the interpreter? If yes, switch it back, and then above void IRFrontend::Comp_Vi2f(MIPSOpcode op) { add:

#undef CONDITIONAL_DISABLE
#define CONDITIONAL_DISABLE { Comp_Generic(op); return; }

And you can keep doing that to "bisect" the file. When it's still working, move it upward. When it stops working, go downward. Then when you think you've found the function that does it, remove the lines and just add DISABLE; at the start of the function.

If you can find a function where DISABLE; at the top causes the arms to get supersized, then it's almost definitely a vfpu bug, and we can probably fix it for jit/interp too.

-[Unknown]

@LunaMoo
Copy link
Collaborator Author

LunaMoo commented May 15, 2016

Thanks for the guide:

void IRFrontend::Comp_Mftv(MIPSOpcode op) {
        DISABLE;

restores the issue when using IR interpreter, more specifically I removed that disable again and tried both cases there and disabling just:

        case 7: // mtv
            DISABLE;

restores it. So that's I guess where the bug must be in JIT & old interpreter.

@unknownbrackets
Copy link
Collaborator

Hmm, interesting. It could be it's a prefix thing, then. What if you use if (imm < 128) { DISABLE; } there? Or if (imm == VFPU_CTRL_CC + 128) {? I'm wondering which value imm has (you can bisect this even.)

You can also log the values (but this is on compilation, so it may log only once and well before the issue) with NOTICE_LOG(JIT, "mtv rt=%d, %d", rt, imm);.

-[Unknown]

@LunaMoo
Copy link
Collaborator Author

LunaMoo commented May 15, 2016

Well it logs quite a bit, all with imm < 128, here's all log from [JIT] using IR interpreter:

54:27:753 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=5, 0
54:27:772 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=8, 1
54:27:772 user_main    W[JIT]: IR\IRFrontend.cpp:109 An uneaten prefix at end of block
54:27:772 user_main    W[JIT]: ppsspp\Core/MIPS/JitCommon/JitState.h:169 S: 000000e4 flag: 1
54:27:772 user_main    W[JIT]: ppsspp\Core/MIPS/JitCommon/JitState.h:169 T: 000600e4 flag: 1
54:27:772 user_main    W[JIT]: ppsspp\Core/MIPS/JitCommon/JitState.h:178 D: 00000000 flag: 1
54:27:772 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=10, 66
54:27:772 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=11, 0
54:27:772 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=12, 33
54:27:772 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=10, 65
54:27:772 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=11, 2
54:27:772 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=12, 32
54:27:772 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=10, 34
54:27:772 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=11, 64
54:27:772 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=12, 1
54:27:827 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=8, 34
54:27:841 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=8, 33
54:27:876 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=8, 35
54:27:876 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=1, 64
54:27:888 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=1, 64
54:28:211 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=10, 66
54:28:211 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=11, 0
54:28:211 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=12, 33
54:28:211 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=10, 65
54:28:211 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=11, 2
54:28:211 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=12, 32
54:28:211 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=10, 34
54:28:211 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=11, 64
54:28:211 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=12, 1
54:28:564 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=8, 0
54:28:564 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=9, 32
54:28:564 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=10, 64
54:28:580 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=8, 4
54:28:580 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=8, 4
54:43:053 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=8, 8
54:43:053 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=9, 40
54:43:058 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=8, 8
54:43:058 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=9, 40
54:43:058 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=8, 9
54:43:058 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=9, 41
54:43:058 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=8, 10
54:43:058 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=9, 42
54:43:058 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=8, 16
54:43:058 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=8, 17
54:43:058 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=8, 18
54:43:544 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=5, 2
54:43:778 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=8, 35
54:43:778 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=1, 64
54:47:399 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=1, 64
54:52:649 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=8, 33
54:53:576 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=8, 33
54:53:576 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=8, 1
54:53:576 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=8, 33
54:53:576 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=8, 33
54:56:287 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=8, 34
54:58:814 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=8, 8
55:04:486 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=8, 10
55:04:486 user_main    N[JIT]: IR\IRCompVFPU.cpp:907 mtv rt=8, 73

@hrydgard
Copy link
Owner

I added logging of the current block when that happens, can you get the latest and run again?

@unknownbrackets
Copy link
Collaborator

Note: I think it only logs to Visual Studio when you F5 in the Output tab. Not in the debug console.

I guess this means we're handling a prefix more correctly in IR than in jit/interp? And so making it uneaten is breaking it?

Or maybe it's a false positive. Maybe because it's now avoiding IR this made us not notice the real problem in some other op...

Want to jump on #ppsspp if you have time?

-[Unknown]

@unknownbrackets
Copy link
Collaborator

I'm thinking it might be best to re-verify that it's this op by removing CONDITIONAL_DISABLE; from it, and then doing the bisect I described again. It's very possibly caused by a different op.

-[Unknown]

@LunaMoo
Copy link
Collaborator Author

LunaMoo commented May 15, 2016

Couldn't see anything new in the log, but after rebuilding again got:

43:13:344 user_main    W[JIT]: x86\Jit.cpp:317 An uneaten prefix at end of block: 0881aba4
43:13:344 user_main    W[JIT]: ppsspp\Core/MIPS/JitCommon/JitState.h:169 S: 000000e4 flag: 1
43:13:344 user_main    W[JIT]: ppsspp\Core/MIPS/JitCommon/JitState.h:169 T: 000600e4 flag: 1
43:13:344 user_main    W[JIT]: ppsspp\Core/MIPS/JitCommon/JitState.h:178 D: 00000000 flag: 1

at that address it looks like:
uneaten prefix
copy paste of whole block(well everything around using same color) - https://gist.github.com/LunaMoo/575deb1963fcfd99cd1a443f37a79619

Well I checked most functions first time and that was the only one which disabling restores the bug. Through as I initially mentioned, IR has other issues, animations look all correct for a while, but after moving for a bit, they kind of get doubled, or more exactly run at double speed as movement is also affected that way, maybe that bug somehow just masks the other one, but then it does work completely fine at times, especially if I just walk in one direction doing nothing after spawning on the map to make a screenshot:
works fine

@unknownbrackets
Copy link
Collaborator

Ha. That's an evil little prefix there. I guess we should inline that "if" somehow so we can maintain prefix info, maybe? I wonder how common that technique is.

Hmm. You know, the thing is, if there's an uneaten prefix, we rebuild the block with vfpu jit off, basically.

In Core/MIPS/x86/Jit.cpp, what if you just delete this part?

    // Drat.  The VFPU hit an uneaten prefix at the end of a block.
    if (js.startDefaultPrefix && js.MayHavePrefix()) {
        WARN_LOG(JIT, "An uneaten prefix at end of block: %08x", GetCompilerPC() - 4);
        js.LogPrefix();

        // Let's try that one more time.  We won't get back here because we toggled the value.
        js.startDefaultPrefix = false;
        cleanSlate = true;
    }

Or just delete the cleanSlate = true; part. IR currently has a bug that makes that part not work.

-[Unknown]

@LunaMoo
Copy link
Collaborator Author

LunaMoo commented May 16, 2016

Sorry got distracted ~ with this change IR interpreter has a lot of missing graphics, JIT however does appear to be fixed by it.
Edit: I first removed whole code and that worked, removing just cleanSlate = true; doesn't seem to affect JIT in any visible way through while still breaking IR interpreter.

@unknownbrackets
Copy link
Collaborator

unknownbrackets commented May 16, 2016

Okay, what this means is that we probably have a bug in the interpreter only. Let me explain, because vfpu prefixes are tricky.

Let me start with the important bit: As far as jit is concerned, the "prefixes" are part of the instruction.

A prefix modifies how the instruction operates. Basically, I say "vfpxd [M, 0:1, 0:1, 0:1]" before a vfpu op, and now the vfpu op (only the next one) will: mask (not write) x, and clamp yzw each to [+0, 1].

I could set the prefix based on a GPR register, which could be read from a file. That would be very very evil, but I could do it. Handling this in the jit would make it even more complex than it already is... and it's rare. So we don't; if we find a prefix that we don't know the value of, we stop jitting vfpu ops entirely. We just give up, because the game is evil.

It turns out, this game is evil. At least by our current definition.

So whether you run jit or interpreter, the code in CompVFPU.cpp is not actually being used. It seems like the real problem is that we have a bug in interpreter which we don't have in jit.

When you call DISABLE; or use CONDITIONAL_DISABLE in mtv, if there's a "unused prefix", it assumes the game is evil (this is because, it could end up branching anywhere, and that code would be jitted for the wrong prefix value.) I think that's why putting it in mtv made it broken, because it stopped using the VFPU jit.

You should get better results if you do the bisect while you have the code I quoted above removed. Now it won't be impacted by unused prefixes. Otherwise, I can't really see what mtv could be doing wrong.

@hrydgard thinking about this, aside from fixing whatever interp bug, the best way to handle this behavior (if we want it to get the fast version of the jit) would be to detect when:

  1. The delay slot sets a prefix.
  2. The op after the delay slot (untaken) immediately overwrites that prefix.
  3. The destination of the branch is a vfpu op, or else N prefix ops followed by a vfpu op.

In that situation, we could compile the vfpu op (with prefixes) as the delay slot, and then branch to immediately after the vfpu op. This would keep us away from having any blocks starting with unknown prefixes.

-[Unknown]

@LunaMoo
Copy link
Collaborator Author

LunaMoo commented May 16, 2016

Not exactly sure what I should be looking for, with Core/MIPS/x86/Jit.cpp code removed:

  • JIT doesn't have any issue I can see, so kind of fixes the whole issue,

  • IR interpreter misses most of the graphics including robot models so can't really see how they would be changing.

    When bisecting IR as earlier with:

    #undef CONDITIONAL_DISABLE
    #define CONDITIONAL_DISABLE { Comp_Generic(op); return; }

the missing graphics get restored and I see everything in pretty much same order as without removing Core/MIPS/x86/Jit.cpp code and it still points to Comp_Mftv. However if I remove that undef/def and just put DISABLE; under this function, it goes back to missing graphics and so I can't see if things get affected or not as pretty much nothing displays
uljs00321_00004
.

@unknownbrackets
Copy link
Collaborator

unknownbrackets commented May 16, 2016

Sorry, I think I added some confusion.

The code in Core/MIPS/x86/Jit.cpp should only do anything if you are using the jit. The IR doesn't use that file at all - or it shouldn't. It uses Core/MIPS/IR/IRJit.cpp.

Right now, IR and jit are completely disconnected. If you select jit, there's no IR. If you select IR, you're only using the IR interpreter. There's no IR jit yet.

I was thinking you could bisect jit (ignore IR, pretend it doesn't exist) in Core/MIPS/x86/CompVFPU.cpp. I'm wondering which interpreter op is wrong. Does changing CONDITIONAL_DISABLE at the top of Core/MIPS/x86/CompVFPU.cpp not affect how cpu = jit (not ir) works?

-[Unknown]

@LunaMoo
Copy link
Collaborator Author

LunaMoo commented May 16, 2016

Ah ok then, yeah changing conditional disable at the top of x86/CompVFPU restores the bug again in JIT, will try to find exact function now.

Edit: Found it, putting disable under this:

void Jit::Comp_Vocp(MIPSOpcode op) {

restores the bug with JIT.

@hrydgard
Copy link
Owner

@unknownbrackets

 if we find a prefix that we don't know the value of, we stop jitting vfpu ops entirely.
 We just give up, because the game is evil

Just a reminder that we don't stop entirely, we just stop jitting the first VFPU instruction in each block - as soon as the flags are "eaten" we can trust it again.

But yes, this can surface bugs in the interpreter for sure.

@hrydgard
Copy link
Owner

@unknownbrackets yes, it's pretty clear that the game just forgot to fill the delay slot with something else, and looking a little ahead into other blocks we can eliminate it like you say.

@unknownbrackets
Copy link
Collaborator

Right, that's true, we'll jit after the first vfpu op in each block.

-[Unknown]

@unknownbrackets
Copy link
Collaborator

I wonder if this was somehow related to #6554. I know IR was different, though.

Still happening the same?

-[Unknown]

@LunaMoo
Copy link
Collaborator Author

LunaMoo commented May 7, 2018

It's still the same, however I made cwcheat workarounds for all games that have such problem(all I know of at least) by simply writing the same problematic code triggering uneaten prefix in a more wasteful way to avoid prefix opcode in the delay slot. (Well to be exact to avoid prefix opcode in the delay slot, followed by another prefix opcode)

@hrydgard
Copy link
Owner

hrydgard commented May 7, 2018

Sounds to me like #6554 is very different, that one was about not advancing the vertex pointer when decoding bbox calls, while this is about VFPU prefixes in delay slots, according to the comment #5549 (comment)

@unknownbrackets
Copy link
Collaborator

unknownbrackets commented Feb 19, 2019

So, this is either that vocp shouldn't apply prefixes on D (testable), mishandling of prefixes somehow from delay slots when bailing to interp (but probably not that), or interp not handling nan / some values properly.

It would be great to reproduce this issue in a small hardware test.

Starting from a clean build, you could replace Comp_Vocp() with something like this:

alignas(16) u32 compareVocpValues[4];
u32 compareVocpEnable;

void Jit::Comp_Vocp(MIPSOpcode op) {
	CONDITIONAL_DISABLE(VFPU_VEC);

	// Hack: forcing this always on makes the geometry right...
	//if (js.HasUnknownPrefix())
	//	DISABLE;

	VectorSize sz = GetVecSize(op);
	int n = GetNumVectorElements(sz);

	u8 sregs[4], dregs[4];
	GetVectorRegsPrefixS(sregs, sz, _VS);
	GetVectorRegsPrefixD(dregs, sz, _VD);

	// Flush SIMD.
	fpr.SimpleRegsV(sregs, sz, 0);
	fpr.SimpleRegsV(dregs, sz, MAP_NOINIT | MAP_DIRTY);

	X64Reg tempxregs[4];
	for (int i = 0; i < n; ++i) {
		// HACK: Don't actually write to D regs, write to some temp regs.
		int reg = fpr.GetTempV();
		fpr.MapRegV(reg, MAP_NOINIT | MAP_DIRTY);
		fpr.SpillLockV(reg);
		tempxregs[i] = fpr.VX(reg);
	}

	MOV(PTRBITS, R(TEMPREG), ImmPtr(&one));
	MOVSS(XMM1, MatR(TEMPREG));
	for (int i = 0; i < n; ++i) {
		MOVSS(XMM0, R(XMM1));
		SUBSS(XMM0, fpr.V(sregs[i]));
		MOVSS(tempxregs[i], R(XMM0));
	}

	for (int i = 0; i < n; ++i) {
		MOVSS(M(&compareVocpValues[i]), tempxregs[i]);
	}

	ApplyPrefixD(dregs, sz);

	fpr.ReleaseSpillLocks();

	MOV(32, M(&compareVocpEnable), Imm32(1));
	Comp_Generic(op);
	MOV(32, M(&compareVocpEnable), Imm32(0));
}

And then, replace Int_Vocp() with something like:

	};

	using namespace ::MIPSComp;
	namespace MIPSComp {
		extern u32 compareVocpValues[4];
		extern u32 compareVocpEnable;
	};
	namespace MIPSInt {

	void Int_Vocp(MIPSOpcode op) {
		float s[4], d[4];
		int vd = _VD;
		int vs = _VS;
		VectorSize sz = GetVecSize(op);
		ReadVector(s, sz, vs);
		for (int i = 0; i < GetNumVectorElements(sz); i++) {
			// Always positive NaN.
			d[i] = my_isnan(s[i]) ? fabsf(s[i]) : 1.0f - s[i];
		}
		ApplyPrefixD(d, sz);
		WriteVector(d, sz, vd);
		PC += 4;
		EatPrefixes();

		if (MIPSComp::compareVocpEnable) {
			u32 du[4] = {};
			ReadVector(reinterpret_cast<float *>(du), sz, vd);

			bool same = true;
			for (int i = 0; i < GetNumVectorElements(sz); i++) {
				same = same && du[i] == MIPSComp::compareVocpValues[i];
			}

			if (!same) {
				NOTICE_LOG(CPU, "FOUND: Mismatched vocp result at %08x", PC);
			}
		}
	}

This would help us find the instruction. Could add logging there for the actual register values, then we could replicate and figure out whether we just have vocp prefix handling wrong or something else.

-[Unknown]

@LunaMoo
Copy link
Collaborator Author

LunaMoo commented Feb 21, 2019

:3 some errors I might not figure out on my own:

Severity	Code	Description	Project	File	Line	Suppression State
Error (active)	E1866	attribute does not apply to any entity	Core	D:\source\ppsspp\Core\MIPS\MIPSIntVFPU.cpp	546	
Error	C2660	'Gen::XEmitter::MOV': function does not take 2 arguments	Core	d:\source\ppsspp\core\mips\x86\compvfpu.cpp	2068	
Error (active)	E0413	no suitable conversion function from "Gen::OpArg" to "int" exists	Core	D:\source\ppsspp\Core\MIPS\x86\CompVFPU.cpp	2066	
Error (active)	E0165	too few arguments in function call	Core	D:\source\ppsspp\Core\MIPS\x86\CompVFPU.cpp	2066	
Error (active)	E0413	no suitable conversion function from "Gen::OpArg" to "int" exists	Core	D:\source\ppsspp\Core\MIPS\x86\CompVFPU.cpp	2068	
Error (active)	E0165	too few arguments in function call	Core	D:\source\ppsspp\Core\MIPS\x86\CompVFPU.cpp	2068	
Error	C2660	'Gen::XEmitter::MOV': function does not take 2 arguments	Core	d:\source\ppsspp\core\mips\x86\compvfpu.cpp	2066	

Including also where the problem happens in the game code:
uneaten prefix

A workaround I made which solves the issue is to simply branch the code a bit differently to avoid the delay slot instruction on false, could also just nop it as this function runs only right at the start in all affected games and I recall it never get's this condition true.

@unknownbrackets
Copy link
Collaborator

Sorry was trying to hack together quickly by typing - I've updated the code above, should compile now.

Well, I'm after the VFPU regs for the misbehaving vocp instruction. That way I can fix interp, and potentially arm64jit if it has the same bug, etc. That address only affects startDefaultPrefix, which is causing interp vs x86jit to be used. It doesn't actually trigger any bug in interp directly.

That said, as I was typing this - I realized what I can't believe I didn't see until now...

Instead of the above, try replacing:

		ReadVector(s, sz, vs);
		for (int i = 0; i < GetNumVectorElements(sz); i++) {

With:

		ReadVector(s, sz, vs);
		ApplySwizzleS(s, sz);
		for (int i = 0; i < GetNumVectorElements(sz); i++) {

Then disable any hacks or cheats otherwise trying to make it work. Hopefully that makes the interp fallback behave correctly.

-[Unknown]

@LunaMoo
Copy link
Collaborator Author

LunaMoo commented Feb 22, 2019

The uneaten prefix still logs, but the geometry problem seems to be fixed with that change:]. Yay!

unknownbrackets added a commit to unknownbrackets/ppsspp that referenced this issue Feb 22, 2019
Also, guess that vsocp also applies prefixes.  See hrydgard#5549.
@unknownbrackets
Copy link
Collaborator

Thanks - #11822. So are all Artdink games generally fixed by this?

Might still be interesting as mentioned there to optimize the delay slot case (eating the target instruction if vfpu when the delay slot is a prefix op.)

-[Unknown]

@unknownbrackets unknownbrackets added this to the v1.8.0 milestone Feb 22, 2019
@LunaMoo
Copy link
Collaborator Author

LunaMoo commented Feb 22, 2019

I'm leaving to work in a bit so can't check all of them, but they all ran on same engine, used same function and suffered from exactly same issue so it's safe to assume they are all working now. I think there's only that mipmap weirdness(#6357) left for those games to be pretty much perfect.

@unknownbrackets
Copy link
Collaborator

Does using this for Jit::CompBranchExits prevent the uneaten prefixes (everything past the last line here stays the same):

void Jit::CompBranchExits(CCFlags cc, u32 targetAddr, u32 notTakenAddr, bool delaySlotIsNice, bool likely, bool andLink) {
	if (andLink)
		gpr.SetImm(MIPS_REG_RA, GetCompilerPC() + 8);

	if (!likely && delaySlotIsNice && js.MayHavePrefix() && !js.HasUnknownPrefix()) {
		auto addrReplacesPrefix = [&](Memory::Opcode op, MIPSInfo info) {
			if (info & IS_VFPU) {
				const char *targetName = MIPSGetName(op);
				if (js.prefixS == 0xE4 && js.prefixT == 0xE4 && !strcmp(targetName, "vpfxd"))
					return true;
				if (js.prefixT == 0xE4 && js.prefixD == 0x00 && !strcmp(targetName, "vpfxs"))
					return true;
				if (js.prefixS == 0xE4 && js.prefixD == 0x00 && !strcmp(targetName, "vpfxt"))
					return true;
			}
			return false;
		};

		// Check if the target would consume a prefix set in the delay slot.
		auto targetOp = Memory::Read_Instruction(targetAddr);
		auto targetInfo = MIPSGetInfo(targetOp);
		bool targetReplacesPrefix = (targetInfo & OUT_EAT_PREFIX) || addrReplacesPrefix(targetOp, targetInfo);

		auto notTakenOp = Memory::Read_Instruction(notTakenAddr);
		auto notTakenInfo = MIPSGetInfo(notTakenOp);
		bool notTakenReplacesPrefix = (notTakenInfo & OUT_EAT_PREFIX) || addrReplacesPrefix(notTakenOp, notTakenInfo);

		if (targetReplacesPrefix && notTakenReplacesPrefix) {
			Gen::FixupBranch ptr;
			u32 prefixS = js.prefixS;
			u32 prefixT = js.prefixT;
			u32 prefixD = js.prefixD;
			JitState::PrefixState prefixSFlag = js.prefixSFlag;
			JitState::PrefixState prefixTFlag = js.prefixTFlag;
			JitState::PrefixState prefixDFlag = js.prefixDFlag;

			// Not using FlushAll() to avoid flushing the prefix.
			gpr.Flush();
			fpr.Flush();
			ptr = J_CC(cc, true);

			// Take the branch - including the vfpu op if it needs the prefix.
			if (targetInfo & OUT_EAT_PREFIX) {
				js.compilerPC += 4;

				CheckJitBreakpoint(targetAddr, 0);
				MIPSCompileOp(targetOp, this);
				FlushAll();
				CONDITIONAL_LOG_EXIT(targetAddr + 4);
				WriteExit(targetAddr + 4, js.nextExit++);
			} else {
				CONDITIONAL_LOG_EXIT(targetAddr);
				WriteExit(targetAddr, js.nextExit++);
			}

			// Not taken, but let's finish off the vfpu op and jump past it.
			SetJumpTarget(ptr);
			// Don't forget to restore the prefix state.
			js.prefixS = prefixS;
			js.prefixT = prefixT;
			js.prefixD = prefixD;
			js.prefixSFlag = prefixSFlag;
			js.prefixTFlag = prefixTFlag;
			js.prefixDFlag = prefixDFlag;
			if (notTakenInfo & OUT_EAT_PREFIX) {
				AddContinuedBlock(targetAddr);
				js.compilerPC = targetAddr;

				CheckJitBreakpoint(notTakenAddr, 0);
				MIPSCompileOp(notTakenOp, this);
				FlushAll();
				CONDITIONAL_LOG_EXIT(notTakenAddr + 4);
				WriteExit(notTakenAddr + 4, js.nextExit++);
			} else {
				CONDITIONAL_LOG_EXIT(notTakenAddr);
				WriteExit(notTakenAddr, js.nextExit++);
			}

			js.compiling = false;
		}
	}

	// We may want to try to continue along this branch a little while, to reduce reg flushing.
	bool predictTakeBranch = PredictTakeBranch(targetAddr, likely);

I haven't had a chance to test it (would need to setup a matching code block), but at least it will compile... heh, sorry. The idea is as above - if the target of the branch is a vfpu op, compile it as part of the branch. This does require both paths (take or don't take branch) consume the prefix, or else we'd still have an uneaten prefix.

-[Unknown]

@LunaMoo
Copy link
Collaborator Author

LunaMoo commented Feb 22, 2019

Still shows:

19:33:968 user_main    W[JIT]: x86\jit.cpp:296 An uneaten prefix at end of block: 0881aba8
19:33:968 user_main    W[JIT]: jitcommon\jitstate.h:162 S: 000000e4 flag: 1
19:33:968 user_main    W[JIT]: jitcommon\jitstate.h:162 T: 000600e4 flag: 1
19:33:968 user_main    W[JIT]: jitcommon\jitstate.h:171 D: 00000000 flag: 1

@hrydgard
Copy link
Owner

This makes me suspect that some arm64 bugs might really be interpreter bugs - the x86 jit is more complete so if the arm64 falls back but the x86 doesnt...

@hrydgard
Copy link
Owner

Anyway, I'm closing this - please reopen if testing shows that any more Artdink games have issues, and feel free to continue the above investigation in this thread.

@unknownbrackets
Copy link
Collaborator

My tests are showing that vocp does NOT apply S prefixes, but instead applies the T prefix to the 1.0f.

Do Artdink games still work if you use:

	void Int_Vocp(MIPSOpcode op)
	{
		float s[4], t[4], d[4];
		int vd = _VD;
		int vs = _VS;
		VectorSize sz = GetVecSize(op);
		ReadVector(s, sz, vs);

		int n = GetNumVectorElements(sz);
		for (int i = 0; i < n; ++i) {
			t[i] = 1.0f;
		}
		ApplySwizzleT(t, sz);

		for (int i = 0; i < n; i++) {
			// Always positive NaN.
			d[i] = my_isnan(s[i]) ? fabsf(s[i]) : t[i] - s[i];
		}
		ApplyPrefixD(d, sz);
		WriteVector(d, sz, vd);
		PC += 4;
		EatPrefixes();
	}

And put DISABLE; at the top of Comp_Vocp? The above version correctly matches tests, strangely enough...

-[Unknown]

@hrydgard hrydgard reopened this Feb 23, 2019
@hrydgard
Copy link
Owner

That's... wacky. I'd have expected that it might run as a regular subtraction where S would apply to the 1.0 and T to the input, if it's gonna behave in some strange way, but apparently not then....

@LunaMoo
Copy link
Collaborator Author

LunaMoo commented Feb 23, 2019

That new code just restores the initial bug with oversized body parts. In fact it makes things worse, a workaround avoiding uneaten prefix no longer works with it.

@unknownbrackets
Copy link
Collaborator

unknownbrackets commented Feb 23, 2019

Actually, it seems to wire T regnum to one. The S prefix still applies, just forces/ignores negate. Here's more accurate behavior matching a better range of prefix tests:

	void Int_Vocp(MIPSOpcode op)
	{
		float s[4], t[4], d[4];
		int vd = _VD;
		int vs = _VS;
		VectorSize sz = GetVecSize(op);
		ReadVector(s, sz, vs);

		// S prefix forces the negate flags.
		u32 sprefix = currentMIPS->vfpuCtrl[VFPU_CTRL_SPREFIX];
		ApplyPrefixST(s, sprefix | 0x000F0000, sz);

		// T prefix forces constants on and regnum to 1.
		// That means negate still works, and abs activates a different constant.
		u32 tprefix = currentMIPS->vfpuCtrl[VFPU_CTRL_TPREFIX];
		ApplyPrefixST(t, (tprefix & ~0x000000FF) | 0x00000055 | 0x0000F000, sz);

		for (int i = 0; i < GetNumVectorElements(sz); i++) {
			// Always positive NaN.  Note that s is always negated from the registers.
			d[i] = my_isnan(s[i]) ? fabsf(s[i]) : t[i] + s[i];
		}
		ApplyPrefixD(d, sz);
		WriteVector(d, sz, vd);
		PC += 4;
		EatPrefixes();
	}

This actually sorta makes sense. It's basically a prefix hack that then hits regular vadd? Even the NAN behavior makes more sense.

-[Unknown]

@LunaMoo
Copy link
Collaborator Author

LunaMoo commented Feb 23, 2019

This one does work(still with the DISABLE; at Comp_Vocp).

unknownbrackets added a commit to unknownbrackets/ppsspp that referenced this issue Feb 23, 2019
See hrydgard#5549.  Matches tests for various prefix settings.
@hrydgard
Copy link
Owner

With Unknown's latest investigations into the VFPU, this is now adequately explained and should be fixed. Closing again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants