-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Subpixel precision (GTE accuracy) #28
Comments
After removing garbage some games are not affected by this hack at all like mentioned before Tomb Raider II (SLUS-00437) and Silent Hill (SLUS-00707) ie. they look exactly same as without it. |
I believe Crash Team Racing and Wipeout are problematic too. |
Okay, finally got my debugger up and running. So the first thing I looked into was how the BIOS displayed the "PlayStation" logo at the start. I used the SCPH7003 BIOS (NA, version 3.0). The first GTE XY FIFO access is an => 0x8004e7d0: swc2 $12,0(a3) /* $a3: 0x80086eb8 */
0x8004e7d4: swc2 $13,0(t0)
0x8004e7d8: swc2 $14,0(t1)
0x8004e7dc: swc2 $8,0(t2)
0x8004e7e0: nop
0x8004e7e4: c2 0x158002d
0x8004e7e8: lw t1,28(sp)
0x8004e7ec: mfc2 t0,$7
0x8004e7f0: mfc2 t0,$7
0x8004e7f4: nop Then I expected the BIOS to use the DMA to upload the completed commands to the GPU but it's not the case, instead this code uploads the data to the GPU: => 0x80050b38: lw t6,0(a0) /* $a0: 0x80086eb8 */
0x80050b3c: move v0,a1
0x80050b40: addiu a1,a1,-1
0x80050b44: addiu a0,a0,4
0x80050b48: bnez v0,0x80050b38
=> 0x80050b4c: sw t6,0(v1) /* $v1: 0x1f801810 (GPU GP0) */ So we can see that instead of using the DMA the BIOS copies the commands from the RAM towards the GPU in software using regular This is an interesting situation for subpixel precision because in order to handle this situation we need to tie the enhanced precision vertex data with one of the CPU's general purpose registers ( Of course the BIOS is not the most interesting test case for subpixel precision and it's not really a big deal if it breaks for the PlayStation logo but i wouldn't be surprised if some games did something similar. |
Ehhh, can that situation be detected and logged? If there are only a few games doing that, no offense, but i'd rather have them broken or at least to have a fast and slow path (for those games) than slowdown everything significantly. I know it's a hack, but the feature itself is a hack. |
Yeah maybe, I'm going to test more games. I haven't really settled on a solution yet. I was just interested to test the BIOS because I noticed that my current implementation didn't work there and wanted to figure out why it didn't. |
Also maybe it could be made optional, the hack could have various levels of complexity which could be turned on and off depending on the game and the capabilities of the host computer. |
I managed to get it working with Crash Bandicoot but not Spyro for some reason. @i30817 Do also try to get perspective correct mapping working or just subpixel precision? Since I'm using an OpenGL renderer I thought I might try to get the z-coordinate with the floating point coordinates but that doesn't seem to work well so far. |
I have no good idea of graphical programming so i can't answer that about the z-coordinate precision on the GTE. In general I guess if you manage to surpass the other emulators at graphical enhancement of ps1 games it would be a powerful draw to users, but making the feature optional and with as many fast vs slow paths as possible seems best for the final solution (simpler prototyping is ok). If you manage to detect when the simpler technique fails and replace it with the more complex one without false positives or missed events; that would be best (certainly better than per-game configs, which sound troublesome with the ps1 library size, as well as too coarse a measure since surely most games that need the more complex technique might not need it everywhere?). |
I see. Currently I manage to run Crash at full speed with the expensive version of the hack but I'm almost maxing out my CPU. I think I'm going to try to get better compatibility with my emu before I continue with this hack, I can't really test all the games I want. |
Is it possible to make this emulator multithreaded? For example on one thread CPU and GTE on the other SPU and MDEC or even CPU and GTE on different threads? Though It'd be better if GTE could be emulated on a GPU. By the way could you give option to switch between these hacks? |
I implemented it in a way that would make it possible to make it an option (with no performance hit when the option is disabled) but I haven't actually implemented the option yet. The GPU could be multithreaded but I'm not sure if there's a point since it's already de-facto offloaded to the host GPU through OpenGL so it shouldn't take too much CPU time. For the rest it's more difficult, the GTE is so tightly coupled with the CPU that it's going to be hard to make it run in a separate thread. The MDEC is coupled with the CPU and the DMA which is itself coupled with RAM (and CPU) so it's going to be pretty difficult too, although probably less so than the GTE. The MDEC has pretty specific use cases (FMV, pre-rendered backgrounds...) and generally it runs during loading times or while video is being displayed and the rest of the system is pretty much idle (except for SPU and CD-ROM, probably) so I'm not sure you'd see any significant improvement by threading the MDEC. The SPU might be doable, I'm not sure at this point if it's worth it. |
Audio is rather power hungry in many emulated consoles. |
Yeah but in order for threading to give us performance we must offset the cost of the resynchronizations. If the average game tinkers with the SPU very frequently (reading registers, uploading audio, waiting for interrupts...) the thread might spend all of its time resync'ing which might well end up being slower than optimized single threaded code. There is no such thing as a free lunch. |
I get about 25FPS on BIOS screen using master branch, so rustation is very slow for me, but I have old CPU - Q6600. |
@simias By the way did you already implement perspective corrext texture mapping in that subpixel branch since you mentioned it here:
|
@Tapcio ouch, that is pretty slow. I haven't really spent time optimizing yet, hopefully that will be improved in the future. It still runs decently well on my core i5-2450M @ 2.5GHz. @ADormant I tried. In the subpixel branch I store the Z coordinate of the vertex alongside the precise X and Y values and I feed them to OpenGL. I don't really know if it works though, Crash Bandicoot doesn't have a lot of obvious texture warping going on. I'm going to get more games working and give it an other try. |
What is this exactly? @Tapcio's code or yet an other implementation? It looks similar to what we were trying to do here as far as I can tell at a glance from the code. |
@simias iCatButler implemented perspective-correct texture mapping. Trilinear and anisotropic filtering should be doable with it. |
@simias Reagarding iCat's implementation PGXP it's still not perfect and even more advanced implementation may require Getting the remaining vertex data will either mean much more widespread mirroring of CPU operations or some form of mesh reconstruction that will make a best guess at the exact 3D position from the low precision coordinates. |
Dynarec for PGXP |
I'm developing a prototype in the subpixel branch. More details to come...
The text was updated successfully, but these errors were encountered: