Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Handle segment offset size for DisplayLists #137

Closed
Archez opened this issue Feb 27, 2024 · 1 comment · Fixed by #162
Closed

RFC: Handle segment offset size for DisplayLists #137

Archez opened this issue Feb 27, 2024 · 1 comment · Fixed by #162

Comments

@Archez
Copy link
Contributor

Archez commented Feb 27, 2024

Background

MM and OOT have assets that load a DL from a segment address provided by code. For OOT this is always a segment with a 0 offset. However MM is unique in that some assets also provide an offset value. This is used in a way where one DList can be stored in a segment, but assets can "index" into certain parts through creative placement of gsSPEndDisplayList.

An example of this can be seen here:

static Gfx renderModeSetXluSingleCycleDL[] = {
gsDPSetRenderMode(AA_EN | Z_CMP | IM_RD | CLR_ON_CVG | CVG_DST_WRAP | ZMODE_XLU | FORCE_BL |
GBL_c1(G_BL_CLR_IN, G_BL_0, G_BL_CLR_IN, G_BL_1),
G_RM_AA_ZB_XLU_SURF2),
gsSPEndDisplayList(),
// These instructions will never get executed
gsDPSetRenderMode(AA_EN | Z_CMP | IM_RD | CLR_ON_CVG | CVG_DST_WRAP | ZMODE_XLU | FORCE_BL |
GBL_c1(G_BL_CLR_FOG, G_BL_A_SHADE, G_BL_CLR_IN, G_BL_1MA),
G_RM_AA_ZB_XLU_SURF2),
gsSPEndDisplayList(),
};

This DList is 4 instructions long, but essentially is used as two DLists due to the gsSPEndDisplayList. This DL is synced to segment 0x0C. Then assets control which "DL" they get by setting the segment offset. 0x0C000000 would execute the first half, where as 0x0C000010 would execute the second half.

The Problem

Where this becomes a problem is with our definition of Gfx words being uintptr_t instead of uint32_t. On 64bit machines, the size of Gfx is double compared to 32bit/N64 hardware. This means that segment offset values are invalid/index to the wrong location.

With the example above, 0x0C000010 has an offset of 0x10. Gfx has a size of 0x8 on 32bit and a size of 0x10 on 64bit. This means that the original offset of 0x10 is meant to index the segment address by 2, but on a 64bit machine this translates into only an index of 1.

image

Possible solutions / Proposals

Option 1: Exporter Fix

We could updated the exporter to adjust the segment offset for DList lookups based on the system performing the export. This would allow everything to work as expected without any changes in 2ship/Fast3D.

Example of proposed changes: Archez/OTRExporter@8ac66fe

Pros:

  • Keeps Fast3D ignorant and 2ship

Cons:

  • Prevents portability of exported OTRs from working on opposite architecture machines (OTRs generated on a 64bit machine will only work on a 64bit machine

We would also probably need to track in the OTR what architecture was used to create it so we can warn/notify when it is used on an incorrect machine.

Option 2: 2ship/Fast3D fix

We could change Fast3D to handle adjusting the segment value for DList lookups at render time. This would require us to adjust our existing 64bit modified segment values used by the master gfx DLs to be forced as 32bit sized offsets.

Example of proposed changes (also look at the LUS submodule change): Archez@6d479a8

Pros:

  • Preserves portability of generated OTRs

Cons:

  • Requires Fast3D to handle and be aware of the Gfx size difference
  • Requires 2ship and other ports using LUS be aware that DList segment values must be in the "32bit size"

Option 3: New custom opcode

We could add a new custom DL opcode for use by the exporter when encountering DList segment addresses. This opcode can then signal to Fast3D to perform the offset adjustment strictly for address coming from the OTR.

Example of proposed changes: Archez/libultraship@552e192 + Archez/OTRExporter@4c1a36b

Pros:

  • Preservces OTR portability
  • Keeps segment valus coming from 2ship code using their true offset values unmodified

Cons:

  • Yet another custom opcode to manage

Option 4: Same opcode with extra custom flag

Similar to option 3, however, instead of adding a new opcode we can leverage unused space in the original G_DL opcode to set a flag (16 bits of free space through gsDma1p l argument). This flag can then be used by Fast3D to perform the offset adjustment. A new macro can be used to set the flag into the command.

Example of proposed changes: Archez/libultraship@057d374 + Archez/OTRExporter@4c1a36b

Pros:

  • All from option 3
  • No new opcodes

Cons:

  • Technically squeezes in custom flags into an existing opcode, but should be safe assuming this bit range is truly unused in N64 hardware (it is at least unused in LUS)
@Archez
Copy link
Contributor Author

Archez commented Mar 27, 2024

This has been addressed by louist103/libultraship#14 and louist103/OTRExporter#15 Where option 3 was selected with a tweak to base the offset value as an "index" value to simplify the math.

@Archez Archez closed this as completed Mar 27, 2024
Eblo pushed a commit to Eblo/2ship2harkinian that referenced this issue Jan 19, 2025
* Factored GBT switches in logic.

* Fixed some more logic & made some macros for better readability.

* Corrected spacing on RC renames in Checks.cpp

* Swapped the names for the Red switches to be more accurate to the flow.

* Fixed slight logic issue with the third Green Switch.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant