Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stencil test #58

Merged
merged 4 commits into from
Aug 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@ add_library(${TARGET} STATIC
src/call_lists.h
src/debug.c
src/debug.h
src/efb.c
src/efb.h
src/functions.c
src/gc_gl.c
src/image_DXT.c
Expand All @@ -34,6 +36,8 @@ add_library(${TARGET} STATIC
src/pixels.h
src/selection.c
src/state.h
src/stencil.c
src/stencil.h
src/texture.c
src/utils.h
)
Expand Down
56 changes: 56 additions & 0 deletions doc/src/opengx.tex
Original file line number Diff line number Diff line change
Expand Up @@ -417,6 +417,62 @@ \subsubsection{Lit and textured render}
The HANDLE\_CALL\_LIST macro is called at the beginning of those GL functions that can be stored into a call list. It takes care of adding the operation to the active call list (if there's one) minimizing the visual impact of call lists on the code base.


\subsection {Stencil test}

The stencil test allows to discard individual fragments depending on the outcome of a comparison between the value stored in the stencil buffer for that fragment and a reference value. The compared values can also be masked with a bitmask before comparison, and the generation of the stencil buffer involves drawing primitives and performing logical and arithmetical operations on the drawn stencil pixels. The stencil test is typically used to render shadows and reflections. Unfortunately, nor the GameCube hardware or the GX APIs provide any support for the stencil test, so we have to emulate it, partially in software, partially with an additional TEV stage.

\subsubsection {Discarding fragments}

Let's first see how opengx discards fragments which don't pass the stencil test; for the time being, let's assume that we have managed to build a stencil buffer (in opengx it can be 4 or 8 bits wide, with 4 being the default) and focus only on how to build a TEV stage which does the comparison and discards the fragment. We will be setting up the TEV stage to operate in \emph{compare mode}, where the inputs are combined according to this formula:

$$ output = d + ((a OP b) ? c : 0 $$

and $OP$ will be either “equal” (\lstinline{GX_TEV_COMP_A8_EQ}) or “greater than” (\lstinline{GX_TEV_COMP_A8_GT}). Our goal is to decide whether a fragment will be drawn or not, so we'll be using the alpha channel as the output, set $d$ to zero and $c$ to the alpha from the previous stage: in this way, depending on the result of the comparison $a OP b$ (we'll see how to set $a$ and $b$ later below), we can control whether display the fragment with its original alpha, or not display it at all.

Note that we'll have to make the Z buffer operate per fragment and not per vertex (by setting \lstinline{GX_SetZCompLoc(GX_DISABLE)}) and set the alpha compare function (\fname{GX\_SetAlphaCompare}) to exclude all fragments having an alpha value of zero: this is important so that the discarded fragments won't update the Z-buffer.

The next problem we have to solve is setting up a texture coordinate generation that, once the stencil texture is loaded in our TEV stage, would allow us to read its pixels using screen coordinates; in other words, we want to make it so that for every fragment processed in this stage, its texel would coincide with a screen pixel. This can be achieved by setting up a texture coordinate generation matrix that multiplies the primitive's vector's \emph{position} and transforms that to the exact x and y coordinates that this vertex will occupy on the screen. Such a matrix can be built by concatenating the movel-view matrix with the projection matrix, but we must take into account that such a matrix will transform vertex coordinates into the \lstinline{[-1,1]x[-1,1]} range whereas the TEV expects texture coordinates to be in the \lstinline{[0,1]x[0,1]} range, so we have to concatenate an additional matrix to translate and scale the coordinates by half.

Another issue is how to actually implement the comparison, since the OpenGL specification supports all kinds of arithmetical comparisons, whereas the TEV only supports comparing for equality (\lstinline{GX_TEV_COMP_A8_EQ}) and strict "greater than" (\lstinline{GX_TEV_COMP_A8_GT}); however, since we know that we are operating on integer values, most of this operations can be emulated by inverting the order of the operands in the TEV, or by altering the reference value by ±1, as shown in Figure~\ref{table:stencil1}.

\begin{figure}[ht]
\centering

\begin{tabular}{|l|l|l|}
\hline
{GL comparison} & {Formula} & {GX comparison} \\
\hline
GL\_EQUAL & $a = b$ & $a = b$ \\
\hline
GL\_GREATER & $a > b$ & $a > b$\\
\hline
GL\_LESS & $a < b$ & $b > a$\\
\hline
GL\_GEQUAL & $a \geq b$ & $a > b - 1$ \\
\hline
GL\_LEQUAL & $a \leq b$ & $b + 1 > a$ \\
\hline
\end{tabular}
\caption{Mapping OpenGL stencil comparisons into the TEV. $a$ is the value from the stencil texture, and $b$ is the reference value}
\label{table:stencil1}
\end {figure}

What is missing from that figure is the comparison for not equality (\lstinline{GL_NOTEQUAL}), which simply cannot be implemented using the comparisons provided by the TEV engine. For that reason, if the comparison mode is \lstinline{GL_NOTEQUAL}, opengx builds a special stencil texture in software, whose pixels are set to 1 if the stencil buffer value is different from reference value and 0 otherwise, and then uses the \lstinline{GX_TEV_COMP_A8_GT} (greater than) comparison on this texture's pixels.

Note that while building the stencil texture in software sounds like a non optimal decision (and indeed it is), opengx has to do it in any case, even for those comparisons from Figure~\ref{table:stencil1} supported by the TEV, in order to support the masking step described in the OpenGL specification: that is, OpenGL is not directly comparing the stencil buffer value with the reference value, but both values are \emph{bitwise AND'ed} with a bitmask given by the programmer. This means that before drawing a primitive on which the stencil test must run, opengx will read the stencil buffer and generate a stencil texture that can be used with the comparisons described above. Using the \fname{GX\_ReadBoundingBox} function allows us to minimize the work by updating only the area that changed.

\subsubsection {Drawing the stencil buffer}

When drawing to the stencil buffer, special care must be taken to avoid making this drawing visible, so before performing a stencil draw operation opengx saves the current contents of the EFB to a texture, then loads the previous contents of the stencil buffer into the EFB, perform the stencil rendering, then saves the EFB to the stencil buffer and finally restores the previous contents of the EFB (the ones containing the visible graphics). This is clearly not optimal when the sequence of drawing operations to the screen is intertwined with drawing operations to the stencil buffer (as is often the case), but unfortunately there's no way around this. We could still use the \fname{GX\_ReadBoundingBox} function to keep track of the areas that need saving and restoring, so there might be some room for optimization here.

OpenGL specifies several logical and arithmetical operations that can be applied when rendering to the stencil buffer, with the most used being by far \lstinline{GL_KEEP} and \lstinline{GL_REPLACE}: the former is trivial to implement, since it just means that no drawing must happen on the stencil buffer; the latter is used to replace the stencil buffer's value with the reference value, so this is also not hard to realize by rendering the primitive with no lighting, no texturing, and just fixing the drawing color to the reference value. The \lstinline{GL_ZERO} operation is also easy to implement, being a special case of \lstinline{GL_REPLACE} where instead of using the reference colour we just draw total black. Other operations are not currently implemented (see the \ref{sec:stencillimitations} section for details).

\subsubsection {Limitations}
\label{sec:stencillimitations}

The stencil drawing operations \lstinline{GL_INCR}, \lstinline{GL_DECR}, \lstinline{GL_INCR_WRAP}, \lstinline{GL_DECR_WRAP} and \lstinline{GL_INVERT} are not implemented. It might be possible that at least some of them could be implemented in the hardware by setting an appropriate blending mode. Another posibility is to make these functions not directly render on the stencil buffer, but on yet another offscreen buffer, and then process the results in software. But none of these solutions have been implemented so far, given how rarely these operations are used (at least in public source code indexed by GitHub).


\subsection {Selection mode}

GL selection mode, also known as "picking mode", is typically used in applications to determine which objects are rendered at a certain screen position, in order to implement mouse interactions. When selection mode is active, drawing primitives do not result in any pixels (or even Z-pixels) being drawn, but instead produce a stack containing the names of the object which would have been drawn.
Expand Down
31 changes: 25 additions & 6 deletions src/call_lists.c
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ POSSIBILITY OF SUCH DAMAGE.

#include "call_lists.h"
#include "debug.h"
#include "stencil.h"
#include "utils.h"

#include <GL/gl.h>
Expand Down Expand Up @@ -181,17 +182,35 @@ static Command *new_command(CommandBuffer **head)
}
}

static void run_command(Command *cmd)
static void flat_draw_list(void *cb_data)
{
struct GXDisplayList *gxlist = cb_data;

GX_CallDispList(gxlist->list, gxlist->size);
}

static void run_gx_list(struct GXDisplayList *gxlist)
{
struct client_state cs;

cs = glparamstate.cs;
glparamstate.cs = gxlist->cs;
_ogx_apply_state();
_ogx_setup_render_stages();
glparamstate.cs = cs;
GX_CallDispList(gxlist->list, gxlist->size);
glparamstate.draw_count++;

if (glparamstate.stencil.enabled) {
_ogx_stencil_draw(flat_draw_list, gxlist);
}
}

static void run_command(Command *cmd)
{
switch (cmd->type) {
case COMMAND_GXLIST:
cs = glparamstate.cs;
glparamstate.cs = cmd->c.gxlist.cs;
_ogx_apply_state();
glparamstate.cs = cs;
GX_CallDispList(cmd->c.gxlist.list, cmd->c.gxlist.size);
run_gx_list(&cmd->c.gxlist);
break;
case COMMAND_CALL_LIST:
glCallList(cmd->c.gllist);
Expand Down
1 change: 1 addition & 0 deletions src/debug.c
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ static const struct {
{ "call-lists", OGX_LOG_CALL_LISTS },
{ "lighting", OGX_LOG_LIGHTING },
{ "texture", OGX_LOG_TEXTURE },
{ "stencil", OGX_LOG_STENCIL },
{ NULL, 0 },
};

Expand Down
2 changes: 2 additions & 0 deletions src/debug.h
Original file line number Diff line number Diff line change
Expand Up @@ -34,12 +34,14 @@ POSSIBILITY OF SUCH DAMAGE.

#include <ogc/system.h>
#include <errno.h>
#include <stdio.h>

typedef enum {
OGX_LOG_WARNING = 1 << 0,
OGX_LOG_CALL_LISTS = 1 << 1,
OGX_LOG_LIGHTING = 1 << 2,
OGX_LOG_TEXTURE = 1 << 3,
OGX_LOG_STENCIL = 1 << 4,
} OgxLogMask;

extern OgxLogMask _ogx_log_mask;
Expand Down
155 changes: 155 additions & 0 deletions src/efb.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
/*****************************************************************************
Copyright (c) 2011 David Guillen Fandos ([email protected])
Copyright (c) 2024 Alberto Mardegan ([email protected])
All rights reserved.

Attention! Contains pieces of code from others such as Mesa and GRRLib

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. Neither the name of copyright holders nor the names of its
contributors may be used to endorse or promote products derived
from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL COPYRIGHT HOLDERS OR CONTRIBUTORS
BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.
*****************************************************************************/

#include "efb.h"

#include "debug.h"
#include "state.h"
#include "utils.h"

#include <GL/gl.h>
#include <malloc.h>

OgxEfbContentType _ogx_efb_content_type = OGX_EFB_SCENE;

static GXTexObj s_efb_texture;
/* This is the ID of the drawing operation that was copied last (0 = none) */
static int s_draw_count_copied = 0;

void _ogx_efb_save(OgxEfbFlags flags)
{
/* TODO: support saving Z-buffer (code in selection.c) */

if (s_draw_count_copied == glparamstate.draw_count) {
printf("Not copying EFB\n");
/* We already copied this frame, nothing to do here */
return;
}

s_draw_count_copied = glparamstate.draw_count;

u16 width = glparamstate.viewport[2];
u16 height = glparamstate.viewport[3];
u16 oldwidth = GX_GetTexObjWidth(&s_efb_texture);
u16 oldheight = GX_GetTexObjHeight(&s_efb_texture);
uint8_t *texels = GX_GetTexObjData(&s_efb_texture);
if (texels) {
texels = MEM_PHYSICAL_TO_K0(texels);
}

if (width != oldwidth || height != oldheight) {
if (texels) {
free(texels);
}
u32 size = GX_GetTexBufferSize(width, height, GX_TF_RGBA8, 0, GX_FALSE);
texels = memalign(32, size);
DCInvalidateRange(texels, size);

GX_InitTexObj(&s_efb_texture, texels, width, height,
GX_TF_RGBA8, GX_CLAMP, GX_CLAMP, GX_FALSE);
GX_InitTexObjLOD(&s_efb_texture, GX_NEAR, GX_NEAR,
0.0f, 0.0f, 0.0f, 0, 0, GX_ANISO_1);
}

_ogx_efb_save_to_buffer(GX_TF_RGBA8, width, height, texels, flags);
}

void _ogx_efb_restore(OgxEfbFlags flags)
{
/* TODO: support restoring Z-buffer (code in selection.c) */

_ogx_efb_restore_texobj(&s_efb_texture);
}

void _ogx_efb_save_to_buffer(uint8_t format, uint16_t width, uint16_t height,
void *texels, OgxEfbFlags flags)
{
GX_SetCopyFilter(GX_FALSE, NULL, GX_FALSE, NULL);
GX_SetTexCopySrc(glparamstate.viewport[0],
glparamstate.viewport[1],
width,
height);
GX_SetTexCopyDst(width, height, format, GX_FALSE);
GX_CopyTex(texels, flags & OGX_EFB_CLEAR ? GX_TRUE : GX_FALSE);
/* TODO: check if all of these sync functions are needed */
GX_PixModeSync();
GX_SetDrawDone();
u32 size = GX_GetTexBufferSize(width, height, format, 0, GX_FALSE);
DCInvalidateRange(texels, size);
GX_WaitDrawDone();
}

void _ogx_efb_restore_texobj(GXTexObj *texobj)
{
_ogx_setup_2D_projection();
u16 width = GX_GetTexObjWidth(texobj);
u16 height = GX_GetTexObjHeight(texobj);
GX_LoadTexObj(texobj, GX_TEXMAP0);

GX_ClearVtxDesc();
GX_SetVtxDesc(GX_VA_POS, GX_DIRECT);
GX_SetVtxDesc(GX_VA_TEX0, GX_DIRECT);
GX_SetVtxAttrFmt(GX_VTXFMT0, GX_VA_POS, GX_POS_XY, GX_U16, 0);
GX_SetVtxAttrFmt(GX_VTXFMT0, GX_VA_TEX0, GX_TEX_ST, GX_U8, 0);
GX_SetTexCoordGen(GX_TEXCOORD0, GX_TG_MTX2x4, GX_TG_TEX0, GX_IDENTITY);
GX_SetNumTexGens(1);
GX_SetNumTevStages(1);
GX_SetNumChans(0);
GX_SetTevOp(GX_TEVSTAGE0, GX_REPLACE);
GX_SetTevOrder(GX_TEVSTAGE0, GX_TEXCOORD0, GX_TEXMAP0, GX_COLORNULL);

GX_SetCullMode(GX_CULL_NONE);
glparamstate.dirty.bits.dirty_cull = 1;

GX_SetZMode(GX_FALSE, GX_ALWAYS, GX_FALSE);
glparamstate.dirty.bits.dirty_z = 1;

GX_SetBlendMode(GX_BM_NONE, GX_BL_ZERO, GX_BL_ZERO, GX_LO_COPY);
glparamstate.dirty.bits.dirty_blend = 1;

GX_SetAlphaCompare(GX_ALWAYS, 0, GX_AOP_OR, GX_ALWAYS, 0);
glparamstate.dirty.bits.dirty_alphatest = 1;

GX_SetColorUpdate(GX_TRUE);
glparamstate.dirty.bits.dirty_color_update = 1;

GX_Begin(GX_QUADS, GX_VTXFMT0, 4);
GX_Position2u16(0, 0);
GX_TexCoord2u8(0, 0);
GX_Position2u16(0, height);
GX_TexCoord2u8(0, 1);
GX_Position2u16(width, height);
GX_TexCoord2u8(1, 1);
GX_Position2u16(width, 0);
GX_TexCoord2u8(1, 0);
GX_End();
}
61 changes: 61 additions & 0 deletions src/efb.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
/*****************************************************************************
Copyright (c) 2011 David Guillen Fandos ([email protected])
Copyright (c) 2024 Alberto Mardegan ([email protected])
All rights reserved.

Attention! Contains pieces of code from others such as Mesa and GRRLib

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. Neither the name of copyright holders nor the names of its
contributors may be used to endorse or promote products derived
from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL COPYRIGHT HOLDERS OR CONTRIBUTORS
BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.
*****************************************************************************/

#ifndef OPENGX_EFB_H
#define OPENGX_EFB_H

#include <GL/gl.h>
#include <malloc.h>
#include <ogc/gx.h>

typedef enum {
OGX_EFB_NONE = 0,
OGX_EFB_CLEAR = 1 << 0,
OGX_EFB_COLOR = 1 << 1,
OGX_EFB_ZBUFFER = 1 << 2,
} OgxEfbFlags;

typedef enum {
OGX_EFB_SCENE = 1,
OGX_EFB_STENCIL,
} OgxEfbContentType;

extern OgxEfbContentType _ogx_efb_content_type;

void _ogx_efb_save(OgxEfbFlags flags);
void _ogx_efb_restore(OgxEfbFlags flags);

void _ogx_efb_save_to_buffer(uint8_t format, uint16_t width, uint16_t height,
void *texels, OgxEfbFlags flags);
void _ogx_efb_restore_texobj(GXTexObj *texobj);

#endif /* OPENGX_EFB_H */
Loading