Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rs_free_function #1494

Merged
merged 4 commits into from
May 4, 2022
Merged

Conversation

steenax86
Copy link
Contributor

@steenax86 steenax86 commented Apr 21, 2022

Description

In order to move towards shader execution on devices, we need to replace host side interactions with new concepts that can exist on the device (or host).
The current virtual function based RendererServices exemplifies this dependency
on host side objects.

To break the dependency we introduce free function based renderer services (rs_free_function.h)
To simplify its introduction, we only added the renderer service free functions required by opmatrix.cpp.
Goal is to have a single opmatrix that can be compiled for host or device (eventually).
Renderers can provide llvm bitcode representation of their renderer service
free functions to the ShadingSystem. During shader compilation/JIT the
renderer's bitcode is combined with opmatrix.cpp bitcode along with the JIT'd shader network.
The idea is the combined LLVM IR representation can then be lowered to the target
(be it on the host or device[SPIRV|PTX]). Thus a renderer's free function implementation can be shared.

opmatrix.cpp now only calls the free functions (not virtual RendererServices).
However liboslexec on the host contains an rs_fallback.cpp which forwards
calls through the existing virtual RendererServices for seamless compatibility
when a renderer does not provide bitcode for the renderer services.

Testshade has been modified to optionally to provide rs_bitcode,
utilizing a new Cmake macro, EMBED_LLVM_BITCODE_IN_CPP, to turn rs_simplerend.cpp (the free function implementation)
into bitcode.

Free functions can be enabled via --use-rs-bitcode flag in testshade.

Free functions need access to RenderState values because they lack a this
pointer like the RendererServices virtual methods have. Added a RenderState
containing minimal set of values required by free functions.
Testshade/SimpleRenderer is responsible for setting RenderState values.

Added ShadingSystem attribute rs_bitcode to hold renderer services free function bitcode (provided by the renderer)

Renderers using global variables (like ustring's) not in strdecls.h must register their global variables' addresses
using OSL::register_JIT_Global (const char *global_var_name, void *global_var_addr).
This is necessary for the JIT process to resolve global variables that are embedded in RendererServices bitcode.
A validation step occurs that emits an error with a list of unmapped global variables.

RendererServices can reference strings with the STRING_PARAM(varname) macro.
By default, STRING_PARAM(varname) works for any string in strdecls.h.
For any string NOT in strdecls.h, it will map to a global variable with a prefix of "RS_".
For example, STRING_PARAM(raster) maps to a global ustring RS_raster.
NOTE: C linkage with a "RS_" prefix is used to allow for unmangled
non-colliding global symbol names, so its easier to pass them to
OSL::register_JIT_Global(name, addr) for host execution
NOTE: the STRING_PARAMS macro adapts to OSL_HOST_RS_BITCODE macro
to utilize the RS_ prefix. RS_ prefixed versions of all OSL::Strings
intances have been created by rs_free_function.h, so the same STRING_PARAMS
macro can be used for renderer service or OSL strings.
NOTE: We expect the handling of strings to change in the future possibly being replaced by Handles entirely

It is the renderer's responsibility to instantiate any NEW ustring global variables with the proper RS_ prefix;
It is the renderer's free function implementation to import/extern any of these NEW global variables with the proper RS_ prefix;
For testshade, a rs_strdecls.h has been added and is utilized to instance the RS_ global variables and register their addresses with OSL in simplerend.cpp.
rs_strdecls.h is also used by rs_simplerend.cpp (free function implementation) to create the externs with the RS_ prefix.

"NOTE: DO NOT instantiate global variables inside the free function renderer services, because we are not sure their constructors/destructors will get called, and are considering removing global constructors/destructors as part of the pruning process.

Added LLVM_Util::validate_global_mapping to identify unresolved globals.

Added utility macro __OSL_STRINGIFY(string_value) in oslversion.h.in so the
proper naming prefix is used and we don't have multiple implementations around.

Added liboslexec build step to produce bitcode containing OSL library functions that depend on free functions (here, opmatrix.cpp)

Added a new Cmake EMBED_LLVM_BITCODE_IN_CPP function in llvm_macros.cmake that replaces the LLVM_COMPILE macro
and provides for both linking and serializing multiple OSL cpp files.

Tests

Made testing.cmake recognize RS_BITCODE and RS_BITCODE_REGRESSION tag files to exercise unit and regression test suites.
Test suites exercised using bitcode RendererServices: matrix-arithmetic-reg, matrix-compref-reg, matrix-reg, matrix.

Checklist:

  • I have read the contribution guidelines.
  • I have previously submitted a Contributor License Agreement.
  • I have updated the documentation, if applicable.
  • I have ensured that the change is tested somewhere in the testsuite (adding new test cases if necessary).
  • My code follows the prevailing code style of this project.

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Apr 21, 2022

CLA Signed

The committers listed above are authorized under a signed CLA.

@AlexMWells AlexMWells requested a review from lgritz April 21, 2022 20:44
@tgrant-nv tgrant-nv mentioned this pull request Apr 28, 2022
5 tasks
# https://github.com/AcademySoftwareFoundation/OpenShadingLanguage


function ( EMBED_LLVM_BITCODE_IN_CPP src_list suffix output_name list_to_append_cpp extra_clang_args)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to unify this with MAKE_CUDA_BITCODE at some point, since they're largely identical. And to generalize to other targets like SPIR-V. But that can wait until we're compiling for other non-host targets.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, they are very similar, but our ability to test the CUDA path is limited and we don't want to break it by accident, or create merge issues for any changes in the MAKE_CUDA_BITCODE that might happen. Maybe an interim solution would be to move both macros to a common file, so at least there right next to each other for reference.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is just fine for now. It'll be something to keep in mind when the time comes to add MAKE_SPIRV_BITCODE, or whatever the next target might be.

llvm::Module* rs_free_functions_module =
ll.module_from_bitcode (static_cast<const char*>(rs_free_function_bitcode.data()),
rs_free_function_bitcode.size(), "rs_free_functions");
if (err.length())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are checking for an error message, but not passing err to the preceding module_from_bitcode call.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@steenax86 we could add that check and error report to all module_from_bitcode calls (there are 4 others here) as a general improvement.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's up to you, @steenax86, whether to fix that here or just incorporate into a subsequent set of changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lgritz we added the error checks.

@tgrant-nv
Copy link
Contributor

This looks good to me at a high level. It's very similar to what we are doing for OptiX/CUDA for shadeops that need access to the renderer services, the texture system, or the shading system. Such functions are not included in the regular bitcode compilation pipeline with the other shadeops, and are instead defined in the renderer source and provided as bitcode through a shading system attribute. It will be great when this free function mechanism can handle both use cases.

Copy link
Collaborator

@lgritz lgritz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is all fine as-is. We will continue to iterate on it as we add more free functions.

This doesn't need to be fixed in this PR, but as a note to us for the future: Because renderers will need to serialize and incorporate bitcode modules, it's not enough to have llvm_macros.cmake and serialize-bc.py merely in our build tree -- we'll need them to be part of the install step itself. (And probably, as already pointed out, to unify the llvm_macros and cuda_macros.)

@lgritz
Copy link
Collaborator

lgritz commented May 3, 2022

Are there any touch-ups you'd like to make, Steena, before the merge?

steenax86 added 4 commits May 3, 2022 15:53
host side interactions with new concepts that can exist on the device (or host).
The current virtual function based RendererServices exemplifies this dependency
on host side objects.

To break the dependency we introduce free function based renderer services (rs_free_function.h)
To simplify its introduction, we only added the renderer service free functions required by opmatrix.cpp.
Goal is to have a single opmatrix that can be compiled for host or device (eventually).
Renderers can provide llvm bitcode representation of their renderer service
free functions to the ShadingSystem.  During shader compilation/JIT the
renderer's bitcode is combined with opmatrix.cpp bitcode along with the JIT'd shader network.
The idea is the combined LLVM IR representation can then be lowered to the target
(be it on the host or device[SPIRV|PTX]).  Thus a renderer's free function implementation can be shared.

opmatrix.cpp now only calls the free functions (not virtual RendererServices).
However liboslexec on the host contains an rs_fallback.cpp which forwards
calls through the existing virtual RendererServices for seamless compatibility
when a renderer does not provide bitcode for the renderer services.

Testshade has been modified to optionally to provide rs_bitcode,
utilizing a new Cmake macro, EMBED_LLVM_BITCODE_IN_CPP, to turn rs_simplerend.cpp (the free function implementation)
into bitcode.

Free functions can be enabled via --use-rs-bitcode flag in testshade.

Free functions need access to RenderState values because they lack a this
pointer like the RendererServices virtual methods have. Added a RenderState
containing minimal set of values required by free functions.
Testshade/SimpleRenderer is responsible for setting RenderState values.

Added ShadingSystem attribute rs_bitcode to hold renderer services free function bitcode (provided by the renderer)

Renderers using global variables (like ustring's) not in strdecls.h must register their global variables' addresses
using OSL::register_JIT_Global (const char *global_var_name, void *global_var_addr).
This is necessary for the JIT process to resolve global variables that are embedded in RendererServices bitcode.
A validation step occurs that emits an error with a list of unmapped global variables.

RendererServices can reference strings with the STRING_PARAM(varname) macro.
By default, STRING_PARAM(varname) works for any string in strdecls.h.
For any string NOT in strdecls.h, it will map to a global variable with a prefix of "RS_".
For example, STRING_PARAM(raster) maps to a global ustring RS_raster.
NOTE:  C linkage with a "RS_" prefix is used to allow for unmangled
non-colliding global symbol names, so its easier to pass them to
OSL::register_JIT_Global(name, addr) for host execution
NOTE:  the STRING_PARAMS macro adapts to OSL_HOST_RS_BITCODE macro
to utilize the RS_ prefix.  RS_ prefixed versions of all OSL::Strings
intances have been created by rs_free_function.h, so the same STRING_PARAMS
macro can be used for renderer service or OSL strings.
NOTE:  We expect the handling of strings to change in the future possibly being replaced by Handles entirely

It is the renderer's responsibility to instantiate any NEW ustring global variables with the proper RS_ prefix;
It is the renderer's free function implementation to import/extern any of these NEW global variables with the proper RS_ prefix;
For testshade, a rs_strdecls.h has been added and is utilized to instance the RS_ global variables and register their addresses with OSL in simplerend.cpp.
rs_strdecls.h is also used by rs_simplerend.cpp (free function implementation) to create the externs with the RS_ prefix.

NOTE: DO NOT instantiate global variables inside the free function renderer services, because we strip global constructors as part of the pruning process.

Added LLVM_Util::validate_global_mapping to identify unresolved globals.

Added utility macro __OSL_STRINGIFY(string_value) in oslversion.h.in so the
proper naming prefix is used and we don't have multiple implementations around.

Added liboslexec build step to produce bitcode containing OSL library functions that depend on free functions (here, opmatrix.cpp)

Added a new Cmake EMBED_LLVM_BITCODE_IN_CPP function in llvm_macros.cmake that replaces the LLVM_COMPILE macro
and provides for both linking and serializing multiple OSL cpp files.

Signed-off-by: Steena Monteiro <[email protected]>
…scripts.

Fixed formatting issues.

Signed-off-by: Steena Monteiro <[email protected]>
… on MacOS Apple Clang.

Left as future work.

Signed-off-by: Steena Monteiro <[email protected]>
@steenax86 steenax86 force-pushed the rs_free_function branch from 95c7922 to c0ee2dd Compare May 3, 2022 23:00
@lgritz lgritz merged commit e5dd190 into AcademySoftwareFoundation:main May 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants