-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement the NVTX3_FUNC_RANGE_IF_IN
and NVTX3_FUNC_RANGE_IF
macros
#28
Comments
I have some code written to implement those macros. Just need to test it before I submit it for review. |
The other thing this would require is for the default ctor of
|
Hm, on second thought, my IILE idea won't work because it still requires the object to be movable... Can we just use |
If we don't touch
I would rather not for three reasons:
|
Adding a whole additional type for this one use case is kind of a lot of code. |
We can create a base |
I still don't love adding a new range type if we can avoid it. Here's how we can do it just by adding a tag type to pass to |
Also, I'm wondering about if There are no doubt numerous variants of this macro some people would find useful, e.g., variants that allow specifying parts of the Also the naming of |
Add the `NVTX3_FUNC_RANGE_IF` and `NVTX3_FUNC_RANGE_IN_IF` macros which are similar to the `NVTX3_FUNC_RANGE` and `NVTX3_FUNC_RANGE_IN` macros except that they only generate a range if the boolean expression passed as parameter evaluates to `true`. Closes NVIDIA#28
The NVTX++ interface currently have the
NVTX3_FUNC_RANGE
andNVTX3_FUNC_RANGE_IN
macros which allow the generation of an NVTX range from the lifetime of the block.There are scenarios where we only want to conditionally generate NVTX annotations. For example, some developers might want to annotate their libraries but have some kind of verbosity control. In this case, they might want to control whether an annotation is emitted or not dynamically.
One possible implementation would be to add an additional class similar to
domain_thread_range
which would enable the move constructor. The macro implementation would then be the following:If the user wants the condition to only be evaluated once for the whole duration of the program execution, he can cache the result in a static variable.
The downside of making a class that allows a thread range to be movable is that it can allow misuse of the API. For e.g. a user might create such thread range an move it into a functor which is executed in another thread. If this is problematic, this class could be implemented into the
detail
namespace and documented to warn the users of those invalid cases.The text was updated successfully, but these errors were encountered: