Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] Wrap complex global objects to control lifetime #2516

Merged
merged 26 commits into from
Oct 13, 2020

Conversation

alexbatashev
Copy link
Contributor

The order of global constructor/destructor invocation is undefined. At the same time, user may desire to make objects of SYCL classes global. Since there're lots of global objects inside SYCL runtime (scheduler and program manager, to name a few), they need to be wrapped with some trivial class/structure. The object of that structure is allocated on heap and deallocated in __attribute__((destructor))/DllMain, which guarantees, that SYCL global objects will be destroyed after user objects.

sycl/source/detail/platform_impl.hpp Outdated Show resolved Hide resolved
sycl/source/detail/global_handler.hpp Outdated Show resolved Hide resolved
sycl/source/detail/global_handler.hpp Outdated Show resolved Hide resolved
sycl/source/detail/global_handler.hpp Outdated Show resolved Hide resolved
sycl/source/detail/global_handler.hpp Outdated Show resolved Hide resolved
@vladimirlaz
Copy link
Contributor

Can we add some test which will check that no new "global" objects are added by a patch?
The following command shows more global objects (this is Rt without your patch so all objects you fixed are still in the list):

` # nm -a ../deploy/linux_prod/lib/libsycl.so | grep " b " | grep detail

00000000003957a8 b _ZGVZN2cl4sycl6detail10SYCLConfigILNS1_8ConfigIDE1EE3getEvE6ValStr
00000000003954d8 b _ZGVZN2cl4sycl6detail10SYCLConfigILNS1_8ConfigIDE2EE3getEvE6ValStr
0000000000395538 b _ZGVZN2cl4sycl6detail10SYCLConfigILNS1_8ConfigIDE3EE3getEvE6ValStr
0000000000395000 b _ZGVZN2cl4sycl6detail10SYCLConfigILNS1_8ConfigIDE4EE3getEvE7Backend
0000000000395810 b _ZGVZN2cl4sycl6detail10SYCLConfigILNS1_8ConfigIDE6EE3getEvE6ValStr
0000000000395550 b _ZGVZN2cl4sycl6detail10SYCLConfigILNS1_8ConfigIDE7EE3getEvE6ValStr
0000000000395578 b _ZGVZN2cl4sycl6detail14ProgramManager11getInstanceEvE8Instance
0000000000395568 b _ZGVZN2cl4sycl6detail14ProgramManager5buildESt10unique_ptrI11_pi_programPF10_pi_resultPS4_EESt10shared_ptrINS1_12context_implEERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESK_RKP10_pi_deviceRSt3mapISt4pairINS1_12DeviceLibExtESM_ES6_St4lessISS_ESaISQ_IKSS_S6_EEEjE19EnableLevelZeroLink
00000000003956d0 b _ZGVZN2cl4sycl6detail7Command7enqueueERNS1_14EnqueueResultTENS1_9BlockingTEE12ThrowOnBlock
0000000000395520 b _ZGVZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb0ELb0EEclEcE5__nul
0000000000395510 b _ZGVZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb0ELb1EEclEcE5__nul
0000000000395500 b _ZGVZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb1ELb0EEclEcE5__nul
00000000003954f0 b _ZGVZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb1ELb1EEclEcE5__nul
0000000000394fe8 b _ZN2cl4sycl6detail12GPICallEventE
00000000003954c8 b _ZN2cl4sycl6detail14SYCLConfigBaseILNS1_8ConfigIDE1EE14MValueFromFileE
00000000003954a0 b _ZN2cl4sycl6detail14SYCLConfigBaseILNS1_8ConfigIDE1EE8MStorageE
0000000000395498 b _ZN2cl4sycl6detail14SYCLConfigBaseILNS1_8ConfigIDE2EE14MValueFromFileE
0000000000395490 b _ZN2cl4sycl6detail14SYCLConfigBaseILNS1_8ConfigIDE2EE8MStorageE
0000000000395488 b _ZN2cl4sycl6detail14SYCLConfigBaseILNS1_8ConfigIDE3EE14MValueFromFileE
0000000000395080 b _ZN2cl4sycl6detail14SYCLConfigBaseILNS1_8ConfigIDE3EE8MStorageE
0000000000395078 b _ZN2cl4sycl6detail14SYCLConfigBaseILNS1_8ConfigIDE4EE14MValueFromFileE
0000000000395060 b _ZN2cl4sycl6detail14SYCLConfigBaseILNS1_8ConfigIDE4EE8MStorageE
0000000000395058 b _ZN2cl4sycl6detail14SYCLConfigBaseILNS1_8ConfigIDE5EE14MValueFromFileE
0000000000395040 b _ZN2cl4sycl6detail14SYCLConfigBaseILNS1_8ConfigIDE5EE8MStorageE
0000000000395030 b _ZN2cl4sycl6detail14SYCLConfigBaseILNS1_8ConfigIDE6EE14MValueFromFileE
0000000000395028 b _ZN2cl4sycl6detail14SYCLConfigBaseILNS1_8ConfigIDE6EE8MStorageE
0000000000395020 b _ZN2cl4sycl6detail14SYCLConfigBaseILNS1_8ConfigIDE7EE14MValueFromFileE
000000000039501c b _ZN2cl4sycl6detail14SYCLConfigBaseILNS1_8ConfigIDE7EE8MStorageE
0000000000394ff0 b _ZN2cl4sycl6detail15GSYCLGraphEventE
0000000000394fd0 b _ZN2cl4sycl6detail2pi12GlobalPluginE
0000000000394fe0 b _ZN2cl4sycl6detail2pi12XPTIInitDoneE
00000000003956e0 b _ZN2cl4sycl6detail9Scheduler8instanceE
00000000003957b0 b _ZZN2cl4sycl6detail10SYCLConfigILNS1_8ConfigIDE1EE3getEvE6ValStr
00000000003954e0 b _ZZN2cl4sycl6detail10SYCLConfigILNS1_8ConfigIDE2EE3getEvE6ValStr
0000000000395540 b _ZZN2cl4sycl6detail10SYCLConfigILNS1_8ConfigIDE3EE3getEvE6ValStr
0000000000395010 b _ZZN2cl4sycl6detail10SYCLConfigILNS1_8ConfigIDE4EE3getEvE10BackendPtr
0000000000395018 b _ZZN2cl4sycl6detail10SYCLConfigILNS1_8ConfigIDE4EE3getEvE11Initialized
0000000000395008 b _ZZN2cl4sycl6detail10SYCLConfigILNS1_8ConfigIDE4EE3getEvE7Backend
0000000000394ffc b _ZZN2cl4sycl6detail10SYCLConfigILNS1_8ConfigIDE5EE3getEvE11Initialized
0000000000394ff8 b _ZZN2cl4sycl6detail10SYCLConfigILNS1_8ConfigIDE5EE3getEvE5Level
0000000000395818 b _ZZN2cl4sycl6detail10SYCLConfigILNS1_8ConfigIDE6EE3getEvE6ValStr
0000000000395558 b _ZZN2cl4sycl6detail10SYCLConfigILNS1_8ConfigIDE7EE3getEvE6ValStr
000000000039501b b _ZZN2cl4sycl6detail10readConfigEvE11Initialized
0000000000395580 b _ZZN2cl4sycl6detail14ProgramManager11getInstanceEvE8Instance
0000000000395570 b _ZZN2cl4sycl6detail14ProgramManager5buildESt10unique_ptrI11_pi_programPF10_pi_resultPS4_EESt10shared_ptrINS1_12context_implEERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESK_RKP10_pi_deviceRSt3mapISt4pairINS1_12DeviceLibExtESM_ES6_St4lessISS_ESaISQ_IKSS_S6_EEEjE19EnableLevelZeroLink
0000000000394fc8 b _ZZN2cl4sycl6detail2pi10initializeEvE15PluginsInitDone
0000000000394fc0 b _ZZN2cl4sycl6detail2pi10initializeEvE7Plugins
0000000000394fb8 b _ZZN2cl4sycl6detail2pi9getPluginILNS0_7backendE1EEERKNS1_6pluginEvE6Plugin
0000000000394fb0 b _ZZN2cl4sycl6detail2pi9getPluginILNS0_7backendE2EEERKNS1_6pluginEvE6Plugin
00000000003957e0 b _ZZN2cl4sycl6detail4Sync11getInstanceEvE8Instance
00000000003956d8 b _ZZN2cl4sycl6detail7Command7enqueueERNS1_14EnqueueResultTENS1_9BlockingTEE12ThrowOnBlock
00000000003957b8 b _ZZN2cl4sycl6detail9Scheduler12GraphBuilder15printGraphAsDotEPKcE7Counter
0000000000395560 b _ZZNK2cl4sycl6detail14ProgramManager18getNextKernelSetIdEvE6Result
0000000000395528 b _ZZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb0ELb0EEclEcE5__nul
0000000000395518 b _ZZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb0ELb1EEclEcE5__nul
0000000000395508 b _ZZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb1ELb0EEclEcE5__nul
`

@alexbatashev
Copy link
Contributor Author

Can we add some test which will check that no new "global" objects are added by a patch?
The following command shows more global objects (this is Rt without your patch so all objects you fixed are still in the list):

` # nm -a ../deploy/linux_prod/lib/libsycl.so | grep " b " | grep detail

00000000003957a8 b _ZGVZN2cl4sycl6detail10SYCLConfigILNS1_8ConfigIDE1EE3getEvE6ValStr
00000000003954d8 b _ZGVZN2cl4sycl6detail10SYCLConfigILNS1_8ConfigIDE2EE3getEvE6ValStr
0000000000395538 b _ZGVZN2cl4sycl6detail10SYCLConfigILNS1_8ConfigIDE3EE3getEvE6ValStr
0000000000395000 b _ZGVZN2cl4sycl6detail10SYCLConfigILNS1_8ConfigIDE4EE3getEvE7Backend
0000000000395810 b _ZGVZN2cl4sycl6detail10SYCLConfigILNS1_8ConfigIDE6EE3getEvE6ValStr
0000000000395550 b _ZGVZN2cl4sycl6detail10SYCLConfigILNS1_8ConfigIDE7EE3getEvE6ValStr
0000000000395578 b _ZGVZN2cl4sycl6detail14ProgramManager11getInstanceEvE8Instance
0000000000395568 b _ZGVZN2cl4sycl6detail14ProgramManager5buildESt10unique_ptrI11_pi_programPF10_pi_resultPS4_EESt10shared_ptrINS1_12context_implEERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESK_RKP10_pi_deviceRSt3mapISt4pairINS1_12DeviceLibExtESM_ES6_St4lessISS_ESaISQ_IKSS_S6_EEEjE19EnableLevelZeroLink
00000000003956d0 b _ZGVZN2cl4sycl6detail7Command7enqueueERNS1_14EnqueueResultTENS1_9BlockingTEE12ThrowOnBlock
0000000000395520 b _ZGVZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb0ELb0EEclEcE5__nul
0000000000395510 b _ZGVZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb0ELb1EEclEcE5__nul
0000000000395500 b _ZGVZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb1ELb0EEclEcE5__nul
00000000003954f0 b _ZGVZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb1ELb1EEclEcE5__nul
0000000000394fe8 b _ZN2cl4sycl6detail12GPICallEventE
00000000003954c8 b _ZN2cl4sycl6detail14SYCLConfigBaseILNS1_8ConfigIDE1EE14MValueFromFileE
00000000003954a0 b _ZN2cl4sycl6detail14SYCLConfigBaseILNS1_8ConfigIDE1EE8MStorageE
0000000000395498 b _ZN2cl4sycl6detail14SYCLConfigBaseILNS1_8ConfigIDE2EE14MValueFromFileE
0000000000395490 b _ZN2cl4sycl6detail14SYCLConfigBaseILNS1_8ConfigIDE2EE8MStorageE
0000000000395488 b _ZN2cl4sycl6detail14SYCLConfigBaseILNS1_8ConfigIDE3EE14MValueFromFileE
0000000000395080 b _ZN2cl4sycl6detail14SYCLConfigBaseILNS1_8ConfigIDE3EE8MStorageE
0000000000395078 b _ZN2cl4sycl6detail14SYCLConfigBaseILNS1_8ConfigIDE4EE14MValueFromFileE
0000000000395060 b _ZN2cl4sycl6detail14SYCLConfigBaseILNS1_8ConfigIDE4EE8MStorageE
0000000000395058 b _ZN2cl4sycl6detail14SYCLConfigBaseILNS1_8ConfigIDE5EE14MValueFromFileE
0000000000395040 b _ZN2cl4sycl6detail14SYCLConfigBaseILNS1_8ConfigIDE5EE8MStorageE
0000000000395030 b _ZN2cl4sycl6detail14SYCLConfigBaseILNS1_8ConfigIDE6EE14MValueFromFileE
0000000000395028 b _ZN2cl4sycl6detail14SYCLConfigBaseILNS1_8ConfigIDE6EE8MStorageE
0000000000395020 b _ZN2cl4sycl6detail14SYCLConfigBaseILNS1_8ConfigIDE7EE14MValueFromFileE
000000000039501c b _ZN2cl4sycl6detail14SYCLConfigBaseILNS1_8ConfigIDE7EE8MStorageE
0000000000394ff0 b _ZN2cl4sycl6detail15GSYCLGraphEventE
0000000000394fd0 b _ZN2cl4sycl6detail2pi12GlobalPluginE
0000000000394fe0 b _ZN2cl4sycl6detail2pi12XPTIInitDoneE
00000000003956e0 b _ZN2cl4sycl6detail9Scheduler8instanceE
00000000003957b0 b _ZZN2cl4sycl6detail10SYCLConfigILNS1_8ConfigIDE1EE3getEvE6ValStr
00000000003954e0 b _ZZN2cl4sycl6detail10SYCLConfigILNS1_8ConfigIDE2EE3getEvE6ValStr
0000000000395540 b _ZZN2cl4sycl6detail10SYCLConfigILNS1_8ConfigIDE3EE3getEvE6ValStr
0000000000395010 b _ZZN2cl4sycl6detail10SYCLConfigILNS1_8ConfigIDE4EE3getEvE10BackendPtr
0000000000395018 b _ZZN2cl4sycl6detail10SYCLConfigILNS1_8ConfigIDE4EE3getEvE11Initialized
0000000000395008 b _ZZN2cl4sycl6detail10SYCLConfigILNS1_8ConfigIDE4EE3getEvE7Backend
0000000000394ffc b _ZZN2cl4sycl6detail10SYCLConfigILNS1_8ConfigIDE5EE3getEvE11Initialized
0000000000394ff8 b _ZZN2cl4sycl6detail10SYCLConfigILNS1_8ConfigIDE5EE3getEvE5Level
0000000000395818 b _ZZN2cl4sycl6detail10SYCLConfigILNS1_8ConfigIDE6EE3getEvE6ValStr
0000000000395558 b _ZZN2cl4sycl6detail10SYCLConfigILNS1_8ConfigIDE7EE3getEvE6ValStr
000000000039501b b _ZZN2cl4sycl6detail10readConfigEvE11Initialized
0000000000395580 b _ZZN2cl4sycl6detail14ProgramManager11getInstanceEvE8Instance
0000000000395570 b _ZZN2cl4sycl6detail14ProgramManager5buildESt10unique_ptrI11_pi_programPF10_pi_resultPS4_EESt10shared_ptrINS1_12context_implEERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESK_RKP10_pi_deviceRSt3mapISt4pairINS1_12DeviceLibExtESM_ES6_St4lessISS_ESaISQ_IKSS_S6_EEEjE19EnableLevelZeroLink
0000000000394fc8 b _ZZN2cl4sycl6detail2pi10initializeEvE15PluginsInitDone
0000000000394fc0 b _ZZN2cl4sycl6detail2pi10initializeEvE7Plugins
0000000000394fb8 b _ZZN2cl4sycl6detail2pi9getPluginILNS0_7backendE1EEERKNS1_6pluginEvE6Plugin
0000000000394fb0 b _ZZN2cl4sycl6detail2pi9getPluginILNS0_7backendE2EEERKNS1_6pluginEvE6Plugin
00000000003957e0 b _ZZN2cl4sycl6detail4Sync11getInstanceEvE8Instance
00000000003956d8 b _ZZN2cl4sycl6detail7Command7enqueueERNS1_14EnqueueResultTENS1_9BlockingTEE12ThrowOnBlock
00000000003957b8 b _ZZN2cl4sycl6detail9Scheduler12GraphBuilder15printGraphAsDotEPKcE7Counter
0000000000395560 b _ZZNK2cl4sycl6detail14ProgramManager18getNextKernelSetIdEvE6Result
0000000000395528 b _ZZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb0ELb0EEclEcE5__nul
0000000000395518 b _ZZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb0ELb1EEclEcE5__nul
0000000000395508 b _ZZNKSt8__detail11_AnyMatcherINSt7__cxx1112regex_traitsIcEELb0ELb1ELb0EEclEcE5__nul
`

I'm not sure how to implement such a check. Not all global objects cause problems, only those, that are not trivial. Having static bool Flag; somewhere in the codebase works fine. Problem with complex objects is that there's no defined order in which handles from __cxa_atexit must be called. So, if users have global objects of SYCL classes, they may accidentally access released resources. There's no such problem with trivial objects, because they don't own any resources, they're just data somewhere in memory.

@vladimirlaz
Copy link
Contributor

I'm not sure how to implement such a check. Not all global objects cause problems, only those, that are not trivial. Having static bool Flag; somewhere in the codebase works fine. Problem with complex objects is that there's no defined order in which handles from __cxa_atexit must be called. So, if users have global objects of SYCL classes, they may accidentally access released resources. There's no such problem with trivial objects, because they don't own any resources, they're just data somewhere in memory.

I filtered only objects in detail namespace. We can check for object types to be sycl classes (to ignore trivial objects).
In any case one time change will be not enough because regression can be introduced with next commit.


SpinLock MFieldsLock;

std::unique_ptr<Scheduler> MScheduler;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid truly knowing/enumerating all the objects handled in this wrapper, and just make it be a custom heap where arbitrary objects can be dealt with overloaded new/delete?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smaslov-intel I was thinking about it. If we go with some custom memory space, we'd need some mechanism to call destructors on that memory. We'd either need to somehow store custom deleters and call them upon shutdown, or implement some other logic, where object is responsible for its destruction. Both look quite complicated solutions. Also, it's not clear how to access these objects. Use names and store pointers in a map? So, I decided to keep global wrapper with unique pointers for now. Adding a new global object shouldn't be that hard, and I don't expect us to add a new global object every other Tuesday as they're code smell anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a new global object shouldn't be that hard

But it is badly violating OOP encapsulation principle, I feel.

Also it makes it nearly impossible to use this global handler in the plugins (standalone parts of SYCL RT).
And then anything else we use for plugins would compete with 65535 destruction priority used for this data.

Having said that I admit this is definitely an improvement and we should go with it until something better is developed.

Copy link
Contributor

@kbobrovs kbobrovs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good intent. But I think this deserves a design doc section - can you please create one?

sycl/include/CL/sycl/detail/spinlock.hpp Outdated Show resolved Hide resolved
sycl/source/detail/global_handler.cpp Outdated Show resolved Hide resolved
sycl/source/detail/global_handler.cpp Outdated Show resolved Hide resolved
sycl/source/detail/global_handler.cpp Outdated Show resolved Hide resolved
sycl/source/detail/global_handler.cpp Outdated Show resolved Hide resolved
@keryell
Copy link
Contributor

keryell commented Sep 23, 2020

There is also https://en.cppreference.com/w/cpp/utility/program/atexit and https://en.cppreference.com/w/cpp/utility/program/at_quick_exit that might be useful.
Another solution is to use static allocation instead of heap allocation, but removing destructor call if this is the goal, for example by adapting idea from https://github.com/keryell/teaching_C-plus-plus/blob/master/non-initializing/remove_initialization.cpp

@alexbatashev
Copy link
Contributor Author

There is also https://en.cppreference.com/w/cpp/utility/program/atexit and https://en.cppreference.com/w/cpp/utility/program/at_quick_exit that might be useful.

Unfortunately, this does not guarantee, that our destructors will be run after user code. Consider the following (completely artificial) toy example:

// somewhere in coolsycllibrary.so:
void *createSyclDevice() { ... }
void freeSyclDevice(void*) { ... }

// User code:

struct Wrapper {
  void *device;
  ~Wrapper() {
    if (device) {
      freeSyclDevice(device);
    }
  }
};

Wrapper globalSyclDevice;

// before main is run, __cxa_atexit(~Wrapper()) is called.
int main() {
  void *handle = dlopen("coolsycllibrary.so"); // here libsycl.so global objects are initialized.
  // resolve symbols
  // Work with SYCL device.
  return 0;
}
// Destructors for all SYCL global objects are called.
// ~Wrapper() is called --> Oops.

Now, I can't imagine someone really does it in pure C++ (why, if you have an official C++ API), but I can imagine such solutions used for interoperability with other languages (like GPU-accelerated libraries for Python).

Another solution is to use static allocation instead of heap allocation, but removing destructor call if this is the goal, for example by adapting idea from https://github.com/keryell/teaching_C-plus-plus/blob/master/non-initializing/remove_initialization.cpp

That's a mind-blowing piece of C++ sorcery, I absolutely like it. But for the reasons I mentioned above we don't want to skip destruction of global objects. Such a library can be dlopened/dlclosed many times, resulting in significant memory leak.

@keryell
Copy link
Contributor

keryell commented Sep 24, 2020

Unfortunately, this does not guarantee, that our destructors will be run after user code. Consider the following (completely artificial) toy example:

Yes, this was just another suggestion for the solution landscape.
Actually I do not know how to solve this in a definitive way...

Otherwise, in a similar area, there is inside the SYCL committee this discussion: https://gitlab.khronos.org/sycl/Specification/-/issues/289
This would require some level of collaboration with the programmer. As usual, feedback is more than welcome... :-)

Alexander Batashev added 2 commits September 29, 2020 13:17
…_wrapper

* upstream/sycl: (1533 commits)
  [SYCL] XFAIL sub_group shuffle tests on GPU
  [SYCL] Add support for L0 loader validation layer (intel#2520)
  [NFC][LIT] Temporary disable function pointers as they hang on L0 (intel#2544)
  [SYCL] use release version of OpenCL ICD loader
  [SYCL] Improve testing of host-task (intel#2540)
  Revert 1291215
  [SYCL] Fix warning caused by [[nodiscard]] attribute (intel#2545)
  [SYCL] Workaround windows build failure
  [SYCL] Remove kernel_signature_start from int header (intel#2537)
  [SYCL] Fix ABI tests in post-commit (intel#2539)
  [SYCL][DOC] Update C-CXX-StandardLibrary doc to align with latest status (intel#2529)
  [SYCL][NFC] Fix static code analysis concerns (intel#2531)
  [SYCL][NFC] Improve testing for accessor_property_list (intel#2532)
  [SYCL] Avoid overuse of CPU on wait read-write lock loop (intel#2525)
  [SYCL] Implement no-decomposition for kernel types that don't need it. (intel#2477)
  [SYCL] Add group algorithm constraints (intel#2462)
  [BuildBot] Uplift Windows GPU RT from 8673 to 8778 (intel#2533)
  [SYCL][LIT][NFC] Extend ABI test suite (intel#2522)
  [SYCL][DebugInfo] Reinstate source locations for some kernel instructions (intel#2527)
  [SYCL][NFC] Replace the deprecated VectorType::getNumElements() (intel#2524)
  ...
@keryell
Copy link
Contributor

keryell commented Sep 29, 2020

But I think this deserves a design doc section - can you please create one?

At least some local comments are required.
Because there is no guarantee that a reader of the source will read in sync some other external documentation at the same time.

@gmlueck
Copy link
Contributor

gmlueck commented Oct 1, 2020

Note that the plugins may also have global objects that need to be deleted. An example of this is #2566. Maybe we need a new PI interface that ~GlobalHandler() calls?

@alexbatashev
Copy link
Contributor Author

Note that the plugins may also have global objects that need to be deleted. An example of this is #2566. Maybe we need a new PI interface that ~GlobalHandler() calls?

We've discussed it with @romanovvlad and decided, that it should be handled in a separate PR.

Alexander Batashev added 7 commits October 5, 2020 10:28
…_wrapper

* upstream/sycl: (1021 commits)
  [SYCL] Enable async_work_group_copy for scalar and vector bool types (intel#2582)
  [SYCL] Fix element type in handler::copy (intel#2590)
  [NFC][SYCL] Remove unnecessary if condition (intel#2585)
  [SYCL][NFC] Fix SYCL lit test execution on a system w/o GPU (intel#2584)
  [SYCL] Add error handling for non-uniform work group size case (intel#2569)
  [SYCL][ESIMD] Preserve undef initializer for globals in ESIMDLowerVecArg pass (intel#2555)
  [SYCL] Make Level-Zero events visible on the host (intel#2576)
  [Driver][SYCL][NFC] Add help information for -Wno-sycl-strict (intel#2570)
  [SYCL] Relax test to work in Win32 environment. (intel#2580)
  [SYCL] Emit suppressed warnings from SYCL headers (intel#2575)
  [SYCL][NFC] Cover more classes with ABI tests (intel#2577)
  [SYCL][ESIMD] Update ESIMD tests and add raw send support. (intel#2482)
  [SYCL] Make ESIMD on-device tests require linux,gpu,opencl. (intel#2560)
  [SYCL] Release commands with no dependencies after they're enqueued (intel#2492)
  [SYCL] Add multi-device and multi-platform support for SYCL_DEVICE_ALLOWLIST  (intel#2483)
  [SYCL] Try to enqueue host command depencies (intel#2561)
  [SYCL][ESIMD][NFC] Align namespace name with the spec guidelines (intel#2573)
  [SYCL][NFC] Add class layout ABI tests for memory objects (intel#2559)
  [SYCL] Change adress space for global variables (intel#2534)
  [NFC][SYCL] Fix comment. (intel#2541)
  ...
* sycl:
  [SYCL] Fix test failure due to size discrepancy in 32 and 64 bit environment (intel#2594)
  [BuildBot] Linux GPU driver uplift to 20.39.17972 (intel#2600)
  [SYCL][NFC] Remove cyclic dependency in headers (intel#2601)
  [SYCL][Doc] Fix Sphinx docs index (intel#2599)
  [SYCL][PI][L0] Add an assert to piEnqueueKernelLaunch() when the GlobalWorkOffset!=0 (intel#2593)
  [SYCL][Driver] Turn on -spirv-allow-extra-diexpressions option (intel#2578)
  [SYCL] Serialize queue related PI API in the Level-Zero plugin (intel#2588)
  Added missing math APIs for devicelib. (intel#2558)
  [SYCL][DOC] Fix path to FPGA device selector (intel#2563)
  [SYCL][CUDA] Add basic sub-group functionality (intel#2587)
  [SYCL] Add specialization constant feature design doc. (intel#2572)
  [SYCL] Expand release lit test to CPU and ACC (intel#2589)
  [SYCL] Add clang support for FPGA kernel attribute scheduler_target_fmax_mhz (intel#2511)
  [SYCL][ESIMD] Fixed compiler crash in LowerESIMDVecArg pass (intel#2556)
@alexbatashev
Copy link
Contributor Author

@kbobrovs @smaslov-intel ping

// time when "Plugins" could be deleted.
Plugins = new vector_class<plugin>;
initializePlugins(Plugins);
initializePlugins(&GlobalHandler::instance().getPlugins());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why repeat calls here instead of remembering the return value?

Copy link
Contributor

@kbobrovs kbobrovs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the concise description in global_handler.hpp

@romanovvlad
Copy link
Contributor

@vladimirlaz Could you please approve as well?

@romanovvlad romanovvlad merged commit bb4ce61 into intel:sycl Oct 13, 2020
alexbatashev pushed a commit to alexbatashev/llvm that referenced this pull request Oct 14, 2020
* sycl: (566 commits)
  [SYCL] Fix explicit copy operation for host device (intel#2627)
  [SYCL] Fix initialization issue on Windows (intel#2632)
  [SYCL][CUDA] Disable image_write test on CUDA devices (intel#2630)
  [SYCL] Removes any knowledge of specific memory advices from PI API. (intel#2607)
  [BuildBot] Uplift GPU RT version for Linux to 20.40.18075 (intel#2626)
  [SYCL] Wrap complex global objects to control lifetime (intel#2516)
  [SYCL][CUDA] Image Basic Test (intel#1970)
  [SYCL] Align get_info<info::device::version>() with the SYCL spec (intel#2507)
  [Driver][SYCL] Correct optimization disabling option for gen (intel#2622)
  [SYCL][LIT] Add deleter func for test in buffer.cpp to avoid potential SegFault (intel#2616)
  [SYCL] Remove half type alias causing name conflicts (intel#2624)
  [BuildBot] OpenCL CPU/FPGAEMU driver uplift (intel#2620)
  [SYCL][Doc] Add overview of kernel-program caching (intel#2514)
  [SYCL] Remove two-input sub-group shuffles (intel#2614)
  [SYCL] Add support for new spelling of FPGA kernel attribute scheduler_target_fmax_mhz (intel#2618)
  [SYCL] Align image class constructors with the SYCL spec (intel#2603)
  [SYCL] Remove tests migrated to llvm-test-suite (intel#2611)
  [SYCL][NFC] Fix dependency for SYCL add_sycl_executable macro (intel#2613)
  [SYCL][PI][L0] Update environment variables from LEVEL0 to LEVEL_ZERO (intel#2612)
  [SYCL] Add KernelNameTypeVisitor validation check (intel#2596)
  ...
@alexbatashev alexbatashev deleted the global_constructors_wrapper branch November 13, 2020 06:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants