Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] Always let the backend choose the binary #1587

Merged
merged 7 commits into from
May 12, 2020

Conversation

hiaselhans
Copy link
Contributor

@hiaselhans hiaselhans commented Apr 25, 2020

There is still a little issue that in nvptx binaries entries are not distributed.
When compiling with -fsycl-targets=nvptx64-nvidia-opencl-sycldevice,spir64-unknown-opencl-sycldevice programm_manager is only providing the spir64 binary which has the proper kernel name in entries. that's why the length of binaries is 1 and sycl directly continues to execute with the wrong binary and the error thrown at a later point is misleading:

terminate called after throwing an instance of 'cl::sycl::feature_not_supported'
  what():  Online compilation is not supported in this context -59 (CL_INVALID_OPERATION)

The overhead by always having the backend check for valid binary is not too big in my eyes and the error code becomes that:

terminate called after throwing an instance of 'cl::sycl::runtime_error'
  what():  OpenCL API failed. OpenCL API returns: -42 (CL_INVALID_BINARY) -42 (CL_INVALID_BINARY)

While the issue with entry names remains (i will look into it but not sure where to start), i guess this also happened in a case where binaries are completely missing for the selected device backend.

When there is only one binary available the backend should still choose
the binary to avoid misleading cl_error_codes

Signed-off-by: hiaselhans <[email protected]>
@hiaselhans
Copy link
Contributor Author

see #1588 for the related binary-entries-table issue

@Ruyk
Copy link
Contributor

Ruyk commented Apr 27, 2020

Just to understand this, the intention of the patch is to provide a better error message? or are you going to add more later on?

@Ruyk Ruyk self-requested a review April 27, 2020 09:37
@hiaselhans
Copy link
Contributor Author

the only intention of this PR is to prevent sending wrong binaries to the backend.

the conditional exception allowed to enter a path that should not be allowed (continuing with a non-matching backend/binary combination).

kbobrovs
kbobrovs previously approved these changes Apr 28, 2020
Copy link
Contributor

@kbobrovs kbobrovs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kbobrovs
Copy link
Contributor

i will look into it but not sure where to start

sycl-post-link tool generates the symbol table - use -symbols switch.
As PTX does not support spec constants, post-link should generate 2-column table - first column lists binaries, second - corresponding symbol files. See Driver.cpp/appendLinkDependences code after the picture illustrating the action graph. To support symbol table, post-link should generate TY_Tempfiletable in case of PTX too. Use file-table-tform to process filenames in the table, similar to non-PTX path

@hiaselhans
Copy link
Contributor Author

So, with this PR there is a failing test: kernel_from_file

>>> ProgramManager::getDeviceImage(-1, "0", 0x89d880)
available device images:
  ++++++ Kernel set: 0
  --- Image 0xd934c0
    Version  : 1
    Kind     : 4
    Format   : 2
    Target   : <unknown>
    Bin size : 13532
    Compile options : 
    Link options    : 
    Entries  : 
    Properties [0-0]:
    OSModuleHandle=-2
    DYNAMICALLY CREATED

programm_manager fails to select the image because target is set to unknown via DynRTDeviceBinaryImage

for now DynRTDeviceBinaryImage is only used within the UseSpvEnv so we could hardcode the target to spir64?

Signed-off-by: hiaselhans <[email protected]>
@kbobrovs
Copy link
Contributor

for now DynRTDeviceBinaryImage is only used within the UseSpvEnv so we could hardcode the target to spir64?

The SPIRV format should have been determined byBin->Format = pi::getBinaryImageFormat(Bin->BinaryStart, DataSize); in the constructor. I think this should be investigated/fixed rather than hardcoding SPIRV.

@kbobrovs
Copy link
Contributor

OK, the problem is in target, not in the format. So I'd suggest then to initialize Bin->target with SPIRV64 depending on Bin->format.

@hiaselhans
Copy link
Contributor Author

OK, the problem is in target, not in the format. So I'd suggest then to initialize Bin->target with SPIRV64 depending on Bin->format.

yep, i just found that. so i will set target with a switch statement checking format:

pi.h:

static constexpr pi_device_binary_type PI_DEVICE_BINARY_TYPE_NONE = 0;
// specific to a device
static constexpr pi_device_binary_type PI_DEVICE_BINARY_TYPE_NATIVE = 1;
// portable binary types go next
// SPIR-V
static constexpr pi_device_binary_type PI_DEVICE_BINARY_TYPE_SPIRV = 2;
// LLVM bitcode
static constexpr pi_device_binary_type PI_DEVICE_BINARY_TYPE_LLVMIR_BITCODE = 3;

@kbobrovs
Copy link
Contributor

yes, sounds good

kbobrovs
kbobrovs previously approved these changes Apr 28, 2020
Copy link
Contributor

@kbobrovs kbobrovs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hiaselhans
Copy link
Contributor Author

hiaselhans commented Apr 28, 2020

Did that, it still feels a little hard-coded but it's a bit of an edge case anyways.

thx @kbobrovs !

Signed-off-by: hiaselhans <[email protected]>
Signed-off-by: hiaselhans <[email protected]>
kbobrovs
kbobrovs previously approved these changes Apr 28, 2020
Copy link
Contributor

@Ruyk Ruyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just few comments, I haven't tried with the CUDA backend, does this patch solves the problem?

std::cerr << ">>> ProgramManager::getDeviceImage(" << M << ", \"" << KSId
<< "\", " << getRawSyclObjImpl(Context) << ")\n";

std::cerr << "available device images:\n";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could help when filtering the output...

Suggested change
std::cerr << "available device images:\n";
std::cerr << "ProgramManager: Available device images:\n";

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually i just moved this part up because it might be unreachable at it's old place.

If i were to improve those messages i would query for PI_TRACE at runtime instead? what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be even better

Copy link
Contributor Author

@hiaselhans hiaselhans May 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kbobrovs should i replace all DbgProgMgr checks with pi_trace in programmanager?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hiaselhans , sorry for delay. Yes, definitely makes sense. Can be done as a separate PR.

break;
default:
Bin->DeviceTargetSpec = PI_DEVICE_BINARY_TARGET_UNKNOWN;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work with the CUDA backend when there are multiple binaries?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think you can only load one DynRTDeviceBinaryImage at a time using the SYCL_USE_KERNEL_SPV env-variable. this just sets the devicetarget to spir64 in case it is a spir-file so the opencl backend can recognize it.

@hiaselhans
Copy link
Contributor Author

Just few comments, I haven't tried with the CUDA backend, does this patch solves the problem?

It doesn't yet solve the issue quoted, but it does help when someone compiles only nvptx64 binaries and uses an opencl device and vice-versa. The if-clause prevented the backend from checking the image type before executing it.

@Ruyk
Copy link
Contributor

Ruyk commented Apr 29, 2020

#1543 rejects OpenCL CUDA platform on DPC++, so should help as well.

@hiaselhans
Copy link
Contributor Author

from looking at #1543 i do like the fact that there's now always getBackend() == backend::opencl
makes a lot of sense to me! :)

however there's still this: isDeviceBinaryTypeSupported

static bool isDeviceBinaryTypeSupported(const context &C,

so i wonder if we could have a PI call piDeviceSupported(pi_device *device, bool *support) and reject all those opencl 1.2 devices altogether?

@Ruyk
Copy link
Contributor

Ruyk commented May 1, 2020

I don't think we should ban OpenCL 1.2 devices, since lots of them could work (specially if they expose cl_khr_il_program). This is checked below in

for (const device &D : Devices) {
// We need cl_khr_il_program extension to be present
// and we can call clCreateProgramWithILKHR using the extension
vector_class<string_class> Extensions =
D.get_info<info::device::extensions>();
if (Extensions.end() ==
std::find(Extensions.begin(), Extensions.end(), "cl_khr_il_program"))
return false;
}
return true;
}

It is also, IMHO, not a problem of the PI API to decide if a device is valid for a plugin, one of the reason is because the pi_device object may come from a different object to the plugin you are calling the function and they are not necessarily compatible (there is no guarantee on PI API that PI types are compatible at all across implementations).
Since a pi_device is returned from querying a pi_platform , by definition, the pi_device should be valid.

In terms of CUDA, note the PiDeviceBinaryType check:

if (Format != PI_DEVICE_BINARY_TYPE_SPIRV)
return true;
,

Currently, a PTX binary is appearing as NATIVE, so this function itself is not used,

// assert(Format != PI_DEVICE_BINARY_TYPE_NONE && "Image format not set");
if (!isDeviceBinaryTypeSupported(Context, Format))
throw feature_not_supported(

The code is handled as a Binary blob loaded, like any other native binary format:

Format == PI_DEVICE_BINARY_TYPE_SPIRV
? createSpirvProgram(Ctx, RawImg.BinaryStart, ImgSize)
: createBinaryProgram(Ctx, RawImg.BinaryStart, ImgSize);

FYI, definition of Binary types:

// format is not determined
static constexpr pi_device_binary_type PI_DEVICE_BINARY_TYPE_NONE = 0;
// specific to a device
static constexpr pi_device_binary_type PI_DEVICE_BINARY_TYPE_NATIVE = 1;
// portable binary types go next
// SPIR-V
static constexpr pi_device_binary_type PI_DEVICE_BINARY_TYPE_SPIRV = 2;
// LLVM bitcode
static constexpr pi_device_binary_type PI_DEVICE_BINARY_TYPE_LLVMIR_BITCODE = 3;

@hiaselhans
Copy link
Contributor Author

thx @Ruyk

Ok, i see. I was wrong in a way that isDeviceBinaryTypeSupported does return true on opencl >= 2.1 and not false <=. Wrong assumption from my side, but in a way i just meant if we could use that logic there to filter unsupported opencl devices.

So let me rephrase my point:

The isDeviceBinaryTypeSuported function does only check for format=PI_DEVICE_BINARY_TYPE_SPIRV if a device supports sprir-v.
It is the opencl backend to deliver these pi_devices, so why shouldn't it also be responsible to check if that device is a "good" opencl device? in a way it could also be a bool isSupported or similar stored with pi_device which can be checked in device_selector.

@hiaselhans hiaselhans changed the title Always let the backend choose the binary [SYCL] Always let the backend choose the binary May 1, 2020
@Ruyk
Copy link
Contributor

Ruyk commented May 1, 2020

It is the opencl backend to deliver these pi_devices, so why shouldn't it also be responsible to check if that device is a "good" opencl device? in a way it could also be a bool isSupported or similar stored with pi_device which can be checked in device_selector.

Any OpenCL device is a "good" opencl device, it is only the SYCL RT that can use or not a certain OpenCL (or PI device). PI Plugin should provide devices and platforms to the SYCL RT, decisions on what is valid is something that should be on the SYCL RT layers.

@hiaselhans
Copy link
Contributor Author

hiaselhans commented May 1, 2020

i think this discussion should be in #1543 because nothing of it is actually addressed here, in this pr.

in other words: If all devices are "good" devices, why do we need special treatment for the cuda opencl device? why not reusing the logic from isDeviceBinaryTypeSupported? and as that logic is quite opencl specific, why not moving it to the opencl backend?

forgive my naive questions but it just seems so much more straightforward to me...

@bader
Copy link
Contributor

bader commented May 9, 2020

@hiaselhans, sorry for the delay.
It looks like there is a conflict with c22e34b. Could you resolve it, please?

@hiaselhans
Copy link
Contributor Author

@bader done :)

@bader bader requested a review from kbobrovs May 10, 2020 09:16
@bader
Copy link
Contributor

bader commented May 12, 2020

@intel/llvm-reviewers-runtime, ping.

@bader bader merged commit 6233c68 into intel:sycl May 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants