Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Official releases #8

Closed
blueberry opened this issue May 8, 2016 · 102 comments
Closed

Official releases #8

blueberry opened this issue May 8, 2016 · 102 comments

Comments

@blueberry
Copy link

Since CLBlast 0.7.0 is out, maybe we can prepare the release 0.7.0 of JOCLBlast (and also RC01 of JOCL)?
We have people that can build for all 3 major operating systems...

@gpu
Copy link
Owner

gpu commented May 9, 2016

Sounds like a plan. I have some (time and technical) constraints this week, but I should be able to

  • create a tag for JOCL-2.0.0-RC01 and
  • (check if there are relevant changes in CLBlast and) create a tag for JOCLBlast-0.7.0-RC01

These tags could then be used to build

  • the JOCL natives
  • the CLBlast natives (for 0.7.0) (!)
  • the JOCLBlast natives

so that, if everything works out as expected, both can be released to Maven Central next week

@blueberry
Copy link
Author

blueberry commented May 9, 2016

Cool. As for JOCLBlast, I vote to go straight to 0.7.0, since:

  1. CLBlast is itself beta, and 0.7.0 has already been released
  2. We have the minor version number for eventual fixes
  3. Currently there is a handful of users anyway, and CLBlast itself will have new versions every month or two.

I propose to:

  1. Create JOCLBlast 0.7.0-SNAPSHOT (or RC)
  2. We build CLBlast 0.7.0 for each OS, and test the latest JOCLBlast snapshot with our libraries/tests/apps (I guess this will work, since there weren't many changes since the last batch of issues)
  3. If everything is OK, you release 0.7.0 final with the same CLBlast binaries that we used to test the snapshot (RC)?

@gpu
Copy link
Owner

gpu commented May 9, 2016

OK, a first attempt:

The versions in the POMs are now "...0-SNAPSHOT", as this is the usual maven convention. The tags are both called "RC00".

I hope that the MVN install of JOCLBlast works as desired: I had to extend the POM a bit to specify the native library location of the CLBlast library for the unit tests.

(Again, I have some constraints this week, but have already built the CLBlast and JOCLBlast natives for Windows, x86 and x86_64. When the Linux x86 and x86_64 and the MacOS ones are available, I can (fingers crossed) publish both to Maven Central)

@blueberry
Copy link
Author

@gpu @amherag Here are osx and linux builds for JOCL and JOCL blast:
linux-dragan.zip
osx-amaury.zip

@gpu
Copy link
Owner

gpu commented May 11, 2016

Thanks @amherag and @blueberry !

@blueberry Could you also compile them for 32 bit? That would be great, as it is the "only" library that is missing now.

@blueberry
Copy link
Author

As far as I know no, because I don't have a 32 bit system. If it can be done on the 64 bit system then yes, but I'd need smeone to point me to the information on how to do that.

Is 32 bit still a thing? I understand why someone woud need that on windows, but I haven't seen a 32 bit desktop linux for a looong time.

@gpu
Copy link
Owner

gpu commented May 11, 2016

For Windows, the compiler toolchain can explicitly be selected in CMake ("Visual Studio..." vs "Visual Studio Win64"). For Linux, it should be possible as well, according to http://stackoverflow.com/questions/1272357/how-to-compile-a-32-bit-binary-on-a-64-bit-linux-machine-with-gcc-cmake . However, if you think that it is not worth the effort (I'm not sure how prevalent 32bit Linuxes still are...) then I'd try to assemble 2.0.0 tomorrow.

@blueberry
Copy link
Author

I had looked at that page before, but on linux there are many more variables outside the project, starting with the dependent libraries.

I'd go with what we have since:

  1. clblast is going to be released often, so there will be opportunities for adding new platforms and architectures soon.
  2. there is a number of users who are eager to use linux/mac/windows 64bit libraries as soon as possible.
  3. if someone who needs linux 32bit appears - great! he will have the needed 32bit system and we can help him build the binaries and fill that gap!

@gpu
Copy link
Owner

gpu commented May 11, 2016

OK then, I'll try to build the package tomorrow (unfortunately, I'm not entirely sure whether this will be possible, but in any case, I'll do it ASAP)

@blueberry
Copy link
Author

Thank you.

@gpu
Copy link
Owner

gpu commented May 12, 2016

So after a bit of a hassle and back and forth *, JOCL and JOCLBlast should soon be available in Maven Central as

<dependency>
    <groupId>org.jocl</groupId>
    <artifactId>jocl</artifactId>
    <version>2.0.0</version>
</dependency>

and

<dependency>
    <groupId>org.jocl</groupId>
    <artifactId>jocl-blast</artifactId>
    <version>0.7.0</version>
</dependency>

respectively.


* Having several GitHub- and Sonatype-accounts always makes this a bit fiddly. I still think it's a nuisance that Maven performs the POM update and SCM tagging before it notices that it can not upload anything to the staging server. Once more, I had to delete tags and roll back the GitHub history to fix this...

And now, preparing everything for 2.0.1 and 0.7.1, because I'm pretty sure that something will have gone wrong with this release nevertheless ;-)

@gpu
Copy link
Owner

gpu commented May 13, 2016

It turned out that for (JO)CLBlast, users may have to download and install https://www.microsoft.com/en-us/download/confirmation.aspx?id=48145 , because the runtime libraries are linked dynamically by default (c.f. CNugteren/CLBlast#59 ). Apart from that, after a quick test on a different PC, it basically seemed to work - at least, on Windows.

@blueberry
Copy link
Author

@gpu @amherag Cedric have just released CLBlast 0.7.1 that solves the Windows runtime issue, and also two very important AMD performance issuse.

@gpu, can you please prepare 0.7.1, so @amherag and I can build OS X and Linux version? As soon as JOCLBlast 0.7.1 is in the maven central, I am going to release Neanderthal.

@gpu
Copy link
Owner

gpu commented May 19, 2016

I'll try to do this ASAP (not sure whether I can do it today - hopefully tomorrow, but can not promise it)

@gpu
Copy link
Owner

gpu commented May 20, 2016

@blueberry @amherag A tag https://github.com/gpu/JOCLBlast/releases/tag/0.7.1-RC00 has been created for version 0.7.1, which may be used to build the binaries.

@amherag
Copy link

amherag commented May 20, 2016

Got it. I'll build the OS X binaries today.

@blueberry
Copy link
Author

I'll probably do it tomorrow.

@blueberry
Copy link
Author

@amherag
Copy link

amherag commented May 21, 2016

@gpu
jocl-blast-0.7.1-apple.zip

Sorry for the delay.

@gpu
Copy link
Owner

gpu commented May 22, 2016

Thanks @blueberry and @amherag I'll pack it together ASAP (maybe today, more likely tomorrow).

@gpu
Copy link
Owner

gpu commented May 23, 2016

JOCLBlast 0.7.1 has been released and will soon be available in Maven central as

<dependency>
    <groupId>org.jocl</groupId>
    <artifactId>jocl-blast</artifactId>
    <version>0.7.1</version>
</dependency>

@blueberry
Copy link
Author

@gpu I just want to notify you that CLBlast 0.8.0 is out, so we can prepare the releases that fix the OS X bug uncomplicate/neanderthal#15.

It will require JOCL code generation, since there are new functions in CLBlast.

@gpu
Copy link
Owner

gpu commented Jun 29, 2016

Thanks for this pointer. Unfotunately, I'll hardly be able to do it this week, but will try to schedule it ASAP (likely Monday).

@blueberry
Copy link
Author

Thank you. Monday is perfectly fine. I'll include a pointer for @amherag so he can expect it.

@gpu
Copy link
Owner

gpu commented Jun 30, 2016

I managed to allocate some time today, and updated it, in https://github.com/gpu/JOCLBlast/releases/tag/0.8.0-RC00

There haven't been many new functions, except for the (non-BLAS) omatcopy ones, and the ones that operate on cl_half. Since this data type is not supported in Java at all, I now omitted these.

@blueberry
Copy link
Author

@gpu @amherag Here is the linux build. Please note that you forgot to update the version in Maven pom - I updated that to 0.8.0-SNAPSHOT, but it should not affect native binaries that are relevant here (their version is OK).
jocl-blast-0.8.0-SNAPSHOT.zip

@amherag
Copy link

amherag commented Jul 6, 2016

@gpu @blueberry And here is the mac build.

jocl-blast-0.8.0-SNAPSHOT.jar.zip

When I issued mvn clean install on JOCLBlast 0.8.0, the jar was named as version 0.7.2. (Oh, nevermind, I just read @blueberry post before mine. I'll rename it to 0.8.0-SNAPSHOT too).

Sorry for the delay, although I'm already on vacation, I've been very busy.

@blueberry
Copy link
Author

Thank you, Amaury.

@gpu
Copy link
Owner

gpu commented Jul 6, 2016

Thanks @blueberry and @amherag for the contribution!

(The version number in the POM: Indeed, I created this tag (which was only intended for the binaries) before updating the maven build part - but it would indeed have been better to update the version number in the POM as well)

The new version should be available in Maven Central soon, at

<dependency>
    <groupId>org.jocl</groupId>
    <artifactId>jocl-blast</artifactId>
    <version>0.8.0</version>
</dependency>

Hopefully, this will also resolve uncomplicate/neanderthal#15

@gpu
Copy link
Owner

gpu commented May 10, 2017

I'll try to create the samples tomorrow (not sure how to test this "parameter override", but at least a batched example), so hopefully, the RC tag can be created tomorrow as well.

@CNugteren
Copy link

There is a small parameter override test in CLBlast, maybe that will help you: https://github.com/CNugteren/CLBlast/blob/master/test/correctness/misc/override_parameters.cpp

@gpu
Copy link
Owner

gpu commented May 11, 2017

@blueberry and @amherag The RC tag for 0.11.0 is at https://github.com/gpu/JOCLBlast/releases/tag/0.11.0-RC00

@CNugteren Thanks. I have created a "simplified port" of this class for testing the OverrideParameters functionality in JOCLBlast:

package org.jocl.samples.blast;

import static org.jocl.CL.*;

import java.util.*;

import org.jocl.*;
import org.jocl.blast.*;

/**
 * An example for using the OverrideParameters functionality of CLBlast.
 *
 * This example is basically a (simplified) port of the original test at
 * https://github.com/CNugteren/CLBlast/blob/
 *     f24c142948fc71d8b37826c1275259668fe0d0e5/test/
 *     correctness/misc/override_parameters.cpp
 *     
 */
public class JOCLBlastOverrideTest
{
    // The platform, device type and device number
    // that will be used
    static final int platformIndex = 0;
    static final long deviceType = CL_DEVICE_TYPE_ALL;
    static final int deviceIndex = 0;

    private static cl_device_id device;
    private static cl_context context;
    private static cl_command_queue commandQueue;

    public static void main(String[] args)
    {
	int errors = 0;
	int passed = 0;
	int kSeed = 42; // fixed seed for reproducibility

	// Determines the test settings
	String routine_name = "SGEMM";
	String kernel_name = "Xgemm";
	int precision = CLBlastPrecision.CLBlastPrecisionSingle;
	List<Map<String, Integer>> valid_settings = createValidSettings();
	List<Map<String, Integer>> invalid_settings = createInvalidSettings();

	// Retrieves the arguments
	int m = 256;
	int n = 256;
	int k = 256;
	int a_ld = k;
	int b_ld = n;
	int c_ld = n;
	int a_offset = 0;
	int b_offset = 0;
	int c_offset = 0;
	int layout = CLBlastLayout.CLBlastLayoutRowMajor;
	int a_transpose = CLBlastTranspose.CLBlastTransposeNo;
	int b_transpose = CLBlastTranspose.CLBlastTransposeNo;
	float alpha = 0.0f;
	float beta  = 0.0f;

	// Initialize OpenCL
	defaultInitialization();

	// Populate host matrices with some example data
	float host_a[] = new float[m * k];
	float host_b[] = new float[n * k];
	float host_c[] = new float[m * n];
	Random random = new Random(kSeed);
	populateVector(host_a, random);
	populateVector(host_b, random);
	populateVector(host_c, random);

	// Copy the matrices to the device
	cl_mem device_a = copyToDevice(host_a);
	cl_mem device_b = copyToDevice(host_b);
	cl_mem device_c = copyToDevice(host_c);

	System.out.printf(
	    "* Testing OverrideParameters for '%s'\n", routine_name);

	// Loops over the valid combinations: run before and run afterwards
	for (Map<String, Integer> override_setting : valid_settings)
	{
	    // Call with the default parameters
	    int status_before = CLBlast.CLBlastSgemm(
		layout, a_transpose, b_transpose, m, 
		b_transpose, k, alpha, device_a, a_offset, 
		a_ld, device_b, b_offset, b_ld, beta, 
		device_c, c_offset, c_ld, commandQueue, null);
	    CL.clFinish(commandQueue);

	    if (status_before != CLBlastStatusCode.CLBlastSuccess)
	    {
		errors++;
		continue;
	    }

	    // Overrides the parameters
	    int num_parameters = override_setting.size();
	    String parameters_names[] = 
		override_setting.keySet().toArray(new String[0]);
	    long[] parameters_values = 
		extractParameterValues(override_setting.values());
	    int status = CLBlast.CLBlastOverrideParameters(
		device, kernel_name, precision, num_parameters, 
		parameters_names, parameters_values);

	    if (status != CLBlastStatusCode.CLBlastSuccess)
	    {
		errors++;
		continue;
	    }

	    // Call with the overridden parameters
	    int status_after = CLBlast.CLBlastSgemm(
		layout, a_transpose, b_transpose, m, 
		b_transpose, k, alpha, device_a, a_offset, 
		a_ld, device_b, b_offset, b_ld, beta, 
		device_c, c_offset, c_ld, commandQueue, null);
	    CL.clFinish(commandQueue);

	    if (status_after != CLBlastStatusCode.CLBlastSuccess)
	    {
		errors++;
		continue;
	    }

	    passed++;
	}


	// Loops over the valid combinations: run before and run afterwards
	for (Map<String, Integer> override_setting : invalid_settings)
	{
	    // Call with the default parameters
	    int status_before = CLBlast.CLBlastSgemm(
		layout, a_transpose, b_transpose, m, 
		b_transpose, k, alpha, device_a, a_offset, 
		a_ld, device_b, b_offset, b_ld, beta, 
		device_c, c_offset, c_ld, commandQueue, null);
	    CL.clFinish(commandQueue);

	    if (status_before != CLBlastStatusCode.CLBlastSuccess)
	    {
		errors++;
		continue;
	    }

	    // Overrides the parameters
	    int num_parameters = override_setting.size();
	    String parameters_names[] = 
		override_setting.keySet().toArray(new String[0]);
	    long[] parameters_values = 
		extractParameterValues(override_setting.values());
	    int status = CLBlast.CLBlastOverrideParameters(
		device, kernel_name, precision, num_parameters, 
		parameters_names, parameters_values);

	    if (status == CLBlastStatusCode.CLBlastSuccess) // expecting error
	    {
		errors++;
		continue;
	    }

	    // Call again (using the default parameters)
	    int status_after = CLBlast.CLBlastSgemm(
		layout, a_transpose, b_transpose, m, 
		b_transpose, k, alpha, device_a, a_offset, 
		a_ld, device_b, b_offset, b_ld, beta, 
		device_c, c_offset, c_ld, commandQueue, null);
	    CL.clFinish(commandQueue);

	    if (status_after != CLBlastStatusCode.CLBlastSuccess)
	    {
		errors++;
		continue;
	    }

	    passed++;
	}

	// Print the statistics
	System.out.printf("    %d test(s) passed\n", passed);
	System.out.printf("    %d test(s) failed\n", errors);
	System.out.printf("\n");
    }

    private static List<Map<String, Integer>> createValidSettings()
    {
	List<Map<String, Integer>> validSettings = 
	    new ArrayList<Map<String, Integer>>();

	Map<String, Integer> map = null;

	map = new LinkedHashMap<String, Integer>();
	map.put("KWG",16);
	map.put("KWI",2);
	map.put("MDIMA",4);
	map.put("MDIMC",4);
	map.put("MWG",16);
	map.put("NDIMB",4);
	map.put("NDIMC",4);
	map.put("NWG",16);
	map.put("SA",0);
	map.put("SB",0);
	map.put("STRM",0);
	map.put("STRN",0);
	map.put("VWM",1);
	map.put("VWN",1);
	validSettings.add(map);

	map = new LinkedHashMap<String, Integer>();
	map.put("KWG",32);
	map.put("KWI",2);
	map.put("MDIMA",4);
	map.put("MDIMC",4);
	map.put("MWG",32);
	map.put("NDIMB",4);
	map.put("NDIMC",4);
	map.put("NWG",32);
	map.put("SA",0);
	map.put("SB",0);
	map.put("STRM",0);
	map.put("STRN",0);
	map.put("VWM",1);
	map.put("VWN",1);
	validSettings.add(map);

	return validSettings;
    }

    private static List<Map<String, Integer>> createInvalidSettings()
    {
	List<Map<String, Integer>> invalidSettings = 
	    new ArrayList<Map<String, Integer>>();

	Map<String, Integer> map = null;

	map = new LinkedHashMap<String, Integer>();
	map.put("KWI",2);
	map.put("MDIMA",4);
	map.put("MDIMC",4);
	map.put("MWG",16);
	map.put("NDIMB",4);
	map.put("NDIMC",4);
	map.put("NWG",16);
	map.put("SA",0);
	invalidSettings.add(map);

	return invalidSettings;
    }

    private static long[] extractParameterValues(Collection<Integer> integers)
    {
	long result[] = new long[integers.size()];
	int index = 0;
	for (Integer integer : integers)
	{
	    result[index] = integer;
	    index++;
	}
	return result;
    }

    private static void populateVector(float a[], Random random)
    {
	for (int i=0; i<a.length; i++)
	{
	    a[i] = random.nextFloat();
	}
    }

    private static cl_mem copyToDevice(float host[])
    {
	cl_mem device = clCreateBuffer(context, CL_MEM_READ_WRITE,
	    host.length * Sizeof.cl_float, null, null);
	clEnqueueWriteBuffer(commandQueue, device, CL_TRUE, 0,
	    host.length * Sizeof.cl_float, 
	    Pointer.to(host), 0, null, null);
	return device;
    }

    /**
     * Default OpenCL initialization of the device, context and command queue
     */
    private static void defaultInitialization()
    {
	// Enable exceptions and subsequently omit error checks in this sample
	CL.setExceptionsEnabled(true);

	// Obtain the number of platforms
	int numPlatformsArray[] = new int[1];
	clGetPlatformIDs(0, null, numPlatformsArray);
	int numPlatforms = numPlatformsArray[0];

	// Obtain a platform ID
	cl_platform_id platforms[] = new cl_platform_id[numPlatforms];
	clGetPlatformIDs(platforms.length, platforms, null);
	cl_platform_id platform = platforms[platformIndex];

	// Initialize the context properties
	cl_context_properties contextProperties = new cl_context_properties();
	contextProperties.addProperty(CL_CONTEXT_PLATFORM, platform);

	// Obtain the number of devices for the platform
	int numDevicesArray[] = new int[1];
	clGetDeviceIDs(platform, deviceType, 0, null, numDevicesArray);
	int numDevices = numDevicesArray[0];

	// Obtain a device ID
	cl_device_id devices[] = new cl_device_id[numDevices];
	clGetDeviceIDs(platform, deviceType, numDevices, devices, null);
	device = devices[deviceIndex];

	// Create a context for the selected device
	context = clCreateContext(
	    contextProperties, 1, new cl_device_id[]{device},
	    null, null, null);

	String deviceName = getString(devices[0], CL_DEVICE_NAME);
	System.out.printf("CL_DEVICE_NAME: %s\n", deviceName);

	// Create a command-queue
	commandQueue = clCreateCommandQueue(
	    context, devices[0], 0, null);

    }

    private static String getString(cl_device_id device, int paramName)
    {
	// Obtain the length of the string that will be queried
	long size[] = new long[1];
	clGetDeviceInfo(device, paramName, 0, null, size);

	// Create a buffer of the appropriate size and fill it with the info
	byte buffer[] = new byte[(int)size[0]];
	clGetDeviceInfo(device, paramName, buffer.length, 
	    Pointer.to(buffer), null);

	// Create a string from the buffer (excluding the trailing \0 byte)
	return new String(buffer, 0, buffer.length-1);
    }

}

Also, a small test/example for the CLBlastCaxpyBatched function:

package org.jocl.samples.blast;

import static org.jocl.CL.*;
import static org.jocl.blast.CLBlast.CLBlastCaxpyBatched;

import java.nio.FloatBuffer;
import java.util.Locale;

import org.jocl.*;
import org.jocl.blast.CLBlast;

/**
 * An example for using the batched CAXPY function from CLBlast to compute
 * Y = a * X + Y
 * for several single-precision complex number vectors
 */
public class JOCLBlastCaxpyBatchedSample
{
    private static cl_context context;
    private static cl_command_queue commandQueue;

    /**
     * The entry point of this sample
     *
     * @param args Not used
     */
    public static void main(String args[])
    {
	CL.setExceptionsEnabled(true);
	CLBlast.setExceptionsEnabled(true);

	defaultInitialization();


	// Create the host input data. Each entry of these vectors consists 
	// of TWO values, which are the real- and imaginary part of the 
	// complex number
	int numVectors = 3;
	int vectorSize = 5;

	// 3 vectors, each with 5 dimensions (*2, for real- and imaginary part)
	float X[] =  
	{
	    1,1, 1,2, 1,3, 1,4, 1,5,
	    2,1, 2,2, 2,3, 2,4, 2,5,
	    3,1, 3,2, 3,3, 3,4, 3,5,
	};
	// 3 vectors, each with 5 dimensions (*2, for real- and imaginary part)
	float Y[] =
	{
	    4,1, 4,2, 4,3, 4,4, 4,5,
	    5,1, 5,2, 5,3, 5,4, 5,5,
	    6,1, 6,2, 6,3, 6,4, 6,5,
	};

	// Create the device input buffers
	cl_mem memX = clCreateBuffer(context, CL_MEM_READ_ONLY,
	    vectorSize * numVectors * Sizeof.cl_float2, null, null);
	cl_mem memY = clCreateBuffer(context, CL_MEM_READ_ONLY,
	    vectorSize * numVectors * Sizeof.cl_float2, null, null);

	// Copy the host data to the device
	clEnqueueWriteBuffer(commandQueue, memX, CL_TRUE, 0,
	    vectorSize * numVectors * Sizeof.cl_float2, 
	    Pointer.to(X), 0, null, null);
	clEnqueueWriteBuffer(commandQueue, memY, CL_TRUE, 0,
	    vectorSize * numVectors * Sizeof.cl_float2, 
	    Pointer.to(Y), 0, null, null);

	// 3 factors to be multiplied with X (*2, for real- and imaginary part)
	float alphas[] = { 1,2, 2,3, 3,4 };

	// Execute batched CAXPY: Y = alpha * X + Y
	cl_event event = new cl_event();
	CLBlastCaxpyBatched(vectorSize, alphas, 
	    memX, new long[] { 0, 5, 10 }, 1, 
	    memY, new long[] { 0, 5, 10 }, 1,  
	    numVectors, commandQueue, event);

	// Wait for the computation to be finished
	clWaitForEvents( 1, new cl_event[] { event });

	// Copy the result data back to the host
	float resultY[] = new float[vectorSize * numVectors * 2];
	clEnqueueReadBuffer(commandQueue, memY, CL_TRUE, 0,
	    vectorSize * numVectors * Sizeof.cl_float2, 
	    Pointer.to(resultY), 0, null, null);

	// Print the inputs and the result
	System.out.println("a:");
	printComplex2D(FloatBuffer.wrap(alphas), 1);

	System.out.println("X:");
	printComplex2D(FloatBuffer.wrap(X), vectorSize);

	System.out.println("Y:");
	printComplex2D(FloatBuffer.wrap(Y), vectorSize);

	System.out.println("Result:");
	printComplex2D(FloatBuffer.wrap(resultY), vectorSize);

	// Clean up
	clReleaseMemObject(memX);
	clReleaseMemObject(memY);
	clReleaseCommandQueue(commandQueue);
	clReleaseContext(context);        
    }

    /**
     * Default OpenCL initialization of the context and command queue
     */
    private static void defaultInitialization()
    {
	// The platform, device type and device number
	// that will be used
	final int platformIndex = 0;
	final long deviceType = CL_DEVICE_TYPE_ALL;
	final int deviceIndex = 0;

	// Enable exceptions and subsequently omit error checks in this sample
	CL.setExceptionsEnabled(true);

	// Obtain the number of platforms
	int numPlatformsArray[] = new int[1];
	clGetPlatformIDs(0, null, numPlatformsArray);
	int numPlatforms = numPlatformsArray[0];

	// Obtain a platform ID
	cl_platform_id platforms[] = new cl_platform_id[numPlatforms];
	clGetPlatformIDs(platforms.length, platforms, null);
	cl_platform_id platform = platforms[platformIndex];

	// Initialize the context properties
	cl_context_properties contextProperties = new cl_context_properties();
	contextProperties.addProperty(CL_CONTEXT_PLATFORM, platform);

	// Obtain the number of devices for the platform
	int numDevicesArray[] = new int[1];
	clGetDeviceIDs(platform, deviceType, 0, null, numDevicesArray);
	int numDevices = numDevicesArray[0];

	// Obtain a device ID
	cl_device_id devices[] = new cl_device_id[numDevices];
	clGetDeviceIDs(platform, deviceType, numDevices, devices, null);
	cl_device_id device = devices[deviceIndex];

	// Create a context for the selected device
	context = clCreateContext(
	    contextProperties, 1, new cl_device_id[]{device},
	    null, null, null);

	String deviceName = getString(devices[0], CL_DEVICE_NAME);
	System.out.printf("CL_DEVICE_NAME: %s\n", deviceName);

	// Create a command-queue
	commandQueue = clCreateCommandQueue(
	    context, devices[0], 0, null);

    }

    /**
     * Print the given buffer as a matrix with the given number of columns.
     * This assumes that the the elements of these buffers are complex 
     * numbers, consisting of a real- and an imaginary part.
     *
     * @param data The buffer
     * @param columns The number of columns
     */
    private static void printComplex2D(FloatBuffer data, int columns)
    {
	StringBuffer sb = new StringBuffer();
	for (int i=0; i<data.capacity() / 2; i++)
	{
	    sb.append(String.format(Locale.ENGLISH, "(%5.1f, %5.1fi) ",
		data.get(i * 2 + 0), data.get(i * 2 + 1)));
	    if (((i + 1) % columns) == 0)
	    {
		sb.append("\n");
	    }
	}
	System.out.print(sb.toString());
    }

    private static String getString(cl_device_id device, int paramName)
    {
	// Obtain the length of the string that will be queried
	long size[] = new long[1];
	clGetDeviceInfo(device, paramName, 0, null, size);

	// Create a buffer of the appropriate size and fill it with the info
	byte buffer[] = new byte[(int)size[0]];
	clGetDeviceInfo(device, paramName, buffer.length, 
	    Pointer.to(buffer), null);

	// Create a string from the buffer (excluding the trailing \0 byte)
	return new String(buffer, 0, buffer.length-1);
    }

}

Both seem to work well (although I'll have to dive deeper into what OverrideParameters actually does to be sure that it has the intended effect, I received some error messages from the OpenCL compiler when I called it with wrong parameters, so it at least does have an effect ;-)).


I still have to create a GitHub repo for all the JOCL samples, so that I can finally summarize the examples from http://jocl.org/samples/samples.html and the ones that are posted elsewhere (in the forum and here) in one place....

@blueberry
Copy link
Author

@gpu @amherag Here is the linux build for 0.11.0. Everything went smoothly.
jocl-blast-0.11.0-SNAPSHOT.zip

@gpu
Copy link
Owner

gpu commented May 22, 2017

(EDIT: Writing this overlapped with the comment at #9 (comment) )

I have done a small update for #9 (comment)

Although technically, it should not change anything for the linux version, it might be clearer if the linux version would also be compiled based on this state. (The change might still cause issues on Linux - although, of course, it should not, but just to be sure...)

@amherag
Copy link

amherag commented May 22, 2017

Here it is :)

jocl-blast-0.11.0-SNAPSHOT.jar.zip

@blueberry
Copy link
Author

And the linux build is also ready:
jocl-blast-0.11.0-SNAPSHOT-22-5-2017.zip

@gpu
Copy link
Owner

gpu commented May 22, 2017

You're great! I'll build the Maven package ASAP (maybe tomorrow, but most likely not later than thursday)

@gpu
Copy link
Owner

gpu commented May 25, 2017

Thanks again to @amherag and @blueberry (and @CNugteren , for making all this possible in the first place ;-) )

The release will soon be available as

<dependency>
    <groupId>org.jocl</groupId>
    <artifactId>jocl-blast</artifactId>
    <version>0.11.0</version>
</dependency>

@blueberry
Copy link
Author

@amherag
Copy link

amherag commented Jul 30, 2017

@blueberry
Copy link
Author

@amherag Hi Amaury. I'm afraid that we first have to wait for @gpu to update JOCLBlast to the newest CLBlast 1.0 :)

@amherag
Copy link

amherag commented Jul 30, 2017

@blueberry Yeah, I was wondering why the versions didn't match. I was going to update my comment, but I decided to wait and see what you or @gpu were going to tell me :P

@gpu
Copy link
Owner

gpu commented Jul 31, 2017

Thanks for the heads-up. Apart from the *AMIN functions, there seem to be no changes in the API. I'll try to schedule the update ASAP (I'm a bit short on time this week, but will see what I can do)

@blueberry
Copy link
Author

Thank you, @gpu

@CNugteren
Copy link

Thanks again everyone! There was a bug fixed just after the release though, so I'll make a 1.0.1 release soon after (next week after everything is properly checked this time). Perhaps you should wait for that?

@blueberry
Copy link
Author

@CNugteren @gpu I'd prefer to wait for the proper release, as I am in no hurry. Thanks everyone!

@gpu
Copy link
Owner

gpu commented Aug 2, 2017

Yes, that sounds like a plan :-)

@CNugteren
Copy link

New 1.0.1 release is now made, sorry for any inconvenience. Greatly appreciate your effort with JOCLBlast!

@gpu
Copy link
Owner

gpu commented Aug 11, 2017

These efforts are nothing compared to the efforts that went into CLBlast itself 👍

(I'll do the update on Sunday/Monday and drop a note here)

@gpu
Copy link
Owner

gpu commented Aug 14, 2017

Although it's already tuesday now, here is the tag for the 1.0.1 release:

https://github.com/gpu/JOCLBlast/releases/tag/1.0.1-RC00

@blueberry and @amherag Once the natives for JOCLBlast and CLBlast are available, I'll publish the Maven release.

(BTW: This issue is already rather long. I'd probably close this after the release, so that we can use dedicated issues for the subsequent releases)

@blueberry
Copy link
Author

I will be able to buid it and test it only in a few weeks. I hope that is ok. Sorry.

@gpu
Copy link
Owner

gpu commented Aug 15, 2017

OK for me. Maybe that's a chance for me to try and build this on a VirtualBox VM. This should work, but not being able to really test the resulting library would cause me to hesitate publishing it.

(Maybe I can build it on a VM, and you can try out whether the resulting lib works on a real machine. If it does, I could build the linux libs myself in the future)

@blueberry
Copy link
Author

Just testing whether it works wouldn't be that time-consuming for me, but the thing is that Cedric committed new tuning results for the GPU that I use from another user that tuned it with a newer GPU. However, that user was getting some results that were suspicious to me, so I need to investigate this and make some measurements to see whether these new changes do not introduce some noticeable performance regressions on my hardware (R9 290X)...

@blueberry
Copy link
Author

@gpu Hi Marco. I've finally come around to building JOCLBlast 1.0.1 for Linux. Sorry for the delay.

@amherag reminder :)

jocl-blast-1.0.1-SNAPSHOT.zip

@amherag
Copy link

amherag commented Sep 3, 2017

@blueberry Thanks for the reminder :D

jocl-blast-1.0.1-SNAPSHOT.jar.zip

@gpu
Copy link
Owner

gpu commented Sep 3, 2017

Thanks, @blueberry and @amherag , I'll to the maven update ASAP

@gpu
Copy link
Owner

gpu commented Sep 5, 2017

JOCLBlast 1.0.1 has been uploaded to Maven Central, and will soon be available under the following coordinates:

<dependency>
    <groupId>org.jocl</groupId>
    <artifactId>jocl-blast</artifactId>
    <version>1.0.1</version>
</dependency>

Thanks again @blueberry @amherag and of course @CNugteren for making this possible!

(As mentioned above, I'll close this issue now. For future updates, dedicated issues can be created)

@gpu gpu closed this as completed Sep 5, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants