some details on LMFitCuda::solve_equation_system() #50

lukkio88 · 2018-03-13T14:27:05Z

Hi All,
With reference to "#46" I've implemented a model with many parameters (more than 32) and I'm trying to implement a "vector function" to be fit, plus the number of parameter is more than 32. Therefore I'm trying to hack a bit the code in such a way I'd be able to support my case. With reference to function below

void LMFitCUDA::solve_equation_system()
{
	dim3  threads(1, 1, 1);
	dim3  blocks(1, 1, 1);

	threads.x = info_.n_parameters_to_fit_*info_.n_fits_per_block_;
	blocks.x = n_fits_ / info_.n_fits_per_block_;

	cuda_modify_step_widths << < blocks, threads >> >(
		gpu_data_.hessians_,
		gpu_data_.lambdas_,
		gpu_data_.scaling_vectors_,
		info_.n_parameters_to_fit_,
		gpu_data_.iteration_failed_,
		gpu_data_.finished_,
		info_.n_fits_per_block_);
	CUDA_CHECK_STATUS(cudaGetLastError());

	int n_parameters_pow2 = 1;

	while (n_parameters_pow2 < info_.n_parameters_to_fit_)
	{
		n_parameters_pow2 *= 2;
	}

	//set up to run the Gauss Jordan elimination
	int const n_equations = info_.n_parameters_to_fit_;
	int const n_solutions = n_fits_;

	threads.x = n_equations + 1;
	threads.y = n_equations;
	blocks.x = n_solutions;

	//set the size of the shared memory area for each block
	int const shared_size
		= sizeof(float) * ((threads.x * threads.y)
			+ n_parameters_pow2 + n_parameters_pow2);

	//set up the singular_test vector
	int * singular_tests;
	CUDA_CHECK_STATUS(cudaMalloc((void**)&singular_tests, n_fits_ * sizeof(int)));

	//run the Gauss Jordan elimination
	cuda_gaussjordan << < blocks, threads, shared_size >> >(
		gpu_data_.deltas_,
		gpu_data_.gradients_,
		gpu_data_.hessians_,
		gpu_data_.finished_,
		singular_tests,
		info_.n_parameters_to_fit_,
		n_parameters_pow2);
	CUDA_CHECK_STATUS(cudaGetLastError());

	//set up to update the lm_state_gpu_ variable with the Gauss Jordan results
	threads.x = std::min(n_fits_, 256);
	threads.y = 1;
	blocks.x = int(std::ceil(float(n_fits_) / float(threads.x)));

	//update the lm_state_gpu_ variable
	cuda_update_state_after_gaussjordan << < blocks, threads >> >(
		n_fits_,
		singular_tests,
		gpu_data_.states_);
	CUDA_CHECK_STATUS(cudaGetLastError());

	CUDA_CHECK_STATUS(cudaFree(singular_tests));

	threads.x = info_.n_parameters_*info_.n_fits_per_block_;
	threads.y = 1;
	blocks.x = n_fits_ / info_.n_fits_per_block_;

	cuda_update_parameters << < blocks, threads >> >(
		gpu_data_.parameters_,
		gpu_data_.prev_parameters_,
		gpu_data_.deltas_,
		info_.n_parameters_to_fit_,
		gpu_data_.parameters_to_fit_indices_,
		gpu_data_.finished_,
		info_.n_fits_per_block_);
	CUDA_CHECK_STATUS(cudaGetLastError());
}

What is the meaning of the following:

	threads.x = info_.n_parameters_to_fit_*info_.n_fits_per_block_;
	blocks.x = n_fits_ / info_.n_fits_per_block_;

	//set up to run the Gauss Jordan elimination
	int const n_equations = info_.n_parameters_to_fit_;
	int const n_solutions = n_fits_;

	threads.x = n_equations + 1;
	threads.y = n_equations;
	blocks.x = n_solutions;

	threads.x = info_.n_parameters_*info_.n_fits_per_block_;
	threads.y = 1;
	blocks.x = n_fits_ / info_.n_fits_per_block_;

I mean the link I pointed above give a work around to implement a vector function, however due to the large number of parameters I have to modify the threads and blocks variables, but I don't want to mess-up with how they're actually used in practice. I have just one function to fit.

The text was updated successfully, but these errors were encountered:

superchromix · 2018-03-16T09:03:49Z

We will have a fix for models using >32 parameters very soon.

lukkio88 · 2018-03-16T10:44:34Z

Thank you.

superchromix · 2018-03-29T07:55:49Z

The limitation on the number of model parameters has been removed, by changing the way in which the calculation of the Hessian is parallelized. The new parameter limit is 1024.

superchromix closed this as completed Mar 29, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

some details on LMFitCuda::solve_equation_system() #50

some details on LMFitCuda::solve_equation_system() #50

lukkio88 commented Mar 13, 2018

superchromix commented Mar 16, 2018

lukkio88 commented Mar 16, 2018

superchromix commented Mar 29, 2018

some details on LMFitCuda::solve_equation_system() #50

some details on LMFitCuda::solve_equation_system() #50

Comments

lukkio88 commented Mar 13, 2018

superchromix commented Mar 16, 2018

lukkio88 commented Mar 16, 2018

superchromix commented Mar 29, 2018