Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add cuda generator #26786

Merged
merged 17 commits into from
Sep 4, 2020
Merged

Conversation

yaoxuefeng6
Copy link
Contributor

PR types

Function optimization

PR changes

Others

Describe

add cuda generator

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@PaddlePaddle PaddlePaddle locked and limited conversation to collaborators Aug 31, 2020
@PaddlePaddle PaddlePaddle unlocked this conversation Aug 31, 2020
Comment on lines 34 to 38
static int64_t num_cuda_devices = -1;
static std::once_flag num_devices_init_flag;
static std::deque<std::once_flag> cuda_device_flags;

static std::vector<std::shared_ptr<Generator>> default_cuda_generators;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to avoid global variables that are not of POD Type. You can use static local variables initialized inside functions instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

#endif
}

const std::shared_ptr<Generator>& getDefaultCUDAGenerator(int64_t device_id) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getDefaultCUDAGenerator -> GetDefaultCUDAGenerator

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -38,6 +38,7 @@ static uint64_t GetRandomSeed() {
struct GeneratorState {
int64_t device = -1;
uint64_t current_seed = 34342423252;
uint64_t thread_offset;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set the default value of thread_offset.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

this->engine_ = engine;
VLOG(4) << "initial seed: " << this->state_.current_seed
<< ", cpu engine: " << &this->state_.cpu_engine;
this->is_init_py_ = true; // TODO(zhiqiu): remove it in future
}
explicit Generator(uint64_t seed, uint64_t device_id) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explicit is not necessary for constructor that has more than one argument?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Comment on lines 189 to 193
int64_t device_id = -1;
auto gen_cuda = framework::getDefaultCUDAGenerator(device_id);
if (gen_cuda->GetIsInitPy() && (context.Attr<bool>("fix_seed"))) {
std::cout << ">>>>>>>>CUDA DROPOUT GENERATOR" << std::endl;
auto seed_offset = gen_cuda->IncrementOffset(1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use a common function to abstract this logic?

return state_list


def set_cuda_state(state_list):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same above.

core.default_cpu_generator()._is_init_py = True
return core.default_cpu_generator().manual_seed(seed)


def get_cuda_state():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is a public API, please add some docs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

Comment on lines 61 to 73
int64_t device_id = -1;
auto gen_cuda = framework::getDefaultCUDAGenerator(device_id);
if (gen_cuda->GetIsInitPy() && seed_flag) {
std::cout << ">>>>>>>>CUDA bernoulli GENERATOR" << std::endl;
// auto seed_offset = gen_cuda->IncrementOffset(1);
auto seed_gen = static_cast<int>(gen_cuda->GetCurrentSeed());
// int offset_step = 0;
// NOTE(xuefeng): Currently, we let offset step fixed to avoid
// unexpected results which may cause ut fail.
// we will fix this in future.
// int gen_offset = offset_step * seed_offset.second;
trans(*context, index_sequence_begin, index_sequence_begin + size,
in_data, out_data, BernoulliCudaFunctor<T>(seed_gen));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why only use seed instead of the random cuda engine?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. This is in debug.

for i in range(core.get_cuda_device_count()):
core.default_cuda_generator(i).set_state(state_list[i])


def _manual_program_seed(seed):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we delete _manual_program_seed after the CUDA generator is implemented, and remove the usage in unit tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do this in next pr

Copy link
Contributor

@jzhang533 jzhang533 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_cuda_state / set_cuda_state
只看这两个API的名字,用户是不是会误以为是设置cuda的每个状态的,而不会联想到随机数?

Comment on lines 90 to 93
#print("x: {}".format(x_np))
#print("x1: {}".format(x1_np))
#print("x2: {}".format(x2_np))
#print("x3: {}".format(x3_np))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove unused code.

@@ -77,7 +77,7 @@ struct Generator {
this->engine_ = engine;
VLOG(4) << "initial seed: " << this->state_.current_seed
<< ", cpu engine: " << &this->state_.cpu_engine;
this->is_init_py_ = true; // TODO(zhiqiu): remove it in future
this->is_init_py_ = false; // TODO(zhiqiu): remove it in future
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the default is false?

auto gen_cuda = framework::getDefaultCUDAGenerator(device_id);
if (gen_cuda->GetIsInitPy() && (context.Attr<bool>("fix_seed"))) {
std::cout << ">>>>>>>>CUDA DROPOUT GENERATOR" << std::endl;
auto gen_cuda = framework::GetDefaultCUDAGenerator(-1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the device_id is -1 here? I think it should be got by the context.

num_cuda_devices = paddle::platform::GetCUDADeviceCount();
cuda_device_flags.resize(num_cuda_devices);
default_cuda_generators.resize(num_cuda_devices);
});
platform::Place place;
if (device_id == -1)
device_id = BOOST_GET_CONST(platform::CUDAPlace, place).GetDeviceId();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case, the originally given device_id is -1, what the number after this?

.. code-block:: python

import paddle
gen = paddle.manual_seed(102)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line seems unused.

Comment on lines 63 to 65
} // end Generator
} // end namespace pybind
} // namespace paddle
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not introduced by your PR. It seems the namespace endings are not of the same style.

@zhiqiu
Copy link
Contributor

zhiqiu commented Sep 3, 2020

get_cuda_state / set_cuda_state
只看这两个API的名字,用户是不是会误以为是设置cuda的每个状态的,而不会联想到随机数?

同意,建议 paddle.set_cuda_rng_state/get_cuda_rng_state or paddle.cuda.set_rng_state/paddle.cuda.get_rng_state

Copy link
Contributor

@XiaoguangHu01 XiaoguangHu01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@jzhang533 jzhang533 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@yaoxuefeng6 yaoxuefeng6 merged commit 7f3e6ca into PaddlePaddle:develop Sep 4, 2020
yaoxuefeng6 added a commit to yaoxuefeng6/Paddle that referenced this pull request Sep 4, 2020
yaoxuefeng6 added a commit that referenced this pull request Sep 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants