Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a lookup table op and a CUDA helper. #3620

Merged
merged 11 commits into from
Aug 24, 2017

Conversation

qingqing01
Copy link
Contributor

@qingqing01 qingqing01 commented Aug 22, 2017

Fix #3621

  1. Add a lookup table operator, which is used as embedding in RNN network.

    • Now only use the dense tensor to represent table instead of SparseRow tensor. And the TableProjection in Paddle also uses a dense matrix to represent the table by default.
    • The optimized lookup table operator by using SparseRow will be supported in future.
  2. Add some CUDA helper.

@@ -55,5 +55,6 @@ cc_library(paddle_pybind SHARED
recurrent_op
uniform_random_op
gaussian_random_op
lookup_table_op
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lookup_table_op and other ops defined here in framework/CMakeLists.txt are targets defined in operators/CMakeLists.txt. This implies a cyclic dependency --
framework and operators depend on each other. Can we remove this cyclic dependency?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that the solution is to move target pybind into a third package, say, paddle/pybind, in parallel with paddle/framework and paddle/operators.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will do it.

namespace functor {

template <typename T>
struct Set<platform::CPUPlace, T> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name Set here is a little confusing -- it could refer to a data container and the assignment operation. If it is the latter case here, as a class name should be a noun, we might want Setter instead of Set.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set is not needed. Use Eigen::Vector 's operator =

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reyoung Done.

@@ -0,0 +1,5 @@
if(WITH_GPU)
Copy link
Collaborator

@wangkuiyi wangkuiyi Aug 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that we are creating a subdirectory and a namespace for holding a new class Set. Do we have a plan to add a lot of more functors in the near future? If so, I think it is OK; otherwise, the creation of a subdirectory looks an overkill.

Copy link
Contributor Author

@qingqing01 qingqing01 Aug 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we have a subdirectory of paddle/operators/math. But the plain function is used in math subdirectory, instead of functor (or function object). I notice TensorFlow and WarpCTC use functors to write some common math functions. And the functors are also used in Paddle's function
subdirectory. The functors have some advantages, they can have state. And it is convenient to specialize in different data types compared with plain function (discussed with @hedaoyuan). We can see many duplicated codes in operators/math/math_function.cc and operators/math/math_function.cu for single float and double type.

So, maybe it's better to move math/math_function to functor/math_functor.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have some operations on tensor in our code internally. Now, the internal operations are in paddle/operators/math directory, and there are some global functions. If functor make codes cleaner, we can use functor instead of function(Just like TensorFlow does).

@@ -0,0 +1,55 @@
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cuda_helper.h => cuda.h ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CUDA also has a cuda.h header. In caffe2, the file with similar function is called common_gpu.h, and in TensorFlow, called cuda_kernel_helper.h.


namespace paddle {
namespace platform {

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this header is necessary.

Using thrust of Cuda can easily write an element-wise operator. and thrust is a header only library provided by cuda toolkit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this header, the atomicAdd for float and double type is defined. The code for double type is a little complex. I think we need a common file to write some basic and common usage for CUDA.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reyoung remove set functor and remove CUDA_1D_KERNEL_LOOP.

if (alpha == static_cast<T>(0)) {
memset(YData, 0, N * sizeof(T));
} else {
framework::EigenVector<T, Eigen::RowMajor, Eigen::DenseIndex>::Flatten(*Y)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always use Eigen.setConstant in header of operator kernel is OK.

memset is not always faster than std::copy or other carefully implemented setConstant or Copy method. So it is no need to special handle zero.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll modify the code.

Copy link
Member

@QiJune QiJune Aug 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please refer to https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/fill_zeros_like_op.h#L29

Both for CPU and GPU

t.device(EigenDeivce) = t.constant(T(scalar));

And Set can be a function in operators/math_function.h

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reyoung @QiJune Done. Thanks!

@qingqing01 qingqing01 merged commit 3663bd8 into PaddlePaddle:develop Aug 24, 2017
@qingqing01 qingqing01 deleted the lookup_table branch March 7, 2018 12:03
heavengate pushed a commit to heavengate/Paddle that referenced this pull request Aug 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants