Add a lookup table op and a CUDA helper. #3620

qingqing01 · 2017-08-22T16:14:13Z

Add a lookup table operator, which is used as embedding in RNN network.
- Now only use the dense tensor to represent table instead of SparseRow tensor. And the TableProjection in Paddle also uses a dense matrix to represent the table by default.
- The optimized lookup table operator by using SparseRow will be supported in future.
Add some CUDA helper.

1. finish lookup table CPU and GPU kernel 2. Add some cuda helper 3. Add some math funtor

… lookup_table

wangkuiyi · 2017-08-22T17:32:02Z

paddle/framework/CMakeLists.txt

@@ -55,5 +55,6 @@ cc_library(paddle_pybind SHARED
    recurrent_op
    uniform_random_op
    gaussian_random_op
+    lookup_table_op


The lookup_table_op and other ops defined here in framework/CMakeLists.txt are targets defined in operators/CMakeLists.txt. This implies a cyclic dependency --
framework and operators depend on each other. Can we remove this cyclic dependency?

It seems that the solution is to move target pybind into a third package, say, paddle/pybind, in parallel with paddle/framework and paddle/operators.

I will do it.

wangkuiyi · 2017-08-22T17:36:27Z

paddle/operators/functor/math_functor.cc

+namespace functor {
+
+template <typename T>
+struct Set<platform::CPUPlace, T> {


The name Set here is a little confusing -- it could refer to a data container and the assignment operation. If it is the latter case here, as a class name should be a noun, we might want Setter instead of Set.

Set is not needed. Use Eigen::Vector 's operator =

@reyoung Done.

wangkuiyi · 2017-08-22T17:37:22Z

paddle/operators/functor/CMakeLists.txt

@@ -0,0 +1,5 @@
+if(WITH_GPU)


I see that we are creating a subdirectory and a namespace for holding a new class Set. Do we have a plan to add a lot of more functors in the near future? If so, I think it is OK; otherwise, the creation of a subdirectory looks an overkill.

Now we have a subdirectory of paddle/operators/math. But the plain function is used in math subdirectory, instead of functor (or function object). I notice TensorFlow and WarpCTC use functors to write some common math functions. And the functors are also used in Paddle's function
subdirectory. The functors have some advantages, they can have state. And it is convenient to specialize in different data types compared with plain function (discussed with @hedaoyuan). We can see many duplicated codes in operators/math/math_function.cc and operators/math/math_function.cu for single float and double type.

So, maybe it's better to move math/math_function to functor/math_functor.

We have some operations on tensor in our code internally. Now, the internal operations are in paddle/operators/math directory, and there are some global functions. If functor make codes cleaner, we can use functor instead of function(Just like TensorFlow does).

wangkuiyi · 2017-08-22T17:37:56Z

paddle/platform/cuda_helper.h

@@ -0,0 +1,55 @@
+/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.


cuda_helper.h => cuda.h ?

CUDA also has a cuda.h header. In caffe2, the file with similar function is called common_gpu.h, and in TensorFlow, called cuda_kernel_helper.h.

reyoung · 2017-08-23T02:30:22Z

paddle/platform/cuda_helper.h

+
+namespace paddle {
+namespace platform {
+


I don't think this header is necessary.

Using thrust of Cuda can easily write an element-wise operator. and thrust is a header only library provided by cuda toolkit.

In this header, the atomicAdd for float and double type is defined. The code for double type is a little complex. I think we need a common file to write some basic and common usage for CUDA.

@reyoung remove set functor and remove CUDA_1D_KERNEL_LOOP.

reyoung · 2017-08-23T02:33:28Z

paddle/operators/functor/math_functor.cc

+    if (alpha == static_cast<T>(0)) {
+      memset(YData, 0, N * sizeof(T));
+    } else {
+      framework::EigenVector<T, Eigen::RowMajor, Eigen::DenseIndex>::Flatten(*Y)


Always use Eigen.setConstant in header of operator kernel is OK.

memset is not always faster than std::copy or other carefully implemented setConstant or Copy method. So it is no need to special handle zero.

Ok, I'll modify the code.

Please refer to https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/fill_zeros_like_op.h#L29

Both for CPU and GPU

t.device(EigenDeivce) = t.constant(T(scalar));

And Set can be a function in operators/math_function.h

@reyoung @QiJune Done. Thanks!

* add gfl model and PicoDet

qingqing01 added 6 commits August 22, 2017 16:35

lookup table op, cuda helper and set functor

0f3b9e4

1. finish lookup table CPU and GPU kernel 2. Add some cuda helper 3. Add some math funtor

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

1795e57

… lookup_table

fix compile for paddle_pybind.

c91e542

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

31f59d2

… lookup_table

fix cuda_helper.h

9bc1a1a

fix bug.

a8d072c

wangkuiyi reviewed Aug 22, 2017

View reviewed changes

wangkuiyi mentioned this pull request Aug 22, 2017

Move pybind from package paddle/framework into paddle/pybind #3622

Closed

reyoung reviewed Aug 23, 2017

View reviewed changes

qingqing01 added 5 commits August 23, 2017 14:39

Remove set functor and add comapre_grad test

f188e22

resolve conflicts

d8ea560

Merge branch 'develop' into lookup_table

fe480b9

Resovle conflicts.

068ddca

Merge branch 'develop' into lookup_table

aafeff0

wangkuiyi approved these changes Aug 24, 2017

View reviewed changes

qingqing01 merged commit 3663bd8 into PaddlePaddle:develop Aug 24, 2017

qingqing01 deleted the lookup_table branch March 7, 2018 12:03

heavengate pushed a commit to heavengate/Paddle that referenced this pull request Aug 16, 2021

Add GFL model and PicoDet (PaddlePaddle#3620)

fceab30

* add gfl model and PicoDet

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a lookup table op and a CUDA helper. #3620

Add a lookup table op and a CUDA helper. #3620

qingqing01 commented Aug 22, 2017 •

edited

Loading

wangkuiyi Aug 22, 2017

wangkuiyi Aug 22, 2017

qingqing01 Aug 23, 2017

wangkuiyi Aug 22, 2017

reyoung Aug 23, 2017

qingqing01 Aug 23, 2017

wangkuiyi Aug 22, 2017 •

edited

Loading

qingqing01 Aug 23, 2017 •

edited

Loading

QiJune Aug 23, 2017

wangkuiyi Aug 22, 2017

qingqing01 Aug 23, 2017

reyoung Aug 23, 2017

qingqing01 Aug 23, 2017

qingqing01 Aug 23, 2017

reyoung Aug 23, 2017

qingqing01 Aug 23, 2017

QiJune Aug 23, 2017 •

edited

Loading

qingqing01 Aug 23, 2017

		@@ -0,0 +1,55 @@
		/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.

Add a lookup table op and a CUDA helper. #3620

Add a lookup table op and a CUDA helper. #3620

Conversation

qingqing01 commented Aug 22, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wangkuiyi Aug 22, 2017 • edited Loading

Choose a reason for hiding this comment

qingqing01 Aug 23, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

QiJune Aug 23, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

qingqing01 commented Aug 22, 2017 •

edited

Loading

wangkuiyi Aug 22, 2017 •

edited

Loading

qingqing01 Aug 23, 2017 •

edited

Loading

QiJune Aug 23, 2017 •

edited

Loading