Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing documentations for few more operators #5374

Merged
merged 6 commits into from
Nov 5, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 9 additions & 6 deletions paddle/operators/smooth_l1_loss_op.cc
Original file line number Diff line number Diff line change
Expand Up @@ -77,14 +77,17 @@ class SmoothL1LossOpMaker : public framework::OpProtoAndCheckerMaker {
"A float scalar with default value 3.0.")
.SetDefault(3.0);
AddComment(R"DOC(
Compute smooth l1 loss for input and target. The operator take the 1st
dimension of input as batch size. For each instance, it will compute
smooth l1 loss element by element first and sum all losses to one value.
So the output shape is [batch_size, 1].
Smooth L1 Loss Operator.

This operator computes the smooth l1 loss for input and target.
The operator takes the first dimension of input as the batch size.
For each instance, it computes the smooth l1 loss element by element first
and then sums all the losses. So the resulting output shape
is [batch_size, 1].

The equation is:
loss = 0.5 * (sigma * (x-y))^2 if abs(x - y) < 1 / sigma^2
abs(x - y) - 0.5 / sigma^2 otherwise
loss = $$0.5 * (\sigma * (x-y))^2$$ if $$|x - y| < 1 /({\sigma}^2)$$
$$\frac{|x - y| - 0.5}{{\sigma}^2}$$ otherwise

)DOC");
}
Expand Down
17 changes: 10 additions & 7 deletions paddle/operators/softmax_op.cc
Original file line number Diff line number Diff line change
Expand Up @@ -44,20 +44,23 @@ class SoftmaxOpMaker : public framework::OpProtoAndCheckerMaker {
"2-D with shape [batch_size, input_feature_dimensions].");
AddOutput("Y", "The normalized values with the same shape as X.");
AddComment(R"DOC(
The input of softmax operator is a 2-D tensor with shape N x K (N is the
Softmax Operator.

The input of the softmax operator is a 2-D tensor with shape N x K (N is the
batch_size, K is the dimension of input feature). The output tensor has the
same shape as the input tensor.

For each row of the input tensor, the softmax operator squashes the
K-dimensional vector of arbitrary real values to a K-dimensional vector of real
values in the range [0, 1] that add up to 1. Specifically, it computes the
exponential of the given dimension and the sum of exponential values of all
the other dimensions in the K-dimensional vector input. Then the ratio of the
exponential of the given dimension and the sum of exponential values of all
the other dimensions is the output of the softmax operator.
values in the range [0, 1] that add up to 1.
It computes the exponential of the given dimension and the sum of exponential
values of all the other dimensions in the K-dimensional vector input.
Then the ratio of the exponential of the given dimension and the sum of
exponential values of all the other dimensions is the output of the softmax
operator.

For each row `i` and each column `j` in input X, we have:
Y[i, j] = exp(X[i, j]) / sum_j(exp(X[i, j]))
$$Y[i, j] = \frac{\exp(X[i, j])}{\sum_j(exp(X[i, j])}$$

)DOC");
}
Expand Down
30 changes: 16 additions & 14 deletions paddle/operators/softmax_with_cross_entropy_op.cc
Original file line number Diff line number Diff line change
Expand Up @@ -51,32 +51,34 @@ class SoftmaxWithCrossEntropyOpMaker
"the given labels as soft labels.")
.SetDefault(false);
AddComment(R"DOC(
Cross entropy loss with softmax are used as the output layer extensively. This
Softmax With Cross Entropy Operator.

Cross entropy loss with softmax is used as the output layer extensively. This
operator computes the softmax normalized values for each row of the input
tensor, after which cross-entropy loss is then computed. This provides a more
tensor, after which cross-entropy loss is computed. This provides a more
numerically stable gradient.

Because this operators performs a softmax on logits internally, it expects
unscaled logits. Please do not call this op with the output of softmax operator,
which will produce incorrect results.
Because this operator performs a softmax on logits internally, it expects
unscaled logits. This operator should not be used with the output of
softmax operator since that would produce incorrect results.

When the attribute softLabel is set false, this operators expects mutually
exclusive hard labels, each sample in a batch is in exactly one class with
probabilities 1. Each sample in the batch with one and only one label.
exclusive hard labels, each sample in a batch is in exactly one class with a
probability of 1.0. Each sample in the batch will have a single label.

Equation:
The equation is as follows:

1) hard label (one-hot label)
1) Hard label (one-hot label, so every sample has exactly one class)

Loss_j = \f$ -\text{Logit}_{Label_j} +
$$Loss_j = \f$ -\text{Logit}_{Label_j} +
\log\left(\sum_{i=0}^{K}\exp(\text{Logit}_i)\right),
j = 1, ..., K $\f
j = 1, ..., K $\f$$

2) soft label (a distribution over all classes)
2) Soft label (each sample can have a distribution over all classes)

Loss_j = \f$ -\sum_{i=0}^{K}\text{Label}_i\left(\text{Logit}_i -
$$Loss_j = \f$ -\sum_{i=0}^{K}\text{Label}_i\left(\text{Logit}_i -
\log\left(\sum_{i=0}^{K}\exp(\text{Logit}_i)\right)\right),
j = 1,...,K $\f
j = 1,...,K $\f$$

)DOC");
}
Expand Down
11 changes: 7 additions & 4 deletions paddle/operators/transpose_op.cc
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ class TransposeOp : public framework::OperatorWithKernel {
size_t axis_size = axis.size();

PADDLE_ENFORCE_EQ(x_rank, axis_size,
"the input tensor's rank(%d) "
"The input tensor's rank(%d) "
"should be equal to the axis's size(%d)",
x_rank, axis_size);

Expand Down Expand Up @@ -64,12 +64,14 @@ class TransposeOpMaker : public framework::OpProtoAndCheckerMaker {
AddOutput("Out", "(Tensor)The output tensor");
AddAttr<std::vector<int>>(
"axis",
"(vector<int>)a list of values, and the size of the list should be "
"(vector<int>)A list of values, and the size of the list should be "
"the same with the input tensor rank, the tensor will "
"permute the axes according the the values given");
AddComment(R"DOC(
The Tensor will be permuted according to the axis values given.
The op is very much like the numpy.transpose function in python
Transpose Operator.

The input tensor will be permuted according to the axis values given.
The op functions similar to how numpy.transpose works in python.
For example:
>> input = numpy.arange(6).reshape((2,3))
>> input
Expand All @@ -83,6 +85,7 @@ For example:
[2, 5]])
So, given a input tensor of shape(N, C, H, W) and the axis is {0, 2, 3, 1},
the output tensor shape will be (N, H, W, C)

)DOC");
}
};
Expand Down