-
Notifications
You must be signed in to change notification settings - Fork 699
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Quantization] Load quantized resnet50 model #2016
Conversation
lib/Graph/Graph.cpp
Outdated
llvm::ArrayRef<unsigned_t> strides, | ||
llvm::ArrayRef<unsigned_t> pads) { | ||
ShapeNHWC idim = ShapeNHWC(input.dims()); | ||
(void)idim; |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
lib/Graph/VerifierHelper.cpp
Outdated
bool ret = expectCompareTrue("Mismatching isQuantized", A->isQuantizedType(), | ||
B->isQuantizedType(), parent); | ||
if (!ret) { | ||
printf("now quantization has issue\n"); |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!!
MaxPool implementation relies on the fact that Scale of input and Scale of output are the same.
You'd need to fix CPU and Interpreter implementation. Also, check special cases for Max/Avg Pool in the quantization procedure, we'd not need post processing phase after you add proper handling.
lib/Importer/Caffe2ModelLoader.cpp
Outdated
/// For Glow: -127 <= orig_fp32/scale_1 + offset_1 < 128 | ||
/// For Caffe2: 0 <= orig_fp32/scale_2 + offset_2 < 255 | ||
/// Therefore, we can make scale_1 == scale_2, and offset_1 = offset2 - 128 | ||
#define OFFSETSHIFT 128 |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
lib/Importer/Caffe2ModelLoader.cpp
Outdated
@@ -159,34 +169,18 @@ void Caffe2ModelLoader::loadOperator(const caffe2::OperatorDef &op) { | |||
Tensor *w = getTensorByName(op.input(1)); | |||
|
|||
// Transpose the weights to the right format. Glow expects to read the | |||
// weights in the format CRSK. Caffe2 stores the operators as KCRS. | |||
// C - output_depth, R - filter_height, S - filter_width, K - input_depth. | |||
// weights in the format NHWC. |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
lib/Importer/Caffe2ModelLoader.cpp
Outdated
w->transpose(&wtag, NCHW2NHWC); | ||
if (typeName != "Conv" && order == "NHWC") | ||
wtag.assign(w); | ||
else |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
||
// The structure of the conv weigts is: NHWC. We take the C, which is the | ||
// number of filters. We use this value to calculate the size of the bias | ||
// if it is not specified. | ||
size_t depth = wtag.dims()[0]; | ||
|
||
// Construct the Filter field. |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
Nice! |
lib/Graph/Nodes.cpp
Outdated
@@ -199,7 +199,8 @@ static bool verifyPool(NodeValue src, NodeValue dest, | |||
ShapeNHWC exp(idim.n, outSz.first, outSz.second, idim.c); | |||
isValid &= | |||
expectCompareTrue("Unexpected output dimensions", exp, odim, parent); | |||
isValid &= checkTypeIgnoreShape(src, dest, dest.getNode()); | |||
if (!src.getType()->isQuantizedType()) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
utils/download_caffe2_models.sh
Outdated
EOF | ||
) | ||
|
||
for model in $MODELS; do | ||
for file in predict_net.pbtxt predict_net.pb init_net.pb; do | ||
for file in predict_net.pbtxt predict_net.pb init_net.pb init_net.pbtxt; do |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
tests/images/run.sh
Outdated
@@ -41,3 +41,5 @@ done | |||
for png_filename in tests/images/imagenet_299/*.png; do | |||
./bin/image-classifier "$png_filename" -image_mode=0to1 -m=googlenet_v4_slim/googlenet_v4_slim.onnx -model_input_name=input:0 -image_layout=NHWC -label_offset=1 "$@" | |||
done | |||
#Quantized Resnet50 Caffe2 model test | |||
./bin/image-classifier tests/images/imagenet/*.png -image_mode=0to1 -m=quant_resnet50/predict_net.pbtxt -m=quant_resnet50/init_net.pbtxt -model_input_name=gpu_0/data_0 -use-imagenet-normalization "$@" |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
cc: @jspark1105 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for doing this! This is a lot of tricky code for one PR.
include/glow/Graph/Graph.h
Outdated
@@ -260,6 +270,9 @@ class Function final : public Named { | |||
NodeValue input, Storage *W, | |||
Storage *B); | |||
|
|||
FullyConnectedNode *createFullyConnected(llvm::StringRef name, TypeRef outTy, | |||
NodeValue input, Storage *W, | |||
Storage *B); | |||
/// Create a fully connected node with the specified output type. | |||
/// Note, outputDepth is infered based on the output type. | |||
FullyConnectedNode *createFullyConnected(llvm::StringRef name, |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
lib/Graph/Nodes.cpp
Outdated
@@ -199,7 +199,11 @@ static bool verifyPool(NodeValue src, NodeValue dest, | |||
ShapeNHWC exp(idim.n, outSz.first, outSz.second, idim.c); | |||
isValid &= | |||
expectCompareTrue("Unexpected output dimensions", exp, odim, parent); | |||
isValid &= checkTypeIgnoreShape(src, dest, dest.getNode()); | |||
if (!src.getType()->isQuantizedType()) { |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
lib/Graph/Graph.cpp
Outdated
llvm::ArrayRef<unsigned_t> kernels, | ||
llvm::ArrayRef<unsigned_t> strides, | ||
llvm::ArrayRef<unsigned_t> pads) { | ||
ShapeNHWC idim = ShapeNHWC(input.dims()); |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
lib/Importer/Caffe2ModelLoader.cpp
Outdated
// C - output_depth, R - filter_height, S - filter_width, K - input_depth. | ||
// Caffe2 "Conv" op always stores the weight as KCRS, while for "Int8Conv", |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
Tensor wtag; | ||
w->transpose(&wtag, NCHW2NHWC); | ||
if (typeName != "Conv" && order == "NHWC") { |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
@@ -214,7 +279,49 @@ void Caffe2ModelLoader::loadOperator(const caffe2::OperatorDef &op) { | |||
return; | |||
} | |||
|
|||
if (typeName == "MaxPool" || typeName == "AveragePool") { | |||
if (typeName == "Int8SumRelu") { |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
auto in0 = getNodeValueOrCreateConstantByName(op.input(0)); | ||
auto in1 = getNodeValueOrCreateConstantByName(op.input(1)); | ||
auto outDims = in0.getType()->dims(); | ||
auto outTy = G_.getParent()->uniqueType( |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
lib/Importer/Caffe2ModelLoader.cpp
Outdated
loadInt(dict["Y_zero_point"]) - OFFSETSHIFT); | ||
node = G_.createRELU(opName, node, outTy1); | ||
} else if (typeName == "Int8Conv") { | ||
node = G_.createConv(opName, tr, filter, bias, outTy, kernels, strides, |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
@@ -214,7 +279,49 @@ void Caffe2ModelLoader::loadOperator(const caffe2::OperatorDef &op) { | |||
return; | |||
} | |||
|
|||
if (typeName == "MaxPool" || typeName == "AveragePool") { | |||
if (typeName == "Int8SumRelu") { |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
lib/Graph/Graph.cpp
Outdated
@@ -494,6 +494,16 @@ MaxPoolNode *Function::createMaxPool(llvm::StringRef name, NodeValue input, | |||
return addNode(new MaxPoolNode(name, OT, input, kernels, strides, pads)); | |||
} | |||
|
|||
MaxPoolNode *Function::createMaxPool(llvm::StringRef name, NodeValue input, | |||
TypeRef outTy, |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
include/glow/Graph/Graph.h
Outdated
@@ -243,6 +243,11 @@ class Function final : public Named { | |||
llvm::ArrayRef<unsigned_t> strides, | |||
llvm::ArrayRef<unsigned_t> pads); | |||
|
|||
MaxPoolNode *createMaxPool(llvm::StringRef name, NodeValue input, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we enforce output type to be the same as input type for maxPool, i think we can just use createMaxPool
method above (no need to overload it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated. Thanks!
lib/Graph/Graph.cpp
Outdated
(void)kdim; | ||
assert((idim.w + pdim.left + pdim.right) >= kdim.width && | ||
(idim.h + pdim.top + pdim.bottom) >= kdim.height && | ||
"buffer too small for selected stride"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's a bit misleading "buffer too small for selected stride"
as stride is not used in this computations at all.
lib/Graph/Nodes.cpp
Outdated
@@ -200,7 +200,11 @@ static bool verifyPool(NodeValue src, NodeValue dest, | |||
ShapeNHWC exp(idim.n, outSz.first, outSz.second, idim.c); | |||
isValid &= | |||
expectCompareTrue("Unexpected output dimensions", exp, odim, parent); | |||
isValid &= checkTypeIgnoreShape(src, dest, dest.getNode()); | |||
if (src.getType()->isQuantizedType()) { | |||
isValid &= checkSameIsQuantized(src.getType(), dest.getType(), parent); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this check is a bit relaxed. For maxpool we'd like to check that type with ignored shape match as well (not only that the input/output is quantized).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
slightly simplified comparison.
both needs to be either quantized or not quantized and in case of max pool we'd need to check matching type except shape.
isValid &= checkSameIsQuantized(src.getType(), dest.getType(), parent);
if (!isAvgPool) {
isValid &= checkTypeIgnoreShape(src, dest, parent);
}
lib/Importer/Caffe2ModelLoader.cpp
Outdated
bias = G_.getParent()->createConstant("conv.bias", biasTensor); | ||
} else { | ||
assert(dict.count("Y_zero_point") && | ||
"missing zero point for quantzied type"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you be more explicit that the problem is with zero point for output
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has been explained before defining OFFSETSHIFT if it is what you mean:).
/// For the quantized Caffe2 ops, the activations are quantized to uint_8.
/// In Glow, the activations are quantized to int_8. Therefore, for the offset
/// read from quantized caffe2 model, we need to subtract 128(i.e. INT8_MIN) to
/// make the activations becomes int8_t.
/// For Glow: -127 <= orig_fp32/scale_1 + offset_1 < 128
/// For Caffe2: 0 <= orig_fp32/scale_2 + offset_2 < 255
/// Therefore, we can make scale_1 == scale_2, and offset_1 = offset2 - 128
const int32_t OFFSETSHIFT = 128
```;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, i was referring to "missing zero point for quantzied type"
just to clarify that it's for the output type etc.
also, there is typo in quantized :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! Few comments.
lib/Importer/Caffe2ModelLoader.cpp
Outdated
|
||
Node *node = G_.createConv(opName, tr, filter, bias, outTy, kernels, | ||
strides, pads, group); | ||
if (typeName == "Int8ConvRelu") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similarly how you handle Int8SumRelu
we should be able to handle Int8ConvRelu without additional Relu node.
lib/Importer/Caffe2ModelLoader.cpp
Outdated
auto outTy = G_.getParent()->uniqueType( | ||
ElemKind::Int8QTy, outDims, loadFloat(dict["Y_scale"]), | ||
loadInt(dict["Y_zero_point"]) - OFFSETSHIFT); | ||
auto *node = G_.createAdd("tempadd", outTy, in0, in1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tempadd
? why not op name?
assert(dict.count("Y_zero_point") && | ||
"missing zero point for quantzied type"); | ||
assert(dict.count("Y_scale") && "missing Y_scale for quantzied type"); | ||
ShapeNHWC idim = ShapeNHWC(tr->getType(0)->dims()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be less error prone to get result and type from the result and avoid explicit indexing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tr
could either be the input node directly or a node with Transposed type. So I think it is easier to get the type directly without casting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant: tr->getResult().getType()
which avoids explicit indexing.
@@ -238,7 +334,28 @@ void Caffe2ModelLoader::loadOperator(const caffe2::OperatorDef &op) { | |||
} | |||
|
|||
Node *node = nullptr; | |||
if (typeName == "MaxPool") { | |||
|
|||
if (typeName == "Int8MaxPool" || typeName == "Int8AveragePool") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you need special casing for Int8MaxPool? I think typeName == "MaxPool"
should handle both cases nicely. Anything i'm missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. They are the same. However, for "Int8MaxPool", it has the ""Y_zero_point" and "Y_scale". I would like to combine it with Int8AveragePool to check if the fields are there (although for Int8MaxPool, those fields can be ignored).
TH.raw(i) = ((uint8_t)(str.c_str()[i]) - OFFSETSHIFT); | ||
} | ||
} else { | ||
T->reset(ElemKind::Int32QTy, dim, scale, offset); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the typeName
for which we set tensor type to be Int32?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Int8GivenIntTensorFill
this feels a bit strange as Int range is bigger than Int8 range. But since it's what's on C2 we probably cannot do anything.
tests/images/run.sh
Outdated
@@ -41,3 +41,5 @@ done | |||
for png_filename in tests/images/imagenet_299/*.png; do | |||
./bin/image-classifier "$png_filename" -image_mode=0to1 -m=googlenet_v4_slim/googlenet_v4_slim.onnx -model_input_name=input:0 -image_layout=NHWC -label_offset=1 "$@" | |||
done | |||
#Quantized Resnet50 Caffe2 model test | |||
./bin/image-classifier tests/images/imagenet/*.png -image_mode=0to1 -m=quant_resnet50 -model_input_name=gpu_0/data_0 -use-imagenet-normalization "$@" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where do we get model from? Likely we need to update utils script for downloading the model
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the download_caffe2_models.sh is included in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missed that :(
491159e
to
e999ebb
Compare
"missing Y_scale for quantized output type"); | ||
// Construct the Bias field. | ||
Tensor biasTensor(ElemKind::Int32QTy, {depth}, 1.0, 0); | ||
biasTensor.zero(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is fine, as we have offset equal to 0, but isZero and zero does not properly work for quantized types currently.
Tracking this here: #2082
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Will improve it in the following PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nick was going to look at the issue above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you create a small net and test the quantized loader? I'm a bit worried that we do not test it anyhow on CI ...
lib/Graph/Nodes.cpp
Outdated
@@ -200,7 +200,14 @@ static bool verifyPool(NodeValue src, NodeValue dest, | |||
ShapeNHWC exp(idim.n, outSz.first, outSz.second, idim.c); | |||
isValid &= | |||
expectCompareTrue("Unexpected output dimensions", exp, odim, parent); | |||
isValid &= checkTypeIgnoreShape(src, dest, dest.getNode()); | |||
if (src.getType()->isQuantizedType() && isAvgPool) { | |||
// For quantized AvgPool, the scale and offset of its input and output could |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this documented anywhere? Like, in the doc string of the node?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it hasn't been documented yet. However, after this PR is merged, we need to update our quantization docs.
@rdzhabarov Is that OK to create a separate PR for the net testing? Since I think the unittest for each quantized op should be added as well. |
We already have tests for each quantized op (see operatorTest). The only thing which is missing is loader tests and since we do not run resnet50 quantized. |
@rdzhabarov I agree. Will test the Caffe2 loader for the now quantized caffe2 op. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! Really happy to see this merged.
Pending: doc update + importer test
lib/Importer/Caffe2ModelLoader.cpp
Outdated
|
||
// Check if we have a serialized bias vector. | ||
if (op.input_size() > 2) { | ||
auto &biasTensorName = op.input(2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: const auto &
lib/Importer/Caffe2ModelLoader.cpp
Outdated
biasTensor.zero(); | ||
// Check if we have a serialized bias vector. | ||
if (op.input_size() > 2) { | ||
auto &biasTensorName = op.input(2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
lib/Importer/Caffe2ModelLoader.cpp
Outdated
@@ -602,6 +736,73 @@ void Caffe2ModelLoader::loadWeight(const caffe2::OperatorDef &op) { | |||
return; | |||
} | |||
|
|||
// Load tensors and quantize the values: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we do not really quantize values here, right. Just loading already quantized tensors.
@@ -26,6 +26,7 @@ vgg19 | |||
zfnet512 | |||
bvlc_alexnet | |||
en2gr | |||
quant_resnet50 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kinda unrelated question but why don't we download directly from the c2 zoo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Their model has a problem: for their init_net.pb/init_net.pbtxt file, it has duplicated tensors which is unexpected. I removed the duplicated tensors in our version. According to the model owner, they will update the model on their side in the next release.
Description:
This PR finished the rest work of loading Caffe2 quantized Resnet50 model discussed in #1727.
The model can be found here : https://github.com/caffe2/models/tree/master/resnet50_quantized
For testing in Glow side, the model can be downloaded by running utils/download_caffe2_models.sh.
Testing:
added quantized ResNet50 to Glow's test suite (run.sh), tested for interpreter and cpu backend.
Documentation:
[Optional Fixes #1762 #1727]
Please see a detailed explanation of how to fill out the fields in the relevant sections in PULL_REQUEST.md.