-
Notifications
You must be signed in to change notification settings - Fork 428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatic precision inference #855
Conversation
Clang doesn't seem to like this
while g++ compiles it just fine. |
The issue seems to be that this: model.add(Conv2D(4, kernel_size=(1, 1), activation='relu', name='last_layer')) # Will become PointwiseConv2D doesn't actually become PointwiseConv2D for Quartus. The bug it uncovered seems tangential to this PR; nevertheless, we need to fix it, either as part of this PR or separately. |
is this the same issue that was observed in #878? |
I believe so. If you have a filter of size 1, then things like |
|
||
def _infer_precision(self, node, types_to_infer): | ||
node_class = node.class_name | ||
if node_class in ['Dense']: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it's better to use something like isinstance(node, Dense)
instead of matching to a class name. Matching to a class name doesn't deal with inheritance. For example, I can see the BatchNormalization matching failing for ApplyAlpha matching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LOL, I did it specifically to avoid this since it is usually not what we want. The ApplyAlpha is an example of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With QONNX ApplyAlpha does need the precision propagated. It worries me if derived classes by default have different behavior. That violates the "is a" principle. I think if you want different behavior you explicitly should code it. What happens in ApplyAlpha if you don't forbid it? Should QONNX not use ApplyAlpha in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You derive a class to have different behavior, not the same one. By principle, ApplyAlpha "is NOT" a BatchNormalization conceptually, it just happens to share the implementation (and honestly it shouldn't, both should have a same parent class, for example ScaleShift, then your logic would be good). Another example of this is DepthwiseConv2D vs Conv2D. It's true they are convolutions, but they have different behavior. I think it would be better for new layers that potentially inherit from other layers instead of the base Layer class to be unsupported by this rather then for them to be silently supported in a wrong way. I'm not gonna die on this hill though, if you feel strongly about this we could revisit, but I'd like stronger arguments 😄.
For ApplyAlpha, you can use it in QONNX, I can add it to have the same behavior as BN if that is what is needed.
return inferred_types | ||
|
||
def _infer_dense_precision(self, node, types_to_infer): | ||
n_ops = node.get_attr('n_in') * node.get_attr('n_out') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the total n_ops is the important value for the accumulator precision. Ignoring bias for now, each output value is a certain number of multiplies, resulting in the input_width + weight_width
part of the equation, plus the accumulation, math.ceil(np.log2(num_acc))
. But num_acc != num_ops. In particular, it's node.get_attr('n_in')
(-1?), at least for the standard 1D Dense layer. Bias modifies things a bit, but the general trends are the same. This is the result we had from the CMS hackathon with Sioni: https://github.com/jmitrevs/hls4ml/blob/bit-correct/hls4ml/model/optimizer/passes/propagate_dense_precision.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, true, I'll fix this.
|
||
return ['result_t'] | ||
|
||
def _infer_common_precision(self, node, types_to_infer, n_ops): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this assumes integer or fixed precision types. Do we need to handle, e.g., xnor precision types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should test this PR with the binary model that has Xnor types. I wouldn't expect that you need to infer anything in that case, but perhaps it breaks the optimizer.
I made vloncar #53 into your branch with changes for dense and standard convolution. Let me know what you think. If you like the way I made this, I can also add signed/unsigned support to the other precision propagations (like merge, bn, sepconv) either in that PR or a different one. |
def _infer_bn_precision(self, node, types_to_infer): | ||
inferred_types = [] | ||
|
||
if 'scale_t' in types_to_infer: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't the input quantization in this case be the input mean and variance (+ other things), not the scale and bias?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(actually I have to see how this is handled by the qkeras parser)
@@ -33,6 +33,7 @@ | |||
register_flow( | |||
'convert', | |||
[ | |||
'infer_precision_types', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer having this towards the end of convert, or just leaving the call out of convert, since we call it in the optimize flow anyway. What is the purpose of doing it here? Here it messes up some of the conversion steps I have for qonnx, which only happen if the type is not set, since it doesn't want to override set types, and this inferring here sets the types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"optimize" is sort-of optional, it belongs to "convert" because after that stage the other optimizers don't expect "auto" to exist. If onnx has its own optimizers that run, why not group them into a flow and run before convert? the idea is that after "convert" it shouldn't matter where the model came from
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we can move it later in concert, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have onnx-specific optimizers that must go before, place them before this one.
latency pooling overhaul vivado latency pooling overhaul vitis latency pooling overhaul, fix comment fix boundry cond fix syn issues latency pooling overhaul Fix pooling accum_t autoset & avoid global override [pre-commit.ci] auto fixes from pre-commit hooks better way to get inp layer name fix for vitis / input_t fetch torch padding fix avoid name dup in torch api test rm pooling precision override in favor of fastmachinelearning#855
latency pooling overhaul vivado latency pooling overhaul vitis latency pooling overhaul, fix comment fix boundry cond fix syn issues latency pooling overhaul Fix pooling accum_t autoset & avoid global override [pre-commit.ci] auto fixes from pre-commit hooks better way to get inp layer name fix for vitis / input_t fetch torch padding fix avoid name dup in torch api test rm pooling precision override in favor of fastmachinelearning#855
Description
This introduces the ability to specify
auto
as a precision string, that implies hls4ml should infer the precision in some way. This is not exposed by default via theconfig_from...
functions for now. The goal is to have the framework for inferring types in some ways within hls4ml (e.g., QONNX parser) before fully exposing it to users. An initial inference of precision has been added via theinfer_precision_types
optimizer, based on previous attempts by various people. It's not advanced in any way. During testing, I encountered some issues withSeparableConv1D
templates which I fixed.Type of change
SeparableConv1D
issueTests
There are new tests in
test_auto_precision.py
that cover the few use cases.Checklist
pre-commit
on the files I edited or added.