-
Notifications
You must be signed in to change notification settings - Fork 386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow fp16 re-write of pack-lh nodes #7731
base: master
Are you sure you want to change the base?
Conversation
@alankelly I tried this out and it goes further than before and I get the following error message now. Note (XNNPACK): Analyzing subgraph for FP16 compatibility (xnn_subgraph_rewrite_for_fp16, external/XNNPACK/src/subgraph.c:862) If I disable SME --define=xnn_enable_arm_sme=false --define=xnn_enable_arm_sme2=false, then it goes through fine and the same F32 layer is executed as F16.. |
e27aeff
to
09781cf
Compare
PiperOrigin-RevId: 720489462
09781cf
to
e072477
Compare
@@ -1009,7 +1022,7 @@ bool xnn_subgraph_rewrite_for_fp16(xnn_subgraph_t subgraph) | |||
value->fp16_id = XNN_INVALID_VALUE_ID; | |||
value->fp32_id = XNN_INVALID_VALUE_ID; | |||
if (value->fp16_compatible) { | |||
assert(value->datatype == xnn_datatype_fp32); | |||
assert(value->datatype == xnn_datatype_fp32 || value->datatype == xnn_datatype_pfp32); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess line 1105 needs this as well.
With that, the execution went further till another assert.. The last few lines of log is as
VERBOSE: Replacing 1 out of 1 node(s) with delegate (TfLiteXNNPackDelegate) node, yielding 1 partitions for subgraph 0.
Note (XNNPACK): Analyzing subgraph for FP16 compatibility (xnn_subgraph_rewrite_for_fp16, external/XNNPACK/src/subgraph.c:862)
Note (XNNPACK): XNNPACK has switched to FP16 inference mode! (xnn_subgraph_rewrite_for_fp16, external/XNNPACK/src/subgraph.c:1216)
INFO: Successfully applied the default TensorFlow Lite delegate indexed at 0.
NOTE: because a delegate has been applied, the precision of computations should be unchanged, but the exact output tensor values may have changed. If such output values are checked in your code, like in your tests etc., please consider increasing error tolerance for the check.
Note (XNNPACK): Tile size for GEMM with num_groups=1, m=4096, n=32 and mr=6, nr=16 set to [6, 32] (683 tiles) (xnn_gemm_best_tile_size, external/XNNPACK/src/microkernel-utils.c:110)
INFO: Running benchmark for at least 1 iterations and at least 0.5 seconds but terminate if exceeding 150 seconds.
Assertion failed: (kernel_data == NULL), function setup_fully_connected_operator, file fully-connected.c, line 896.
Allow fp16 re-write of pack-lh nodes