-
Notifications
You must be signed in to change notification settings - Fork 559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem with depth in Hardware kernel code of Vitis-Tutorials/Hardware_Acceleration/Design_Tutorials/01-convolution-tutorial #119
Comments
We're looking at this issue and will get back to you soon. Thanks. |
Hi @Ali-Flt - I have extracted the independent questions and answered them independently Why is it necessary to have a FIFO depth of "64" instead of "2" The Dataflow throughput is dynamic in nature and it is dependent on the FIFO full and empty conditions. When the user selects the FIFO of depth "2" they would see a small stall percentage because of FIFO getting full ( can be seen in the dataflow profiling information). To remove this stall, the author has increased it to"64", but the depth can go as low as "16" without any stall percentage. How the MAXI interface can infer a 512-bit data width for burst transfers the inferences of the 512-bit width is not dependent on the FIFO depth, it is dependent on the memory access pattern of the "dst" variable which is sequential. This infers the 512-bit width, there can be cases where the memory access pattern is random which fails the inferences Resource utilization of the FIFO for a depth of "64" and "2" The resource utilization of HLS is only an estimate, run the vivado to get more accurate results FIFO of the depth of "0" This comment will be removed, you cannot set a FIFO of depth "0", the default would be "2" |
Hi @akhilayyagari , About the setting FIFO depth to 0 comment, I was assuming that maybe by setting the depth to 0, the depth will be calculated by DATAFLOW and therefore the dataflow fifo depth will override the user input (0). I ran the synthesis with depth of 0 and the depth of the FIFO was actually set to 2, using no BRAMs. But about you saying you can't set the FIFO depth to "0", I actually changed the config_dataflow setting "-fifo_depth" to 0 and I observed that it synthesized the design without any errors. And the fifo depth in the synthesis resource profiling report was set to 0 too. How is this possible? Shouldn't the fifo depth be at least 1?
Which one is the correct scenario? Thanks again, |
hi @Ali-Flt , There are two ways you can optimize the FIFO depth,
The second issue you have reported may be a bug, it should not take the "0" which does not make sense. We will fix this in the future release. The minimum depth should be "1", the tool default is "2". if the depth of the FIFO is less than 1024, then they use registers unless explicitly specified using a pragma. |
beamforming update
9ba8ba0 Cleanup 1028 (Xilinx#119) 715ca77 update doc (Xilinx#118) bd2a926 Issue#111, solve timing issue, apply jacobian, reduce latency. (Xilinx#114) Co-authored-by: sdausr <[email protected]>
cf4065d Merge pull request Xilinx#123 from RepoOps/update_readme_5 4890779 update README fa29498 update README 61c2cb5 Merge pull request Xilinx#119 from RepoOps/update_doc_url_3 b871455 fix url 2f7fb05 Merge pull request Xilinx#122 from tuol/cr_1142093_2 59cf572 fix input of cscmv de579fa Merge pull request Xilinx#121 from tuol/cr_1140416 c00a509 update makefile and description.json for L2_tests_fp64_spmv 0a0771e update url and branch in doc a69541e Merge pull request Xilinx#118 from tuol/fix_version dfc5cb7 update version to 2022.2 Co-authored-by: sdausr <[email protected]>
Closing the thread as there are no open concerns. Thanks |
Hi,
I've read the whole tutorial and understood all of it. But an issue I have is that the depth of output_stream in the top-level function of the filter2d_hw.cpp file is set to 64.
Why is this? Is it 64 so that the M_AXI interface can perform data width auto-widening to 512 (64 x 8) bits? If so then why is the input stream (pixel_stream) depth not set to 64 too and only the stream for output has 64 depth?
What I assumed was that even if the depth is 2, the M_AXI interface will be able to understand that the data is flowing in the kernel without stall, therefore achieving 512 bit data width for burst transfers.
I also synthesized the code for both 2 and 64 as depth value for output_stream and both systems pass the C/RTL Co-simulation without problem. The only difference in the synthesis reports I saw was that the FIFOs used 51 more LUTs which is because of the output_stream FIFO depth of 64.
P.S.: Although the FIFO depth is 32 times more (2 x 32 = 64), the number of additional LUTs used is less than two times more (51). So I'm assuming HLS has understood that it doesn't need all of the 64 FIFOs. But even the slight addition of LUTs is strange to me because it should not need more than depth=2.
Also there is a hint comment saying you can set the stream depth to 0 for minimum resources, how can the FIFO be created with a depth of 0? should the depth not be at least 1? When I set depth to 0 suddenly there are no BRAMs used for the window_stream but for Depth of 3, 100 BRAMs are used. Why?
Thanks,
Ali
The text was updated successfully, but these errors were encountered: