-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[VTA] Fix an issue in updating uop_idx in the TensorGemm module #4694
Conversation
@kevinyuan Can you help reproduce the results? A hard requirement is that we should run the module only once with m.run() , instead of running it in In addition, I have a patch --- a/vta/hardware/dpi/tsim_device.cc
+++ b/vta/hardware/dpi/tsim_device.cc
@@ -141,6 +141,8 @@ int VTADPISim() {
tfp->dump(static_cast<vluint64_t>(trace_count * 2 + 1));
#endif
trace_count++;
+ if ((trace_count % 1000000) == 1)
+ fprintf(stderr, "[traced %dM cycles]\n", trace_count / 1000000);
while (top->sim_wait) {
top->clock = 0;
std::this_thread::sleep_for(std::chrono::milliseconds(100)); to trace the number of cycles in TSIM. |
Hi @liangfu, With your recommended changes, tsim produced the same prediction results as fsim. Can you explain why m.run() must be used instead of time_evaluator to generate the correct prediction result ? Furthermore, the cycle_count in "Execution statistics" is equal to 0. Is this expected or it can be improved ? Thanks. |
time_evaluator takes the first run as a warm-up, and the timing results would be ignored. The prediction results are generated from the second run, which might use the buffer contaminated by the first run.
It prints out correct "Execution statistics" when we use time_evaluator. The problem is how to get correct prediction results with time_evaluator, instead of running the module directly. cc @tmoreau89 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @liangfu for the patch
Hi @liangfu, Have you got a chance to figure out the root cause of "The problem is how to get correct prediction results with time_evaluator, instead of running the module directly" ? Also, I found using the following wrong code #1 will also generate wrong prediction, which looks like the inference was run with 11 passed, since I saw [traced 329M cycles] at the end, while the correct run show [traced 29M cycles] at the end. ----------- wrong code #1 ------------
Output
----------- wrong code #2 ------------
Output
----------- correct ---------
Output
|
This PR fixed an issue in updating
uop_idx
counter in theTensorGemm
module.Problem
When evaluating
deploy_vision_on_vta.py
with TSIM backend, it is required to run the module withHowever, it throws an instance of
dmlc::Error
, sayingThe root cause is that
uop_idx
increases unexpectedly.This PR fixes the above issue and enables successful evaluation of
deploy_vision_on_vta.py
with TSIM backend.