You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We found a regression of ~4% for our NDSv2 benchmark in our performance cluster after this change went in: #6604
The issue is we are acquiring the semaphore too early, before the stream side has materialized the first batch. So this means that if the first stream batch requires data from say a data source (like a parquet table), we'd hold the semaphore while we do all of the IO to materialize the stream side, this is not ideal. The issue is very similar to what was fixed here: #4539, and so the proposed fix is very similar, just for broadcasts.
The text was updated successfully, but these errors were encountered:
We found a regression of ~4% for our NDSv2 benchmark in our performance cluster after this change went in: #6604
The issue is we are acquiring the semaphore too early, before the stream side has materialized the first batch. So this means that if the first stream batch requires data from say a data source (like a parquet table), we'd hold the semaphore while we do all of the IO to materialize the stream side, this is not ideal. The issue is very similar to what was fixed here: #4539, and so the proposed fix is very similar, just for broadcasts.
The text was updated successfully, but these errors were encountered: