-
Notifications
You must be signed in to change notification settings - Fork 570
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
seeing more than one cpu id on one output stream with MAP_TO_RECORDED_OUTPUT #6354
Comments
Added check to verify that output stream's CPU ID matches the CPU ID markers seen in the trace. Issue: #6354
It looks like this is the culprit, this thread with zero instructions:
All the scheduling is instruction-count-based so this degenerate thread has 2 entries both at instruction count 0:
The first question is, how did this thread end up having no instructions and still being output: I thought we had a filter to remove empty threads. That filter is based bytes written: so maybe the futex syscall exit (must have attached mid-futex as there's no entry) is what made it past the filter? |
Tightens the normal-mode output rule to discard a thread with no instructions (keeping the no-data check for i-filtered modes). Adds the scheduler_launcher CPU check from PR #6355. It is not easy to create a test with a thread that reliably executes zero instructions yet still outputs some data. I added an invariant check for a zero-instruction thread and confirmed it fires on the trace in PR #6355: ``` $ bin64/drrun -t drcachesim -simulator_type invariant_checker -indir ~/tmp/drmemtrace.schedtest.x64 Trace invariant failure in T85300 at ref # 12 (0 instrs since timestamp 13341391214912820): An unfiltered thread should have at least 1 instruction ``` Issue: #6354
Thanks for the debug! I added a comment here: #6356 (comment) |
The second case from #6355 (comment) is rep string:
All the rep strings have the same instr count: making it not work well as a divider point! Will have to consider how to handle this. |
Want instr ord for fast skip. So have to store both instr and record count? This affects record-replay too: though the switch point would just always be the |
Theoretically the same thing could happen with a ton of markers in between two instructions. |
Since the rep string loop could be quite long, it doesn't seem like we can just ignore this case unfortunately (for the marker case I would say it's not worth more effort than throwing away the offending cpuid marker in the middle). |
@prasun3 do you have an opinion on how important this is: if real-world rep string loops are generally not very long maybe never switching in the middle of one is ok? (There are still other aspects to sort out like record-replay with time quanta if we went this route.) |
Tightens the normal-mode output rule to discard a thread with no instructions (keeping the no-data check for i-filtered modes) and expands the empty-thread behavior to include regular exit runs and not just detach. Adds the scheduler_launcher CPU check from PR #6355. It is not easy to create a test with a thread that reliably executes zero instructions yet still outputs some data. I added an invariant check for a zero-instruction thread and confirmed it fires on the trace in PR #6355: ``` $ bin64/drrun -t drcachesim -simulator_type invariant_checker -indir ~/tmp/drmemtrace.schedtest.x64 Trace invariant failure in T85300 at ref # 12 (0 instrs since timestamp 13341391214912820): An unfiltered thread should have at least 1 instruction ``` Issue: #6354
I think not switching in the middle of a rep should be okay. |
@prasun3 -- another question: what is your preference for whether each iteration of a rep string loop should have a regular instruction fetch record, or not? We had decided in the past that it should not; but an alternative solution here would be to revisit that, to have a separate instruction ordinal for each iteration: either with a regular fetch or even without. |
We'd prefer if each iteration did not have an instr fetch record. |
For the rep string: Currently: scheduler records and replays the start only. Problems:
Proposal #A: Consider not a big deal to not switch mid-loop
Proposal #B: Count non-fetched instrs in all instr ordinals
Proposal #C: Reverse earlier decisions and insert instr record for each iter
Proposal #D: Store both an instruction ordinal and a record ordinal everywhere we store instruction ordinals today (as-traced and serial schedule files; record-replay files; range_t region-of-interest bounds; skip_instructions API) and update the skipping code to advance to mid-instruction points (any point?) Proposal #E: Like #D but we store both an instruction ordinal and sub-instruction "load/store ordinal". This then does not rely on having unfetched instr records (which we have proposed removing) and has more structure than an asolute record ordinal. Proposal #F: New counter of {fetched+unfetched} instructions used by the scheduler. But skipping and zipfile chunks are based on fetched only. We've decided that Proposal #E is the winner. This involves bumping the version for a new field in all the schedule file records (the raw2trace-written format and the record-replay format). |
The rep string solution may also serve to support running instruction-filtered traces or portions of traces in the scheduler with instruction quanta, skips, and replayed schedules. |
Describe the bug
See https://groups.google.com/g/DynamoRIO-Users/c/Dsf4As64r7o
In some of my tests, I am seeing more than one cpu id on one output stream with MAP_TO_RECORDED_OUTPUT.
The value returned by stream->get_output_cpuid matches the initial CPU_ID marker values, but at some point on some streams, we see other CPU_ID values as shown below.
Example:
To Reproduce
Steps to reproduce the behavior:
drrun -t drcachesim -offline
Please also answer these questions:
Expected behavior
With the MAP_TO_RECORDED_OUTPUT option, each output stream maps to exactly one CPU
Screenshots or Pasted Text
Versions
Additional context
The text was updated successfully, but these errors were encountered: