Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i#6635 core filter, part 6: Add core-sharded record filter output #6704

Merged
merged 12 commits into from
Mar 13, 2024

Conversation

derekbruening
Copy link
Contributor

Multiple changes to allow the record filter to operate in core-sharded fashion:

Makes the pc2encoding table per-input, as one input can migrate across multiple core shards and thus one core can see a later instruction without ever having seen its encoding. To handle synchronization, there is no C++11 std:: rwlock, so we use mutexes -- but we limit their use to per-context-switch for the added global lock, and we assume there is no contention for the per-input lock as only one shard operates on one input at any one time.

Sets the memref counter reader to core_sharded_ to avoid asserts.

Appends footer records to ending-in-idle-record cores.

Adds an error check ensuring a single workload, as multiple will require expanding the keys used in some tables.

Renames the output files to include "core.<shard_index>" and not the tid. This is surprisingly complex, as an input filename is needed to determine the output filename compression type: yet not all shards are guaranteed to have an input at the start. A condition variable and mutex are used to coordinate this among shards.

Adds support for started-idle cores by synthesizing headers in record_filter; #6703 covers having the scheduler do this for all analyzers. Adds the version as another field available up front from the scheduler, and adds an idle-tid sentinel needed to be distinct from INVALID_THREAD_ID.

Adds two end-to-end tests, one with a single-threaded app scheduled onto 4 cores to test start-idle cores and one to test multiple threads. Adds a macro to share code with the existing end-to-end test.

Updates the unit test mock classes.

Issue: #6635, #6703

Multiple changes to allow the record filter to operate in core-sharded
fashion:

Makes the pc2encoding table per-input, as one input can migrate across
multiple core shards and thus one core can see a later instruction
without ever having seen its encoding.  To handle synchronization,
there is no C++11 std:: rwlock, so we use mutexes -- but we limit
their use to per-context-switch for the added global lock, and we
assume there is no contention for the per-input lock as only one shard
operates on one input at any one time.

Sets the memref counter reader to core_sharded_ to avoid asserts.

Appends footer records to ending-in-idle-record cores.

Adds an error check ensuring a single workload, as multiple will
require expanding the keys used in some tables.

Renames the output files to include "core.<shard_index>" and not the
tid.  This is surprisingly complex, as an input filename is needed to
determine the output filename compression type: yet not all shards are
guaranteed to have an input at the start.  A condition variable and
mutex are used to coordinate this among shards.

Adds support for started-idle cores by synthesizing headers in
record_filter; #6703 covers having the scheduler do this for all
analyzers.  Adds the version as another field available up front from
the scheduler, and adds an idle-tid sentinel needed to be distinct
from INVALID_THREAD_ID.

Adds two end-to-end tests, one with a single-threaded app scheduled
onto 4 cores to test start-idle cores and one to test multiple
threads.  Adds a macro to share code with the existing end-to-end test.

Updates the unit test mock classes.

Issue: #6635, #6703
Copy link
Contributor

@abhinav92003 abhinav92003 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still trying to understand some things. So will do another round of review.

clients/drcachesim/reader/reader.cpp Show resolved Hide resolved
clients/drcachesim/reader/reader.cpp Outdated Show resolved Hide resolved
clients/drcachesim/reader/reader.cpp Outdated Show resolved Hide resolved
clients/drcachesim/scheduler/scheduler.cpp Outdated Show resolved Hide resolved
clients/drcachesim/common/utils.h Show resolved Hide resolved
clients/drcachesim/tools/filter/record_filter.cpp Outdated Show resolved Hide resolved
clients/drcachesim/tools/filter/record_filter.cpp Outdated Show resolved Hide resolved
@derekbruening
Copy link
Contributor Author

Failure is #4167 "Invalid trace entry type thread_exit before a bundle"

@derekbruening derekbruening merged commit 6d7b1a4 into master Mar 13, 2024
15 of 16 checks passed
@derekbruening derekbruening deleted the i6635-core-sharded-record-filter branch March 13, 2024 00:25
derekbruening added a commit that referenced this pull request Mar 14, 2024
derekbruening added a commit that referenced this pull request Mar 15, 2024
Improves record_filter_t subclass support for initially-idle cores by
refactoring get_output_basename() out of initialize_shard_output(),
allowing a subclass to share the complex initial setup while still using
its own output scheme. Also moves the setup variable to protected for
access to output_ext_ in subclasses.

Removes code that was refactored into initialize_shard_output() but
accidentally left in place in PR #6704.

Tested internally.

Issue: #6635
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants