-
Notifications
You must be signed in to change notification settings - Fork 571
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
i#1684 xarch-IR: Support separate host and target arch #4325
Conversation
Adds support for building with different host and target architectures. The goal is to support running drdisas, drdecode, drraw2trace, and drcachesim trace analyzers on one host machine while targeting recorded bytes or traces from a different type of machine: e.g., processing aarch64 traces on an x86_64 machine. The goal is *not* to turn DR into a cross-ISA binary translator: we only support standalone mode here, not full DR managed mode. Long-term it would be nicer to split out the DR standalone mode interfaces into separate libraries but that is beyond the scope of the current work. Adds a top-level CMake option variable "TARGET_ARCH" which defaults to CMAKE_SYSTEM_PROCESSOR but can be set to a different value. The regular arch variables/defines X86, AARCH64, and ARM are set from TARGET_ARCH, while new values DR_HOST_{X86,AARCH64,ARM} are set for the host. The bulk of the code is built targeting the TARGET_ARCH, with only assembly code and other limited code using the DR_HOST_* arch. This limits the scope of the code changes as compared to the opposite approach. A new define DR_HOST_NOT_TARGET is used to simply disable cases where the host and target and intertwined and difficult to easily separate, such as TLS handling. The key code changed to use DR_HOST_* includes: + *.asm files + arch_exports.h inline asm for atomics and other key utilities + system call defines (we don't want to have our raw syscall wrappers running incorrect syscall numbers: we'll end up with fork bombs or other madness) Code using built-in __*__ arch defines is changed to use our defines: + sigcontext.h, which is also augmented for some missing AArch64 types + x86_code_test.c Code reading ELF headers is relaxed to allow *any* arch, since we need the host for standalone init code but the target for raw2trace module mapping. Nearly all tests are disabled not just from running but also from building for simplicity, to reduce code changes here. TODO: Add new runsuite build TODO: Add new drcachesim xarch tests TODO: A couple of NOCHECK's in the diff Issue: #1684, #4318
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very quick look just to get an idea of what you are doing. Discussion to continue on issue page.
Adds a new option -alt_module_dir which specifies an alternate directory in which to look for libraries in a modules.log offline drcachesim file. Hooks the option up to the opcode_mix and view tools and to raw2trace (both its standalone launcher and drcachesim frontend integration). This will be useful in general, but is needed for adding tests with checked-in raw files and binaries, where the checked-in modules.log has a hardcoded path that will never match all test environments. Adds a test that runs both raw2trace and opcode_mix on a new checked-in aarch64 raw trace (this test will evolve for future cross-arch testing, which is the reason for using aarch64). The binaries are stripped, with the two libraries "fake" binaries: libmemtrace.so is not executed, and libdynamorio.so is only used for dynamorio_sigreturn, so these are synthetic binaries (created from asm files) to keep them smaller. The tests deliberately fail on x86 and match an error message that will improve when architecture tags are added. Issue: #4318
Adds architecture tags to raw traces, propagated to final traces in a new marker type. Adds reader support for the new type. Adds a check for unhandled architectures to raw2trace. Adds checks for unhandled architectures to the opcode_mix and view tools. To reach that point, relaxes the ELF header checks to allow libraries of the same bitwidth but different architectures (i#1345 prevents allowing all libraries), so we can map in the other-arch libraries. Updates the existing aarch64 raw trace files to contain architecture tags, and updates the x86 failure message. Cross-architecture handling of these same files was tested manually with a prototype which will be committed in the future. Issue: #4318, #1345
Adds a new gzip wrapper around std::istream to support compressing .raw files. This lets us easily shrink checked-in test files, and aids other relocate-then-process situations with raw files. Updates the altbin test by compressing its .raw files. Issue: #4318
+ Fix misc host==target build issues from prototype + Fix drsyms arch + Fix asm to properly define in all cases, which requires passing in client flags who have no configure.h, and squashing the same flags for api.* tests + Remove no-longer-necessary IF_HOST from ELF code + Avoid a64 double-def sigcxt + Add fatal error if host and target bitwidths don't match (i#1345), and properly set the target bitwidth (don't base on compiler). + Enable altbindir test as xarch test + Finalize test suite approach: enabled 6 tests; used macro to set up tests and then return() to avoid if() on the other ~350 test setup cmake code Issue: #1345
…alt_module_dir_/;
…st-vs-default-target-arch
…commit: will add Windows drdecode in a separate step
@johnfxgalea since you already took a preliminary look: requesting a review. I separated Windows out and it will be a separate PR. I've got a new Travis job for aarch64-on-x86. |
No problem, happy to take a look. However, just a heads up, I don't think I'll have time to review this today but I'll try. |
Here's the key test in the new Travis job for drcachesim:
So it's x86 drraw2trace and opcode_mix analyzers operating on aarch64 traces. Pretty cool. |
Sure, thank you. There is not all that much beyond what you saw before: setting up the key drcachesim test required several other changes which I put in separately with separate reviews (PR's #4330, #4331, #4332). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need to update the wiki/documentation detailing how to target an architecture that is different from the host.
From my understanding, you do this by setting the cflag -DTARGET_ARCH right?
Yes, that will have to be separate from this PR. I was going to get Windows working as well (though no good solution to the zero-length array issue except more ifdefs) first. |
…arifying comments; add drdisas test
Also, here I am only testing host=x86_64 target=aarch64. I think more work is needed for sthg like host=i386 target=arm (b/c of the -mthumb compiler flags). I don't want to add that support in this PR for sure: best to separate, and I don't think it affects what was done here: would be additional. Maybe an up-front check should be added though for what combinations are supported. |
Added a check for ARM up front, which is better than a compiler error later. |
PTAL |
Appveyor failure is client.winxfer which is a flaky test: #4058. |
Thank you for the detailed review! |
Adds support for building with different host and target
architectures. The goal is to support running drdisas, drdecode,
drraw2trace, and drcachesim trace analyzers on one host machine while
targeting recorded bytes or traces from a different type of machine:
e.g., processing aarch64 traces on an x86_64 machine. The goal is
not to turn DR into a cross-ISA binary translator: we only support
standalone mode here, not full DR managed mode. Long-term it would be
nicer to split out the DR standalone mode interfaces into separate
libraries but that is beyond the scope of the current work.
Adds a top-level CMake option variable "TARGET_ARCH" which defaults to
CMAKE_SYSTEM_PROCESSOR but can be set to a different value. The
regular arch variables/defines X86, AARCH64, and ARM are set from
TARGET_ARCH, while new values DR_HOST_{X86,AARCH64,ARM} are set for
the host.
The bulk of the code is built targeting the TARGET_ARCH, with only
assembly code and other limited code using the DR_HOST_* arch. This
limits the scope of the code changes as compared to the opposite
approach. A new define DR_HOST_NOT_TARGET is used to simply disable
cases where the host and target and intertwined and difficult to
easily separate, such as TLS handling.
The key code changed to use DR_HOST_* includes:
incorrect syscall numbers: we'll end up with fork bombs or other madness)
Code using built-in * arch defines is changed to use our defines:
Code reading ELF headers is relaxed to allow any arch, since we need
the host for standalone init code but the target for raw2trace module
mapping.
Nearly all tests are disabled not just from running but also from
building for simplicity, to reduce code changes here.
TODO: Add new runsuite build
TODO: Add new drcachesim xarch tests
TODO: A couple of NOCHECK's in the diff
Issue: #1684, #4318