i#2350 rseq: Use a local copy for native execution #3826

derekbruening · 2019-09-10T20:18:22Z

Eliminates the call-return reliance for the native execution step of
rseq support. Makes a local copy of the sequence right inside the
sequence-ending block and executes it. The sequence is inserted as
additional instructions and is then mangled normally (mangling changes
are assumed to be restartable), but it is not passed to clients. Any
exits are regular block exits, resulting in a block with many exits.

The prior call-return scheme is left under a temporary option
-rseq_assume_call, as a failsafe in case there are stability problems
discovered with this native execution implementation. Once we are
happy with the new scheme we can remove the option.

To make the local copy an rseq region, the per-thread rseq_cs address
is identified by watching system calls. For attach, it is identified
by searching the possible static TLS offsets. The assumption of a
constant offset is documented and verified.

The rseq_cs's abort handler is a new exit added with the app's
signature as data just before it, hidden in the operands of a nop
instruction to avoid problems with decoding the fragment. A local
jump skips over the data and exit.

A new rseq_cs structure is allocated for each sequence-ending
fragment. It is stored in a hashtable in the rseq module, to avoid
complexities and overhead of adding an additional fragment_t or
"subclass" field. A new flag is set to trigger calling into the rseq
module on fragment deletion.

The rseq_cs fields are filled in via a new post-emit control point,
using information stored in labels during mangling. The pointer to
the rseq_cs is inserted with a dummy value and patched in this new
control point using a new utility routine patch_mov_immed_ptrsz().

To avoid crashing due to invalid rseq bounds after freeing the rseq_cs
structure, the rseq pointer is cleared explicitly on completion, and
on midpoint exit by the fragment deletion hook along with a hook on
the shared fragment flushtime update, to ensure all threads are
covered.

The rseq test is augmented and expanded. An invalid instruction is
added to properly test the abort handler, under a conditional to allow
testing each sequence both to completion and on abort.

Future work is properly handling a midpoint exit during the
instrumentation execution: we need to invoke the native version as
well.

Adding aarchxx support is also future work: the
patch_mov_immed_ptrsz(), the writes to the rseq struct in TLS, and the
rseq tests are currently x86-only.

Issue: #2350

Eliminates the call-return reliance for the native execution step of rseq support. Makes a local copy of the sequence right inside the sequence-ending block and executes it. The sequence is inserted as additional instructions and is then mangled normally (mangling changes are assumed to be restartable), but it is not passed to clients. Any exits are regular block exits, resulting in a block with many exits. The prior call-return scheme is left under a temporary option -rseq_assume_call, as a failsafe in case there are stability problems discovered with this native execution implementation. Once we are happy with the new scheme we can remove the option. To make the local copy an rseq region, the per-thread rseq_cs address is identified by watching system calls. For attach, it is identified by searching the possible static TLS offsets. The assumption of a constant offset is documented and verified. The rseq_cs's abort handler is a new exit added with the app's signature as data just before it, hidden in the operands of a nop instruction to avoid problems with decoding the fragment. A local jump skips over the data and exit. A new rseq_cs structure is allocated for each sequence-ending fragment. It is stored in a hashtable in the rseq module, to avoid complexities and overhead of adding an additional fragment_t or "subclass" field. A new flag is set to trigger calling into the rseq module on fragment deletion. The rseq_cs fields are filled in via a new post-emit control point, using information stored in labels during mangling. The pointer to the rseq_cs is inserted with a dummy value and patched in this new control point using a new utility routine patch_mov_immed_ptrsz(). To avoid crashing due to invalid rseq bounds after freeing the rseq_cs structure, the rseq pointer is cleared explicitly on completion, and on midpoint exit by the fragment deletion hook along with a hook on the shared fragment flushtime update, to ensure all threads are covered. The rseq test is augmented and expanded. An invalid instruction is added to properly test the abort handler, under a conditional to allow testing each sequence both to completion and on abort. Future work is properly handling a midpoint exit during the instrumentation execution: we need to invoke the native version as well. Adding aarchxx support is also future work: the patch_mov_immed_ptrsz(), the writes to the rseq struct in TLS, and the rseq tests are currently x86-only. Issue: #2350

derekbruening · 2019-09-10T20:20:37Z

Unfortunately this is on the large side. Some small pieces could be split out, like the ATOMIC_1BYTE_WRITE, but to split out anything sizable would require committing some unexecuted code. Let me know if it's too much to review at once. The description should cover the key changes. I still wanted to update the wiki page with more details on all of the design decisions. Some of them we discussed offline so they should be familiar.

Carrotman42 · 2019-09-10T22:00:28Z

The CL size is not a big deal, thanks for the warning though :) I'll start taking a look

derekbruening · 2019-09-10T22:43:09Z

Spent some time looking at the AArch64 failures: it seems that a bunch of tests which all load app libraries fail with the assert core/utils.c:685 null_ok. I don't seem to have access to the Jenkins machine, but I do have access to a Packet machine. I can reproduce the assert there. However, after failing to find a connection to this PR, I also tried running HEAD and there those app-lib tests also have problems: they time out, hanging at the same point as the assert in this run. Yet the previous build on Jenkins has these tests passing. Still investigating, but given the HEAD behavior suspecting something unrelated to this PR.

derekbruening · 2019-09-10T23:25:36Z

The HEAD hang looks related to #1698: I'm seeing ldaxr involved.

derekbruening · 2019-09-11T04:55:16Z

Figured out the cause of the AArch64 issues: fixed in 7906b7f. The hangs are still there on this other machine. I expect Jenkins to be green now.

api/docs/bt.dox

core/arch/arch_exports.h

core/arch/mangle_shared.c

core/unix/rseq_linux.c

core/arch/x86/mangle.c

core/unix/rseq_linux.c

suite/tests/linux/rseq.c

derekbruening · 2019-09-11T21:47:45Z

PTAL

derekbruening requested a review from Carrotman42 September 10, 2019 20:20

derekbruening added the Google-Verified label Sep 10, 2019

Do not clobber TLS when rseq is not enabled

7906b7f

Carrotman42 requested changes Sep 11, 2019

View reviewed changes

Address reviewer comments.

51d5961

Carrotman42 approved these changes Sep 11, 2019

View reviewed changes

Fix incorrect complaint about a zero offset

d49012a

derekbruening merged commit d465def into master Sep 12, 2019

derekbruening deleted the i2350-rseq-run-copy branch September 12, 2019 01:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

i#2350 rseq: Use a local copy for native execution #3826

i#2350 rseq: Use a local copy for native execution #3826

derekbruening commented Sep 10, 2019

derekbruening commented Sep 10, 2019

Carrotman42 commented Sep 10, 2019

derekbruening commented Sep 10, 2019

derekbruening commented Sep 10, 2019

derekbruening commented Sep 11, 2019

derekbruening commented Sep 11, 2019

i#2350 rseq: Use a local copy for native execution #3826

i#2350 rseq: Use a local copy for native execution #3826

Conversation

derekbruening commented Sep 10, 2019

derekbruening commented Sep 10, 2019

Carrotman42 commented Sep 10, 2019

derekbruening commented Sep 10, 2019

derekbruening commented Sep 10, 2019

derekbruening commented Sep 11, 2019

derekbruening commented Sep 11, 2019