Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[doc] sys.addaudithook() documentation should be more explicit on its limitations #87604

Closed
vstinner opened this issue Mar 8, 2021 · 12 comments
Labels
3.10 only security fixes docs Documentation in the Doc dir

Comments

@vstinner
Copy link
Member

vstinner commented Mar 8, 2021

BPO 43438
Nosy @vstinner, @tiran, @zooba, @zkonge, @gousaiyang, @frankli0324

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2021-03-08.20:26:14.280>
labels = ['3.10', 'docs']
title = '[doc] sys.addaudithook() documentation should be more explicit on its limitations'
updated_at = <Date 2021-03-12.23:55:44.013>
user = 'https://github.com/vstinner'

bugs.python.org fields:

activity = <Date 2021-03-12.23:55:44.013>
actor = 'steve.dower'
assignee = 'docs@python'
closed = False
closed_date = None
closer = None
components = ['Documentation']
creation = <Date 2021-03-08.20:26:14.280>
creator = 'vstinner'
dependencies = []
files = []
hgrepos = []
issue_num = 43438
keywords = []
message_count = 10.0
messages = ['388299', '388301', '388304', '388305', '388313', '388462', '388476', '388490', '388517', '388569']
nosy_count = 7.0
nosy_names = ['vstinner', 'christian.heimes', 'docs@python', 'steve.dower', 'zkonge', 'gousaiyang', 'frankli']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = None
url = 'https://bugs.python.org/issue43438'
versions = ['Python 3.10']

@vstinner
Copy link
Member Author

vstinner commented Mar 8, 2021

Recently, the PEP-578 audit hooks was used to build a Capture The Flag (CTF) security challenge, AntCTF x D^3CTF: https://d3ctf.io/

Multiple issues have been reported to the Python Security Response Team (PSRT) from this challenge. It seems like there was a misunderstanding on the intent of the PEP-578.

Building a sandbox using audit hooks is *explicitly* excluded from the PEP-578 design:
https://www.python.org/dev/peps/pep-0578/#why-not-a-sandbox

See also the PEP-551 for more details.

The problem is that these two PEPs are not well summarized in the Python documentation, especially in the sys.addaudithook() documentation:
https://docs.python.org/dev/library/sys.html#sys.addaudithook

The documentation should better describe limitations of audit hooks, and may also point to these two PEPs for more information (PEP-578 is already mentioned).

The bare minimum should be to explicitly say that it should not be used to build a sandbox.

By design, audit events is a whack a mole game. Rather than starting from a short "allow list", it is based on a "deny list", so it cannot be safe or complete by design. Every "forgotten" audit event can be "abused" to take the control on the application. And that's perfectly *fine*. It should just be documented.

@vstinner vstinner added the 3.10 only security fixes label Mar 8, 2021
@vstinner vstinner added the docs Documentation in the Doc dir label Mar 8, 2021
@vstinner
Copy link
Member Author

vstinner commented Mar 8, 2021

See also bpo-43439: [security] Add audit events on GC functions giving access to all Python objects.

@zooba
Copy link
Member

zooba commented Mar 8, 2021

To clarify my position on this (as the PEP author):

  • audit hooks added *after* initialization (including via the Python API) are not intended for security, but for logging/debugging, and so bypasses are not considered security issues
  • audit hooks added *before* Python is initialized should not be able to be bypassed *without* prior events indicating that a bypass is going to occur. Ways of bypassing/removing them without prior indicators should be reported as security issues

And note that all compile()d, imported or exec()d code should have been collected, which means any security bypass has to happen without arbitrary code execution.

These hooks are only one tool necessary to create a more secured environment, not the whole thing. (And note that I said "more secured" not "secure", because it's only as secure as you make it. The relative descriptor is deliberate.)

@tiran
Copy link
Member

tiran commented Mar 8, 2021

I agree with both of you.

The documention should explicitly state that the audit hooks are for auditing. They are not designed to sandbox Python. When used correctly, they can help to capture and analyze an event post-mortem.

The documentation of sys.addaudithook() should also mention that hooks may not trigger under some conditions, e.g. very late in the interpreter shutdown phase.

@vstinner
Copy link
Member Author

vstinner commented Mar 8, 2021

Another example of recent Capture The Flag challenge which used audit hooks: https://bugs.python.org/issue42800#msg384143

@frankli0324
Copy link
Mannequin

frankli0324 mannequin commented Mar 10, 2021

PEP-551 is confusing. It looked suggesting that it's a "security tool" that "detects, identifies and analyzes misuse of Python" to me (and apparently many others).

examples shown in the PEP includes WannaCrypt, APTs, all of which involves the good old remote code execution, which is basically a sandboxed environment it self, at least in some way.

also, the challenges provided the contestants with a "background story" that enables an attacker to execute arbitrary code doesn't mean that one HAVE to gain code execution to achieve the goal of bypassing the aevents. in this case, one only have to find the list object which contains the audit hooks registered, and clear it(or replace it). this clearly breaks the promise made in PEP-578 (Hooks cannot be removed or replaced). THIS SHOULD BE FIXED.

ALSO(again), the software is not always doing what it's designed to do. maybe, I mean maybe, developers should make changes according to what users are doing. I don't know, really.

@gousaiyang
Copy link
Mannequin

gousaiyang mannequin commented Mar 11, 2021

We understand that audit hooks should not be used to build a sandbox with Python. It is natural for audit hooks to appear in CTF challenges though, as many CTF challenges intentionally try to use a wrong way to secure a system (and let players prove it wrong).

With that being said, audit hooks should still be robust, even for logging purposes. We are no trying to prevent all kinds of malicious behaviors, but we want to detect them *as much as possible*. If audit hooks can be easily removed while triggering very few seemingly non-sensitive audit events (in this CTF challenge, only "import gc" is triggered, which probably looks "no so suspicious"), this allows attackers to hide details of further malicious behavior without being audited, which violated the motivation of audit hooks (to increase security transparency).

The recent gc patch introduced new events which will make the attack in that CTF challenge look more suspicious. But probably it is still better to harden the current data structure used to store per interpreter audit hooks. If an attacker happens to gain a reference to the list holding the hooks (although I'm not sure how that will still be possible without using gc), they can easily remove the hooks at the Python language level. Probably a Python tuple is already better than a Python list to store the hooks, since tuples are immutable at the language level. Although that means we should build new a tuple each time a new hook is added.

If the hook itself is fragile (e.g. a hook written in Python which relies on global variables), it is a user fault. But if the hook function itself is good, it shouldn't be too easy to remove. Any successful attempts to remove the hook must have already "pwned" the Python interpreter (i.e. gained arbitrary memory read/write or native code execution ability), either by using ctypes, by open('/proc/self/mem'), by crafting bytecode (which triggers code.__new__) or importing modules written in native code. (Overwriting hook.__code__ triggers object.__setattr__.)

@tiran
Copy link
Member

tiran commented Mar 11, 2021

Python's dynamic nature makes it hard to implement and reason about audit hooks written in Python. sys.addaudithook() is really only design for testing, debugging, and playing around with auditing. You absolutely have to write a custom interpreter if you want to take auditing serious.

Please also keep in mind that sys.addaudithook() does **not** add a global hook. The function adds a per-interpreter hook. It just looks global to most people because a process typically has just one interpreter. I have filed bpo-43472 to track the issue.

$ cat auditsub.py 
import sys
import _xxsubinterpreters
def hook(*args):
    print(args)

sys.addaudithook(hook)

import os
os.system('echo main interpreter')

sub = _xxsubinterpreters.create()
_xxsubinterpreters.run_string(sub, "import os; os.system('echo you got pwned')", None)
$ ./python auditsub.py 
('os.system', (b'echo main interpreter',))
main interpreter
you got pwned

@gousaiyang
Copy link
Mannequin

gousaiyang mannequin commented Mar 11, 2021

Please also keep in mind that sys.addaudithook() does **not** add a global hook. The function adds a per-interpreter hook.

Yes, I'm aware of this. And this should be better documented. When I was playing around with audit hooks and reading the source code, I see hooks written in Python (added via sys.addaudithook) are per-interpreter (stored in a Python list object as part of per-interpreter state) while hooks written in C (added via PySys_AddAuditHook) are global (stored in a C linked list). C hooks are more robust. The Python hooks are not inherited in a child process created via multiprocessing when the start method is "spawn" or "forkserver" (not "fork") since they are per-interpreter. I'm not sure whether we should audit the creation of new processes/threads via multiprocessing/threading modules (and also creation of sub-interpreters when PEP-554 gets accepted).

@zooba
Copy link
Member

zooba commented Mar 12, 2021

If someone offers a patch for replacing the list of per-interpreter hooks with something not easily discoverable via gc, I'm sure we'd take it. It's all internal, so just hiding it from the list of bases should be fine (there should never be more than one reference), but turning it into a new form of dynamically expanding array/list would also work, providing whoever reviews it doesn't think it's adding too much complexity/instability.

But file that as a separate issue, please. It's somewhat debatable, while fixing the docs _needs_ to happen. (And I will get to it eventually if nobody else does, but I'm just as happy to review someone else's contribution. And MUCH happier to review a contribution than to argue about it on the tracker :) )

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
zooba added a commit to zooba/cpython that referenced this issue Nov 11, 2022
zooba added a commit to zooba/cpython that referenced this issue Nov 11, 2022
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Nov 11, 2022
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Nov 11, 2022
miss-islington added a commit that referenced this issue Nov 11, 2022
miss-islington added a commit that referenced this issue Nov 11, 2022
ethanfurman pushed a commit to ethanfurman/cpython that referenced this issue Nov 12, 2022
zooba added a commit that referenced this issue Nov 14, 2022
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Nov 14, 2022
… hooks via the gc module (pythonGH-99373)

(cherry picked from commit 4e4b13e)

Co-authored-by: Steve Dower <[email protected]>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Nov 14, 2022
… hooks via the gc module (pythonGH-99373)

(cherry picked from commit 4e4b13e)

Co-authored-by: Steve Dower <[email protected]>
@zooba
Copy link
Member

zooba commented Nov 14, 2022

Have now pushed the doc fix and the GC fix (which should backport all the way through security fixes, but we'll see...)

@vstinner
Copy link
Member Author

Thanks @zooba!

miss-islington added a commit that referenced this issue Nov 15, 2022
… via the gc module (GH-99373)

(cherry picked from commit 4e4b13e)

Co-authored-by: Steve Dower <[email protected]>
miss-islington added a commit that referenced this issue Nov 15, 2022
… via the gc module (GH-99373)

(cherry picked from commit 4e4b13e)

Co-authored-by: Steve Dower <[email protected]>
CuriousLearner added a commit to CuriousLearner/cpython that referenced this issue Nov 16, 2022
* main: (8272 commits)
  Update Windows readme.txt to clarify Visual Studio required versions (pythonGH-99522)
  pythongh-99460 Emscripten trampolines on optimized METH_O and METH_NOARGS code paths (python#99461)
  pythongh-92647: [Enum] use final status to determine lookup or create (pythonGH-99500)
  pythongh-81057: Move Globals in Core Code to _PyRuntimeState (pythongh-99496)
  Post 3.12.0a2
  pythongh-99300: Use Py_NewRef() in Python/Python-ast.c (python#99499)
  pythongh-93649: Split pytime and datetime tests from _testcapimodule.c (python#99494)
  pythongh-99370: fix test_zippath_from_non_installed_posix (pythonGH-99483)
  pythonGH-99205: remove `_static` field from `PyThreadState` and `PyInterpreterState` (pythonGH-99385)
  pythongh-81057: Move the Remaining Import State Globals to _PyRuntimeState (pythongh-99488)
  pythongh-87604: Avoid publishing list of active per-interpreter audit hooks via the gc module (pythonGH-99373)
  pythongh-93649: Split getargs tests from _testcapimodule.c (python#99346)
  pythongh-81057: Move Global Variables Holding Objects to _PyRuntimeState. (pythongh-99487)
  pythonGH-98219: reduce sleep time in `asyncio` subprocess test (python#99464)
  pythonGH-99388: add `loop_factory` parameter to `asyncio.run` (python#99462)
  pythongh-99300: Use Py_NewRef() in PC/ directory (python#99479)
  pythongh-99300: Use Py_NewRef() in Doc/ directory  (python#99480)
  pythongh-99300: Use Py_NewRef() in Modules/ directory (python#99473)
  pythongh-99300: Use Py_NewRef() in Modules/ directory (python#99469)
  pythongh-99370: Calculate zip path from prefix when in a venv (pythonGH-99371)
  ...
ambv pushed a commit that referenced this issue Nov 21, 2022
ambv pushed a commit that referenced this issue Nov 21, 2022
… hooks via the gc module (GH-99373) (GH-99661)

(cherry picked from commit 7b98207)

Co-authored-by: Steve Dower <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.10 only security fixes docs Documentation in the Doc dir
Projects
None yet
Development

No branches or pull requests

3 participants