Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Soundness Issues with syscalls #6

Closed
toshipiazza opened this issue Feb 22, 2018 · 4 comments
Closed

Soundness Issues with syscalls #6

toshipiazza opened this issue Feb 22, 2018 · 4 comments

Comments

@toshipiazza
Copy link
Owner

Syscalls such as read, recv, uname, fstat{64,}, getdents{64,}, etc all write to a passed-in buffer. If we don't explicitly set taint over these buffers after the syscalls succeeds, we could have a buffer that is marked tainted but in reality isn't.

Should we handle this in drtaint proper?

Syscalls such as these currently cause a lot of coreutils to fail, for example ls -alh.

@vanhauser-thc
Copy link

yes you should as otherwise the tainter is worthless.
read, recv, recvfrom are the most important syscalls to hook for tainting.
And it makes sense to hook open and accept4 as well, as e.g. if you only want to taint network traffic that can be received with read() too, so you would need to track read() for the file descriptor that came with accept4().

have a look at this project which is based on Pin:
https://github.com/BinaryAnalysisPlatform/bap-pintraces

it is not a lot of work.

@toshipiazza
Copy link
Owner Author

Thanks for the suggestions!

read, recv, recvfrom are the most important syscalls to hook for tainting.

Agreed, we implement at least these and a few others, but not enough that our analysis is sound just yet (i.e. coreutils binaries routinely yield false positives).

And it makes sense to hook open and accept4 as well, as e.g. if you only want to taint network traffic that can be received with read() too, so you would need to track read() for the file descriptor that came with accept4().

Maybe we're not exactly on the same page--this technique seems to be aimed at tainting user input (possibly for the purposes of checking for control-flow infractions as in TaintCheck and TaintTrace). However, DrTaint should be a general library that allows users to taint what they want, for example at a syscall event.

The reason I call our analysis "unsound" is due to syscalls which overwrite some userspace buffer, for example uname. If we don't clear this taint (or subsequently set it, as a user might be interested in the information produced by uname for whatever reason) then the taint value of whatever is currently on the stack is preserved, which shouldn't happen, despite it having been overwritten.

@vanhauser-thc
Copy link

for very selfish reasons I just would like this tainter to be as flexible as possible and applicable to real-world problems :)
(and then add myself Intel x86 and x64 to it)

the many syscalls are an issue as it is a lot of effort to implement them, true.

@toshipiazza
Copy link
Owner Author

Just a heads up @vanhauser-thc , but potentially around April I plan on contributing the DrTaint library in some form to the main DrMemory project under the name DrTracker, XREF DynamoRIO/drmemory#825.

Derek has expressed interest in also splitting out the taint analysis code used for uninitialized read tracking for DrMemory (x86 only) and contributing it to DrTracker, if you are interested :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants