Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

every vuln chain #81

Merged
merged 23 commits into from
Mar 27, 2018
Merged

every vuln chain #81

merged 23 commits into from
Mar 27, 2018

Conversation

KevinHock
Copy link
Collaborator

@KevinHock KevinHock commented Feb 28, 2018

Motivation

What this PR does is change the way PyT finds vulnerabilities, as mentioned in the docs and the thesis, PyT originally used reaching definitions -- slightly modified to keep reassignments going (i.e. keeping the definition reaching) and, ran this slightly modified reaching definitions on every 'secondary' node in addition to the source.

The thing I wanted to do, was handle 'blackbox' function calls well, does url_for with a tainted argument yield a tainted value? What about os.path.join? Or some_random_web_framework_function? Having an answer would have helped us in our last evaluation.

There is a problem with only knowing that a definition reaches a sink, and not how. You can know arg = request.args.get('foo') reaches subprocess.check_output, but if there are multiple paths from source to sink, there may be one path that's vulnerable and one that's not. i.e. If there is a blackbox node that doesn't propagate taint or a sanitizer along the path we don't want to say it "might" not be vulnerable just because we don't know the path. You can see an example in the multi_chain.py test with corresponding mapping. I use 'path' and 'chain' interchangeably.

It's also confusing IMHO to just see all of the secondary nodes in the output as opposed to the ones that lead to the vulnerability, I tried to remedy this by implementing the trim option, which stepped backwards along one path, but that might not be the vulnerable path. Though I still left the default as secondary the nodes if you don't have the trim or interactive flag on.

This also gives us the opportunity to ask the user about if a blackbox function we haven't encountered propagates taint, (if they set the interactive command line option) and then save the mapping to a config, since we can't know all the web frameworks functions in the world.

How It Works

The way this works is by, after knowing that either a source or a secondary node reaches a sink (i.e. doing what we've always done), we construct a definition-use mapping, and traversing through the uses of the source, then the defs of those uses, and so on and so forth until the use is the sink. This was a great opportunity to use yield from and recursion, once we have a 'chain' we pass it to how_vulnerable which checks the chain against the blackbox mapping and sanitizers, if it comes back as FALSE we loop through the other chains and call how_vulnerable on until we find a vulnerable one.

We now use a vuln_factory and pass a dictionary, vuln_deets, as args.

Additional details

We can tell it's definitely sanitized when the sanitizer is an assignment node. Right now we can only say potentially if the sanitizer is an if statement, this is because we're currently not smart enough to tell if e.g. the condition leads to a return. I am okay with this.

Future Work

  • Being able to tell if the tainted argument is at the beginning or end of the string that ends up in the sink.
def client_passport():
    code = request.args.get('code')
    uri = 'http://localhost:5000/oauth?response_type=%s&client_id=%s&redirect_uri=%s' %(code,client_id,redirect_uri)
    return redirect(uri)

was a false-positive in our last evaluation due to this, this is also helpful for SSRF and other vulnerabilities.

  • Currently in this PR we stop and return a vulnerability if it's not FALSE, I'll change this in the future to be better and keep on going if the first thing we find is sanitized or unknown (and the e.g. 2nd thing we find is a TRUE vulnerability.)

  • JSON output

  • Maybe using textwrap for the vulnerability descriptions, only if it looks better.

  • I also might implement a command line option to list all of the vulnerable paths, though this is low-priority as it won't help us in our future evaluations.

@KevinHock KevinHock mentioned this pull request Feb 28, 2018
@KevinHock KevinHock added the cool label Mar 2, 2018
…ctory + alphabetize enums, make all end line comments have 2 spaces not 1, add AssignmentCallNode docstrings and DRYness, cleanup dead SinkArgsError code, rename vulnerability_log to vulnerability_helper, refactor all [0:..] to [:..]
…Visiting during an AssignmentCallNode, Rinsed build_def_use_chain, Wiped get_uses
…en we can default arg, Wipe old is_sanitised tests
…def_args_in_temp, make all tests pass, change double quotes to single quotes, cleanup stmt_star_handler/connect_if_allowed/comments/docstrings
@KevinHock
Copy link
Collaborator Author

LGTM

@KevinHock
Copy link
Collaborator Author

👍

@KevinHock KevinHock merged commit 558d753 into master Mar 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants