-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Fix false-negatives and false-positives #99
Conversation
…-p's f-n's and the infinite loop in get_first_node
if not url: | ||
abort(400) | ||
|
||
if is_image(url): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one seems bypass-able, so not a false-positive, but it easily could have been, in order to do this well we would need to be able to solve predicates, via like Z3str3 or something. I feel these are acceptable for the foreseeable future.
…(rhs_vars), Move add_blackbox_or_builtin_call to expr_visitor where it should be, and CALL_IDENTIFIER to expr_visitor_helper, some flake8 on vulnerabilities_test
pyt/expr_visitor.py
Outdated
# .args is only used in get_sink_args | ||
# We make a new list because right_hand_side_variables is extended in assignment_call_node | ||
call_node.right_hand_side_variables = rhs_vars | ||
call_node.args = list(rhs_vars) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we did not make a new list, things like this would happen:
> User input at line 8, trigger word "request.args.get(":
¤call_1 = ret_request.args.get('image_name')
Reassigned in:
File: example/vulnerable_code/path_traversal_sanitised.py
> Line 8: image_name = ¤call_1
File: example/vulnerable_code/path_traversal_sanitised.py
> reaches line 10, trigger word "replace(":
¤call_2 = ret_image_name.replace('..', '')
though the problem on master was that we didn't include ~call_N
in .args, and thus in the return value of get_sink_args
and so we had a false-negative that's now fixed. However, or_in_sink.py
remains unsolved, for the time being.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason or_in_sink.py
doesn't pass yet is because we don't if isinstance
for BoolOp
in the 'loop through args' code in visit_Call
, should be easy enough.
So I was half way done with |
… vulnerabilities_test.py
…ALL_IDENTIFIER back to what it is on master
WHAT IS LEFTWrite more tests, especially having many BoolOps -- like in if_else_in_sink.py e.g. Make existing tests pass ControlFlowExpr inherit nicely from namedtuple Clean up ControlFlowNode.connect and ControlFlowExpr.connect random: An aside: AFTER WORK
|
The current problem with return redirect(request.args.get('The') if 'hey' or 'you' else request.args.get('French') if 'foo' else request.args.get('Aces') and 'c') is that --- a/pyt/cfg/expr_visitor.py
+++ b/pyt/cfg/expr_visitor.py
@@ -86,6 +86,7 @@ class ExprVisitor(StmtVisitor):
variables = list()
visual_variables = list()
first_node = None
+ last_node_should_not_be_ignored = False
node_not_to_step_past = self.nodes[-1]
for expr in exprs:
@@ -112,7 +113,8 @@ class ExprVisitor(StmtVisitor):
label.visit(expr)
visual_variables.append(label.result)
- if not isinstance(node, StrNode):
+ last_node_should_not_be_ignored = not isinstance(node, StrNode)
+ if last_node_should_not_be_ignored:
cfg_expressions.append(node)
if not first_node:
if isinstance(node, ControlFlowExpr):
@@ -130,7 +132,7 @@ class ExprVisitor(StmtVisitor):
return ConnectExpressions(
first_expression=first_node,
all_expressions=cfg_expressions,
- last_expressions=get_last_expressions(cfg_expressions) if cfg_expressions else [],
+ last_expressions=get_last_expressions(cfg_expressions) if last_node_should_not_be_ignored else [],
variables=variables,
visual_variables=visual_variables
) Now it only reports a vuln, if |
…a StrNode the last expression, instead of nothing. e.g. return redirect(request.args.get('The') if 'hey' or 'you' else request.args.get('French') if 'foo' else request.args.get('Aces') and 'c')
…ast_var_in_and_is_not_tainted with test
… match. Did boolop.connect to each boolop last expression.
inner_most_call notes HOW WE CURRENTLY SET THE INNER MOST CALL ATTRIBUTE FOR BLACKBOX CALLS:When looping through args and keyword args in add_blackbox_call This code essentially does 3 things:
In save_def_args_in_temp of process_function we do the same thing pretty much. HOW WE CURRENTLY USE THE INNER MOST CALL ATTRIBUTE:When connecting a ControlFlowNode to an AssignmentCallNode
So blackbox calls are done well, and user-defined calls are done lazily. Code more or less: |
This line seems dead Line 308 in fbca4d2
b/c of the above. (i.e. we only use first_node of user-defined functions. OPEN QUESTIONSQuestionWhy do we only do this when connecting a ControlFlowNode to an AssignmentCallNode? AnswerWe do connect_to_allowed -1 stuff that we are trying to get away from QuestionWhat tests do we currently have for the above? Answer??? |
Removed the if CALL_IDENTIFIER in garbage Removed the node_not_to_step_past.connect(first_node) Fixed up _get_inner_most_expression, connect_nodes and wrote the top 4 expr_cfg_tests # This is used in connect_nodes call_node.first_expression = connected_expressions.first_expression Fixed remaining expr_cfg_tests Left off BinOp for another day
So after my last vulnerabilities PR and my 2 re-organization PRs I think I'll try my hand at finding the false-negatives of our last evaluation. Fixing bugs and false-positives is easier than false-negatives so I'm looking forward to this :D
(The crashes in the last evaluation were caused by us analyzing invalid python and there was also an infinite loop in
get_first_node
that I now fixed 104cf80.)