Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Symbolic model for strcpy #1681

Merged
merged 62 commits into from
Jun 9, 2020
Merged
Changes from 34 commits
Commits
Show all changes
62 commits
Select commit Hold shift + click to select a range
88fc5b6
Working copy for strcpy - without symbolic input
sschriner Apr 24, 2020
3977b0a
Corrected null term error
sschriner Apr 24, 2020
590e0e8
Needs to be tested but general algorithm idea for handling symbolic i…
sschriner Apr 24, 2020
b62ae4a
Updated according to feedback
sschriner Apr 24, 2020
063f2cd
Beginning of strcpy tests
sschriner Apr 24, 2020
4ae8527
Revising strcpy algorithm - realised it was flawed
sschriner Apr 27, 2020
f94244a
This should be a more correct way of handling symbolic bytes. Needs t…
sschriner Apr 27, 2020
5db5296
Get rid of redundant code
sschriner Apr 27, 2020
425232a
Fixed another typo
sschriner Apr 27, 2020
ac2c56a
Update manticore/native/models.py
sschriner Apr 27, 2020
8ac2185
Clarified and updated comments
sschriner Apr 27, 2020
9f836fa
Fixed some errors in model while working on tests
sschriner Apr 28, 2020
509a964
Progress on tests
sschriner Apr 28, 2020
0b1729f
Found typo in models
sschriner Apr 28, 2020
c05a4e6
Removed unnecassary whitespace
sschriner Apr 28, 2020
1b5f11a
Added some helper functions for tests and to clean things up
sschriner Apr 28, 2020
f53a89b
Added comments for helper functions
sschriner Apr 28, 2020
4c034d4
WIP: current state of strcpy tests
sschriner Apr 29, 2020
bb13eaf
Updated type hints from Expression to BitVec
sschriner Apr 29, 2020
9f0cb0f
Fixed linter complaints
sschriner Apr 29, 2020
f727753
Updated strcpy return type to make linter happy
sschriner Apr 29, 2020
a64bcaf
Added some helper properties to expression BitVecITE
sschriner Apr 29, 2020
38c0ed1
Overwrote equal op in ArraySelect
sschriner Apr 29, 2020
828dbe6
Fixed formatting in expression
sschriner Apr 29, 2020
82c61eb
Almost done just needs some cleaning
sschriner Apr 30, 2020
2c51906
Attempt at overriding __eq__ again
sschriner Apr 30, 2020
62a547c
Cleaned up tests some still have unresolved __eq__ issues
sschriner Apr 30, 2020
767b360
Removed stuff from expression and updated test_models.py
sschriner May 1, 2020
c1c9f20
Removed stuff from expression and updated test_models.py
sschriner May 1, 2020
9afad4f
Basicly done probably needs some more comments for clarity
sschriner May 1, 2020
7601831
Cleaned up the tests a little
sschriner May 7, 2020
f2dd84b
Corrected linter error and added more comments to strcpy model functions
sschriner May 7, 2020
a353e84
Merge branch 'master' of github.com:trailofbits/manticore into c_str_…
sschriner May 7, 2020
f8be8eb
Remove duplicated word in comment
sschriner May 8, 2020
35cc9fc
Update manticore/native/models.py
sschriner May 14, 2020
1ce0424
Update manticore/native/models.py
sschriner May 14, 2020
edb721d
Update tests/native/test_models.py
sschriner May 14, 2020
f2f58b2
Merge branch 'master' of github.com:trailofbits/manticore into c_str_…
sschriner May 14, 2020
4296893
Updates according to feedback
sschriner May 14, 2020
5028cda
Updates according to feedback
sschriner May 14, 2020
794f943
Swapped can_be_true and must be true
sschriner May 14, 2020
dceb351
Updates for feedback
sschriner May 19, 2020
9b0aad5
Forgot to lint
sschriner May 19, 2020
dde7d0f
Fixed some typos
sschriner May 20, 2020
1a6d1ce
Largely simplified ITE being built
sschriner May 20, 2020
adec0f8
Forgot to lint
sschriner May 20, 2020
657ab4f
This is messy but seems to be concretizing to the correct states
sschriner May 26, 2020
dd9447b
Very very messy state of the code before attempting to concretize on …
sschriner May 29, 2020
e65c185
Cleaned up model with forking on every possible NULL
sschriner Jun 3, 2020
ee93ffb
Removed trailing whitespace for CodeClimate
sschriner Jun 3, 2020
1dad5c3
Added state.context for offset value back
sschriner Jun 3, 2020
72516ab
New symbolic test with regex for expected output
sschriner Jun 5, 2020
e20b21c
New symbolic test with regex for expected output
sschriner Jun 5, 2020
f9204e5
Updated for mypy
sschriner Jun 5, 2020
600b44f
Merge branch 'master' of github.com:trailofbits/manticore into c_str_…
sschriner Jun 8, 2020
38cb66d
Updated comments in tests
sschriner Jun 8, 2020
dbd627a
Updated cant_be_null to cannot_be_null
sschriner Jun 9, 2020
a6f4dab
Corrected docstring location for _test_strcpy
sschriner Jun 9, 2020
eab68d7
Add break for efficiency
sschriner Jun 9, 2020
ecb35be
Added compiler information and source file
sschriner Jun 9, 2020
3491d52
Updated regex to use raw strings
sschriner Jun 9, 2020
3cd7b46
Merge branch 'master' of github.com:trailofbits/manticore into c_str_…
sschriner Jun 9, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions manticore/core/smtlib/expression.py
Original file line number Diff line number Diff line change
@@ -1263,6 +1263,18 @@ def __init__(
assert false_value.size == size
super().__init__(size, condition, true_value, false_value, *args, **kwargs)

@property
def condition(self):
return self.operands[0]

@property
def true_value(self):
return self.operands[1]

@property
def false_value(self):
return self.operands[2]


Constant = (BitVecConstant, BoolConstant)
Variable = (BitVecVariable, BoolVariable, ArrayVariable)
130 changes: 126 additions & 4 deletions manticore/native/models.py
Original file line number Diff line number Diff line change
@@ -2,10 +2,12 @@
Models here are intended to be passed to :meth:`~manticore.native.state.State.invoke_model`, not invoked directly.
"""

from .cpu.abstractcpu import ConcretizeArgument
from ..core.smtlib import issymbolic
from .cpu.abstractcpu import Cpu, ConcretizeArgument
from .state import State
from ..core.smtlib import issymbolic, BitVec
from ..core.smtlib.solver import Z3Solver
from ..core.smtlib.operators import ITEBV, ZEXTEND
from typing import Union, List

VARIADIC_FUNC_ATTR = "_variadic"

@@ -57,7 +59,7 @@ def _find_zero(cpu, constrs, ptr):
return offset


def strcmp(state, s1, s2):
def strcmp(state: State, s1: Union[int, BitVec], s2: Union[int, BitVec]):
"""
strcmp symbolic model.

@@ -113,7 +115,7 @@ def strcmp(state, s1, s2):
return ret


def strlen(state, s):
def strlen(state: State, s: Union[int, BitVec]):
"""
strlen symbolic model.

@@ -140,3 +142,123 @@ def strlen(state, s):
ret = ITEBV(cpu.address_bit_size, byt == 0, offset, ret)

return ret


def is_NULL(byte, constrs) -> bool:
"""
Checks if a given byte read from memory is NULL or effectively NULL

:param byte: byte read from memory to be examined
:param constrs: state constraints
:return: whether a given byte is NULL or constrained to NULL
"""
if issymbolic(byte):
return not Z3Solver.instance().can_be_true(constrs, byte != 0)
else:
return byte == 0


def cant_be_NULL(byte, constrs) -> bool:
"""
Checks if a given byte read from memory is not NULL or cannot be NULL

:param byte: byte read from memory to be examined
:param constrs: state constraints
:return: whether a given byte is not NULL or cannot be NULL
"""
if issymbolic(byte):
return not Z3Solver.instance().can_be_true(constrs, byte == 0)
else:
return byte != 0


def can_be_NULL(byte, constrs) -> bool:
"""
Checks if a given byte read from memory can be NULL

:param byte: byte read from memory to be examined
:param constrs: state constraints
:return: whether a given byte is NULL or can be NULL
"""
if issymbolic(byte):
return Z3Solver.instance().can_be_true(constrs, byte == 0)
else:
return byte == 0


def strcpy(state: State, dst: Union[int, BitVec], src: Union[int, BitVec]) -> Union[int, BitVec]:
"""
strcpy symbolic model

Algorithm: Copy every byte from the src to dst until finding a byte that can be or is NULL.
If the byte is NULL or is constrained to only the NULL value, append the NULL value to dst
and return. If the value can be NULL or another value write an `Expression` for every following
byte that sets a value to the src or dst byte according to the preceding bytes until a NULL
byte or effectively NULL byte is found.

:param state: current program state
:param dst: destination string address
:param src: source string address
:return: pointer to the dst
"""
if issymbolic(src):
raise ConcretizeArgument(state.cpu, 1)

if issymbolic(dst):
raise ConcretizeArgument(state.cpu, 2)

cpu = state.cpu
constrs = state.constraints
ret = dst

# Copy until '\000' is reached or symbolic memory that can be '\000'
c = cpu.read_int(src, 8)
while cant_be_NULL(c, constrs):
cpu.write_int(dst, c, 8)
src += 1
dst += 1
c = cpu.read_int(src, 8)

# If the byte is symbolic and constrained to '\000' or is '\000' write concrete val and return
if is_NULL(c, constrs):
cpu.write_int(dst, 0, 8)
return ret

offset = 0
zeros: List[int] = []
src_val = cpu.read_int(src, 8)
while not is_NULL(src_val, constrs):
if can_be_NULL(c, constrs):
# If a byte can be NULL set the src_val for NULL, build the ITE, & add to the list of nulls
src_val = ITEBV(8, src_val != 0, src_val, 0)
_build_ITE(zeros, cpu, src, dst, offset, src_val)
zeros.append(offset)
else:
# If it can't be NULL just build the ITE
_build_ITE(zeros, cpu, src, dst, offset, src_val)
offset += 1
src_val = cpu.read_int(src + offset, 8)

# Build ITE Tree for NULL byte
src_val = 0
_build_ITE(zeros, cpu, src, dst, offset, src_val)

return ret


def _build_ITE(
zeros: List[int],
cpu: Cpu,
src: Union[int, BitVec],
dst: Union[int, BitVec],
offset: int,
src_val: Union[int, BitVec],
) -> None:
"""
Builds ITE tree for each symbolic dst byte
"""
dst_val = cpu.read_int(dst + offset, 8)
for zero in reversed(zeros):
c = cpu.read_int(src + zero, 8)
src_val = ITEBV(8, c != 0, src_val, dst_val)
cpu.write_int(dst + offset, src_val, 8)
155 changes: 152 additions & 3 deletions tests/native/test_models.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,28 @@
import unittest
import os

from manticore.core.smtlib import ConstraintSet, Z3Solver
import random

from manticore.core.smtlib import (
ConstraintSet,
Operators,
Z3Solver,
issymbolic,
ArraySelect,
BitVecITE,
)
from manticore.native.state import State
from manticore.platforms import linux

from manticore.native.models import variadic, isvariadic, strcmp, strlen
from manticore.native.models import (
variadic,
isvariadic,
strcmp,
strlen,
strcpy,
is_NULL,
can_be_NULL,
cant_be_NULL,
)


class ModelMiscTest(unittest.TestCase):
@@ -43,6 +60,16 @@ def _push_string(self, s):
cpu.write_bytes(cpu.RSP, s)
return cpu.RSP

def _push_string_space(self, l):
cpu = self.state.cpu
cpu.RSP -= l
return cpu.RSP

def _pop_string_space(self, l):
cpu = self.state.cpu
cpu.RSP += l
return cpu.RSP

def assertItemsEqual(self, a, b):
# Required for Python3 compatibility
self.assertEqual(sorted(a), sorted(b))
@@ -198,3 +225,125 @@ def test_symbolic_mixed(self):
self.state.constrain(sy[3] != 0)
ret = strlen(self.state, s)
self.assertTrue(self.state.must_be_true(ret == 4))


class StrcpyTest(ModelTest):
def _check_BitVecITE(self, dst, dst_val):
# Iterate and check the nested ITE tree for a symbolic byte
self.assertTrue(issymbolic(dst))
while type(dst.true_value) is BitVecITE: # check each if/else in dst
self.assertEqual(dst.false_value, dst_val) # dst = false_val
dst = dst.true_value
return dst

def _assert_start(self, s, d):
# Checks all characters are copied until the 1st that could be NULL
cpu = self.state.cpu
src = cpu.read_int(s, 8)
dst = cpu.read_int(d, 8)
offset = 0
while cant_be_NULL(src, self.state.constraints):
self.assertTrue(not issymbolic(dst))
self.assertEqual(src, dst)
offset += 1
src = cpu.read_int(s + offset, 8)
dst = cpu.read_int(d + offset, 8)
return offset

def _assert_end(self, s, d, offset, org_dest_val):
cpu = self.state.cpu
src = cpu.read_int(s + offset, 8)
dst = cpu.read_int(d + offset, 8)

# src string was entirely symbolic
if not issymbolic(dst):
self.assertTrue(is_NULL(src, self.state.constraints))
self.assertEqual(0, dst)

# Check each symbolic byte in the dst
else:
# Loop till src must be NULL or assumed dst space allowed is gone
while not is_NULL(src, self.state.constraints) and offset < len(org_dest_val):
# Check ITE tree in symbolic dst
dst = self._check_BitVecITE(dst, org_dest_val[offset])

# Check that true value is src
if type(dst.true_value) is ArraySelect:
self.assertEqual(type(src), ArraySelect)
self.assertEqual(src.index, dst.true_value.index)
self.assertTrue(src.array is dst.true_value.array)
else:
self.assertTrue(Operators.ITE(dst.true_value == src, True, False))

# Check the false value is dst or offset
if issymbolic(src) and can_be_NULL(src, self.state.constraints):
self.assertEqual(dst.false_value, 0)
else:
self.assertEqual(dst.false_value, org_dest_val[offset]) # dst = false_val

offset += 1
src = cpu.read_int(s + offset, 8)
dst = cpu.read_int(d + offset, 8)

# Check last sym dst byte
dst = self._check_BitVecITE(dst, org_dest_val[offset])
self.assertEqual(dst.true_value, 0)

def _test_strcpy(self, string, dst_len=None):
# Create src and dsty strings
if dst_len is None:
dst_len = len(string)
cpu = self.state.cpu
s = self._push_string(string)
d = self._push_string_space(dst_len)
dst_vals = [None] * dst_len
for i in range(dst_len):
# Set each dst byte to a random char to simplify equal compairisons
c = random.randrange(255)
cpu.write_int(d + i, c, 8)
dst_vals[i] = c

ret = strcpy(self.state, d, s)

# addresses should match
self.assertEqual(ret, d)
# assert everything is copied up to the 1st possible 0 is copied
offset = self._assert_start(s, d)
# check all symbolic values created until a value that must be 0 is found
self._assert_end(s, d, offset, dst_vals)

# Delete stack space created
self._pop_string_space(dst_len + len(string))

def test_concrete(self):
self._test_strcpy("abc\0")
self._test_strcpy("a\0", dst_len=10)
self._test_strcpy("abcdefghijklm\0")
self._test_strcpy("a\0", dst_len=5)

def test_concrete_empty(self):
self._test_strcpy("\0")
self._test_strcpy("\0", dst_len=10)

def test_symbolic_mixed(self):
src = self.state.symbolicate_buffer("++\0")
self._test_strcpy(src, dst_len=4)

src = self.state.symbolicate_buffer("xy++\0")
self._test_strcpy(src, dst_len=6)

src = self.state.symbolicate_buffer("iv+++")
self.state.constraints.add(src[-1] == 0)
self._test_strcpy(src, dst_len=6)

src = self.state.symbolicate_buffer("+++jl\0")
self._test_strcpy(src, dst_len=6)

src = self.state.symbolicate_buffer("+++df+++")
self.state.constraints.add(src[-1] == 0)
self._test_strcpy(src, dst_len=12)

def test_only_symbolic(self):
src = self.state.symbolicate_buffer("++++++++++++")
self.state.constraints.add(src[-1] == 0)
self._test_strcpy(src, dst_len=15)