Skip to content

Commit

Permalink
Import of GitHub PR #587: Add minimum changes required for py3.8 support
Browse files Browse the repository at this point in the history
#587
Merge 917097e into 549363b

GITHUB_PR_BODY=This is a continuation of @pganssle's work on #559

Beyond just making it compile, this also adds some of the newer VM and marshaling stuff such as positional only arguments. It also addresses and adds tests for a 3.6 feature that allows the line number to opcode table (`lnotab`) to be non monotonic. After this the remaining failing tests are:

```
** pytype.vm_test
** pytype.tests.py3.test_variable_annotations
** pytype.tools.xref.indexer_test
** pytype.tests.py3.test_typing_namedtuple
** pytype.tests.test_decorators
** pytype.tests.test_with
** pytype.tests.test_exceptions
** pytype.tests.test_flow
** pytype.tests.py3.test_coroutine
** pytype.tests.py3.test_annotations
** pytype.tests.py3.test_stdlib
** pytype.tests.test_builtins1
** pytype.tests.test_basic
```

I'll take a look at them later or someone else can carry through with those. Either way I'd like to land this first to make the reviews manageable.

-----

As @pganssle noted on their PR, there is some weird behavior going on with the tests that causes them to fail but not produce any output in the console or the log files.

Upon further investigation this was caused by a segmentation fault in `cfg.cc`. In Python 3.8, `PyStructSequence_InitType` was changed so that it errors out if a type with a reference count greater than 0 is passed in: https://github.com/python/cpython/blame/3.8/Objects/structseq.c#L396-L400

This error was being silently swallowed because we were just assuming that `PyStructSequence_InitType` was running successfully. The problem was we were using `PyObject_HEAD_INIT` which [initializes the refcount to 1](https://github.com/python/cpython/blob/35b8a4da7494cd51f3edeaa07764dd8f14bac0bf/Include/object.h#L83-L85).

With this fixed there should now be actual output from the test runs so hopefully this work should be easier to continue by others.

----

### Notes on the workflow

As Paul mentioned, the workflow on the open-source end of pytype is pretty bespoke and not too friendly. I would recommend using `tox` and separating out the `googletest` based C/C++ tests from the Python ones. The `build_scripts` folder has a lot of magic that masks the details of how the build system works underneath. CMake is currently being used to build the extension, run Python tests and run C++ tests. Building and testing C++ is CMake's forte but there's a lot of code to handle the discovery and running of Python tests. This might have been done to keep symmetry with the `py_test` build rule internally but it is completely alien to most open-source Python developers. Instead, using something like the builtin `unittest` module or `pytest` to drive the Python tests would be a more comfortable experience.

For example, one of the issues I had was figuring out how to make CMake use my locally compiled Python 3.8 build. Passing custom arguments to CMake is nearly impossible due to the build_scripts. I'd recommend this, if you're on a debian based distro, add the deadsnakes ppa or similar and use:
```shell
sudo apt install python3.7 python3.7-dev python3.7-venv
python3.7 -m venv venv
. ./venv/bin/activate
python3.7 build_scripts/build.py
python3.7 build_scripts/run_tests.py
```
Swap out 3.7 for whatever host Python version you want to test with.

Next, for debugging issues like the mysterious failing tests without output, the most important file is `out/ninja.log`. This actually shows the commands being invoked to run the tests. The exact incantation seems to be:

`TYPESHED_HOME=/home/aaskar/workspace/pytype/typeshed /home/aaskar/workspace/pytype/venv/bin/python3.8 -B /home/aaskar/workspace/pytype/build_scripts/test_module.py pytype.tests.test_basic -o /home/aaskar/workspace/pytype/out/pytype/tests/test_basic.log -P /home/aaskar/workspace/pytype/out -s -p`

Failing tests can also be run with unittest instead of the custom runner for a workflow where you don't have to repeatedly check a log file after making changes like so:

`TYPESHED_HOME=$PWD/typeshed python3.8 -m unittest pytype.tools.traces.traces_test`

PiperOrigin-RevId: 313493580
  • Loading branch information
ammaraskar authored and rchen152 committed Jun 1, 2020
1 parent f56f465 commit 39afb06
Show file tree
Hide file tree
Showing 12 changed files with 176 additions and 33 deletions.
1 change: 1 addition & 0 deletions pytype/blocks.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ class OrderedCode(object):
Attributes:
co_argcount: Same as loadmarshal.CodeType.
co_posonlyargcount: Same as loadmarshal.CodeType.
co_kwonlyargcount: Same as loadmarshal.CodeType.
co_nlocals: Same as loadmarshal.CodeType.
co_stacksize: Same as loadmarshal.CodeType.
Expand Down
19 changes: 13 additions & 6 deletions pytype/pyc/loadmarshal.py
Original file line number Diff line number Diff line change
Expand Up @@ -97,14 +97,15 @@ class CodeType(object):
CO_FUTURE_PRINT_FUNCTION = 0x10000
CO_FUTURE_UNICODE_LITERALS = 0x20000

def __init__(self, argcount, kwonlyargcount, nlocals, stacksize, flags, code,
consts, names, varnames, filename, name, firstlineno, lnotab,
freevars, cellvars, python_version):
def __init__(self, argcount, posonlyargcount, kwonlyargcount, nlocals,
stacksize, flags, code, consts, names, varnames, filename,
name, firstlineno, lnotab, freevars, cellvars, python_version):
assert isinstance(nlocals, int)
assert isinstance(stacksize, int)
assert isinstance(flags, int)
assert isinstance(filename, (bytes, six.string_types))
self.co_argcount = argcount
self.co_posonlyargcount = posonlyargcount
self.co_kwonlyargcount = kwonlyargcount
self.co_nlocals = nlocals
self.co_stacksize = stacksize
Expand Down Expand Up @@ -367,6 +368,11 @@ def load_dict(self):
def load_code(self):
"""Load a Python code object."""
argcount = self._read_long()
# Python 3.8+ has positional only arguments.
if self.python_version >= (3, 8):
posonlyargcount = self._read_long()
else:
posonlyargcount = -1
if self.python_version[0] >= 3:
kwonlyargcount = self._read_long()
else:
Expand All @@ -391,9 +397,10 @@ def load_code(self):
# https://github.com/python/cpython/blob/master/Objects/lnotab_notes.txt:
# 'an array of unsigned bytes disguised as a Python bytes object'.
lnotab = self.load()
return CodeType(argcount, kwonlyargcount, nlocals, stacksize, flags,
code, consts, names, varnames, filename, name, firstlineno,
lnotab, freevars, cellvars, self.python_version)
return CodeType(argcount, posonlyargcount, kwonlyargcount, nlocals,
stacksize, flags, code, consts, names, varnames, filename,
name, firstlineno, lnotab, freevars, cellvars,
self.python_version)

def load_set(self):
n = self._read_long()
Expand Down
8 changes: 8 additions & 0 deletions pytype/pyc/magic.py
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,14 @@
3392: (3, 7), # a4
3393: (3, 7), # b1
3394: (3, 7), # b5

# Python 3.8
3400: (3, 8), # a1
3401: (3, 8), # a1
3410: (3, 8), # a1
3411: (3, 8), # b2
3412: (3, 8), # b2
3413: (3, 8), # b4
}


Expand Down
54 changes: 49 additions & 5 deletions pytype/pyc/opcodes.py
Original file line number Diff line number Diff line change
Expand Up @@ -319,6 +319,15 @@ class BEFORE_ASYNC_WITH(Opcode):
__slots__ = ()


class BEGIN_FINALLY(Opcode):
__slots__ = ()


class END_ASYNC_FOR(Opcode):
FLAGS = HAS_JUNKNOWN | NO_NEXT # might re-raise an exception
__slots__ = ()


class STORE_MAP(Opcode):
__slots__ = ()

Expand Down Expand Up @@ -835,6 +844,17 @@ class CALL_METHOD(OpcodeWithArg): # Arg: #args
__slots__ = ()


class CALL_FINALLY(OpcodeWithArg): # Arg: Jump offset to finally block
FLAGS = HAS_JREL|HAS_ARGUMENT
__slots__ = ()


class POP_FINALLY(OpcodeWithArg):
FLAGS = HAS_ARGUMENT|HAS_JUNKNOWN # might re-raise an exception or
# jump to a finally
__slots__ = ()


python2_mapping = {
0: STOP_CODE, # removed in Python 3
1: POP_TOP,
Expand Down Expand Up @@ -1097,16 +1117,29 @@ def _overlay_mapping(mapping, new_entries):
161: CALL_METHOD,
})

python_3_8_mapping = _overlay_mapping(python_3_7_mapping, {
6: ROT_FOUR, # ROT_FOUR returns under a different, cooler id!
53: BEGIN_FINALLY,
54: END_ASYNC_FOR,
80: None, # BREAK_LOOP was removed in 3.8
119: None, # CONTINUE_LOOP was removed in 3.8
120: None, # SETUP_LOOP was removed in 3.8
121: None, # SETUP_EXCEPT was removed in 3.8
162: CALL_FINALLY,
163: POP_FINALLY,
})


class _LineNumberTableParser(object):
"""State machine for decoding a Python line number array."""

def __init__(self, lnotab, firstlineno):
def __init__(self, python_version, lnotab, firstlineno):
assert not len(lnotab) & 1 # lnotab always has an even number of elements
self.lnotab = lnotab
self.lineno = firstlineno
self.next_addr = six.indexbytes(self.lnotab, 0) if self.lnotab else 0
self.pos = 0
self.python_version = python_version

def get(self, i):
"""Get the line number for the instruction at the given position.
Expand All @@ -1121,7 +1154,14 @@ def get(self, i):
The line number corresponding to the position at i.
"""
while i >= self.next_addr and self.pos < len(self.lnotab):
self.lineno += six.indexbytes(self.lnotab, self.pos + 1)
line_diff = six.indexbytes(self.lnotab, self.pos + 1)
# The Python docs have more details on this weird bit twiddling.
# https://github.com/python/cpython/blob/master/Objects/lnotab_notes.txt
# https://github.com/python/cpython/commit/f3914eb16d28ad9eb20fe5133d9aa83658bcc27f
if self.python_version >= (3, 6) and line_diff >= 0x80:
line_diff -= 0x100
self.lineno += line_diff

self.pos += 2
if self.pos < len(self.lnotab):
self.next_addr += six.indexbytes(self.lnotab, self.pos)
Expand Down Expand Up @@ -1222,12 +1262,15 @@ def _wordcode_reader(data, mapping):
start = pos + 2


def _dis(data, mapping, reader,
def _dis(python_version, data, mapping, reader,
co_varnames=None, co_names=None, co_consts=None, co_cellvars=None,
co_freevars=None, co_lnotab=None, co_firstlineno=None):
"""Disassemble a string into a list of Opcode instances."""
code = []
lp = _LineNumberTableParser(co_lnotab, co_firstlineno) if co_lnotab else None
if co_lnotab:
lp = _LineNumberTableParser(python_version, co_lnotab, co_firstlineno)
else:
lp = None
offset_to_index = {}
if co_cellvars is not None and co_freevars is not None:
cellvars_freevars = co_cellvars + co_freevars
Expand Down Expand Up @@ -1271,9 +1314,10 @@ def dis(data, python_version, *args, **kwargs):
(3, 5): python_3_5_mapping,
(3, 6): python_3_6_mapping,
(3, 7): python_3_7_mapping,
(3, 8): python_3_8_mapping
}[(major, minor)]
reader = _wordcode_reader if (major, minor) > (3, 5) else _bytecode_reader
return _dis(data, mapping, reader, *args, **kwargs)
return _dis(python_version, data, mapping, reader, *args, **kwargs)


def dis_code(code):
Expand Down
52 changes: 50 additions & 2 deletions pytype/pyc/opcodes_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,10 @@
class _TestBase(unittest.TestCase):
"""Base class for all opcodes.dis testing."""

def dis(self, code):
def dis(self, code, **kwargs):
"""Return the opcodes from disassbling a code sequence."""
return opcodes.dis(compat.int_array_to_bytes(code), self.python_version)
return opcodes.dis(compat.int_array_to_bytes(code),
self.python_version, **kwargs)

def assertSimple(self, opcode, name):
"""Assert that a single opcode byte diassembles to the given name."""
Expand All @@ -28,6 +29,14 @@ def assertDisassembly(self, code, expected):
else:
self.assertEqual(e, (o.name, o.arg))

def assertLineNumbers(self, code, co_lnotab, expected):
"""Assert that the opcodes have the expected line numbers."""
ops = self.dis(code, co_lnotab=compat.int_array_to_bytes(co_lnotab),
co_firstlineno=1)
self.assertEqual(len(ops), len(expected))
for o, e in zip(ops, expected):
self.assertEqual(e, o.line)


class CommonTest(_TestBase):
"""Test bytecodes that are common to Python 2 and 3 using python 2."""
Expand Down Expand Up @@ -191,5 +200,44 @@ def test_call_method(self):
self.assertName([161, 0], 'CALL_METHOD')


class Python38Test(_TestBase):
python_version = (3, 8, 0)

def test_non_monotonic_line_numbers(self):
# Make sure we can deal with line number tables that aren't
# monotonic. That is:
#
# line 1: OPCODE_1
# line 2: OPCODE_2
# line 1: OPCODE 3

# Compiled from:
# f(
# 1,
# 2
# )
code = [
0x65, 0, # LOAD_NAME, arg=0th name
0x64, 0, # LOAD_CONST, arg=0th constant
0x64, 1, # LOAD_CONST, arg=1st constant
0x83, 0x2, # CALL_FUNCTION, arg=2 function arguments
0x53, 0x0 # RETURN_VALUE
]
expected = [
('LOAD_NAME', 0),
('LOAD_CONST', 0),
('LOAD_CONST', 1),
('CALL_FUNCTION', 2),
('RETURN_VALUE',)
]
self.assertDisassembly(code, expected)
lnotab = [
0x2, 0x1, # +2 addr, +1 line number
0x2, 0x1, # +2 addr, +1 line number
0x2, 0xfe # +2 addr, -2 line number
]
self.assertLineNumbers(code, lnotab, [1, 2, 3, 1, 1])


if __name__ == '__main__':
unittest.main()
7 changes: 4 additions & 3 deletions pytype/tests/test_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -175,9 +175,10 @@ class MakeCodeMixin(object):
def make_code(self, int_array, name="testcode"):
"""Utility method for creating CodeType objects."""
return loadmarshal.CodeType(
argcount=0, kwonlyargcount=0, nlocals=2, stacksize=2, flags=0,
consts=[None, 1, 2], names=[], varnames=["x", "y"], filename="",
name=name, firstlineno=1, lnotab=[], freevars=[], cellvars=[],
argcount=0, posonlyargcount=0, kwonlyargcount=0, nlocals=2,
stacksize=2, flags=0, consts=[None, 1, 2], names=[],
varnames=["x", "y"], filename="", name=name, firstlineno=1,
lnotab=[], freevars=[], cellvars=[],
code=compat.int_array_to_bytes(int_array),
python_version=self.python_version)

Expand Down
5 changes: 5 additions & 0 deletions pytype/tools/traces/traces.py
Original file line number Diff line number Diff line change
Expand Up @@ -218,6 +218,11 @@ def match_Call(self, node):
def match_Ellipsis(self, node):
return self._match_constant(node, Ellipsis)

def match_Constant(self, node):
# As of Python 3.8, bools, numbers, bytes, strings, ellipsis etc are
# all constants instead of individual ast nodes.
return self._match_constant(node, node.s)

def match_Import(self, node):
return list(self._match_import(node, is_from=False))

Expand Down
24 changes: 18 additions & 6 deletions pytype/tools/traces/traces_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -283,28 +283,40 @@ def f(x, y, z):
class MatchConstantTest(MatchAstTestCase):

def test_num(self):
matches = self._get_traces("v = 42", ast.Num)
matches = self._get_traces(
"v = 42",
ast.Constant if sys.version_info >= (3, 8) else ast.Num)
self.assertTracesEqual(matches, [((1, 4), "LOAD_CONST", 42, ("int",))])

def test_str(self):
matches = self._get_traces("v = 'hello'", ast.Str)
matches = self._get_traces(
"v = 'hello'",
ast.Constant if sys.version_info >= (3, 8) else ast.Str)
self.assertTracesEqual(matches, [((1, 4), "LOAD_CONST", "hello", ("str",))])

def test_unicode(self):
matches = self._get_traces("v = u'hello'", ast.Str)
matches = self._get_traces(
"v = u'hello'",
ast.Constant if sys.version_info >= (3, 8) else ast.Str)
self.assertTracesEqual(matches, [((1, 4), "LOAD_CONST", "hello", ("str",))])

def test_bytes(self):
matches = self._get_traces("v = b'hello'", ast.Bytes)
matches = self._get_traces(
"v = b'hello'",
ast.Constant if sys.version_info >= (3, 8) else ast.Bytes)
self.assertTracesEqual(
matches, [((1, 4), "LOAD_CONST", b"hello", ("bytes",))])

def test_bool(self):
matches = self._get_traces("v = True", ast.NameConstant)
matches = self._get_traces(
"v = True",
ast.Constant if sys.version_info >= (3, 8) else ast.NameConstant)
self.assertTracesEqual(matches, [((1, 4), "LOAD_CONST", True, ("bool",))])

def test_ellipsis(self):
matches = self._get_traces("v = ...", ast.Ellipsis)
matches = self._get_traces(
"v = ...",
ast.Constant if sys.version_info >= (3, 8) else ast.Ellipsis)
self.assertTracesEqual(
matches, [((1, 4), "LOAD_CONST", Ellipsis, ("ellipsis",))])

Expand Down
16 changes: 8 additions & 8 deletions pytype/typegraph/cfg.cc
Original file line number Diff line number Diff line change
Expand Up @@ -812,11 +812,7 @@ PyTypeObject PyCFGNode = {

// --- Origin --------------------------------------------------------------

static PyTypeObject PyOrigin = {
// initialized in inittypegraph
PyVarObject_HEAD_INIT(nullptr, 0)
tp_name: nullptr
};
static PyTypeObject PyOrigin;

PyDoc_STRVAR(origin_doc,
"An \"origin\" is an explanation of how a binding was "
Expand Down Expand Up @@ -1474,8 +1470,10 @@ static struct PyModuleDef typegraph_moduledef = {

static PyObject* InitModule(PyObject* module) {
PyObject* module_dict = PyModule_GetDict(module);
if (PyOrigin.tp_name == 0)
PyStructSequence_InitType(&PyOrigin, &origin_desc);
if (PyOrigin.tp_name == 0) {
if (PyStructSequence_InitType2(&PyOrigin, &origin_desc) == -1)
return NULL;
}
PyDict_SetItemString(module_dict, "Program",
reinterpret_cast<PyObject*>(&PyProgram));
PyDict_SetItemString(module_dict, "CFGNode",
Expand Down Expand Up @@ -1542,7 +1540,9 @@ PyMODINIT_FUNC initcfg(void) {
PyType_Ready(&PyCFGNode);
PyType_Ready(&PyVariable);
PyType_Ready(&PyBinding);
InitModule(module);
PyObject* module = InitModule(module);
if (module == NULL)
return NULL;
pytype::typegraph::internal::CFGLogger::Init();
}
#endif
4 changes: 2 additions & 2 deletions pytype/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -91,9 +91,9 @@ def validate_version(python_version):
# that typing.py isn't introduced until 3.5, anyway.
raise UsageError(
"Python versions 3.0 - 3.4 are not supported. Use 3.5 and higher.")
elif python_version > (3, 7):
elif python_version > (3, 8):
# We have an explicit per-minor-version mapping in opcodes.py
raise UsageError("Python versions > 3.7 are not yet supported.")
raise UsageError("Python versions > 3.8 are not yet supported.")


def strip_prefix(string, prefix):
Expand Down
2 changes: 1 addition & 1 deletion pytype/utils_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ def test_validate_version(self):
self._validate_version_helper((2, 1))
self._validate_version_helper((2, 8))
self._validate_version_helper((3, 1))
self._validate_version_helper((3, 8))
self._validate_version_helper((3, 9))

def _validate_version_helper(self, python_version):
with self.assertRaises(utils.UsageError):
Expand Down
Loading

0 comments on commit 39afb06

Please sign in to comment.