Skip to content

Commit

Permalink
cEP-0025.md: Integrate pyflakes AST
Browse files Browse the repository at this point in the history
Closes #143
  • Loading branch information
ankitxjoshi committed May 19, 2018
1 parent 5ae1373 commit 978e384
Showing 1 changed file with 306 additions and 0 deletions.
306 changes: 306 additions & 0 deletions cEP-0025.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,306 @@
# Integration of pyflakes-enhanced-AST into coala

| Metadata | |
| -------- | ----------------------------------------------- |
| cEP | 22 |
| Version | 1.0 |
| Title | Integration of pyflakes-enhanced-AST into coala |
| Authors | Ankit Joshi <mailto:[email protected]> |
| Status | Proposed |
| Type | Feature |

## Abstract

This document describes how to create a metabear to integrate
pyflakes-enhanced-AST into coala and demonstrates its use by
creating a simple dependent bear.

## Introduction

Flake8 is the most commonly used linter framework when it comes to static
code analysis for python. It wraps around the tools pyflakes,
pycodestyle and mccabe scripts. These tools are implemented on different
interfaces and flake8 combines their potential.

It even provides with a plugin mechanism. Additional linters are
supported using the flake8 plugin system, and may be Python AST
based or built using pycodestyle's internal logical-line-based
checker API (checkout [Hacking](https://github.com/openstack-dev/hacking)).

There has not been an option to use the pyflakes-enhanced-AST based checker
API to create custom linters. Thus, the full potential of enhanced AST isn't
utilized , a whole lot of rework is required to do the basic traversing
and collection of important nodes. Pyflakes provides with a basic API
that does the traversing. So, if a developer uses enhanced AST he just
needs to work on the implementation of the new logic that his/her plugin
provides and not about the fidelity of the basic node handlers.

The reason for choosing pyflakes over its equivalents like
[Asteroid](https://github.com/PyCQA/astroid) lies in its simplicity
and provision of lots of re-usable analysis. Asteroid is more detailed and
powerful, thus making it complicated. Plus, there is a much larger ecosystem
of flake8 plugins.

## Proposed Change

Here is a brief overview of the architecture:

1. There will be a separate repository named as coala-pyflakes which will
be installable via `pip install pyflakes-bears`.
2. The repository will contain a new package `pyflakes-bears` which
would house all bears which use pyflakes-enhanced-ast.
3. The `pyflakes-bears` would contain a metabear `PyFlakesASTBear`
which would be used by all other plugin bears.
4. The repository would even contain another package `pyflakes-generic-plugins`
that would be independent of coala and could be run using flake8.

## Management of the new `coala-pyflakes` repository

The repository will use a similar mechanism of what is being
used in `coala-bears` to test all its bears.

## Implementation of PyFlakesASTBear

1. PyFlakesASTBear will be LocalBear returning result of the type
HiddenResult since these results are not meant for the users but
for the dependent bear. The result structure is described by
PyFlakesResult class.
2. PyFlakesResult consists of a snapshot of the module at class,
function, module, generator ,and doctest level. Thus, providing
the developer multiple views of a file allowing him to retrieve
results as per his demand.
3. Apart from scope information, the metabear also returns a list of
messages generated by pyflakes itself. These messages can be used
by the developer to add an additional filter layer that is
required to be resolved by the user before executing the plugin.
This is primarily done because pyflakes emits additional syntax
errors which are not caught by python interpretor and may affect
the working of a plugin. This check is optional and may or may not
be incorparated by the developer.
4. Helper function `get_node_location` provides with node location
in source file.
5. Helper function `get_nodes` provides with all the nodes of a
particular type present in a scope.

Here is a prototype for the implementations of PyFlakesASTBear:

```python
# Imports not mentioned to maintain brevity

class PyFlakesResult(HiddenResult):

def __init__(self, origin, deadScopes, pyflakes_messages):

Result.__init__(self, origin, message='')

self.module_scope = self.get_scopes(ModuleScope, deadScopes)
self.class_scopes = self.get_scopes(ClassScope, deadScopes)
self.function_scopes = self.get_scopes(FunctionScope, deadScopes)
self.generator_scopes = self.get_scopes(GeneratorScope, deadScopes)
self.doctest_scopes = self.get_scopes(DoctestScope, deadScopes)
self.pyflakes_messages = pyflakes_messages

def get_scopes(self, scope_type, scopes):
return list(filter(lambda scope: isinstance(scope, scope_type),
scopes))

def get_node_location(self, node):
return (node.source.lineno,
node.source.lineno + node.source.depth - 1,
node.source.col_offset)

def get_nodes(self, scope, node_type):
for _, node in scope.items():
if isinstance(node, node_type):
yield node


class PyFlakesASTBear(LocalBear):
AUTHORS = {'The coala developers'}
AUTHORS_EMAILS = {'[email protected]'}
LICENSE = 'AGPL-3.0'

def run(self, filename, file):
"""
Generates pyflakes-enhance-AST for the input file and returns
deadScopes as HiddenResult
:return:
One HiddenResult containing a dictionary with keys being
type of scope and values being a list of scopes generated
from the file.
"""
tree = ast.parse(''.join(file))
result = Checker(tree, filename=filename, withDoctest=True)

yield PyFlakesResult(self, result.deadScopes, result.messages)
```

## Demonstrating use of PyFlakesASTBear by creating a simple bear

A hassle free solution to the problem of finding all the future imports
present in the source code.

Here is a prototype for the NoFutureImportBear that uses PyFlakesASTBear:

```python
# Imports not mentioned to maintain brevity

class NoFutureImportBear(LocalBear):
"""
Uses pyflakes-enhance-AST to detect use of future imports
"""
BEAR_DEPS = {PyFlakesASTBear}

def run(self, filename, file,
dependency_results=dict()
):
for result in dependency_results.get(PyFlakesASTBear.name, []):
for node in result.get_nodes(result.module_scope[0],
FutureImportation):
lineno = result.get_node_location(node)[0]
corrected = self.remove_import(file, lineno, node.name)
yield Result.from_values(
origin=self,
message=('Future import %s found' % node.name),
file=filename,
diffs={filename: corrected},
line=result.get_node_location(node)[0])
```

## Demonstrating use of PyFlakesASTBear by creating a fixes bear

A hassle free solution to the problem of removing all the future imports
present in the source code.

Here is a prototype for the NoFutureImportBear that uses PyFlakesASTBear:

```python
# Imports not mentioned to maintain brevity

class NoFutureImportBear(LocalBear):
"""
Uses pyflakes-enhance-AST to remove future imports
"""
BEAR_DEPS = {PyFlakesASTBear}

def remove_future_imports(self, file, lineno,
max_col_offset, affected_lines):
"""
Removes all __future__ imports from the input line
"""
diff = Diff(file)
line = file[lineno - 1]
# If a line contains \ in the end
if line.rstrip()[-1] == '\\':
next_line = file[lineno]
semicolon_index = next_line.find(';')
diff.delete_line(lineno)
if semicolon_index == -1:
diff.delete_line(lineno + 1)
else:
replacement = next_line[semicolon_index + 1:].lstrip()
diff.modify_line(lineno+1, replacement)
else:
semicolon_indices = [i for i, a in enumerate(line) if a == ';']
if len(semicolon_indices) == 0:
diff.delete_line(lineno)
else:
for indice in semicolon_indices:
if indice > max_col_offset:
replacement = line[indice + 1:].lstrip()
diff.modify_line(lineno, replacement)
break

return diff

def run(self, filename, file,
dependency_results=dict()
):
"""
Uses PyFlakesASTBear to get only FutureImportation
"""
# Preprocess all the lines containing __future__ imports
affected_lines = set()
# Get the maximum column offset of FutureImportation for a line
max_column_for_a_line = dict()
for result in dependency_results.get(PyFlakesASTBear.name, []):
for node in result.get_nodes(result.module_scope[0],
FutureImportation):
lineno = result.get_node_location(node)[0]
col_offset = result.get_node_location(node)[2]
if lineno in max_column_for_a_line:
max_column_for_a_line[lineno] = max(
max_column_for_a_line[lineno],
col_offset)
else:
max_column_for_a_line[lineno] = col_offset
affected_lines.add(lineno)

for line in affected_lines:
corrected = self.remove_future_imports(
file, line, max_column_for_a_line[lineno],
affected_lines
)
yield Result.from_values(
origin=self,
message=('Future import/imports found'),
file=filename,
diffs={filename: corrected},
line=line)
```

## A generic implementation for detecting `__future__` imports

This implementation doesn't requires coala to be installed and is
designed so that it can be used in other tools like flake8. This
implementation will be housed in `pyflakes-generic-plugins`.

Here is a prototype for the NoFutureImport that uses pyflakes-AST:

```python
# Imports not mentioned to maintain brevity
__version__ = '0.1'

CODE = 'F482'


class NoFutureImport(object):
name = 'no_future'
version = __version__

def __init__(self, tree, filename):
self.tree = tree
self.filename = filename
self._checker = None
self._module_scope = None

@property
def checker(self):
if self._checker is None:
self._checker = Checker(self.tree, self.filename)
return self._checker

@checker.setter
def checker(self, checker):
self.checker = checker

@property
def module_scope(self):
if self._module_scope is None:
self._module_scope = list(filter(lambda scope: isinstance(scope,
ModuleScope),
self.checker.deadScopes))[0]
return self._module_scope

def set_pyflakes_data(self, checker):
self.checker = checker

def run(self):
for _, node in self.module_scope.items():
if isinstance(node, FutureImportation):
message = ('{code}: Future import {name} found'
.format(name=node.name,
code=CODE))
yield (node.source.lineno, node.source.col_offset,
message, NoFutureImport)
```

0 comments on commit 978e384

Please sign in to comment.