diff --git a/cEP-0025.md b/cEP-0025.md new file mode 100644 index 000000000..9cb9fbd9f --- /dev/null +++ b/cEP-0025.md @@ -0,0 +1,305 @@ +# Integration of pyflakes-enhanced-AST into coala + +| Metadata | | +| -------- | ----------------------------------------------- | +| cEP | 22 | +| Version | 1.0 | +| Title | Integration of pyflakes-enhanced-AST into coala | +| Authors | Ankit Joshi | +| Status | Proposed | +| Type | Feature | + +## Abstract + +This document describes how to create a metabear to integrate +**pyflakes-enhanced-AST** into coala and demonstrates its use by +creating a simple dependent bear. + +## Introduction + +**Flake8** is the most commonly used linter framework when it comes to static +code analysis for python. It wraps around the tools **pyflakes**, +**pycodestyle** and **mccabe** scripts. These tools are implemented on different +interfaces and flake8 combines their potential. + +It even provides with a plugin mechanism. Additional linters are +supported using the flake8 plugin system, and may use **CPython** ast standard +library or may be built using pycodestyle's internal logical-line-based +checker API (checkout [Hacking](https://github.com/openstack-dev/hacking)). + +There has not been an option to use the pyflakes-enhanced-AST based checker +API to create custom linters. Thus, the full potential of enhanced AST isn't +utilized , a whole lot of rework is required to do the basic traversing +and collection of important nodes. Pyflakes provides with a basic API +that does the traversing. So, if a developer uses enhanced AST he just +needs to work on the implementation of the new logic that his/her plugin +provides and not about the fidelity of the basic node handlers. + +The reason for choosing pyflakes over its equivalents like +[Astroid](https://github.com/PyCQA/astroid) lies in its simplicity +and provision of lots of re-usable analysis. Astroid is more detailed and +powerful, thus making it complicated. Plus, there is a much larger ecosystem +of flake8 plugins. + +## Proposed Change + +Here is a brief overview of the architecture: + +1. There will be a separate repository named as coala-pyflakes which will + be installable via `pip install pyflakes-bears`. +2. The repository will contain a new package `pyflakes-bears` which + would house all bears which use pyflakes-enhanced-ast. +3. The `pyflakes-bears` would contain a metabear `PyFlakesASTBear` + which would be used by all other plugin bears. +4. The repository would even contain another package + `pyflakes-generic-plugins` that would be independent of coala and could + be run by flake8 or coala or any other similar tool. + +## Management of the new `coala-pyflakes` repository + +The repository will use a similar mechanism of what is being +used in `coala-bears` to test all its bears. + +## Implementation of PyFlakesASTBear + +1. `PyFlakesASTBear` will be `LocalBear` returning result of the type + `HiddenResult` since these results are not meant for the users but + for the dependent bear. The result structure is described by + `PyFlakesResult` class. +2. `PyFlakesResult` consists of a snapshot of the module at class, + function, module, generator and doctest level. Thus, providing + the developer multiple views of a file allowing him to retrieve + results as per his demand. +3. Apart from scope information, the metabear also returns a list of + **messages generated by pyflakes** itself. These s can be used + by the developer to add an additional filter layer that is + required to be resolved by the user before executing the plugin. + This is primarily done because pyflakes emits additional syntax + errors which are not caught by python interpretor and may affect + the working of a plugin. This check is optional and may or may not + be incorparated by the developer. +4. Helper function `get_node_location` provides with node location + in source file. +5. Helper function `get_nodes` provides with all the nodes of a + particular type present in a scope. + +Here is a prototype for the implementations of `PyFlakesASTBear`: + +```python +# Imports not mentioned to maintain brevity + +class PyFlakesResult(HiddenResult): + + def __init__(self, origin, deadScopes, pyflakes_messages): + + Result.__init__(self, origin, message='') + + self.module_scope = self.get_scopes(ModuleScope, deadScopes)[0] + self.class_scopes = self.get_scopes(ClassScope, deadScopes) + self.function_scopes = self.get_scopes(FunctionScope, deadScopes) + self.generator_scopes = self.get_scopes(GeneratorScope, deadScopes) + self.doctest_scopes = self.get_scopes(DoctestScope, deadScopes) + self.pyflakes_messages = pyflakes_messages + + def get_scopes(self, scope_type, scopes): + return list(filter(lambda scope: isinstance(scope, scope_type), + scopes)) + + def get_node_location(self, node): + return {'lineno': node.source.lineno, + 'depth': node.source.depth - 1, + 'col_offset': node.source.col_offset} + + def get_nodes(self, scope, node_type): + for _, node in scope.items(): + if isinstance(node, node_type): + yield node + + +class PyFlakesASTBear(LocalBear): + AUTHORS = {'The coala developers'} + AUTHORS_EMAILS = {'coala-devel@googlegroups.com'} + LICENSE = 'AGPL-3.0' + + def run(self, filename, file): + """ + Generates pyflakes-enhance-AST for the input file and returns + deadScopes as HiddenResult + :return: + One HiddenResult containing a dictionary with keys being + type of scope and values being a list of scopes generated + from the file. + """ + tree = ast.parse(''.join(file)) + result = Checker(tree, filename=filename, withDoctest=True) + + yield PyFlakesResult(self, result.deadScopes, result.messages) +``` + +## Demonstrating use of PyFlakesASTBear by creating a simple bear + +A hassle free solution to the problem of finding all the future imports +present in the source code. + +Here is a prototype for the `NoFutureImportBear` that uses `PyFlakesASTBear`: + +```python +# Imports not mentioned to maintain brevity + +class NoFutureImportBear(LocalBear): + """ + Uses pyflakes-enhance-AST to detect use of future imports + """ + BEAR_DEPS = {PyFlakesASTBear} + + def run(self, filename, file, + dependency_results=dict() + ): + for result in dependency_results.get(PyFlakesASTBear.name, []): + for node in result.get_nodes(result.module_scope, + FutureImportation): + lineno = result.get_node_location(node)['lineno'] + corrected = self.remove_import(file, lineno, node.name) + yield Result.from_values( + origin=self, + message='Future import %s found' % node.name, + file=filename, + diffs={filename: corrected}, + line=result.get_node_location(node)['lineno']) +``` + +## A generic implementation for detecting `__future__` imports + +This implementation doesn't requires coala to be installed and is +designed so that it can be used in other tools like flake8. coala +will also be able to use this plugin, but it is also usable by flake8. +This implementation will be housed in `pyflakes-generic-plugins`. + +Here is a prototype for the `NoFutureImport` using **pyflakes-enhanced-AST**: + +```python +# Imports not mentioned to maintain brevity +__version__ = '0.1' + +CODE = 'F482' + + +class NoFutureImport(object): + name = 'no_future' + version = __version__ + + def __init__(self, tree, filename): + self.tree = tree + self.filename = filename + self._checker = None + self._module_scope = None + + @property + def checker(self): + if self._checker is None: + self._checker = Checker(self.tree, self.filename) + return self._checker + + @checker.setter + def checker(self, checker): + self.checker = checker + + @property + def module_scope(self): + if self._module_scope is None: + self._module_scope = list(filter(lambda scope: isinstance(scope, + ModuleScope), + self.checker.deadScopes))[0] + return self._module_scope + + def run(self): + for _, node in self.module_scope.items(): + if isinstance(node, FutureImportation): + message = '{code}: Future import {name} found' + .format(name=node.name, + code=CODE) + yield (node.source.lineno, node.source.col_offset, + message, NoFutureImport) +``` + +## Demonstrating use of PyFlakesASTBear by creating a fixes bear + +A hassle free solution to the problem of removing all the future imports +present in the source code. + +Here is a prototype for the `NoFutureImportBear` that uses `PyFlakesASTBear`: + +```python +# Imports not mentioned to maintain brevity + +class NoFutureImportBear(LocalBear): + """ + Uses pyflakes-enhance-AST to remove future imports + """ + BEAR_DEPS = {PyFlakesASTBear} + + def remove_future_imports(self, file, lineno, + max_col_offset, affected_lines): + """ + Removes all __future__ imports from the input line + """ + diff = Diff(file) + line = file[lineno - 1] + # If a line contains \ in the end + if line.rstrip()[-1] == '\\': + next_line = file[lineno] + semicolon_index = next_line.find(';') + diff.delete_line(lineno) + if semicolon_index == -1: + diff.delete_line(lineno + 1) + else: + replacement = next_line[semicolon_index + 1:].lstrip() + diff.modify_line(lineno+1, replacement) + else: + semicolon_indices = [i for i, a in enumerate(line) if a == ';'] + if len(semicolon_indices) == 0: + diff.delete_line(lineno) + else: + for indice in semicolon_indices: + if indice > max_col_offset: + replacement = line[indice + 1:].lstrip() + diff.modify_line(lineno, replacement) + break + + return diff + + def run(self, filename, file, + dependency_results=dict() + ): + """ + Uses PyFlakesASTBear to get only FutureImportation + """ + # Preprocess all the lines containing __future__ imports + affected_lines = set() + # Get the maximum column offset of FutureImportation for a line + max_column_for_a_line = dict() + for result in dependency_results.get(PyFlakesASTBear.name, []): + for node in result.get_nodes(result.module_scope, + FutureImportation): + lineno = result.get_node_location(node)['lineno'] + col_offset = result.get_node_location(node)['col_offset'] + if lineno in max_column_for_a_line: + max_column_for_a_line[lineno] = max( + max_column_for_a_line[lineno], + col_offset) + else: + max_column_for_a_line[lineno] = col_offset + affected_lines.add(lineno) + + for line in affected_lines: + corrected = self.remove_future_imports( + file, line, max_column_for_a_line[lineno], + affected_lines + ) + yield Result.from_values( + origin=self, + message='Future import(s) found', + file=filename, + diffs={filename: corrected}, + line=line) +```