From da6a4477179d36a198a84268f5d6825ec0f9297d Mon Sep 17 00:00:00 2001 From: Muhammad Kaisar Arkhan Date: Thu, 10 May 2018 21:04:48 +0700 Subject: [PATCH] cEP 23: Separation of bears' metadata Closes https://github.com/coala/cEPs/issues/138 --- cEP-0023.md | 378 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 378 insertions(+) create mode 100644 cEP-0023.md diff --git a/cEP-0023.md b/cEP-0023.md new file mode 100644 index 000000000..67c81581e --- /dev/null +++ b/cEP-0023.md @@ -0,0 +1,378 @@ +# Separation of bears' metadata + +| Metadata | | +|----------|-----------------------------------------------| +| cEP | 0023 | +| Version | 1.0 | +| Title | Separation of bears' metadata | +| Authors | Muhammad Kaisar Arkhan | +| Status | Proposed | +| Type | Feature | + +## Abstract + +This cEP proposes a method of separating bears' metadata and separating the +usage of Python when writing bears. + +## How bears are written currently + +Most bears are composed of Python boilerplate code containing the needed +metadata by coala, some more metadata to identify what a bear is, and docstrings +for the bear description. + +[GoVetBear][GoVetBear] + +Of course not all bears are just boilerplate code. Some require Python code to +help coala execute the linters, parse logs, make configuration files, etc. + +[CoffeeLintBear][CoffeeLintBear] + +Some bears are made locally by the coala team. + +[SpaceConsistencyBear][SpaceConsistencyBear] + +## Problems with current way of writing bears + +### Duplicate code all over the place + +This makes it annoying when introducing a new feature that deprecates the old +methods. + +When writing bears, You have to get the Python boilerplate and put fancy +metadata. + +When a new feature that deprecates the old way of doing things, we have to +change almost every bear code. + +[Example 1][Example 1] + +### Python is not needed + +Bears such as [GoVetBear][GoVetBear] don't need Python to declare metadata. + +The usage of `@linter` decorator helps supressing a lot of boilerplate code +but it still have the issue of having to use Python to just declare metadata. + +Some projects/orgs may need to write their own bear so coala can use their +exclusive tools (such as commerical code safety checks that are commonly used +by embedded software projects). + +Not all projects/organization want snippets of Python code in their projects +just to simply declare on how to use the linter and not everyone can write +Python. + +### Development is slow + +This is specific to bears that are made in-house or require a lot of fancy +code to run. + +When writing a bear, we have to test them. + +This require setting up coala development in your environment, making sure +coala-bears isn't installed or declare the bears directory which may result +in a conflict, run coala with a long list of arguments or just make a +`.coafile`. + +or do the other way around, write the tests first and just run `py.test` to +test your fresh new bear. + +Either way, both of them add a lot of time to just test a bear when +development. You don't need to write a lot of unneccesary boilerplate code to +just run the bear ad-hoc. It should be a simple as running them in your +shell. + +### Dual functionality of bears + +Are bears linters or are they just metadata to instruct coala to run linters? + +Should bears just declare metadata and have the code that make it coala-able +separated? + +This has been an issue for a while and it generates inconsistencies all over +the place. + +Some bears have needy code to generate configuration files such as +[CoffeeLintBear][CoffeeLintBear]. + +Some bears just put their code into themselves such as +[SpaceConsistencyBear][SpaceConsistencyBear]. + +Some of the Python bears just call the functions such as +[PEP8Bear][PEP8Bear]. + +I believe bears should be simply metadata while the actual linter tool should +be seperated from them. + +Needy code such as generating config files can easily be tasked into an +external script. + +### Dependency Hell + +Tracking coala and coala-bears has been a problem. coala and coala-bears must +be released together and releases are quite slow because coala need a lot of +changes while bears should be able to be released soon. + +This holds back a lot of new bears and bug fixes. + +coala-bears should have a steady and often release cycle so people can enjoy +bug fixes and new bears without coala development holding them back. + +Sadly this is a hard thing to do because coala-bears is a bunch of Python +code that are calling things from coala that may or may not be there. + +This creates a dependency cycle from both coala and coala-bears that should +not be ignored. + +### Security + +When declaring bears code inside the context of the coala process, it is +possible to intorduce bugs that have access to the coala process. + +This is bad since it is possible to leak information and possible gain code +execution which makes it possible in theory for services such as continuous +integration or have a specific usage of coala to be exploited and leak +information such as secret keys for deployment like the Play Store. + +coala should simply run linters in a seperated manner. It should not run +them inside the same context. + +If we treat bears as simply just metadata, it will help implementation of +good secure practices such as privilege separation, operating system +specific mitigations, and many more possible and way easier. + +## Objective + +coala-bears can be simplified by order of magnitude if it was treated as a +repository filled with metadata to instruct coala on how to use linters. +coala-bears should operate independently of coala development enabling a faster +release cycle and deliver bug fixes and new bears faster. + +## Structure of Bears + +Collection of bears will be put inside a directory that are declared in +`$COALA_BEAR_PATH` with defaults such as +`$HOME/.coala/bears:/usr/local/lib/coala/bears:/usr/lib/coala/bears` in addition +to a possible local `.coala` directory inside the project where bears are +located inside `.coala/bears`. + +``` + /usr/local/lib/coala/bears +... + | + |_ GoVetBear + | |_ metadata.toml + | + |_ CoffeeLintBear + | |_ metadata.toml + | |_ bear.py + | |_ generate_config.py + | + |_ SpaceConsistencyBear + | |_ metadata.toml + | |_ bear.py + | + |_ PEP8Bear + | |_ metadata.toml + | |_ bear.py +... + + .coala/bears + |_ AeroplaneSafetyComplianceBear + | |_ metadata.toml + | + |_ MemoryStructureFormatBear + |_ metadata.toml + |_ check_memory_structure.sh +``` + +The `metadata.toml` file will declare the metadata required to instruct coala on +how to use the tool, what arguments to give when executing, what dependencies +required, etc. + +Inside the folder, a script or an executable can be added seperating the need of +coala when executing thus removing the dependency cycle. + +The script will be launched as a general fork+exec model to prevent the script +from doing malicious things inside the context of coala. + +Enabling coala itself to do more safety features such as implementing operating +system specific safety features (FreeBSD Capscicum, OpenBSD pledge, Linux +SECCOMP, etc) and have a more fine-grained priviledge separation, however those +aren't part of this cEP and will be covered in another time. + +## `metadata.toml` + +`metadata.toml` is essentially a TOML file declaring the needed information for +coala. + +TOML is chosen since it has enough features to do what we want. We may need to +research on ini files are good enough since those are already inside Python's +standard library. + +Here are a couple of examples: + +**GoVetBear/metadata.toml** +```toml +[identity] +name = "GoVetBear" +description = """\ + Analyze Go code and raise suspicious constructs, such as printf calls \ + whose arguments do not correctly match the format string, useless \ + assignments, common mistakes about boolean operations, unreachable code, \ + etc.\ + """ +languages = ["Go"] +authors = ["The coala developers"] +authors_email = ["coala-devel@googlegroups.com"] +license = "AGPL-3.0" +can_detect = ["Unused code", "Smell", "Unreachable Code"] + +[[requirements]] +type = "AnyOneOf" + + [[requirements.child]] + type = "binary" + name = "go" + + [[requirements.child]] + type = "apt" + name = "golang" + +[[requirements]] +type = "GoRequirement" +package = "golang.org/cmd/vet" +flag = "-u" + +[run] +executable = "go" +arguments = "vet" +use_stdout = false +use_stderr = true +output_format = "regex" +output_regex = ".+:(?P\d+): (?P.*)" +``` + +**SpaceConsistencyBear/metadata.toml** +```toml +[identity] +name = "SpaceConsistencyBear" +description = """\ + Check and correct spacing for all textual data. This includes usage of \ + tabs vs. spaces, trailing whitespace and (missing) newlines before \ + the end of the file.\ + """ +languages = ["All"] +authors = ["The coala developers"] +authors_email = ["coala-devel@googlegroups.com"] +license = "AGPL-3.0" +can_detect = ["Formatting"] + +[[params]] +name = "use_spaces" +description = "True if spaces are to be used instead of tabs." +type = "bool" + +[[params]] +name = "allow_trailing_whitespace" +description = "Whether to allow trailing whitespace or not." +type = "bool" +default = false + +[[params]] +name = "indent_size" +description = "Number of spaces per indentation level" +type = "int" +default = 8 + +[[params]] +name = "enforce_newline_at_EOF" +description = "Whether to enforce a newline at the end of file" +type = "bool" +default = true +format="enforce-newline={}" + +[run] +executable = "bear.py" +local = true +use_coala_logging_style = true +``` + +As you can see from SpaceConsistencyBear example, It is treated not as a Python +code running under coala but rather if it was it's own linter. The `local` +variable is simply to indicate the file is inside the directory and not in +`$PATH` and `use_coala_logging_style` variable to tell coala that it's going to +use the common log format. + +Parameters will be given to the process via command arguments when launching. +With the defaults of the above example it will result in the following command +to execute: + +```sh +/usr/local/lib/coala/bears/general/SpaceConsistencyBear/bear.py \ + --allow_trailing_whitespace=false \ + --indent_size=8 \ + enforce-newline=true +``` + +The above example is formatted for reading, the real command will be in one +line. + +**CoffeeLintBear/metadata.toml** +```toml +[identity] +name = "CoffeeLintBear" +description = "Check CoffeeScript for a clean and consistent file" +url = "http://www.coffeelint.org" +languages = ["CoffeeScript"] +authors = ["The coala developers"] +authors_email = ["coala-devel@googlegroups.com"] +license = "AGPL-3.0" +can_detect = ["Syntax", "Formatting", "Smell", "Complexity", "Duplication"] + +[severity_map] +normal = "warn" +major = "error" +info = "ignore" + +[[requirements]] +type = "binary" +name = "coffeelint" + +[[params]] +name = "max_line_length" +description = "Maximum number of characters per line." +type = "int" +default = 79 + +... + +[prerun] +executable = "generate_config.py" +local = true +use_coala_logging_style = true + +[run] +executable = "bear.py" +ignore_params = true +local = true +use_coala_logging_style = true +``` + +CoffeeLintBear example above indicates how the metadata will look like if it +requires special treatment such as generating configuration files and +translating the output of the linter. + +If it require some special treatment after the linter is executed, a `postrun` +section can be added as well. + +`prerun` and `postrun` section will have the same format as the `run` section. + +## Process + +TODO + +[GoVetBear]: https://github.com/coala/coala-bears/blob/3cb9b148adc0dda51ac890188b38fd968f6058fd/bears/go/GoVetBear.py +[CoffeeLintBear]: https://github.com/coala/coala-bears/blob/3cb9b148adc0dda51ac890188b38fd968f6058fd/bears/coffee_script/CoffeeLintBear.py +[SpaceConsistencyBear]: https://github.com/coala/coala-bears/blob/3cb9b148adc0dda51ac890188b38fd968f6058fd/bears/general/SpaceConsistencyBear.py +[PEP8Bear]: https://github.com/coala/coala-bears/blob/c5a5e201a42c44c159b9c118b062417e4ae4b17f/bears/python/PEP8Bear.py +[Example 1]: https://github.com/coala/coala-bears/commit/3cb9b148adc0dda51ac890188b38fd968f6058fd