-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPEC 12: Formatting mathematical expressions #326
Open
tupui
wants to merge
23
commits into
scientific-python:main
Choose a base branch
from
tupui:spec_12
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 22 commits
Commits
Show all changes
23 commits
Select commit
Hold shift + click to select a range
ec37a94
SPEC 13: Recommended targets and naming conventions
tupui ff7d1ab
SPEC 12: Formatting mathematical expressions
tupui f46a93e
Attempt to make SPEC 12 complete and unambiguous
mdhaber 5f4b20a
Apply suggestions from code review
tupui 34aa825
Improvements per self-review
mdhaber 63d45e6
Apply suggestions from code review
mdhaber f2a96a6
Apply suggestions from code review
mdhaber efa2ea8
Remove old rule 9
mdhaber 8065ec6
Merge pull request #1 from mdhaber/spec_12
tupui a3d70f8
Update index.md
tupui 6c1174e
Update index.md
tupui 39d2fc2
Update index.md
tupui 07e93f4
Run linter
stefanv 92093c5
Fix spelling error
stefanv 5e7a285
Update spec-0012/index.md
mdhaber 503ce73
MAINT: adjustments per review
mdhaber e4f3c6f
[pre-commit.ci 🤖] Apply code format tools to PR
pre-commit-ci[bot] 101fde4
Merge remote-tracking branch 'origin/main' into spec_12
stefanv 246f73c
Merge branch 'main' into spec_12
mdhaber 5b738a7
Apply suggestions from code review
mdhaber 716808b
Merge branch 'main' into spec_12
mdhaber bed0e02
Apply suggestions from code review
mdhaber 8b5b767
Update spec-0012/index.md
mdhaber File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,201 @@ | ||
--- | ||
title: "SPEC 12 — Formatting mathematical expressions" | ||
number: 12 | ||
date: 2024-06-06 | ||
author: | ||
- "Pamphile Roy <[email protected]>" | ||
- "Matt Haberland <[email protected]>" | ||
discussion: https://discuss.scientific-python.org/t/spec-12-formatting-mathematical-expressions | ||
endorsed-by: | ||
--- | ||
|
||
## Description | ||
|
||
[PEP 8](https://peps.python.org/pep-0008) | ||
and other established styling documents either | ||
|
||
- lack comprehensive guidelines about mathematical expressions, or | ||
- provide simple rules that ignore the relationship between formatting and readability. | ||
|
||
In practice, this leads to varying, even conflicting, mathematical expression | ||
styles across the ecosystem. We seek to standardize the representation of | ||
mathematical code for the same reason we standardize formatting of other code: | ||
it brings consistency to the ecosystem and allows collaborators to focus on | ||
more important aspects of their work. | ||
|
||
## Implementation | ||
|
||
These rules are intended to respect and | ||
complement the [PEP 8 standards](https://peps.python.org/pep-0008), such as using | ||
[implied line continuation](https://peps.python.org/pep-0008/#maximum-line-length) and | ||
and [breaking lines before binary operators](https://peps.python.org/pep-0008/#should-a-line-break-before-or-after-a-binary-operator)[^1]. | ||
|
||
0. Unless otherwise specified, rely on the implicit order of operations; | ||
i.e., do not add extraneous parentheses. For example, prefer `u**v + y**z` | ||
over `(u**v) + (y**z)`, and prefer `x + y + z` over `(x + y) + z`. A full | ||
list of implicit operator priority levels is given by | ||
[Operator Precedence](https://docs.python.org/3/reference/expressions.html#operator-precedence). | ||
1. Always use the `**` operator and unary `+`, `-`, and `~` operators _without_ | ||
surrounding whitespace. For example, prefer `-x**4` over `- (x ** 4)`. | ||
mdhaber marked this conversation as resolved.
Show resolved
Hide resolved
|
||
2. Always surround non-PEMDAS[^2] operators with whitespace, and always make the priority of | ||
non-PEMDAS operators explicit. For example, prefer `(x == y) or (w == t)` over | ||
`x==y or w==t`.[^3] | ||
3. Always surround AS[^2] operators with whitespace. | ||
4. Typically, surround MD[^2] operators with whitespace, except in the following situations. | ||
- When there are lower-priority operators (namely AS) within the same compound | ||
expression[^4]. For example, prefer `z = -x * y**t` over `z = -x*y**t`, but | ||
prefer `z = w - x*y**t` over `z = w - x * y**t` due to the presence of the | ||
lower-priority addition operator. | ||
mdhaber marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- When the division operation would be written mathematically as a fraction with a | ||
horizontal bar. For example, prefer `z = t/v * x/y` over `z = t / v * x / y` | ||
if this would be written mathematically as the product of two fractions, | ||
e.g. $\frac{t}{v} \cdot \frac{x}{y}$. | ||
mdhaber marked this conversation as resolved.
Show resolved
Hide resolved
|
||
5. Considering the previous rules, only `**`, `*`, `/`, and the unary `+`, `-`, and `~` | ||
operators can appear in implicit subexpressions[^4] without spaces. In such expressions, | ||
|
||
- Use at most one unary operator, and if used, ensure that it is the leftmost operator. | ||
- Use at most one `**` operator, and if used, ensure that it is the rightmost operator. | ||
tupui marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Use at most one `/` operator, and if used, ensure that it is the rightmost operator except for `**`. | ||
|
||
To achieve these goals, simplification or the addition of parentheses may be required. | ||
For example: | ||
|
||
- The expressions `--x` and `-~x` would be implicit subexpressions without spaces | ||
containing more than one unary operator. The former can be simplified to `+x` or | ||
simply `x`, and the latter requires explicit parentheses, i.e. `-(~x)`. | ||
- The expression `x**y**z` would be an implicit subexpression without spaces | ||
containing more than one `**` operator. This code would be executed as `x**(y**z)` | ||
following the implicit order, but the explicit parentheses should be included for | ||
clarity. | ||
- In the expression `t**v*x**y + z`, no spaces are used around the multiplication | ||
operator due to the presence of the lower-priority addition operator. However, | ||
this would lead to `t**v*x**y` being an implicit subexpression without spaces | ||
containing more than one `**` operator. This code would be executed as | ||
`(t**v)*(x**y) + z`, but the explicit parentheses should be included for clarity. | ||
- In the expression `z + x**y/w`, no spaces are used around the division operator | ||
due to the presence of the lower-priority addition operator. However, this would | ||
lead to `x**y/w` being an implicit subexpression without spaces containing `**` | ||
to the left of another operator. This code would be executed as `z + (x**y)/w`, | ||
but the explicit parentheses should be included for clarity. | ||
|
||
6. Simplify combinations of unary and binary `+` and `-` operators when possible. | ||
For example, | ||
- prefer `x + y` over `x + +y`, | ||
- prefer `x + y` over `x - -y`, | ||
- prefer `x - y` over `x - +y`, and | ||
- prefer `x - y` over `x + -y`. | ||
7. If required to satisfy other style requirements, include line breaks before | ||
the outermost explicit subexpression possible. For example, if | ||
`t + (w + (x + (y + z))))` must be broken, prefer | ||
```python3 | ||
(t | ||
+ (w + (x + (y + z))))) | ||
``` | ||
over | ||
```python3 | ||
(t + (w + (x + (y | ||
+ z))))) | ||
``` | ||
If there are multiple candidates, include the break at the first opportunity. | ||
8. If line breaks must occur within a compound subexpression, the break should | ||
be placed before the operator with lowest priority. For example, if | ||
(x + y*z) must be broken, prefer | ||
```python3 | ||
(x | ||
+ y*z) | ||
``` | ||
over | ||
```python3 | ||
(x + y | ||
* z) | ||
``` | ||
If there are multiple candidates, include the break at the first opportunity. | ||
9. Any of the preceding rules may be broken if there is a clear reason to do so. | ||
- _Conflict with other style rules_. For example, there is not supposed to be | ||
whitespace surrounding the `**` operator, but one can imagine a chain of `**` | ||
operations that exhausts the character limit of a line. | ||
- _Domain knowledge_. For instance, in the expression | ||
`t = (x + y) - z`, it may be important to emphasize that the addition should be | ||
performed first for numerical reasons or because `(x + y)` is a conceptually | ||
important quantity. In such cases, consider adding a comment, e.g. | ||
```python3 | ||
t = (x + y) - z # perform `x + y` first for precision | ||
``` | ||
or breaking the expressions into separate logical lines, e.g. | ||
```python3 | ||
w = x + y | ||
t = w - z | ||
``` | ||
|
||
## Terminology | ||
|
||
An "explicit" expression is a code expression enclosed within parentheses or | ||
otherwise syntactically separated from other expressions (i.e. by code other | ||
than operators, whitespace, literals, or variables). For example, in the list | ||
comprehension: | ||
|
||
```python3 | ||
[j for j in range(1, i + 1)] | ||
``` | ||
|
||
The output expression `j` is one explicit expression and the input sequence | ||
`range(1, i + 1)` is another. | ||
|
||
A "subexpression" is subset of an expression that is either explicit or could | ||
be made explicit (i.e. with parentheses) without affecting the order of | ||
operations. In the example above, `j` and `range(1, i + 1)` can also be | ||
referred to as explicit subexpressions of the whole expression, and `1` and | ||
`i + 1` are explicit subexpressions of the expression `range(1, i + 1)`. `i` and | ||
`1` are "implicit" subexpressions of `i + 1`: they could be written as explicit | ||
subexpressions `(i)` and `(1)` without affecting the order of operations, but they | ||
are not explicit as written. | ||
|
||
As another example, in `x + y*z`, `y*z` is a subexpression because it could be made | ||
explicit as in `x + (y*z)` without changing the order of operations. However, `x + y` | ||
would not be a subexpression because `(x + y)*z` would change the order of operations. | ||
Note that `x + y*z` as a whole may also be referred to as a "subexpression" rather than | ||
an "expression" even though `(x + y*z)` is not a proper subset of the whole. | ||
|
||
A "simple" expression is an expression involving only one operator priority level | ||
without considering the operators within explicit subexpressions. | ||
A "compound" expression is an expression involving more than one operator | ||
priority level without considering the contents of explicit subexpressions. | ||
For example, | ||
|
||
- `x + y - z` is a simple expression because `+` and `-` have the | ||
same priority level. There are no explicit subexpressions to be ignored. | ||
- `x * (y + z)` is also a simple expression because there is only one operator | ||
between `x` and the explicit subexpression `(y + z)`; we ignore the contents - and | ||
especially the operator - within the explicit subexpression; conceptually, it may | ||
regarded as `(...)`. | ||
- `x*y + z` is a compound expression; there are two operators and no explicit | ||
subexpressions that can be ignored. | ||
|
||
[^1]: | ||
Although examples do not show the use of hanging indent, any of the indentation styles | ||
allowed by [PEP 8 Indentation](https://peps.python.org/pep-0008/#indentation) are permitted | ||
by this SPEC. | ||
|
||
[^2]: | ||
The acronym PEMDAS commonly refers to "parentheses", "exponentiation", "multiplication", | ||
"division", "addition", and "subtraction". Herein, we will consider these operators | ||
to be "PEMDAS operators", and we will also include the unary `+`, `-`, and `~` in | ||
this category for convenience. The order of operations of PEMDAS operators is typically | ||
taught in primary school and reinforced throughout a programmer's training and | ||
experience, so it is assumed that most programmers are comfortable relying on the | ||
implicit order of operations of expressions involving a few PEMDAS operations. Implicit | ||
order of operations becomes less obvious as the number of distinct operator priority | ||
levels increases and when multiple non-PEMDAS operators are involved. Portions of this | ||
acronym, namely MD and AS, will be used to refer to the corresponding operators. | ||
|
||
[^3]: | ||
There is a case for simply eliminating spaces to reinforce the implicit order | ||
of operations, as in `x==y or w==t`. However, if this were the rule, following | ||
the rule would require users to remember the full order of operations hierarchy | ||
and apply it without mistakes. Use of explicit parentheses with non-PEMDAS | ||
operators leads to simpler rules, is more explicit, and is not uncommon in | ||
existing code. | ||
|
||
[^4]: | ||
For definitions of "explicit"/"implicit" and "simple"/"compound" | ||
"expressions"/"subexpressions", see Terminology. |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should any of these rules differ based on the expression type? E.g., all of the examples below use
Name
nodes (likex
,y
, etc.). What if the expression uses calls or subscripts or similar? Likef() ** 2
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to add examples, but no - to keep things simple, I didn't consider changing the rules based on that.