Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use original source when building f-strings #298

Merged
merged 1 commit into from
Apr 9, 2021

Conversation

andersk
Copy link
Contributor

@andersk andersk commented May 2, 2020

Working at the AST level meant we only handled a small subset of Python when generating f-strings. By extracting the original code as a string we can handle many more cases.

A disadvantage of this approach is that it requires Python ≥ 3.8 for ast.expr.end_lineno, ast.expr.end_col_offset. This now uses the token stream like everything else, avoiding that requirement.

@asottile
Copy link
Owner

asottile commented May 2, 2020

the current approach is intentional, to avoid putting complex expressions into fstrings

@andersk
Copy link
Contributor Author

andersk commented May 2, 2020

Maybe, but it misses a lot of simple expressions too, such as 0 (which is tricky to handle at the AST level, because what if it was 0x0?).

@asottile
Copy link
Owner

asottile commented May 2, 2020

your approach also breaks 'foo {}'.format(x['bar']) fwiw, and dropping python3.6 and python3.7 is not going to be accepted either

for what it's worth, there's helpers already in pyupgrade to extract the positional arguments of function calls -- you can use those to do more expressions, but please keep the args which are upgraded simple :)

@andersk
Copy link
Contributor Author

andersk commented May 2, 2020

I handle 'foo {}'.format(x['bar']) by disallowing quotes at the string level in _simple_arg.

Yeah that’s what I started out doing, but I kept finding that I was writing more and more complicated code just to get a better approximation of the identity function. So what do you think about trying to do this at the token level?

@asottile
Copy link
Owner

asottile commented May 2, 2020

So what do you think about trying to do this at the token level?

(not sure if the comments raced, but I addressed this 😉)

for what it's worth, there's helpers already in pyupgrade to extract the positional arguments of function calls -- you can use those to do more expressions, but please keep the args which are upgraded simple :)

@andersk
Copy link
Contributor Author

andersk commented May 2, 2020

Right, that’s what I was replying to to when I said “that’s what I started out doing”. But

  1. there are a lot of cases,
  2. some of the existing cases are subtly wrong: _unparse(ast.parse('a[b,]').body[0].value)'a[b]',
  3. some distinctions are impossible to preserve at the AST level: 0 and 0x0 parse to the same AST.

@andersk
Copy link
Contributor Author

andersk commented May 2, 2020

And I don’t think any definition of “simple” is going to satisfy everyone. One can make “simple” expressions using any of a wide variety of syntax elements, or “complicated” expressions using a very limited set of syntax elements repeatedly. At least the string-based definition (does not contain quotes or backslash or newline) is objective, and a user can always extract a variable if they don’t like the result. But that’s just my view.

@asottile
Copy link
Owner

asottile commented May 2, 2020

(2) is very easily fixable, but also is forbidden in the fstrings rewriter now (and would only apply to type annotations where I believe such a thing isn't possible)
(1) and (3) are fixable by using the token stream which is what I was indicating above

I forget the name of the helper but you essentially get this information out of it at the token level which you can use to exactly preserve the source (I probably should have used it to do the typing classes, but decided it was easier to cut corners and go the unparse route since there's very few things that can be type annotations)

format(a['b'], (c.d()), name, x[1])
      ^                           ^
       [    ]  [     ]  [  ]  [  ]

@andersk
Copy link
Contributor Author

andersk commented May 2, 2020

Oh I see. Yes, that sounds helpful.

@andersk andersk force-pushed the fstrings branch 4 times, most recently from c1ccf9e to 7ab3bf6 Compare May 26, 2020 07:34
@andersk
Copy link
Contributor Author

andersk commented May 26, 2020

I rewrote this to use _parse_call_args on the token stream, like the other similar transforms, so there’s no longer a weird Python 3.8 requirement.

@hexagonrecursion
Copy link

I was going to take a stab at implementing the below conversions but that would probably conflict with this PR. Seeing how this PR had no activity for months I don't expect it to get merged or closed any time soon.

"{foo}".format(**locals())  # f"{foo}"
"{foo}".format(**globals())  # f"{foo}"

@asottile
Copy link
Owner

ah yeah I probably should have merged this before rewriting the way pyupgrade is organized :(

@andersk
Copy link
Contributor Author

andersk commented Feb 12, 2021

Rebased on v2.10.0.

@andersk
Copy link
Contributor Author

andersk commented Apr 6, 2021

Rebased for a simple conflict with #417. Still interested in merging this? There’s no time like the present. 🙂

Copy link
Owner

@asottile asottile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@asottile asottile merged commit 4469c71 into asottile:master Apr 9, 2021
@andersk andersk deleted the fstrings branch October 21, 2021 04:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants