Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-94694: Fix column offsets for multi-line method lookups #94697

Merged
merged 6 commits into from
Jul 10, 2022

Conversation

brandtbucher
Copy link
Member

@brandtbucher brandtbucher commented Jul 8, 2022

If we're given an AST that doesn't make much sense here, just wipe out the column info instead of crashing later.

Also, don't try to include the dot in the column span.

@brandtbucher brandtbucher added interpreter-core (Objects, Python, Grammar, and Parser dirs) needs backport to 3.11 only security fixes labels Jul 8, 2022
@brandtbucher brandtbucher requested a review from pablogsal July 8, 2022 23:51
@brandtbucher brandtbucher self-assigned this Jul 8, 2022
@brandtbucher brandtbucher added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Jul 9, 2022
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @brandtbucher for commit 33dc4d8 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Jul 9, 2022
@brandtbucher
Copy link
Member Author

wasm32 buildbot failures are unrelated: #94549 (comment)

@@ -1181,6 +1181,27 @@ def test_complex_single_line_expression(self):
self.assertOpcodeSourcePositionIs(compiled_code, 'BINARY_OP',
line=1, end_line=1, column=0, end_column=27, occurrence=4)

def test_multiline_assert_rewritten_as_method_call(self):
# GH-94694: Copying location information from a "real" node to a
# handwritten one should always be valid!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it really always be valid?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can remove the comment if it's too controversial. It just seems to me that the utility/safety of ast.copy_location and ast.fix_missing_locations is dramatically reduced if they can't at least promise crash-free behavior.

Python/compile.c Outdated Show resolved Hide resolved
Python/compile.c Outdated Show resolved Hide resolved
@pablogsal
Copy link
Member

pablogsal commented Jul 9, 2022

Is a bit unfortunate that we cannot just propagate correctly the offsets from the AST nodes. The disadvantage of a method relying on inspecting the source (here through the method name) is that ASTs created from abstract sources (or no source at all) may fall through the cracks in these kind of checks.

What's preventing us from correctly grab the original column offsets from the multi line method call? (I know this has more to be with the original issue, but I would like to fully understand this before adding more band aids :) )

@iritkatriel
Copy link
Member

The issue here is that we are flattening a multi line location into a single line. So we are trying to construct the location info for the attribute name only (which is not available directly in the AST, it seems).

@pablogsal
Copy link
Member

The issue here is that we are flattening a multi line location into a single line. So we are trying to construct the location info for the attribute name only (which is not available directly in the AST, it seems).

For our own nodes, I can fix that in the parser so our nodes make those available. For random AST nodes it may still be a problem.

@iritkatriel
Copy link
Member

The issue here is that we are flattening a multi line location into a single line. So we are trying to construct the location info for the attribute name only (which is not available directly in the AST, it seems).

For our own nodes, I can fix that in the parser so our nodes make those available. For random AST nodes it may still be a problem.

I think we almost have it. The location of meth->v.Attribute seems to be the dot plus the attr name. Is that right? Can you exclude the dot?

@iritkatriel
Copy link
Member

For random AST nodes it may still be a problem.

The pytest nodes are not completely random, if we copy to them the location of the attr name from the source code the traceback will probably look right - the source line from the file will appear with the right thing hilighted.

@brandtbucher
Copy link
Member Author

What's preventing us from correctly grab the original column offsets from the multi line method call?

The root of the issue is that attr is stored as an identifier (string) in the AST. If it was a full Name node, this whole hack could just be replaced with a single SET_LOC.

I think we almost have it. The location of meth->v.Attribute seems to be the dot plus the attr name. Is that right? Can you exclude the dot?

Unfortunately, Attribute nodes include the entire span of the attribute access, including everything to the left of the dot.

This also creates a weird (but probably fine) inconsistency. If the attribute access is all on one line, the whole thing is underlined in the traceback. If it spans multiple lines, only the attribute name is underlined.

All this just comes from our need to trace the line that the attribute name is on, and trying (after the fact) to figure out what columns that should include. :(

@pablogsal
Copy link
Member

The root of the issue is that attr is stored as an identifier (string) in the AST. If it was a full Name node, this whole hack could just be replaced with a single SET_LOC.

This is exactly what I'm referring to. We could modify the AST to add this information so we don't need to guess. This way, it should also have work with arbitrary AST nodes.

The price is that although this is something we can do, it will force everyone to adapt.

@brandtbucher
Copy link
Member Author

Yeah. I think it would be an acceptable change for 3.12. While some consumers of the AST would need to change their code slightly, it wouldn't be removing any functionality that's currently present. If anything, it's adding functionality.

@brandtbucher brandtbucher merged commit 264b3dd into python:main Jul 10, 2022
@miss-islington
Copy link
Contributor

Thanks @brandtbucher for the PR 🌮🎉.. I'm working now to backport this PR to: 3.11.
🐍🍒⛏🤖

@miss-islington
Copy link
Contributor

Sorry, @brandtbucher, I could not cleanly backport this to 3.11 due to a conflict.
Please backport using cherry_picker on command line.
cherry_picker 264b3ddfd561d97204ffb30be6a7d1fb0555e560 3.11

@brandtbucher
Copy link
Member Author

I think this one's gonna need a manual backport.

brandtbucher added a commit to brandtbucher/cpython that referenced this pull request Jul 10, 2022
…ps (pythonGH-94697).

(cherry picked from commit 264b3dd)

Co-authored-by: Brandt Bucher <[email protected]>
@bedevere-bot bedevere-bot removed the needs backport to 3.11 only security fixes label Jul 10, 2022
@bedevere-bot
Copy link

GH-94721 is a backport of this pull request to the 3.11 branch.

@brandtbucher brandtbucher deleted the fix-method-cols branch July 21, 2022 20:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants