Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: end_line and end_column for api.classes.BaseName #1576

Closed
pappasam opened this issue May 10, 2020 · 16 comments · Fixed by #1584
Closed

Feature: end_line and end_column for api.classes.BaseName #1576

pappasam opened this issue May 10, 2020 · 16 comments · Fixed by #1584

Comments

@pappasam
Copy link
Contributor

pappasam commented May 10, 2020

Feature

When receiving a Name, there are use-cases where it's helpful to know the end_line and end_column associated with a symbol.

Name already supports line and column, so it doesn't seem like too much of a stretch to add these two new attributes.

Use case

In jedi-language-server, these values would be useful to support symbol range selection (used to create text objects based on symbols found in the document, including Functions, Classes, whatever LSP defines).

I currently support this by accessing a hidden variable, and the result isn't perfect. The end_column is normally 0 and does not cover the full range of characters (eg, it'll omit the return statement of a function). I therefore hack it by adding a huge number in my return value. It'd be nice if Name's public API supported this so it doesn't need to be a hack.

def _document_symbol_range(name: Name) -> Range:
    """Get accurate full range of function

    Thanks https://github.com/CXuesong from
    https://github.com/palantir/python-language-server/pull/537/files for the
    inspiration!

    Note: I add tons of extra space to make dictionary completions work. Jedi
    cuts off the end sometimes before the final function statement. This may be
    the cause of bugs at some point.
    """
    _name = (
        name._name.tree_name.get_definition()  # pylint: disable=protected-access
    )
    (start_line, start_column) = _name.start_pos
    (end_line, end_column) = _name.end_pos
    return Range(
        start=Position(start_line - 1, start_column),
        end=Position(end_line - 1, end_column + 20000),
    )

See here for this function in context: https://github.com/pappasam/jedi-language-server/blob/v0.13.1/jedi_language_server/jedi_utils.py#L87

@davidhalter
Copy link
Owner

I currently support this by accessing a hidden variable, and the result isn't perfect. The end_column is normally 0 and does not cover the full range of characters (eg, it'll omit the return statement of a function). I therefore hack it by adding a huge number in my return value. It'd be nice if Name's public API supported this so it doesn't need to be a hack.

Can you give me an example?


In general, how is your approach holding up? Is it good enough?

It's also unclear to me where you want the end position to be. Is it at the end of the last symbol of a function? At the end of the last comment? At the start of the next symbol?

@pappasam
Copy link
Contributor Author

@davidhalter I just did a proof of concept and it seems like coc is the main culprit of the weird ranges, not Jedi for the most part

The only edge case / weird behavior I've found with Jedi itself is its handling of functions at the end of a file. Based on my approach, the end of a function is the 0th column on the line after the function definition. That said, if the function is the last line in a file, the last character in the function is considered the end.

See the positional result of goodbye in my poc below:

from jedi import Script
from jedi.api.classes import Name
SCRIPT = """
x = 1
def hello():

    return "world"
y = 2
def goodbye():
    return "world"
""".strip()
JEDI_SCRIPT = Script(code=SCRIPT)

def _document_symbol_range(name: Name):
    _name = (
        name._name.tree_name.get_definition()  # pylint: disable=protected-access
    )
    (start_line, start_column) = _name.start_pos
    (end_line, end_column) = _name.end_pos

    print(name.full_name, "start:", _name.start_pos, "end:", _name.end_pos)

names = JEDI_SCRIPT.get_names()
for name in names:
    _document_symbol_range(name)

Result:

__main__.x start: (1, 0) end: (1, 5)
__main__.hello start: (2, 0) end: (5, 0)
__main__.y start: (5, 0) end: (5, 5)
__main__.goodbye start: (6, 0) end: (7, 18)

For the most part, my approach is working fine for the most part. The public API update could just be a wrapper around these private calls / maybe handling for the edge case described above.

@davidhalter
Copy link
Owner

The only edge case / weird behavior I've found with Jedi itself is its handling of functions at the end of a file.

Why is this weird? IMO this is the only valid end position, the next line does not even exist in Jedi in the source code.

@pappasam
Copy link
Contributor Author

pappasam commented May 11, 2020

@davidhalter It's only weird because of differs in behavior from __main__.hello, which "ends" and the same place that __main__.y begins (on the next line).

So, either __main__.goodbye is weird, __main__.hello is weird, or we accept this questionably necessary edge case because it's already implemented.

Note: I'm personally fine keeping things the way they are at this point, just wanted to point out the inconsistency in case it matters to you.

@pappasam
Copy link
Contributor Author

Actually, on further consideration, for a public end_line and end_column, I think it would make more sense to stick to __main__.goodbye's behavior. The last line character/line in a function definition would ideally be the in the function body itself to prevent Names from overlapping. Thoughts?

@davidhalter
Copy link
Owner

I agree.

How about something like:

definition = name._name.tree_name.get_definition()
if definition is None:
    return name.end_pos
if self.type in ('function', 'class'):
    last_leaf = definition.get_last_leaf()
    if last_leaf.type == 'newline':
        return last_leaf.get_previous_leaf().end_pos
    return last_leaf.end_pos
return definition.end_pos

@pappasam
Copy link
Contributor Author

That works amazingly! Interestingly, I've found that the start_pos and end_pos are also useful in my case because name.line and name.column return the position of the function / class name, not the the beginning of the function/class definition. So, if you think explosing a public interface for this would be a good idea, it might be nice to explose both start_pos and end_pos through a separate public interface.

For reference, i've appropriated your above example into jedi-language-server and it's the first perfectly working code I have involving symbol ranges: https://github.com/pappasam/jedi-language-server/blob/v0.13.2/jedi_language_server/jedi_utils.py#L87 Many thanks for your help!

@davidhalter
Copy link
Owner

Alright. I'm still thinking about the best function names for this:

  • get_definition_end_pos
  • get_scope_end_pos
  • get_context_end_pos
  • get_definition_end_position

Do you have any other ideas?

@pappasam
Copy link
Contributor Author

pappasam commented May 14, 2020

Ugh, names are the hardest part!

How about: get_definition_start_pos and get_definition_end_pos, both returning Tuple[int, int]?

This makes sense to me because the variables provide the start and end positions over which a Name is defined.

Alternatively, both can be combined into one get_definition_range, returning Tuple[Tuple[int, int], Tuple[int, int]]. This might be preferable to avoid multiple calls to ...tree_name.get_definition. Then again, the return value is atrocious, so there's that...

@davidhalter
Copy link
Owner

I would probably prefer get_definition_start_position, because it's clearer. However I agree with everything else.

Can you create a PR for that? I'm happy to merge it with a few simple tests. And if you are too lazy to write tests that's fine, too. Then I will write them.

Ugh, names are the hardest part!

True, haha always very annoying :)

@pappasam
Copy link
Contributor Author

Great, I’ll try have something by Sunday evening, hopefully sooner

@pappasam
Copy link
Contributor Author

pappasam commented May 16, 2020

Unfortunately, I'm having trouble running the tests locally, so I'll just push the code changes and leave the tests to you (unless you have any recommendations for me based on the debugging info below).

Note: I've changed the api to definition_start_position and definition_end_position since I've made them properties.

E           jedi.api.environment.InvalidPythonEnvironment: Could not get version information for '~/.asdf/shims/python3.7': InternalError("The subprocess /home/sroeca/.asdf/shims/python3.7 has crashed (EOFError('Ran out of input'), stderr=).")

I manage Python versions with asdf, which uses pyenv under the hood. Things work fine during runtime, I just can't execute the tests. Maybe related to this, I'm not sure: #1540

I've tried this on Python 3.7 and 3.8:

platform linux -- Python 3.7.7, pytest-5.4.2, py-1.8.1, pluggy-0.13.1
rootdir: /home/sroeca/src/Personal/forks/jedi, inifile: pytest.ini, testpaths: jedi, test
collected 3589 items

jedi/__init__.py .                                                                                [  0%]
jedi/api/__init__.py .                                                                            [  0%]
jedi/api/classes.py .....                                                                         [  0%]
jedi/api/replstartup.py .                                                                         [  0%]
jedi/inference/context.py .                                                                       [  0%]
jedi/inference/docstrings.py ..                                                                   [  0%]
jedi/inference/sys_path.py .                                                                      [  0%]
test/conftest.py .                                                                                [  0%]
test/test_cache.py EE                                                                             [  0%]
test/test_compatibility.py .                                                                      [  0%]
test/test_debug.py .                                                                              [  0%]
test/test_deprecation.py EEEEEE                                                                   [  0%]
test/test_file_io.py ..                                                                           [  0%]
test/test_integration.py EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE^C

@davidhalter
Copy link
Owner

Can you show me one stack trace? Otherwise pytest -I might do the trick.

@pappasam
Copy link
Contributor Author

@davidhalter pytest -I does fix it! Why?

One stack trace without -I:

____________________________ ERROR at setup of test_completion[docstring:26] _____________________________

self = <CompiledSubprocess _executable='/home/sroeca/.asdf/shims/python3.7', _pickle_protocol=2, is_crashed=True, pid=15012>
inference_state_id = None, function = <function _get_info at 0x7f09a93395f0>, args = (), kwargs = {}

    def _send(self, inference_state_id, function, args=(), kwargs={}):
        if self.is_crashed:
            raise InternalError("The subprocess %s has crashed." % self._executable)

        if not is_py3:
            # Python 2 compatibility
            kwargs = {force_unicode(key): value for key, value in kwargs.items()}

        data = inference_state_id, function, args, kwargs
        try:
            pickle_dump(data, self._get_process().stdin, self._pickle_protocol)
        except (socket.error, IOError) as e:
            # Once Python2 will be removed we can just use `BrokenPipeError`.
            # Also, somehow in windows it returns EINVAL instead of EPIPE if
            # the subprocess dies.
            if e.errno not in (errno.EPIPE, errno.EINVAL):
                # Not a broken pipe
                raise
            self._kill()
            raise InternalError("The subprocess %s was killed. Maybe out of memory?"
                                % self._executable)

        try:
>           is_exception, traceback, result = pickle_load(self._get_process().stdout)

jedi/inference/compiled/subprocess/__init__.py:261:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

file = <_io.BufferedReader name=15>

    def pickle_load(file):
        try:
            if is_py3:
>               return pickle.load(file, encoding='bytes')
E               EOFError: Ran out of input

jedi/_compatibility.py:396: EOFError

During handling of the above exception, another exception occurred:

self = <[AttributeError("'Environment' object has no attribute 'version_info'") raised in repr()] Environment object at 0x7f09a58c1390>

    def _get_subprocess(self):
        if self._subprocess is not None and not self._subprocess.is_crashed:
            return self._subprocess

        try:
            self._subprocess = CompiledSubprocess(self._start_executable)
>           info = self._subprocess._send(None, _get_info)

jedi/api/environment.py:75:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <CompiledSubprocess _executable='/home/sroeca/.asdf/shims/python3.7', _pickle_protocol=2, is_crashed=True, pid=15012>
inference_state_id = None, function = <function _get_info at 0x7f09a93395f0>, args = (), kwargs = {}

    def _send(self, inference_state_id, function, args=(), kwargs={}):
        if self.is_crashed:
            raise InternalError("The subprocess %s has crashed." % self._executable)

        if not is_py3:
            # Python 2 compatibility
            kwargs = {force_unicode(key): value for key, value in kwargs.items()}

        data = inference_state_id, function, args, kwargs
        try:
            pickle_dump(data, self._get_process().stdin, self._pickle_protocol)
        except (socket.error, IOError) as e:
            # Once Python2 will be removed we can just use `BrokenPipeError`.
            # Also, somehow in windows it returns EINVAL instead of EPIPE if
            # the subprocess dies.
            if e.errno not in (errno.EPIPE, errno.EINVAL):
                # Not a broken pipe
                raise
            self._kill()
            raise InternalError("The subprocess %s was killed. Maybe out of memory?"
                                % self._executable)

        try:
            is_exception, traceback, result = pickle_load(self._get_process().stdout)
        except EOFError as eof_error:
            try:
                stderr = self._get_process().stderr.read().decode('utf-8', 'replace')
            except Exception as exc:
                stderr = '<empty/not available (%r)>' % exc
            self._kill()
            _add_stderr_to_debug(self._stderr_queue)
            raise InternalError(
                "The subprocess %s has crashed (%r, stderr=%s)." % (
                    self._executable,
                    eof_error,
>                   stderr,
                ))
E           jedi.api.exceptions.InternalError: The subprocess /home/sroeca/.asdf/shims/python3.7 has crashed (EOFError('Ran out of input'), stderr=).

jedi/inference/compiled/subprocess/__init__.py:273: InternalError

During handling of the above exception, another exception occurred:

request = <SubRequest 'environment' for <Function test_cache_get_signatures>>

    @pytest.fixture(scope='session')
    def environment(request):
        version = request.config.option.env
        if version is None:
            version = os.environ.get('JEDI_TEST_ENVIRONMENT', str(py_version))

        if request.config.option.interpreter_env or version == 'interpreter':
            return InterpreterEnvironment()

>       return get_system_environment(version[0] + '.' + version[1:])

conftest.py:105:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
jedi/api/environment.py:355: in get_system_environment
    return Environment(exe)
jedi/api/environment.py:67: in __init__
    self._get_subprocess()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <[AttributeError("'Environment' object has no attribute 'version_info'") raised in repr()] Environment object at 0x7f09a58c1390>

    def _get_subprocess(self):
        if self._subprocess is not None and not self._subprocess.is_crashed:
            return self._subprocess

        try:
            self._subprocess = CompiledSubprocess(self._start_executable)
            info = self._subprocess._send(None, _get_info)
        except Exception as exc:
            raise InvalidPythonEnvironment(
                "Could not get version information for %r: %r" % (
                    self._start_executable,
>                   exc))
E           jedi.api.environment.InvalidPythonEnvironment: Could not get version information for '/home/sroeca/.asdf/shims/python3.7': InternalError("The subprocess /home/sroeca/.asdf/shims/python3.7 has crashed (EOFError('Ran out of input'), stderr=).")

jedi/api/environment.py:80: InvalidPythonEnvironment

@davidhalter
Copy link
Owner

I'm not sure what the issue is with your environment.

pytest -I does fix it! Why?

-I loads the interpreter environment. This means basically that all Python commands will not be made against a subprocess, but against the currently running process.

Are you going to write the tests or should I?

@pappasam
Copy link
Contributor Author

I’ll take a stab at it when I find a moment in the next couple days. Thanks for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants