Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the autocomplete feature of the builtin python IDE #8420

Closed
wants to merge 9 commits into from
Closed

Improve the autocomplete feature of the builtin python IDE #8420

wants to merge 9 commits into from

Conversation

dimven
Copy link
Contributor

@dimven dimven commented Dec 30, 2017

Purpose

I've been having some instabilities when using the built in IDE with large scripts and for extended periods of time and I retraced that back to the current autocomplete implementation. I've revised the current implementation so that it avoids raising as many exceptions as it did. I then tried to improve the type guessing as much as possible.

The type guessing should be relatively accurate for simple statements. We can never guarantee a 100% coverage for complicated variable definitions, because we don't have the luxury of running in the same scope as the actual script or pre-executing the entire code of the script.

Here's a before capture; notice the constant exceptions being thrown in the console:
current autocomplete

And this is with the proposed changes:
new autocomplete

Declarations

Check these if you believe they are true

  • The code base is in a better state after this PR
  • Is documented according to the standards
  • The level of testing this PR includes is appropriate
  • User facing strings, if any, are extracted into *.resx files
  • All tests pass using the self-service CI.
  • Snapshot of UI changes, if any.
  • Changes to the API follow Semantic Versioning, and are documented in the API Changes document.

Reviewers

@mjkkirschner

FYIs

This entire class is pretty independent of the rest of Dynamo, so it should work fine in both current and future versions of Dynamo.

@mjkkirschner
Copy link
Member

@dimven can you try to improve the diff here? It looks like there are a bunch of white space only changes, or is that much of the code actually new?

@dimven
Copy link
Contributor Author

dimven commented Jan 5, 2018

👍 Not sure why this happened and why git is detecting white spaces as a change. I must have "Ctrl+I"-ed the whole text at some point. I'll fix it over the weekend and resubmit

@dimven
Copy link
Contributor Author

dimven commented Jan 5, 2018

Now that I'm back from the holidays, I did some more testing and still see the slowdowns and instability whenever I use a from XYZ import ABC type of import statements. I think I'll need to reevaluate those as well. Hmm:
2018-01-05_11-47-56

I don't observe the same when I use from XYZ import *

seems like the 'commaDelimitedVariableNamesRegex' is too greedy; need to change from:
firefox_2018-01-05_15-27-50

to:
firefox_2018-01-05_15-28-11

cleaned up the weird whitespace diffs.
@Dewb
Copy link
Contributor

Dewb commented Jan 9, 2018

Hi @dimven, thanks for tackling this. Any incremental improvements you can make to the autocompletion engine would be great! It also might be worth looking into replacing the entire internal engine with a complete Python autocompletion solution like Jedi -- it could in the end be less work than wrestling with regexes.

Both Atom and VS Code use Jedi for their Python autocompletion. Here are the cores of their implementations:
https://github.com/brennv/autocomplete-python-jedi/blob/master/lib/provider.coffee
https://github.com/DonJayamanne/pythonVSCode/blob/master/src/client/providers/completionSource.ts

Atom's provider API is in Coffeescript, and VS Code's is in Typescript, while Jedi is implemented in Python itself. Both of these extensions just launch a Python process to run Jedi and communicate with it from the editor's language. I don't see any reason why the same couldn't be done from C#.

Of course, this strategy is not without its pitfalls. It will probably be easiest to run Jedi in normal Python, and then you could run into challenges due to autocomplete running in normal Python while Dynamo actually runs IronPython. (In that case, this might help: https://github.com/gtalarico/ironpython-stubs)

So perhaps this wouldn't necessarily be easier than just tweaking regexes, but I thought I'd mention it regardless!

Another worthwhile line of investigation might be adding the ability to select an external editor for Python nodes, so editing them can pop up a window in your favorite IDE, which presumably already has Python autocompletion (though the normal Python vs. IronPython issues might still apply.)

@jnealb
Copy link
Collaborator

jnealb commented Jan 9, 2018

@Dewb @dimven @mjkkirschner I vote the very last suggestion 👍

@radumg
Copy link
Collaborator

radumg commented Jan 18, 2018

I also vote for the last suggestions, had that discussion in another issue too : #6513 (comment)

@dimven
Copy link
Contributor Author

dimven commented Jan 18, 2018

My view on this aligns with @andydandy74 's - unless Dynamo is properly set up and distributed with a default external IDE out of the box (i.e. how #develop environment used in Revit macros), it would be unpractical to completely forgo the current approach. Otherwise we're just asking for more deployment hell.

Without a doubt, it would be great to have an easy to customize access to any IDE but that shouldn't compromise the experience of the majority of the users.

On a side note, why was #6513 closed ? Last time I checked, nothing has been implemented for this in 2.0

Add and keep track of clr library references. That fixes a lot of the import fails later on
- Docstrings are automatically excluded from the code analysis. This is a really big improvement for cases where we handle large multiline string blocks in our code.
- The autocomplete scope now picks up references to external assemblies and can support additional imports.
- The default location of the python builtin library is added to the module search path and now those will be picked up too
- Some of the regex syntax has been revised and instantiated as compiled to hopefully improve performance. 

TODO:
- I'd like to further rework the regex syntax for variable detection and the import statements
- Add support for basic variable unpacking(i.e. a, b, c = 1, 2, 3)
@dimven
Copy link
Contributor Author

dimven commented Jan 24, 2018

  • Docstrings are automatically excluded from the code analysis. This is a really big improvement for cases where we handle large multiline string blocks in our code.
  • The autocomplete scope now picks up references to external assemblies and can support additional imports.
  • The default location of the python builtin library is added to the module search path and now those will be picked up too
  • Some of the regex syntax has been revised and instantiated as compiled to hopefully improve performance.

TODO:

  • I'd like to further rework the regex syntax for variable detection and the import statements
  • Add support for basic variable unpacking(i.e. a, b, c = 1, 2, 3)

- All completion tests are now outdated and need to be revised!
- All regex statements have been revised for speed and responsiveness
- import statement detection is much more robust and powerful and can have custom names
- similarly, we now have variable assignment unpacking & other tweaks
@dimven
Copy link
Contributor Author

dimven commented Feb 8, 2018

@mjkkirschner just pushed a big update and would love to hear what you think about it:

  • All completion tests are now outdated and need to be revised!
  • All regex statements have been revised for speed and responsiveness
  • Import statement detection is much more robust and powerful and can have all kinds of custom names
  • We now have variable assignment unpacking

2018-02-08_21-42-29

The autocomplete now feels snappier and more responsive with larger scripts too:

2018-02-08_21-55-43

variables referencing other variables are recognized correctly now
@Racel
Copy link
Contributor

Racel commented Feb 21, 2018

@dimven - This is awesome. We will review and get back to you. But just so you know, we are currently heads down on Dynamo 2.0, and this may have to wait for a 2.1 release. Thanks for your patience. We will get back to you soon.

@dimven
Copy link
Contributor Author

dimven commented Feb 22, 2018

@Racel no hurry on my end :) . I'm pretty happy with how the new code behaves on my end and the feedback I got from others is positive, so I don't think I'll modify it any further for the time being.

I also got around to implementing a very rudimentary overall autocomplete functionality:

2018-02-20_17-22-42

2018-02-20_17-40-28

but I'll first have to make sure it works well with the changes in 2.0 and will start a separate PR for it.

@phliberato
Copy link
Contributor

@Racel do you have any forecast about when it will be released Dynamo 2.0?

@Dewb
Copy link
Contributor

Dewb commented Jun 18, 2018

Thanks @dimven! I've skimmed your changes and they look pretty good. If the current implementation is causing crashes like #8931 I think that raises the urgency on this. I'll talk to folks and see what we need to do to move this forward.

@andydandy74
Copy link
Contributor

Would be great to have this in the next stable release...

- avoids another exception thrown at incomplete import statements
- adds additional functionality for enum types
- minor fixes & refactoring
@mjkkirschner
Copy link
Member

mjkkirschner commented Sep 26, 2018

I will find time to look at this more closely as soon as I can. I'm sorry it has been open so long without a solid review.

One thing I see immediately is the altering of public methods which we can't do without evaluating the risk - as it will break binary compatibility - we can keep these old public methods, mark them obsolete and create new ones though - and where possible we should be more conservative with access modifiers.

Another initial thought is on the the loading of the standard library from the install directory - I think we were considering moving dynamo's python install into a dynamo sub folder to isolate it when updating to 2.7.8. Not sure that needs to be resolved in this PR though.

@johnpierson johnpierson added priority Related to a release. 2.x Issues related to 2.x versions of Dynamo. labels Sep 26, 2018
/// <summary>
/// Maps a basic variable regex to a basic python type.
/// </summary>
public List<Tuple<Regex, Type>> BasicVariableTypes;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this actually need to be public - can it be internal?

/// <summary>
/// Tracks already referenced CLR modules
/// </summary>
public HashSet<string> clrModules { get; set; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

internal?

/// <summary>
/// Keeps track of failed statements to avoid poluting the log
/// </summary>
public Dictionary<string, int> badStatements { get; set; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

internal?

public static string variableName = @"([0-9a-zA-Z_]+(\.[a-zA-Z_0-9]+)*)";
public static string doubleQuoteStringRegex = "(\"[^\"]*\")";
public static string singleQuoteStringRegex = "(\'[^\']*\')";
public static string commaDelimitedVariableNamesRegex = @"(([0-9a-zA-Z_]+,?\s?)+)";
Copy link
Member

@mjkkirschner mjkkirschner Sep 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question for some of these members - I know that the previous ones were all public, but unless this is actually useful to some external user - making these public - makes refactoring the code between major releases very painful.

public static string singleQuoteStringRegex = "(\'[^\']*\')";
public static string commaDelimitedVariableNamesRegex = @"(([0-9a-zA-Z_]+,?\s?)+)";
public static string variableName = @"([0-9a-zA-Z_]+(\.[a-zA-Z_0-9]+)*)";
public static string quotesStringRegex = "[\"']([^\"']*)[\"']";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be very useful to add comments/summaries to some of these - especially if they are public to describe anything that is not obvious from the field names.


#endregion

private static readonly Regex MATCH_LAST_NAMESPACE = new Regex(@"[\w.]+$", RegexOptions.Compiled);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a problem with the readonly fields being uppercase - but curious - is this from a specific style guide?

}
catch
{
Log("Failed to load Revit types for autocomplete. Python autocomplete will not see Autodesk namespace types.");
Log("Failed to load Revit types for autocomplete. Python autocomplete will not see Autodesk namespace types.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we get more specific with this message?

}
}

if (assemblies.Any(x => x.FullName.Contains("ProtoGeometry")))
if (assemblies.Any(x => x.GetName().Name == "ProtoGeometry"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change? speed?

}
}


string pythonLibDir = System.IO.Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.ProgramFilesX86),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just thinking this might change soon - but that should not hold up this pr.

try
{
var pyLibImports = String.Format("import sys\nsys.path.append(r'{0}')\n", pythonLibDir);
engine.CreateScriptSourceFromString(pyLibImports, SourceCodeKind.Statements).Execute(scope);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe your previous work made the performance impact of this change minimal - is that true?

/// <returns>Return a list of IronPythonCompletionData </returns>
public ICompletionData[] GetCompletionData(string line)
public ICompletionData[] GetCompletionData(string code, bool expand=false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can't change this methods signature at this time - even adding an optional parameter will break binary compatibility:
https://stackoverflow.com/questions/1456785/a-definitive-guide-to-api-breaking-changes-in-net

It's very easy to miss cases here. :(

}

/// <summary>
/// Generates completion data for the specified text, while import the given types into the
/// Generates completion data for the specified text, while import the given types into the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment does not seem grammatically correct.

/// Find all import statements and import into scope. If the type is already in the scope, this will be skipped.
/// The ImportedTypes dictionary is
/// Find all import statements and import into scope. If the type is already in the scope, this will be skipped.
/// The ImportedTypes dictionary is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment just ends :(

{
if (scope.ContainsVariable(import.Key))
int previousTries = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

var

if (scope.ContainsVariable(import.Key))
int previousTries = 0;
badStatements.TryGetValue(statement, out previousTries);
if (previousTries > 3)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the significance of the 3? Can it be pulled out to a constant?

string libName = MATCH_FIRST_QUOTED_NAME.Match(statement).Groups[1].Value;
if (!clrModules.Contains(libName))
{
if (statement.Contains("AddReferenceToFileAndPath"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please pull this string out as a constant

continue;
}

if(AppDomain.CurrentDomain.GetAssemblies().Any(x => x.GetName().Name == libName))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, maybe it is worth making a map of assemblies and checking if the name exists there instead of this loop - it appears lookup might happen a lot.

}

var importStatements = FindAllImportStatements(code);
foreach (var i in importStatements)
Copy link
Member

@mjkkirschner mjkkirschner Sep 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment about the general output/side effects of this for loop.

/// </summary>
/// <param name="text"></param>
/// <returns></returns>
string GetName(string text)
string GetLastName(string text)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add private access modifier

@Racel
Copy link
Contributor

Racel commented Oct 18, 2018

@dimven We think this looks pretty close and would like to get this into the next release. We would like to take this PR over, are you ok with that?

@dimven
Copy link
Contributor Author

dimven commented Oct 22, 2018

Hi @Racel & @mjkkirschner

Sorry, I've been real bussy the last few weeks and haven't had a chance to look at this again. I would absolutely love it if you could take over and wrap it up for 2.1 :)

@alfarok
Copy link
Contributor

alfarok commented Jan 10, 2019

Picking this work back up in #9402 as per the discussion above. I am going to close this PR but feel free to continue any conversations/concerns in the new PR.

@alfarok alfarok closed this Jan 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.x Issues related to 2.x versions of Dynamo. priority Related to a release.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants