Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

features to align assignment operators and dictionary colons #1020

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

lizawang
Copy link

@lizawang lizawang commented Aug 18, 2022

Hello all yapf founders!

I have some new features I want to share with the yapf community.
This is also a response to issue #194 and issue #346.

All the alignment features added are referred to the yapf function _AlignTrailingComments(final_lines).

The knobs

The main functions for all the knobs are added in reformatter.py.

knob: ALIGN_ASSIGNMENT

This is an option to align the assignment and augmented assignment operators of different variable assignment lines in the same block.

  • For the case when there are two assignment operators on one line, only the first assignment operator is aligned:
    timeend   = record[ "timeend" ] = ts_end
    self      = "xyz"
    tablename = "treditab"
    
  • New block starts with new alignment. There are 5 cases that are featured to indicate a new block:
    • If there is any blank line in between:

      a   = 1
      abc = 2
      
      b  = 3
      bc = 4
      
    • If there is any comment line in between:

      a   = 1
      abc = 2
      # comment
      b  = 3
      bc = 4
      
    • If there is any if/ while/ for line or function definition def line in between:

      a   = 1
      abc = 2
      if condition == None:
          var = ''
      b  = 3
      bc = 4
      
    • If there is any object(e.g. a dictionary, a list, a argument list, a set, a function call) that with its items on newlines after '=' in between:

      tablename = "treditab"
      list      = [
          {
              "field"    : "ediid",
              "type"     : "text",
              "required" : True,
              "html"     : {},
          }
      ]
      test = "another block"
      
    • If there is any variable assignment lines with different indentation in between:

      a   = 1
      abc = 2
      if condition == None:
          var       += ''
          var_long  -= 4
      b  = 3
      bc = 4
      

knob: ALIGN_ARGUMENT_ASSIGNMENT

This is an option to align the assignment operators of each block inside the argument list of a function call or a function definition.

  • The condition : it is only effective if all arguments inside the argument list are placed on separate lines.

  • The case when the typed argument name is too long that it has newlines, we can increase the column limit, otherwise it will be aligned with the last line of the unwrapped typed argument name that has the assignment operator:

    def __init__(
        self,
        title: Optional[ str ],
        diffs: Collection[ BinaryDiff ]       = (),
        charset: Union[ Type[ AsciiCharset ],
                        Type[ LineCharset ] ] = AsciiCharset,
        preprocess: Callable[ [ str ], str ]  = identity,
        # TODO(somebody): Make this a Literal type.
        justify: str = 'rjust' ):
      pass
    
  • New block of arguments starts with new alignment. There are 3 cases featured as an indication for new alignment block:

    • Each nested argument list inside with different indentation depth is a new block:
      nested_arglist = test(
          a = fun(
              var_first  = 0,
              var_second = '',
              var_third  = 1,
              var_fourth = None,
              var_fifth  = None ),
          abc = 1,
          d   = 2 )
      
    • A comment line inside a argument list of same level creates a new block:
      def get_dataframes_from_excel(
          self,
          *,
          info_sheet  = 0,
          skip_sheets = None,
          # comment
          df_sheet_names = [],
          rstrip_str     = True,
          **read_excel_kwargs ):
        pass
      
    • An object(a list/a dictionary/an argument list/ a set/a function call, etc.) with its items in newlines in between creates a new block inside the argument list of the same level:
      arglist = test(
          var_first  = 0,
          var_second = '',
          var_dict   = {
              "key_1" : '',
              "key_2" : 2,
              "key_3" : True,
          },
          var_third     = 1,
          var_very_long = None )
      
      
      arglist = test(
          var_first  = 0,
          var_second = '',
          get_dataframe_from_excel_sheet(
              filepath, usecols = [ 0 ], header = None, dtype = str ),
          var_third     = 1,
          var_very_long = None )
      

knob: ALIGN_DICT_COLON

This is an option to align colons inside dictionary.

  • The condition : it is only effective if all key-value pairs inside the dictionary are placed on separate lines, e.g. when 'EACH_DICT_ENTRY_ON_SEPARATE_LINE' is set True.
  • New block of key-value pairs starts with new alignment. There are 3 cases featured as an indication for new alignment block:
    • Each nested dictionary is a new block:
      fields = [
          {
              "field"    : "ediid",
              "type"     : "text",
              "required" : True,
              "html"     :
                  {
                      "attr"   : 'style="width: 250px;" maxlength="30"',
                      "page"   : 0,
                      "span"   : 8,
                      "column" : 0,
                  },
          }
      ]
      
    • A comment line inside a dictionary of same level creates a new block:
      fields =  
          {
              "field" : "ediid",
              "type"  : "text",
              # key: value
              "required" : True,
          }
      
    • An object(a list/a dictionary/a set/a argument list) with its items in newlines in between creates a new block inside the dictionary of the same level:
      fields = {
          "field"      : "ediid",
          "long_value" :
              get_dataframe_from_excel_sheet(
                  filepath, usecols = [ 0 ], header = None, dtype = str ),
          "type"     : "text",
          "required" : True,
      }
      

knob: NEW_ALIGNMENT_AFTER_COMMENTLINE

This option is made under the consideration that some programmers may prefer to remain the alignment with comment lines inside.

  • If it is set False, a comment line in between won't create a new alignment block. Entries before and after the comment line/lines will align with their assignment operators or colons.

Preparations

To make the alignment functions possible, some other changes are made to other files.

  • subtype_assigner.py: a change is made to include the subtype TYPED_NAME_ARG_LIST to all tokens inside typed argument name. This is to recognize all tokens before '=' of very long typed name in arguments with typed names, e.g., a typed argument as below:

    charset: Union[ Type[ AsciiCharset ], Type[ LineCharset ] ] = AsciiCharset,

  • format_token.py: necessary boolean functions are added, is_assign, is_augassign, is_dict_key, is_dict_key_start, is_argassign, is_argname, is_argname_start.

Finally

In the end, the knobs are added into the style.py and tests for each one are also updated in yapftests folder.

  • style.py: all the 4 knobs are set to be False by default.
  • Unitests are added into format_token_test.py, subtype_assigner_test.py and reformatter_basic_test.py.



@ufukty
Copy link

ufukty commented Sep 14, 2022

I don't understand why this doesn't get more attention. Undoubtedly one of the top missing features of python/yapf when compared with Go fmt. Hopefully shipped soon.

@mattrussell2
Copy link

Fantastic. Thanks @lizawang! Have been hunting everywhere for this. Until the merge, am working with your branch directly.

@bwendling
Copy link
Member

@lizawang Please run YAPF over your changes.

This is looking good. However, this change is massive and very hard to review. Could you split it up into one PR per knob?

@lizawang
Copy link
Author

lizawang commented Sep 24, 2022

@gwelymernans Hello, by run Yapf over my changes, do you mean to format yapf codes with the new alignment changes?

For the reviewing, the three alignment features are implemented separately in the module (so they can be reviewed separately by setting the other two knobs false), for each one you only need to review the following three main functions respectively in reformatter.py: _AlignAssignment(final_lines), _AlignArgAssign(final_lines), and _AlignDictColon(final_lines). The auxiliary functions are boolean functions in token_format.py. And for the knob NEW_ALIGNMENT_AFTER_COMMENTLINE, it's implemented in all three alignment functions, so when you review the alignment functions you will see it in each of them.

The rest changes are just follow-up updates, e.g. adding the knobs to style.py and writing unit tests to the according *_test.py.


I hope this will help to start the review. And I will split the the three alignment knobs into three branches soon.

This reverts commit 506612e.
change formatting back to original.
@bwendling
Copy link
Member

@lizawang Don't apply your changes to YAPF, but we do still require that YAPF is "clean" when it's ran on itself. So I normally do something like:

$ python -m yapf -r -d .

to see what changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants