Skip to content

Latest commit

 

History

History
338 lines (227 loc) · 11.9 KB

grammar_debugging.md

File metadata and controls

338 lines (227 loc) · 11.9 KB

Grammar Debugging

How to debug the Puppet Grammar

Understanding what the grammar is doing can be done by reading its static output and manually tracing through what will happen when running.

In the directory: puppet/pops/grammar - run this command:

make egrammar.output

Will produce an output file in text form.

This file is however quite hard to read and it is easy to get lost after just a few steps.

You can also get the parser to output each shift/reduce and follow how the parser builds up the resulting expression. To do this the parser must be generated for debugging. (It is much slower, so this should not be done by default. (And if you generate a parser for debugging, never ever check it in).

Modify the makefile to read like this:

eparser.rb: egrammar.ra
    racc -g -o$@ egrammar.ra

It is the -g that adds debugging. Remove the -g when done with debugging. If nothing happens you may need to touch the egrammar.ra file.

Next step is to get some output. There is no instrumentation for this. You need to modify parser_support.rb, and modify the beginning of the _parse method to read like this:

  def _parse()
    begin
      @yydebug = true
      @racc_debug_out = $stdout
      require 'debugger'; debugger
      main = yyparse(@lexer,:scan)
      . . .

There is normally just a @yydebug = false at this place (to turn off debugging). The @racc_debug_out is by default directed to $stderr (which may be redirected depending on how you are running), change that to $stdout.

When parsing, use only small snippets that trigger the problem or you will drown in output. I use a breakpoint, so I can do 'next' over the parsing and immediately see the result (and terminate if I did not like what I got, tweak, regenerate and run again until happy).

Here is a session as an example:

$ bundle exec puppet apply --parser future -e 'notice [1]'

/Users/henrik/git/puppet/lib/puppet/pops/parser/parser_support.rb:190
main = yyparse(@lexer,:scan)

[185, 194] in /Users/henrik/git/puppet/lib/puppet/pops/parser/parser_support.rb
   185    def _parse()
   186      begin
   187        @yydebug = true
   188        @racc_debug_out = $stdout
   189        require 'debugger'; debugger
=> 190        main = yyparse(@lexer,:scan)
   191        # #Commented out now because this hides problems in the racc grammar while developing
   192        # # TODO include this when test coverage is good enough.
   193        #      rescue Puppet::ParseError => except
   194        #        except.line ||= @lexer.line

(rdb:1) next

read    :NAME(NAME) {:line=>1, :pos=>1, :offset=>0, :length=>6, :value=>"notice"}

shift   NAME
        [ (NAME {:line=>1, :pos=>1, :offset=>0, :length=>6, :value=>"notice"}) ]

goto    55
        [ 0 55 ]

reduce  NAME --> name
        [ (name notice) ]

goto    33
        [ 0 33 ]

reduce  name --> text_or_name
        [ (text_or_name notice) ]

goto    30
        [ 0 30 ]

reduce  text_or_name --> literal_expression
        [ (literal_expression notice) ]

goto    15
        [ 0 15 ]

reduce  literal_expression --> primary_expression
        [ (primary_expression notice) ]

goto    35
        [ 0 35 ]

read    :LISTSTART(LISTSTART) {:line=>1, :pos=>8, :offset=>7, :length=>1, :value=>"["}

reduce  primary_expression --> call_function_expression
        [ (call_function_expression notice) ]

goto    14
        [ 0 14 ]

reduce  call_function_expression --> higher_precedence
        [ (higher_precedence notice) ]

goto    9
        [ 0 9 ]

reduce  higher_precedence --> expression
        [ (expression notice) ]

goto    10
        [ 0 10 ]

reduce  expression --> resource_expression
        [ (resource_expression notice) ]

goto    8
        [ 0 8 ]

reduce  resource_expression --> relationship_expression
        [ (relationship_expression notice) ]

goto    7
        [ 0 7 ]

reduce  relationship_expression --> any_expression
        [ (any_expression notice) ]

goto    6
        [ 0 6 ]

reduce  any_expression --> syntactic_statement
        [ (syntactic_statement notice) ]

goto    5
        [ 0 5 ]

reduce  syntactic_statement --> syntactic_statements
        [ (syntactic_statements [notice]) ]

goto    4
        [ 0 4 ]

shift   LISTSTART
        [ (syntactic_statements [notice]) (LISTSTART {:line=>1, :pos=>8, :offset=>7, :length=>1, :value=>"["}) ]

goto    48
        [ 0 4 48 ]

read    :NAME(NAME) {:line=>1, :pos=>9, :offset=>8, :length=>1, :value=>"1"}

shift   NAME
        [ (syntactic_statements [notice]) (LISTSTART {:line=>1, :pos=>8, :offset=>7, :length=>1, :value=>"["}) (NAME {:line=>1, :pos=>9, :offset=>8, :length=>1, :value=>"1"}) ]

goto    55
        [ 0 4 48 55 ]

reduce  NAME --> name
        [ (syntactic_statements [notice]) (LISTSTART {:line=>1, :pos=>8, :offset=>7, :length=>1, :value=>"["}) (name 1) ]

goto    33
        [ 0 4 48 33 ]

reduce  name --> text_or_name
        [ (syntactic_statements [notice]) (LISTSTART {:line=>1, :pos=>8, :offset=>7, :length=>1, :value=>"["}) (text_or_name 1) ]

goto    30
        [ 0 4 48 30 ]

reduce  text_or_name --> literal_expression
        [ (syntactic_statements [notice]) (LISTSTART {:line=>1, :pos=>8, :offset=>7, :length=>1, :value=>"["}) (literal_expression 1) ]

goto    15
        [ 0 4 48 15 ]

reduce  literal_expression --> primary_expression
        [ (syntactic_statements [notice]) (LISTSTART {:line=>1, :pos=>8, :offset=>7, :length=>1, :value=>"["}) (primary_expression 1) ]

goto    35
        [ 0 4 48 35 ]

read    :RBRACK(RBRACK) {:line=>1, :pos=>10, :offset=>9, :length=>1, :value=>"]"}

reduce  primary_expression --> call_function_expression
        [ (syntactic_statements [notice]) (LISTSTART {:line=>1, :pos=>8, :offset=>7, :length=>1, :value=>"["}) (call_function_expression 1) ]

goto    14
        [ 0 4 48 14 ]

reduce  call_function_expression --> higher_precedence
        [ (syntactic_statements [notice]) (LISTSTART {:line=>1, :pos=>8, :offset=>7, :length=>1, :value=>"["}) (higher_precedence 1) ]

goto    9
        [ 0 4 48 9 ]

reduce  higher_precedence --> expression
        [ (syntactic_statements [notice]) (LISTSTART {:line=>1, :pos=>8, :offset=>7, :length=>1, :value=>"["}) (expression 1) ]

goto    122
        [ 0 4 48 122 ]

reduce  expression --> expressions
        [ (syntactic_statements [notice]) (LISTSTART {:line=>1, :pos=>8, :offset=>7, :length=>1, :value=>"["}) (expressions [1]) ]

goto    125
        [ 0 4 48 125 ]

shift   RBRACK
        [ (syntactic_statements [notice]) (LISTSTART {:line=>1, :pos=>8, :offset=>7, :length=>1, :value=>"["}) (expressions [1]) (RBRACK {:line=>1, :pos=>10, :offset=>9, :length=>1, :value=>"]"}) ]

goto    224
        [ 0 4 48 125 224 ]

reduce  LISTSTART expressions RBRACK --> array
        [ (syntactic_statements [notice]) (array ([] 1)) ]

goto    25
        [ 0 4 25 ]

reduce  array --> literal_expression
        [ (syntactic_statements [notice]) (literal_expression ([] 1)) ]

goto    15
        [ 0 4 15 ]

reduce  literal_expression --> primary_expression
        [ (syntactic_statements [notice]) (primary_expression ([] 1)) ]

goto    35
        [ 0 4 35 ]

read    false($end) false

reduce  primary_expression --> call_function_expression
        [ (syntactic_statements [notice]) (call_function_expression ([] 1)) ]

goto    14
        [ 0 4 14 ]

reduce  call_function_expression --> higher_precedence
        [ (syntactic_statements [notice]) (higher_precedence ([] 1)) ]

goto    9
        [ 0 4 9 ]

reduce  higher_precedence --> expression
        [ (syntactic_statements [notice]) (expression ([] 1)) ]

goto    10
        [ 0 4 10 ]

reduce  expression --> resource_expression
        [ (syntactic_statements [notice]) (resource_expression ([] 1)) ]

goto    8
        [ 0 4 8 ]

reduce  resource_expression --> relationship_expression
        [ (syntactic_statements [notice]) (relationship_expression ([] 1)) ]

goto    7
        [ 0 4 7 ]

reduce  relationship_expression --> any_expression
        [ (syntactic_statements [notice]) (any_expression ([] 1)) ]

goto    6
        [ 0 4 6 ]

reduce  any_expression --> syntactic_statement
        [ (syntactic_statements [notice]) (syntactic_statement ([] 1)) ]

goto    63
        [ 0 4 63 ]

reduce  syntactic_statements syntactic_statement --> syntactic_statements
        [ (syntactic_statements [notice, ([] 1)]) ]

goto    4
        [ 0 4 ]

reduce  syntactic_statements --> statements
        [ (statements [(invoke notice ([] 1))]) ]

goto    2
        [ 0 2 ]

reduce  statements --> program
        [ (program (invoke notice ([] 1))) ]

goto    1
        [ 0 1 ]

shift   $end
        [ (program (invoke notice ([] 1))) ($end false) ]

goto    61
        [ 0 1 61 ]

shift   $end
        [ (program (invoke notice ([] 1))) ($end false) ($end false) ]

goto    134
        [ 0 1 61 134 ]

accept

Even if you are not immediately familiar with the grammar, you can pretty much guess what is going on here. Notice that the built AST is shown in lisp notation form, so you can see how the parser stacks up expression snippets (i.e. 'shift') and then combines them into an expression ('reduce'). A 'shift' pushes a token, or result onto the stack, the state is then shifted ('goto'). A 'reduce' pops a state and may result in an actual reduction on the stack (many rules simply result in the same value).

As an example look at the rule that shifts RBRACK, it then shifts to a state that results in LISTSTART expressions RBRACK --> array, and this is where the parser decides it has "seen an array". You also see the "lisp" notation; (syntactic_statements [notice]) (array ([] 1)) which consists of a mix of rule names (syntactic_statements, array) and resulting values [notice] (a ruby array with a NAME token having value notice, and ([] 1) the puppet model for the [] operator with 1 as an argument.

As an exercise, it is useful to start with following the actions taken once RBRACK has been seen until it reaches the end with the (program (invoke notice ([] 1))) - this part (the unwinding/reduction) is somewhat easier to follow than the nesting that takes place at the beginning until you are familiar with the rules in the grammar.

The state numbers are useful to find the details in the output file. If you are surprised why the next step was a particular shift or reduce, you can see the table of where different input takes the parser next.

The above relies on the following method having been added to the Puppet::Pops::Model::Factory:

# Useful when debugging as this dumps the expression in compact form
def to_s
  @@dumper ||= Puppet::Pops::Model::ModelTreeDumper.new
  @@dumper.dump(self)
end

as you would otherwise get an unintelligent listing of ruby instance references. If you are debugging an older version of the grammar / factory, you probably want to add that method. (And yes, while the parser is building the result, each fragment of the expression tree is wrapped in an instance of Factory - it acts as a builder that can manipulate the model elements. (But that is a topic for another post).

  • henrik