Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposals #18

Closed
itdependsnetworks opened this issue Oct 11, 2017 · 17 comments · May be fixed by #38
Closed

Proposals #18

itdependsnetworks opened this issue Oct 11, 2017 · 17 comments · May be fixed by #38
Assignees

Comments

@itdependsnetworks
Copy link

Trying to get a feel for what the appetite for a few updates into textfsm are. Using it a fair amount I have identified a few short comings (that you can work around, but are difficult.)

Missed captures
Example: I recently had a case in which on a "show interface status" I hit a condition in about %2 of interfaces. My data size what over 60K of interfaces so it was not easy to find.

Proposal: Set a flag that indicates all lines must match. Not that there must be a capture group in a given match, but that simply you have seen every line, and have profiled it already. Meaning that if it completes the rule set without a match, fail. This would be helpful when trying to match large routing tables, or large amounts of data to identify an issue early.

Repeated Data
Example: As noted in #11 when data is repeated, there is not an elegant solution to capture the multiple items

Proposal: A new key (RepeatedItem) that will reprocess a line with the assumption that there is repeated data, and to add to the list.

Trim()
Example: Often it is tough (reverse or forward lookbacks is the only way I have found) to get rid of pre/post white space.

Proposal: A new key (Trim) that will do minor post processing on the data after the match

Column Count
Example: Some output is merely based on column count (routing tables especially) and difficult to match when data may or may not be there.

Proposal: This is clearly the flimsiest of the proposals, however I would like a mechanism to identify the regex in one space, and just define space counts in the next.

@itdependsnetworks
Copy link
Author

Ping

@buxtronix
Copy link
Collaborator

A few comments on this:

Missed captures: You can already do this by having a catch-all rule contain an 'Error' action.

Repeated Data: I'll discuss with harro, there are certainly use cases that make it worth trying to address properly. I'm not sure how the implementation would work yet, do you have something in mind?

Trim: We've usually preferred textfsm to do capture only and not data manipulation, but let me consider whether this simple case is feasible. Can you provide a case where this has been difficult for you?

Column count: I'm not quite sure what a specific example might be - can you provide a more concrete example here?

Thanks.

@itdependsnetworks
Copy link
Author

Missed captures: Brilliant and much easier than my example
Repeated Data : Potentially two regex's, an outside bound and and inside bound. E.g.

Value INTERFACES RepeatedItem (\S+) "(\S+?),"

Start
  ^\${PORT}\s+${PROTO}\s+${INTERFACES}

Raw: Po101(U) LACP(a) Et3/1/1,Et3/1/2,Et3/1/3,Et3/1/4

Trim() https://github.com/networktocode/ntc-templates/blob/master/templates/cisco_ios_show_ip_bgp.template
https://github.com/networktocode/ntc-templates/tree/master/tests/cisco_ios/show_ip_bgp
Column Count Same as previous

You can see that the example it is largely positional based, and forward/reverse lookups being used. I get's complicated real quick

@mihudec
Copy link

mihudec commented Jan 4, 2018

I am just starting with textfsm, but I would also appreciate a better way to handle Repeated Data, possibly by using two regex's as @itdependsnetworks suggested. I think it would have major application for many types of outputs and would greatly reduce the need for post-processing in Python. An example of such command could be show vlan with it's interfaces, show interfaces trunk with list of allowed VLANs and so on.

Thanks.

@harro
Copy link
Contributor

harro commented Mar 4, 2018

To second buxtronix's comments, the preference is to do post processing outside of textfsm, so it would need to be a fairly compelling case to add such functionality.

Repeated Data:

In cases like AS Path, lumping into a single field and splitting later has been our MO to date. e.g.

...
Value ASPath ((\d+ )*(i|\?))

Start
  ^\s*${Status} ${Best}${Origin}c?\s+${Prefix}\s+${NextHop}(\s+u)?\s+${Metric}\s+${LocalPref}\s+${Weight}\s+${ASPath}\s*$$ -> Next.Record

EOF

The use of string.Template for matching keeps the textfsm logic simple. I suppose for a RepeatedItem we could merge 'match' and 'separator' like the above ASPath entry, then use them separately for splitting post match.

@harro harro self-assigned this Mar 4, 2018
@itdependsnetworks
Copy link
Author

@harro the question is would you except a PR? My colleague has put in a PR for better error handling and it hasn’t been picked up yet. Willing to do the work, if going to be accepted.

@harro
Copy link
Contributor

harro commented Mar 9, 2018

Will chat with @buxtronix about it next week. Get back to then.

@harro
Copy link
Contributor

harro commented Mar 22, 2018

Apologies for the delay.

The appetite for functionality that can be easily provided in post processing is not high.
With trim() and ColumnCount falling under that category.

The Repeated Data function is interesting and improves parsing power, which certainly makes it worthy of consideration.

The concern being that we get a clean implementation. Presently the limited syntax and functionality allows us to keep the template and data parsing code simple. So it would be preferably if an incremental increase in functionality came with a similarly modest increase in code complexity.

@gachteme
Copy link

@harro the regex module is fully backwards compatible with re and has functionality to add the repeated data functionality.
using match.captures() instead of match.group() enables this.

a = regex.match('(ab)+(bc)', 'ababbc')
b = re.match('(ab)+(bc)', 'ababbc')
a.groups()

('ab', 'bc')

b.groups()

('ab', 'bc')

a.captures(1)

['ab', 'ab']

a.group(1)

'ab'

@sumkincpp
Copy link
Contributor

sumkincpp commented Sep 2, 2019

Can there be an action for retrying the match but using different state?

Something like Continue <StateName> which is implicitly forbidden now. But, since it's forbidden, it's easy to implement it. This will mean "Keep current line and process it with state ".

Or may be something like KeepLine <StateName>

I.e. something like this:

SUBPART
  <all subpart matches>
  ^[^\s] -> KeepLine.Record Start

@gachteme
Copy link

gachteme commented Sep 2, 2019

@sumkincpp I've been thinking about adding an import statement that would more or less copy and paste one state into another. This would be useful for the situation you suggest and would allow the inserted matches to be at the top or middle of a state, which would be hard to do with your on failure condition. I definitely have written a few templates that had a lot of copying/pasting of parts of states that could have been reduced if there was an import feature.

To be clear I mean importing within the same file.

E.g:

Start
  \<some expressions\>
  $$IMPORT ToImportState
  \<some expressions\>

ToImportState
  \<some expressions\>
  ^\S -> Start

this solution would avoid the infinite loop problem presented by continue coupled with a state transition, while also reducing template development time and complexity.

Maybe it could also provide a feature that suppresses state transitions on imports? Thoughts?

@harro
Copy link
Contributor

harro commented Sep 4, 2019

This seems to me that you would still have looping issues i.e. the jump from 'ToImportState' back to 'Start' could loop back to 'ToImportState' again?

If the $$IMPORT is limited to push/pop behavior then that might avoid loops i.e. if the $$IMPORT is simply an inline macro expansion.

@harro
Copy link
Contributor

harro commented Sep 4, 2019

@sumkincpp as @gachteme has pointed out, the big issue with keeping a line and changing to a new state is that it is possible to loop indefinitely and avoiding this possibility is a strong aspect of the design.

Do you have an example text where this behavior is required?

@gachteme
Copy link

gachteme commented Sep 4, 2019

@harro that is a good point. I was planning on implementing it as a recursive macro. The infinite loop problem could be solved either by only ever going one import deep, removing any imports that have already been used, or removing any self imports. I would lean towards removing any imports that have already been pasted.

This could have somewhat negative implications for debugging as well as the visual debugger under development though it’s not something that couldn’t be overcome.

@harro
Copy link
Contributor

harro commented Sep 4, 2019

For detection in your preferred case, then a stack might suffice - Check the stack for presence of the macro label, error if found, otherwise push the label and expand, pop once done.

It would likely be necessary to parse the input template at least twice, first to parse the macros, second to do the expansion.

For line numbers in error reporting, a nested notation for line numbers might suffice but does entail more work on behalf of the reader. For example an error line might be 2.3.1, where line 2 will correspond to a first macro, and only by looking at the 3rd line of that macro, would we find the final macro, where it is the 1st line there that is the issue.

Might be worth promoting this to its own issue. To get a feel for how helpful/desirable this functionality might be?

@gachteme
Copy link

gachteme commented Sep 4, 2019

So the reason I was suggesting using a pool instead of the classic stack implementation of this is that it will reduce duplicate expressions.

For example if I have:

Start
  $$import A
  $$import B

B
  $$import A

A
  ^stuff

Then the start state ends up with:

  ^stuff
  ^stuff

Which is not an error but clearly the second import of A will never match because the first one will always catch the line first. It’s dead code.

However, if the stack is made and saved it may be easier to display the errors in the way you suggested.

@harro
Copy link
Contributor

harro commented Nov 15, 2019

With Repeat option and import functionality opened as separate issues I think this issue is OK to close.

@harro harro closed this as completed Nov 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants