Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling blank lines in x12xml_simple.seg, x12file._parse_segment #35

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

gumption
Copy link
Contributor

@gumption gumption commented Mar 6, 2015

I've been parsing a bunch of 835 files and found the following error terminating the parsing of many of them:

Traceback: Traceback (most recent call last):
...
File "/usr/local/lib/python2.7/dist-packages/pyx12/x12n_document.py", line 240, in x12n_document
xmldoc.seg(node, seg)
File "/usr/local/lib/python2.7/dist-packages/pyx12/x12xml_simple.py", line 72, in seg
if child_node.usage == 'N' or seg_data.get('%02i' % (i + 1)).is_empty():
AttributeError: 'NoneType' object has no attribute 'usage'

It appears this error is caused by 835 files that have blank lines, which I realize are malformed (out of spec), but my assumption is that the goal is to enable X12 parsing to continue (i.e., not terminate) while logging errors encountered during parsing.

I fixed this error by adding a child_node is None clause in that if statement, but I see there is an else clause at the end that raises an EngineError, so I'm not sure if this fix aligns with the desired behavior.

Once this was fixed, a new IndexError was generated by x12file._parse_segment(), due - I think - to the presence of a blank line between an invalid IEA segment and a valid ISA segment.

I fixed this by adding a check for empty loops before checking the preceding loop to generate error messages, but was not sure what error code to use ... and am less confident that my fix in this file is consistent with desired behavior.

FWIW, here is the section of the log file that is generated for one of the files with errant blank lines and a malformed IEA segment:

20150306 16:37:58 pyx12.error_handler ERROR: Line:4383 SEG:1 - Segment
ISA*00 not found. Started at /ISA_LOOP/IEA
20150306 16:37:58 pyx12.error_handler ERROR: No current segment in error_handler. Line:4383 SEG:1 - Segment identifier "
ISA" is invalid
20150306 16:37:58 pyx12.error_handler ERROR: Line:4610 SEG:5 - Segment IEA exceeded max count. Found 2, should have 1
20150306 16:37:58 pyx12.error_handler ERROR: Line:4382 ISA:000 - IEA loop with malformed preceding segment

I thought it might be more useful to submit a pull request - even though one or both fixes may not be acceptable - to help more easily identify the problem areas. I suspect that if these errors are to be caught, a more extensive set of error checks will need to be added to one or both files.

If it is more helpful to simply report errors than trying to fix them, let me know, and I'll switch tactics.

Blank lines in 835 files were causing AttributeError exceptions in
x12xml_simple._seg() and IndexError exceptions in
x12file._parse_segment(). Added code to check for error conditions,
which may or may not align with desired behavior.
@azoner
Copy link
Owner

azoner commented Mar 19, 2015

I'll take a look at the pull requests. Any unhandled exception like this
is not ideal.

There is a limit to the level of malformed input this library should
cleanly handle and recover from. My preference when dealing with this kind
of data would be to run a preprocessor to more cleanly structure the input
segments.


John Holland
[email protected]

On Fri, Mar 6, 2015 at 12:08 PM, Joe McCarthy [email protected]
wrote:

I've been parsing a bunch of 835 files and found the following error
terminating the parsing of many of them:

Traceback: Traceback (most recent call last):
...
File "/usr/local/lib/python2.7/dist-packages/pyx12/x12n_document.py", line
240, in x12n_document
xmldoc.seg(node, seg)
File "/usr/local/lib/python2.7/dist-packages/pyx12/x12xml_simple.py", line
72, in seg
if child_node.usage == 'N' or seg_data.get('%02i' % (i + 1)).is_empty():
AttributeError: 'NoneType' object has no attribute 'usage'

It appears this error is caused by 835 files that have blank lines, which
I realize are malformed (out of spec), but my assumption is that the goal
is to enable X12 parsing to continue (i.e., not terminate) while logging
errors encountered during parsing.

I fixed this error by adding a child_node is None clause in that if
statement, but I see there is an else clause at the end that raises an
EngineError, so I'm not sure if this fix aligns with the desired
behavior.

Once this was fixed, a new IndexError was generated by
x12file._parse_segment(), due - I think - to the presence of a blank line
between an invalid IEA segment and a valid ISA segment.

I fixed this by adding a check for empty loops before checking the
preceding loop to generate error messages, but was not sure what error code
to use ... and am less confident that my fix in this file is consistent
with desired behavior.

FWIW, here is the section of the log file that is generated for one of the
files with errant blank lines and a malformed IEA segment:

20150306 16:37:58 pyx12.error_handler ERROR: Line:4383 SEG:1 - Segment
ISA*00 not found. Started at /ISA_LOOP/IEA
20150306 16:37:58 pyx12.error_handler ERROR: No current segment in
error_handler. Line:4383 SEG:1 - Segment identifier "
ISA" is invalid
20150306 16:37:58 pyx12.error_handler ERROR: Line:4610 SEG:5 - Segment IEA
exceeded max count. Found 2, should have 1
20150306 16:37:58 pyx12.error_handler ERROR: Line:4382 ISA:000 - IEA loop
with malformed preceding segment

I thought it might be more useful to submit a pull request - even though
one or both fixes may not be acceptable - to help more easily identify the
problem areas. I suspect that if these errors are to be caught, a more
extensive set of error checks will need to be added to one or both files.

If it is more helpful to simply report errors than trying to fix them, let

me know, and I'll switch tactics.

You can view, comment on, or merge this pull request online at:

#35
Commit Summary

  • Updated to handle blank lines in 835 files

File Changes

Patch Links:


Reply to this email directly or view it on GitHub
#35.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants