Alternative implementation of XML parser #8121

M0rdecay · 2020-09-13T21:28:04Z

Required for all PRs:

Signed CLA.
Associated README.md updated.
Has appropriate unit tests.

This is a slightly different implementation of the parser from PR #7460.
This version is closer to the json parser.

This parser, on the one hand, is less flexible than other proposed solutions, on the other hand, due to naming similar to that used in the json parser, it does not lose data if the names of nodes or attributes are repeated.

I would like to know your opinion.

…into xml_parser_v2

M0rdecay · 2020-09-14T06:18:55Z

I think I would like to combine functionality from both.
The ability to add tags and fields from an arbitrary location seems useful.

Of course, if you approve of it.

M0rdecay · 2020-09-14T07:00:33Z

It looks like the test timed out because it didn’t return any information for 10 minutes.
It seems it shouldn't be related to the last commit - it only changed the README

M0rdecay · 2020-09-15T12:17:16Z

Friends, @ssoroka, @reimda, anyone? :(
Please...

As I wrote earlier, now, to work with XML, I have made a wrapper for the parser and work with it as with an external processor, but not everything is as good with it as I would like.
When processing large documents (~7 thousand lines), receiving data back from the processor fails with an error:

E! [processors.execd] Error reading stdout: bufio.Scanner: token too long

The reason is clear, but the point is that we have to work with such large metrics.
If we use the parser inside a telegraf, everything is fine.

We are looking forward to the news.

M0rdecay · 2020-09-15T14:09:14Z

Taking this opportunity, I want to share a solution that we use to break down a metric that contains the result of parsing an array in an array into separate metrics:

measurement,tag=1115 SOME_VALUE=55i,SOME_DATA=66i 1600178359000000000
measurement,tag=1337 NODE_0_SOME_VALUE=11i,NODE_0_SOME_DATA=33i,NODE_1_SOME_VALUE=22i,NODE_1_SOME_DATA=44i 1600178359000000000
measurement,tag=1226 NODE_0_SOME_VALUE=11i,NODE_0_SOME_DATA=33i,NODE_1_SOME_VALUE=22i,NODE_1_SOME_DATA=44i 1600178359000000000

[[processors.strings]]
  order = 1
  namepass = [ "measurement" ]
  [[processors.strings.trim_prefix]]
    field_key = "*"
    prefix = "NODE_"
  [[processors.strings.trim_prefix]]
    tag_key = "*"
    prefix = "NODE_"

[[processors.starlark]]
  order = 4
  namepass = [ "measurement" ]
  source = '''
def apply(metric):
    ids = []

    for v in metric.fields.keys():
        id = v.split("_")[0]
        if id.isdigit() and id not in ids:
            ids.append(id)

    if len(ids) > 0:
        metrics = []
        for id in ids:
            m = deepcopy(metric)

            for k, v in m.fields.items():
                if k.startswith("%s_" % (id)):
                    new_field = k.replace("%s_" % (id), "")
                    m.fields[new_field] = v

                m.fields.pop(k, None)

            metrics.append(m)
        return metrics
    else:
        return metric
'''

measurement,tag=1115 SOME_VALUE=55i,SOME_DATA=66i 1600178359000000000
measurement,tag=1337 SOME_VALUE=11i,SOME_DATA=33i 1600178359000000000
measurement,tag=1226 SOME_VALUE=22i,SOME_DATA=44i 1600178359000000000
measurement,tag=1337 SOME_VALUE=22i,SOME_DATA=44i 1600178359000000000
measurement,tag=1226 SOME_VALUE=11i,SOME_DATA=33i 1600178359000000000

Added XML to data formats list

M0rdecay · 2020-10-23T08:36:07Z

Linked - #6968

sjwang90 · 2020-10-23T16:02:31Z

XML Input Issue #1758

ssoroka · 2020-11-16T16:49:36Z

@M0rdecay
the long buffer line issue has since been resolved.
I'm going to close this in favor of #8047, which I think is going to be a more flexible approach. Are there any features from here that you'd like to see implemented there, as well?

M0rdecay · 2020-11-16T20:32:20Z

@ssoroka ah, never mind. I'll adjust to the accepted implementation.
At least in case of problems, i can open issues.
Hopefully the PR will be merged before the 1.17 release.

ssoroka · 2020-11-16T21:43:02Z

@M0rdecay we're going to work on getting it merged soon. I do want to give you a chance to provide your feedback on the other PR. If there's anything you would like to see added, please add a comment to the other PR. feel free to do a full review and bring up any cases you want to see supported right away.

Thanks for your work here! I appreciate it.

M0rdecay and others added 5 commits September 13, 2020 16:29

Alternative implementation

d050e1a

Create README.md

bd70c36

Corrected parsing of the main node in the array

d86b8b6

Merge branch 'xml_parser_v2' of https://github.com/M0rdecay/telegraf …

0c6b2e0

…into xml_parser_v2

Update README.md

bb57ed5

Update README.md

f626cb2

M0rdecay mentioned this pull request Sep 14, 2020

Adds a new parser for XML data #7460

Closed

3 tasks

ssoroka added the area/xml label Sep 15, 2020

M0rdecay mentioned this pull request Sep 17, 2020

Increasing the metric buffer #8145

Merged

3 tasks

M0rdecay and others added 4 commits September 20, 2020 18:02

Update DATA_FORMATS_INPUT.md

54784c1

Added XML to data formats list

Merge branch 'master' into xml_parser_v2

9d0be84

Trying to resolve conflicts

80f2456

Merge branch 'master' into xml_parser_v2

9e4ccd4

M0rdecay mentioned this pull request Oct 23, 2020

XML parser #6968

Closed

sjwang90 mentioned this pull request Oct 23, 2020

XML input #1758

Closed

M0rdecay added 3 commits November 4, 2020 04:58

config merged

0b19f7c

Merge remote-tracking branch 'upstream/master' into xml_parser_v2

8705911

Update config.go

3a5dd12

ssoroka closed this Nov 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Alternative implementation of XML parser #8121

Alternative implementation of XML parser #8121

M0rdecay commented Sep 13, 2020

M0rdecay commented Sep 14, 2020

M0rdecay commented Sep 14, 2020

M0rdecay commented Sep 15, 2020

M0rdecay commented Sep 15, 2020 •

edited

Loading

M0rdecay commented Oct 23, 2020

sjwang90 commented Oct 23, 2020

ssoroka commented Nov 16, 2020

M0rdecay commented Nov 16, 2020 •

edited

Loading

ssoroka commented Nov 16, 2020

Alternative implementation of XML parser #8121

Alternative implementation of XML parser #8121

Conversation

M0rdecay commented Sep 13, 2020

Required for all PRs:

M0rdecay commented Sep 14, 2020

M0rdecay commented Sep 14, 2020

M0rdecay commented Sep 15, 2020

M0rdecay commented Sep 15, 2020 • edited Loading

M0rdecay commented Oct 23, 2020

sjwang90 commented Oct 23, 2020

ssoroka commented Nov 16, 2020

M0rdecay commented Nov 16, 2020 • edited Loading

ssoroka commented Nov 16, 2020

M0rdecay commented Sep 15, 2020 •

edited

Loading

M0rdecay commented Nov 16, 2020 •

edited

Loading