Skip to content

Commit

Permalink
[pkg/ottl] Add ParseSimplifiedXML Converter
Browse files Browse the repository at this point in the history
  • Loading branch information
djaglowski committed Sep 24, 2024
1 parent ce964b0 commit 2ef6000
Showing 1 changed file with 110 additions and 0 deletions.
110 changes: 110 additions & 0 deletions pkg/ottl/ottlfuncs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -1215,6 +1215,116 @@ Examples:
- `ParseKeyValue("k1!v1_k2!v2_k3!v3", "!", "_")`
- `ParseKeyValue(attributes["pairs"])`

### ParseSimplifiedXML

`ParseSimplifiedXML(target)`

The `ParseSimplifiedXML` Converter returns a `pcommon.Map` struct that is the result of parsing the target string without preservation of attributes or extraneous text content.

The goal of this Converter is to produce a more user-friendly representation of XML data than the `ParseXML` Converter,
which produces a verbose *encoding* of XML data.

This Converter disregards certain aspects of XML, specifically attributes and extraneous text content, in order to produce
a direct representation of XML data. Users are encouraged to simplify their XML documents prior to using `ParseSimplifiedXML`.

See other functions which may be useful for preparing XML documents:

- `ElementizeAttributesXML`
- `ElementizeValuesXML`
- `RemoveXML`
- `AssociateXML`
- `AddElementXML`

#### Formal Definitions

A "Simplified XML" document contains no attributes and no extraneous text content.

An element has "extraneous text content" when it contains both text and element content. e.g.

```xml
<foo>
bar <!-- extraneous text content -->
<hello>world</hello> <!-- element content -->
</foo>
```

#### Parsing logic

1. The Converter will NOT error due to the presence of attributes or extraneous text content.
However, it will omit those values from the result.
2. Elements which contain a value are converted into key/value pairs.
e.g. `<foo>bar</foo>` becomes `"foo": "bar"`
3. Elements which contain child elements are converted into a key/value pair where the value is a map.
e.g. `<foo> <bar>baz</bar> </foo>` becomes `"foo": { "bar": "baz" }`
4. Sibling elements that share the same tag will be combined into a slice.
e.g. `<a> <b>1</b> <c>2</c> <c>3</c> </foo>` becomes `"a": { "b": "1", "c": [ "2", "3" ] }`.
5. Empty elements are dropped, but they can determine whether a value should be a slice or map.
e.g. `<a> <b>1</b> </b> </a>` becomes `"a": { "b": [ "1" ] }` instead of `"a": { "b": "1" }`

#### Examples

Parse a Simplified XML document from the body:

```xml
<event>
<id>1</id>
<user>jane</user>
<details>
<time>2021-10-01T12:00:00Z</time>
<description>Something happened</description>
<cause>unknown</cause>
</details>
</event>
```

```json
{
"event": {
"id": 1,
"user": "jane",
"details": {
"time": "2021-10-01T12:00:00Z",
"description": "Something happened",
"cause": "unknown"
}
}
}
```

Parse a Simplified XML document with unique child elements:

```xml
<x>
<y>1</y>
<z>2</z>
</x>
```

```json
{
"x": {
"y": "1",
"z": "2"
}
}
```

Parse a Simplified XML document with multiple elements of the same tag:

```xml
<a>
<b>1</b>
<b>2</b>
</a>
```

```json
{
"a": {
"b": ["1", "2"]
}
}
```

### ParseXML

Expand Down

0 comments on commit 2ef6000

Please sign in to comment.