Not expand Json Literal Strings in Grok processor #197639

SoniaSanzV · 2024-10-24T13:24:06Z

Summary

When a new pipeline was created and a Grok processor was selected, the pattern definitions were being parsed. This was only happening in the visualization and not in the data being saved. The reason was that it was using XJsonEditor as a component, which in turn was using useXJsonMode. This hooks contains a method that parses JSON-like strings which caused some characters not to be rendered because they were considered to be escaped. Something that this is not wanted in this specific editor.

So for this case I've created a component that uses the same editor option as XJsonEditor but uses the value without parsing so the processor pattern does not change.

Now, if we introduce this pattern, update the processor and edit again, the pattern displayed will look the same

{
    "ISSUE": "aaa\"bbb",
    "ISSUE2": "aaa\\(bbb"
}

Grok.mov

elasticmachine · 2024-10-24T14:14:30Z

Pinging @elastic/kibana-management (Team:Kibana Management)

ElenaStoeva

Thanks for working on this @SoniaSanzV! I tested locally and it works as expected now. I have a suggestion about reusing the XJsonEditor component instead of creating a new one.

ElenaStoeva · 2024-10-25T13:54:11Z

...onents/pipeline_editor/components/processor_form/field_components/unexpanded_json_editor.tsx

+  lineNumbers: 'off',
+};
+
+export const UnexpandedJsonEditor: FunctionComponent<Props> = ({ field, editorProps }) => {


I'm wondering whether we really need to introduce this new component given that it reuses most of the logic in the XJsonEditor component - also, technically it also uses the xjson language so the new naming might be confusing. Wouldn't it be better to add some prop to the XJsonEditor component that tweaks the logic and determines whether we want to expand the strings or not?

Yes, it is also a good idea. And you are so right about the name, so I will implement what you say.

ElenaStoeva

Thanks for refactoring the changes @SoniaSanzV! I tested again and realised that the current behavior in this PR may not be entirely correct. With the current logic, we don't allow triple-quote strings, but they are allowed by Es. For example, the following request works in Console:

PUT _ingest/pipeline/test-pipeline
{
  "processors": [
    {
      "grok": {
        "field": "test",
        "patterns": [
          "^%{ISSUE}$"
        ],
        "pattern_definitions": {
          "ISSUE": """aaa\"bbb""",
          "ISSUE2": "aaa\\(bbb"
        }
      }
    }
  ]
}

But if you type in this pattern definition in the UI, it says it's invalid:

Also, I found something else - Es seems to automatically expand the strings if they contain quotes inside. For example, if you run GET _ingest/pipeline/test-pipeline, the response is:

{
  "test-pipeline": {
    "processors": [
      {
        "grok": {
          "field": "test",
          "patterns": [
            "^%{ISSUE}$"
          ],
          "pattern_definitions": {
            "ISSUE": """aaa\"bbb""",
            "ISSUE2": """aaa\(bbb"""
          }
        }
      }
    ]
  }
}

(the second pattern definition now uses triple-quote even if we set it to use a single-quote).

I think we should still allow triple-quotes in the UI but make sure the value that is in the editor is not expanded/collapsed unnecessarily and is shown in the way the user typed it in/or Es returned it.

SoniaSanzV · 2024-11-05T10:59:13Z

I've been working around with that and I'm quite lost with which is the correct solution. I'm started to think that the original approach is the right one, since we are displaying the same pattern format that ES returns.
What I don't think it makes sense is to mix escaped values with unescaped values. I mean, if the user introduces the following pattern:

{
    "ISSUE": "aaa\"bbb",
    "ISSUE2": "aaa\\(bbb",
    "ISSUE3": """aaa\"bbb"""
}

we can treat the JSON as escaped or unescaped, but not both. Actually, once the pipeline is created, the user click update processor and we fetch the data from ES, we don't have a way to know if the user original input was "ISSUE2": "aaa\\(bbb", or "ISSUE2": """aaa\(bbb"""
Whether we escape the quotation marks or not, I believe that what we show has to be consistent and cohesive.

…tead

SoniaSanzV · 2024-11-05T11:39:55Z

I have uploaded a new proposal (we can discuss it). In the end I have gone with the following: the user can enter any value they want as long as, once parsed, it is a valid JSON. Note that in the video, when “ISSUE3”: “”“aaa ‘bbb’”" is entered, the color of the editor is not red since the quotes have not yet been escaped and therefore it is not a proper JSON.
Once the processor is saved, it is always displayed with escaped values.

GROK.mov

The invalid JSON validator keeps working as expected

elasticmachine · 2024-11-05T13:21:07Z

💛 Build succeeded, but was flaky

Buildkite Build
Commit: 564f8da

Failed CI Steps

Metrics [docs]

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id	before	after	diff
`ingestPipelines`	405.2KB	405.3KB	+66.0B

History

💛 Build #246001 was flaky 22dbf1f69f5e256919e724f7bad82c3f1866ef58
💚 Build #245621 succeeded 501df3466fdae1ae05c474572d87a87e5b75610a

cc @SoniaSanzV

ElenaStoeva

Thanks for addressing my feedback @SoniaSanzV. I tested again and I noticed that now we have the opposite problem - triple quotes get transformed into single quotes, which is also something that users might complain about.

Before these changes:

Screen.Recording.2024-11-07.at.19.06.43.mov

With these Changes:

Screen.Recording.2024-11-07.at.19.04.53.mov

Overall, I don't really think that the initial behavior was really a bug since Kibana was simply transforming the strings into another format. Also, if you send the same request in Console, you will see that even Elasticsearch transforms single quotes with escape chars into triple quotes:

I think we should either do no transformation at all (i.e. the final request should be exactly as the user types it in) or we should leave it and let Kibana do the transformation as Elasticsearch does it.

SoniaSanzV · 2024-11-08T06:22:22Z

Yep, that's what I mentioned in my last comments. I don't think that the initial implementation was a bug either. My last implementation was just another option to have. But, as I said, maybe we can have an input that mixes escaped and unescaped jsons before the user saves the processor; but there is no way to know with the ES response what format they were in at the beginning, so we will always have to go with one of the two formats. And with that in mind, I think the one that matches the dev tools is better.

SoniaSanzV self-assigned this Oct 24, 2024

SoniaSanzV marked this pull request as ready for review October 24, 2024 14:14

SoniaSanzV requested a review from a team as a code owner October 24, 2024 14:14

ElenaStoeva self-requested a review October 25, 2024 13:42

ElenaStoeva reviewed Oct 25, 2024

View reviewed changes

SoniaSanzV force-pushed the ingestPipelines/grokProcessor_#175753 branch from 501df34 to 22dbf1f Compare October 25, 2024 15:01

SoniaSanzV enabled auto-merge (squash) October 25, 2024 15:41

SoniaSanzV force-pushed the ingestPipelines/grokProcessor_#175753 branch from 22dbf1f to 622553a Compare October 28, 2024 07:01

ElenaStoeva reviewed Oct 28, 2024

View reviewed changes

SoniaSanzV added 4 commits November 5, 2024 12:34

Not expand Json Literal Strings in Grok processor

2c68019

Delete unexpanded_json_editor and use a condition in xjson_editor ins…

080fa3d

…tead

Change name of varibale to 'rawValue'

5561e0e

Allow unescaped json as input

564f8da

SoniaSanzV force-pushed the ingestPipelines/grokProcessor_#175753 branch from 622553a to 564f8da Compare November 5, 2024 11:35

ElenaStoeva reviewed Nov 7, 2024

View reviewed changes

SoniaSanzV closed this Nov 8, 2024

auto-merge was automatically disabled November 8, 2024 06:22
Pull request was closed

SoniaSanzV deleted the ingestPipelines/grokProcessor_#175753 branch November 8, 2024 06:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Not expand Json Literal Strings in Grok processor #197639

Not expand Json Literal Strings in Grok processor #197639

SoniaSanzV commented Oct 24, 2024

elasticmachine commented Oct 24, 2024

ElenaStoeva left a comment

ElenaStoeva Oct 25, 2024

SoniaSanzV Oct 25, 2024

ElenaStoeva left a comment

SoniaSanzV commented Nov 5, 2024 •

edited

Loading

SoniaSanzV commented Nov 5, 2024 •

edited

Loading

elasticmachine commented Nov 5, 2024

ElenaStoeva left a comment

SoniaSanzV commented Nov 8, 2024

Not expand Json Literal Strings in Grok processor #197639

Not expand Json Literal Strings in Grok processor #197639

Conversation

SoniaSanzV commented Oct 24, 2024

Summary

elasticmachine commented Oct 24, 2024

ElenaStoeva left a comment

Choose a reason for hiding this comment

ElenaStoeva Oct 25, 2024

Choose a reason for hiding this comment

SoniaSanzV Oct 25, 2024

Choose a reason for hiding this comment

ElenaStoeva left a comment

Choose a reason for hiding this comment

SoniaSanzV commented Nov 5, 2024 • edited Loading

SoniaSanzV commented Nov 5, 2024 • edited Loading

elasticmachine commented Nov 5, 2024

💛 Build succeeded, but was flaky

Failed CI Steps

Metrics [docs]

Async chunks

History

ElenaStoeva left a comment

Choose a reason for hiding this comment

SoniaSanzV commented Nov 8, 2024

SoniaSanzV commented Nov 5, 2024 •

edited

Loading

SoniaSanzV commented Nov 5, 2024 •

edited

Loading