Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON Parsing - deadlock/stuck on parsing Json to HTML #36

Open
JochnGst opened this issue Aug 4, 2023 · 5 comments
Open

JSON Parsing - deadlock/stuck on parsing Json to HTML #36

JochnGst opened this issue Aug 4, 2023 · 5 comments

Comments

@JochnGst
Copy link

JochnGst commented Aug 4, 2023

Im trying to parse this JSON sipped to my Blazor page. But because of some weird RegEx parsing issue the process get stuck without any Exception. Can somebody tell me where there could be a Problem?
here you can find my test project: ColorCodeTest

This is my Test Code

string _jsonString3 = "{\r\n \"raw_causes\": [\r\n      \"Winterglatter Fahrbahn\",\r\n      \"Nicht angepasste Geschwindigkeit\",\r\n      \"test3\"\r\n    ]\r\n      }";
try
{
    var _formatter = new HtmlFormatter();
    var language = ColorCode.Languages.FindById("json");
    _jsonHtml = _formatter.GetHtmlString(_jsonString3, language);

}
catch (Exception)
{

    throw;
}

It stuck after the array Element \"Winterglatter Fahrbahn\", when it call regexMatch = regexMatch.NextMatch(); and I have no idear why this happends

        private void Parse(string sourceCode,
                           CompiledLanguage compiledLanguage,
                           Action<string, IList<Scope>> parseHandler)
        {
            Match regexMatch = compiledLanguage.Regex.Match(sourceCode);

            if (!regexMatch.Success)
                parseHandler(sourceCode, new List<Scope>());
            else
            {
                int currentIndex = 0;

                try
                {
                    while (regexMatch.Success)
                    {
                        string sourceCodeBeforeMatch = sourceCode.Substring(currentIndex, regexMatch.Index - currentIndex);
                        if (!string.IsNullOrEmpty(sourceCodeBeforeMatch))
                            parseHandler(sourceCodeBeforeMatch, new List<Scope>());

                        string matchedSourceCode = sourceCode.Substring(regexMatch.Index, regexMatch.Length);
                        if (!string.IsNullOrEmpty(matchedSourceCode))
                        {
                            List<Scope> capturedStylesForMatchedFragment = GetCapturedStyles(regexMatch, regexMatch.Index, compiledLanguage);
                            List<Scope> capturedStyleTree = CreateCapturedStyleTree(capturedStylesForMatchedFragment);
                            parseHandler(matchedSourceCode, capturedStyleTree);
                        }

                        currentIndex = regexMatch.Index + regexMatch.Length;
                        regexMatch = regexMatch.NextMatch();
                    }
                }
                catch (Exception ex)
                {

                    throw;
                }

                string sourceCodeAfterAllMatches = sourceCode.Substring(currentIndex);
                if (!string.IsNullOrEmpty(sourceCodeAfterAllMatches))
                    parseHandler(sourceCodeAfterAllMatches, new List<Scope>());
            }
        }
@JochnGst
Copy link
Author

JochnGst commented Aug 4, 2023

I found out that there is a conflict with the Key-LanguageRule

new LanguageRule(
     $@"[,\{{]\s*({Regex_String})\s*:",
     new Dictionary<int, string>
         {
             {1, ScopeName.JsonKey}
         }),

for my case it works when I use this RegEx $@"[,\{{]\s*(""\w*"")\s*:"
But I know that this will not catch all edge cases for a JSON-Key

@GuildOfCalamity
Copy link

I found out that there is a conflict with the Key-LanguageRule

new LanguageRule(
     $@"[,\{{]\s*({Regex_String})\s*:",
     new Dictionary<int, string>
         {
             {1, ScopeName.JsonKey}
         }),

for my case it works when I use this RegEx $@"[,\{{]\s*(""\w*"")\s*:" But I know that this will not catch all edge cases for a JSON-Key

I made this to clean up the RegEx pattern:

       public static List<string> ExtractKeys(string jsonString)
       {
           var keys = new List<string>();
           var matches = Regex.Matches(jsonString, "[,\\{]\"(.*?)\"\\s*:");
           foreach (Match match in matches) { keys.Add(match.Groups[1].Value); }
           return keys;
       }

@niltor
Copy link

niltor commented Mar 25, 2024

@GuildOfCalamity @JochnGst
Encountering the same problem, what is the reasonable solution?

@Yomodo
Copy link

Yomodo commented Jul 26, 2024

This has become a serious issue for us; showing the colored JSON of certain Intune policies locks up our whole Blazor app making it unusable and taking down the environment with it.

image

@Kompiler
Copy link

Kompiler commented Sep 9, 2024

I believe the issue relates to excessive regex backtracking when parsing json keys.

Atomic groups can be used by tweaking the original LanguageRule from

new LanguageRule(
     $@"[,\{{]\s*({Regex_String})\s*:",
     new Dictionary<int, string>
         {
             {1, ScopeName.JsonKey}
         })

to

new LanguageRule(
     $@"[,\{{]\s*(?>{Regex_String})\s*:",
     new Dictionary<int, string>
         {
             {1, ScopeName.JsonKey}
         })

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants