Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OPA Unable to Handle Escape Characters when UTF-8 with BOM encoded #6988

Closed
adhilto opened this issue Sep 4, 2024 · 4 comments · Fixed by #6989
Closed

OPA Unable to Handle Escape Characters when UTF-8 with BOM encoded #6988

adhilto opened this issue Sep 4, 2024 · 4 comments · Fixed by #6989
Labels

Comments

@adhilto
Copy link

adhilto commented Sep 4, 2024

Short description

This problem was a problem on the ScubaGear project (though we've identified a fix). The input json files are generated via PowerShell 5, which encodes files as UTF-8 with BOM (byte-order mark). OPA is able to handle such files in most cases, but there are some json sequences that result in OPA outputting unable to parse input: yaml

It's possible there are other sequences that are problematic, but \/ was a character sequence that consistently produced the issue.

Tested with version 0.64.1 on Windows 10.

Steps To Reproduce

A concrete example:

First, generate a UTF-8 with BOM json file. For example, on PowerShell 5, run @{"dates"=@(Get-Date)} | ConvertTo-Json | Set-Content "example.json" -Encoding "utf8".

That will save a file looking like this:

{
    "dates":  [
                  {
                      "value":  "\/Date(1725472274633)\/",
                      "DisplayHint":  2,
                      "DateTime":  "Wednesday, September 4, 2024 10:51:14 AM"
                  }
              ]
}

Use a rego file (example.rego) such as this:

package example
import future.keywords

dates contains date if{
    some date in input.dates
    startswith(date.DateTime, "Wednesday")
}

Execute OPA: .\opa_windows_amd64.exe eval data.example.dates -i example.json -d example.rego -f values. Output: unable to parse input: yaml: line 4: found unknown escape character.

Copy example.json into a new file and save it as just UTF-8 (no BOM). Name it example_nobom.json. Execute OPA again: .\opa_windows_amd64.exe eval data.example.dates -i example_nobom.json -d example.rego -f values. Output:

[
  [
    {
      "DateTime": "Wednesday, September 4, 2024 10:43:58 AM",
      "DisplayHint": 2,
      "value": "/Date(1725471838770)/"
    }
  ]
]

Expected behavior

OPA to be able to parse the escape sequence. Alternatively, if UTF-8 with BOM isn't supported, for that to be appropriately documented (if it is already, I couldn't find it) and have a more user-friendly error message.

Additional context

@adhilto adhilto added the bug label Sep 4, 2024
@ashutosh-narkar
Copy link
Member

I would image this is the case with the latest OPA release as well. If you'd like to contribute a fix that would be great!

@anderseknert
Copy link
Member

BOM! Now that's an issue I haven't dealt with in years :P

JSON doesn't seem to like that, but..

implementations that parse JSON texts MAY ignore the presence of a byte order mark rather than treating it as an error

So I guess that's what we should do. Since we consume the whole input before passing it to the JSON decoder, this should be fairly easy to do. I'll take a look.

anderseknert added a commit to anderseknert/opa that referenced this issue Sep 4, 2024
@adhilto
Copy link
Author

adhilto commented Sep 4, 2024

BOM! Now that's an issue I haven't dealt with in years :P

JSON doesn't seem to like that, but..

implementations that parse JSON texts MAY ignore the presence of a byte order mark rather than treating it as an error

So I guess that's what we should do. Since we consume the whole input before passing it to the JSON decoder, this should be fairly easy to do. I'll take a look.

TIL. Thanks!

@anderseknert
Copy link
Member

Yeah, that was easy :) PR up here #6989

anderseknert added a commit to anderseknert/opa that referenced this issue Sep 4, 2024
anderseknert added a commit to anderseknert/opa that referenced this issue Sep 4, 2024
ashutosh-narkar pushed a commit that referenced this issue Sep 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants