Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comments in Json #402

Closed
vito-c opened this issue Jun 11, 2014 · 14 comments
Closed

Comments in Json #402

vito-c opened this issue Jun 11, 2014 · 14 comments

Comments

@vito-c
Copy link

vito-c commented Jun 11, 2014

Is there a way to have jq ignore comments that have been added to the json file:
{
/* People insist on putting comments in data */
"foo": "test",
"bar": "baz"
}

@nicowilliams
Copy link
Contributor

Is there a way to have jq ignore comments that have been added to the json file:
{
/* People insist on putting comments in data */
"foo": "test",
"bar": "baz"
}

No. JSON is specified by RFC7159, and it doesn't have a format for
comments. Even if jq allowed it (it could), it couldn't preserve them
on output, it couldn't insert them into objects, and so on.
Outputting them would break interop with other JSON parsers, and
that's the real deal breaker, otherwise I'd suggest something like:

{
  "foo": "test" /* comment */,
  ..
}

as that way we could at least logically preserve comments. But as it
is, "no". Sorry :(

I have needed this in the past, so what I tend to do is this:

{
  "comment_0": "People insist on putting comments in data",
  "foo": "test",
  "comment_1": "another comment",
  ...
}

or

{
  "foo__comment": "People insist on putting comments in data",
  "foo": "test",
  "bar__comment": "another comment",
  "bar": ...
}

No comments in arrays, nor at the top-level in any case. C'est la vie.

@pkoppstein
Copy link
Contributor

More generally, it would be great if there could be a "--javascript" switch that would authorize jq to accept Javascript-style JSON. This would, for example, allow {a: "1"} to be "translated" to {"a": "1"}.

@nicowilliams
Copy link
Contributor

@pkoppstein My intention is to allow multiple parser types. Ditto
encoders. You could write such a parser. But the built-in/default JSON
parser and encoder will deal strictly with RFC7159 JSON, full stop.

@pkoppstein
Copy link
Contributor

@nicowilliams wrote:

No. .... jq ... couldn't preserve them on output

Please note that the original request was explicitly that they be ignored.

,,,the built-in/default JSON parser and encoder will deal strictly with RFC7159 JSON, full stop.

Understood, but don't forget that currently jq is not as strict as the "full stop" would suggest. See e.g. #348

@nicowilliams
Copy link
Contributor

@pkoppstein Yeah, I know, but I think that's just unfortunate. I suppose I could add "ignore comments" to the current parser, but I don't really feel up to it. I'm very busy with other things and any cycles I can spare to jq I'd rather use for more deserving things. Even if you send me a PR for that, I think I might not take it, mostly because the parser is a bit messy and I don't want to make it messier. (I tried extending it to support streaming, but that made it much too messy, and that's when I decided on having multiple parsers.)

OTOH, a PR for multiple parser types... that'd be nice. I'm thinking of how to integrate that properly with the I/O builtins I'm still baking.

@nicowilliams
Copy link
Contributor

Also, quite frankly, there's a difference between accepting unescaped control characters and ignoring comments that jq couldn't produce. The latter makes one wonder what the point of comments is. It's much better to use a commenting convention like I described than comments that no parser will/should preserve! Mind you, there are problems with accepting unescaped control characters. I may very well remove that, possibly for just some control characters, possibly for all the ASCII ones -- do not rely on jq's willingness to accept unescaped control characters!

@pkoppstein
Copy link
Contributor

Regarding the handling of comments in JSON, see https://github.com/stedolan/jq/wiki/FAQ#processing-not-quite-valid-json

@vito-c
Copy link
Author

vito-c commented Sep 11, 2015

@pkoppstein Thanks! Do you know if that will also work for mongodb generated json ... this typically looks something like this:

{
        "_id" : ObjectId("1234567890"),
        "foo" : ObjectId("123456789"),
        "bar" : ObjectId("8897865866758"),
        "timestamp" : ISODate("2015-04-05T16:00:00.174Z")
}

@pkoppstein
Copy link
Contributor

@vito-c - That of course is, as they put it, "mongodb-extended-json", not JSON. ("http://docs.mongodb.org/master/reference/mongodb-extended-json)

MongoDB does provide a utility for producing JSON from a MongoDB instance (but apparently not from a text file):

mongoexport is a utility that produces a JSON or CSV export of data stored in a MongoDB instance

Details about mongoexport are at: http://docs.mongodb.org/master/reference/program/mongoexport/#bin.mongoexport

Here is the URL of a script that purports to convert bson to json:
https://gist.github.com/tedsparc/1763326

@pauldraper
Copy link

pauldraper commented Sep 27, 2016

To quote the great @pkoppstein,

Nothing I've read about jq suggests that it should reject invalid JSON. In fact, it would be great if it had the ability (perhaps governed by a switch) to transform imperfect JSON into JSON.

Please also note that the most recent "Proposed Standard" for JSON (http://tools.ietf.org/html/rfc7159) explicitly says:

A JSON parser MAY accept non-JSON forms or extensions.

For example, jq currently accepts strings containing U+0083, which is not valid JSON according to RFC7159. In fact, this is almost exactly the same as the comments issue; both are valid JavaScript but not valid JavaScript Object Notation.

It would be reasonable to have a "strict" (or "non-strict") flag. I don't expect jq to output non-strict JSON. But it would be good for it to accept non-strict JSON, as it already does today.

@alexkli
Copy link

alexkli commented Oct 17, 2016

It would be reasonable to have a "strict" (or "non-strict") flag. I don't expect jq to output non-strict JSON. But it would be good for it to accept non-strict JSON, as it already does today.

👍

@RichardBronosky
Copy link

RichardBronosky commented Mar 15, 2017

Douglas Crockford explains why he removed comments from JSON in https://plus.google.com/+DouglasCrockfordEsq/posts/RK8qyGVaGSr (It's not that they were not put in. They were there. They were removed. They should not be put back.) But, in the last sentence he also gives you the solution.

$ brew install jsmin

$ cat > example.json <<'EOF'
{
/* People insist on putting comments in data */
"foo": "test",
"bar": "baz"
}
EOF

$ cat example.json | jq '.["bar"]'
parse error: Invalid numeric literal at line 2, column 3

$ cat example.json | jsmin | jq '.["bar"]'
"baz"

These posts are becoming a common theme for me. hashicorp/packer#1768 (comment)

@nicowilliams nicowilliams mentioned this issue Apr 28, 2017
@ricsam
Copy link

ricsam commented Nov 18, 2022

For some lightweight parsing and modifying of jsonc documents I've created jsonc-cli which is just a cli front-end for Microsoft's jsonc-parser.

@ghost
Copy link

ghost commented Jun 21, 2023

If you have node installed, save this bash snippet as /usr/local/bin/stripjsonc

#!/bin/bash
node -p 'JSON.stringify(eval(`(${require("fs").readFileSync("'${1}'"||0, "utf-8").toString()})`))'

⚠️ uses eval so you must trust that your input is actually JSON/JSONC

Then

echo '{"foo":42 /*hello*/}' | stripjsonc # => {"foo":42}
# or
cat /tmp/foo.jsonc # { /*hello*/ "foo":7}
stripjsonc /tmp/foo.jsonc # => {"foo":7}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants