-
-
Notifications
You must be signed in to change notification settings - Fork 6.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non-unique keys in objects. #375
Comments
The standard says (emphasis added):
The standard is open how the implementations should react. It explicitly lists two ways, and this library chose the first one. This is also mentioned in the documentation. I don't think that throwing here makes much sense. This also would change the public API and may break more code than it helps. |
(Duplicate keys are also no an issue in https://github.com/miloyip/nativejson-benchmark and http://seriot.ch/parsing_json.php which cover a lot of JSON compliance issues) |
Sounds reasonable. In terms of documentation, the example with the multimap is indicative enough about how this library treats the duplicate keys, so this can be closed. |
Still, it seems that unable to find out about such situation is strange. In many cases, it means logically that someone who makes such json for most cases missed something or got mistaken and such treatment leads to undefined behavior - which value gets parsed as a result. So maybe it would be a good idea to add some way to signalize about it? Maybe it somehow possible? As far as I understand the library use std::map for underlying object representation, and probably for performance reason don't check before inserting new value..probably it can be some flag..or what is the correct way to handle such cases? |
It is possible to pass a callback function to the parser which implements a check if a parsed key was already stored. |
Thanks a lot for an answer on such short notice. Possible related to parser_callback_t.. |
thanks! |
What should I use from client code as a BasicJsonType& parsed argument to parse callback method? receive compile error |
The third argument is a reference to the recently parsed JSON value (or object key). |
I just tried to come up with an example, but I realized that we provide too little information to the callback function to properly keep track of the object keys. Here was my try: json::parser_callback_t cb = [](int depth, json::parse_event_t event, json & parsed)
{
static std::set<std::string> keys;
if (event == json::parse_event_t::object_end)
{
keys.clear();
}
if (event == json::parse_event_t::key)
{
auto p = keys.insert(parsed.get<std::string>());
if (not p.second)
{
std::cerr << "key " << parsed << " was already seen in this object!" << std::endl;
return false;
}
}
// yes, store parsed value
return true;
}; Unfortunately, this function does not work with nested objects. I figured that we are missing a way to access the object we are adding to. :-/ |
Yes, I came up with almost the same code. In my json format I do have nested objects, but I know at what depth I need to check keys for uniqueness. But for any possible nested levels, it's currently seems impossible... |
I always wanted to overwork the callback method anyway, so I will keep this use case in mind. |
Hi, I think this behavior is broken since 85c7680. I tried updating our codebase using 3.1.0 from 2.1.1 and one of our unit tests failed because using Thus either this commit should be reverted, or the documentation fixed to reflect the new behavior:
|
@jeremysingy You are right. This subtlety has been overlooked. May I ask you to open a separate issue, so we can discuss this independently of this issue? |
Would it be a possibility to at least have a build option (when building json) which allows to throw on duplicate keys? Either setting of a macro before the single include file, or passing an option to the cmake configuration? |
Well, libjansson has this option JSON_REJECT_DUPLICATES
I think it would be nice to have that option in nlohmann too, and with a similar API for simplicity. |
PRs welcome ;) |
In fact, this feature could probably be built with the current API using a parser callback function. |
Is something like this what you had in mind? Or is there a better way to leverage what has already been parsed to check against existing keys? std::stack<std::set<json>> parse_stack;
json::parser_callback_t check_duplicate_keys =
[&](int /* depth */, json::parse_event_t event, json &parsed) {
switch (event) {
case json::parse_event_t::object_start: parse_stack.push(std::set<json>()); break;
case json::parse_event_t::object_end: parse_stack.pop(); break;
case json::parse_event_t::key:
const auto result = parse_stack.top().insert(parsed);
if (!result.second) {
std::cerr << "key " << parsed << " was already seen in this object!" << std::endl;
return false;
}
break;
default: break;
}
// yes, store parsed value
return true;
}; Edit: this is very similar to your first attempt above, but should now take care of nested objects via the stack of sets. |
JSON RFC allows (quite unfortunately, IMHO) non-unique keys in objects.
The current implementation only keeps the last one. I know this is mentioned in the doc in the example where json is constructed from a multimap. In this scenario if the user had put non-unique keys in there - well that's their fault.
In general case, however, when we're deserializing an unknown input and silently discarding duplicate keys, that doesn't seem like the most correct thing to do - this looks more like a silent failure. Perhaps this should throw instead?
The text was updated successfully, but these errors were encountered: