-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Iterator.from_json
and #to_json
#10437
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add documentation with usage examples to the public method?
I'm not sure about FromJson
, I think usually those internal iterator types are marked private
.
Also would be great to get the same for YAML.
src/json/from_json.cr
Outdated
elsif @pull.kind.end_array? | ||
@pull.read_next | ||
@end = true | ||
raise JSON::ParseException.new("Invalid JSON", *@pull.location) if @check_eof && [email protected]? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this check is necessary. If the next token isn't EOF, the YAML is malformed and the parser already notices that (at read_next
) and raises. Try parsing [a,b]]
with check_eof: false
for example.
Would be great to add specs for failure behaviour, though (invalid YAML and non-matching types).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pedantic: missing white lines
src/yaml/from_yaml.cr
Outdated
@@ -48,6 +48,18 @@ private def parse_scalar(ctx, node, type : T.class) forall T | |||
end | |||
end | |||
|
|||
module Iterator(T) | |||
# Reads the content for an iterator from a YAML sequence | |||
def self.new(ctx : YAML::ParseContext, node : YAML::Nodes::Node) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if adding this method for YAML actually makes sense, if it can't be lazy. It's pretty much the same as Array.from_yaml
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added it just for completeness. I was also wondering if it's required. If you prefer a PR without this, I will remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd prefer to leave it. It's easier to add later if necessary, than to remove it.
Let's wait for other opinions, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed it and also squashed all commits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't do force pushes (see https://github.com/crystal-lang/crystal/blob/master/CONTRIBUTING.md#minimum-requirements).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for that.
Bump. I've reached for this many times. |
|
||
# Creates a new iterator which iterates over a JSON array. See also `Iterator#from_json`. | ||
# | ||
# WARNING: The `JSON::PullParser` can't be used by anything else until the iterator is fully consumed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is generally true when you pass a pull parser. So I'm not sure we should include this warning. Like, a pull parser shouldn't have more than one owner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That said, we can remove these things later on.
# With this method it should be possible to process a huge JSON array, without | ||
# the requirement that the whole array fits into memory. | ||
# | ||
# The following example produces a huge file, uses a lot of CPU but should not require much memory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering does this actually use "a lot of CPU"? It might take some time, sure. But I wouldn't expect it to be a particularly CPU-heavy task.
Iterator.from_json
and #to_json
For details see discussion in crystal forum