Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Iterator.from_json and #to_json #10437

Merged
merged 2 commits into from
Mar 22, 2022

Conversation

wonderix
Copy link
Contributor

For details see discussion in crystal forum

Copy link
Member

@straight-shoota straight-shoota left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add documentation with usage examples to the public method?
I'm not sure about FromJson, I think usually those internal iterator types are marked private.

Also would be great to get the same for YAML.

spec/std/json/serialization_spec.cr Outdated Show resolved Hide resolved
elsif @pull.kind.end_array?
@pull.read_next
@end = true
raise JSON::ParseException.new("Invalid JSON", *@pull.location) if @check_eof && [email protected]?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this check is necessary. If the next token isn't EOF, the YAML is malformed and the parser already notices that (at read_next) and raises. Try parsing [a,b]] with check_eof: false for example.

Would be great to add specs for failure behaviour, though (invalid YAML and non-matching types).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

Copy link
Contributor

@Sija Sija left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pedantic: missing white lines

src/json/from_json.cr Show resolved Hide resolved
src/json/from_json.cr Show resolved Hide resolved
spec/std/yaml/serialization_spec.cr Outdated Show resolved Hide resolved
@@ -48,6 +48,18 @@ private def parse_scalar(ctx, node, type : T.class) forall T
end
end

module Iterator(T)
# Reads the content for an iterator from a YAML sequence
def self.new(ctx : YAML::ParseContext, node : YAML::Nodes::Node)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if adding this method for YAML actually makes sense, if it can't be lazy. It's pretty much the same as Array.from_yaml.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it just for completeness. I was also wondering if it's required. If you prefer a PR without this, I will remove it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd prefer to leave it. It's easier to add later if necessary, than to remove it.
Let's wait for other opinions, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed it and also squashed all commits.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for that.

@caspiano
Copy link
Contributor

Bump. I've reached for this many times.


# Creates a new iterator which iterates over a JSON array. See also `Iterator#from_json`.
#
# WARNING: The `JSON::PullParser` can't be used by anything else until the iterator is fully consumed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is generally true when you pass a pull parser. So I'm not sure we should include this warning. Like, a pull parser shouldn't have more than one owner.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, we can remove these things later on.

# With this method it should be possible to process a huge JSON array, without
# the requirement that the whole array fits into memory.
#
# The following example produces a huge file, uses a lot of CPU but should not require much memory.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering does this actually use "a lot of CPU"? It might take some time, sure. But I wouldn't expect it to be a particularly CPU-heavy task.

@beta-ziliani beta-ziliani added this to the 1.4.0 milestone Mar 18, 2022
@straight-shoota straight-shoota changed the title Implement from_json and to_json for Iterator Implement Iterator.from_json and #to_json Mar 21, 2022
@straight-shoota straight-shoota merged commit a860168 into crystal-lang:master Mar 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants