Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Iterator.from_json and #to_json #10437

Merged
merged 2 commits into from
Mar 22, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions spec/std/json/serialization_spec.cr
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,22 @@ describe "JSON serialization" do
Deque(String).from_json(%(["a", "b"])).should eq(Deque.new(["a", "b"]))
end

it "does Iterator(String)#from_json" do
assert_iterates_iterator ["a", "b"], Iterator(String).from_json(%(["a", "b"]))
end

it "raises an error Iterator(String)#from_json with invalid types" do
expect_raises(JSON::ParseException) do
Iterator(String).from_json(%([1, 2])).to_a
end
end

it "raises an error Iterator(String)#from_json with invalid JSON" do
expect_raises(JSON::ParseException) do
Iterator(String).from_json(%(["a")).to_a
end
end

it "does Hash(String, String)#from_json" do
Hash(String, String).from_json(%({"foo": "x", "bar": "y"})).should eq({"foo" => "x", "bar" => "y"})
end
Expand Down Expand Up @@ -523,6 +539,10 @@ describe "JSON serialization" do
Set(Int32).new([1, 1, 2]).to_json.should eq("[1,2]")
end

it "does for Iterator" do
(1..3).each.to_json.should eq("[1,2,3]")
end

it "does for Hash" do
{"foo" => 1, "bar" => 2}.to_json.should eq(%({"foo":1,"bar":2}))
end
Expand Down
51 changes: 51 additions & 0 deletions src/json/from_json.cr
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,57 @@ def Deque.from_json(string_or_io) : Nil
end
end

module Iterator(T)
# Reads the content of a JSON array into an iterator in a lazy way.
# With this method it should be possible to process a huge JSON array, without
# the requirement that the whole array fits into memory.
#
# The following example produces a huge file, uses a lot of CPU but should not require much memory.
wonderix marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering does this actually use "a lot of CPU"? It might take some time, sure. But I wouldn't expect it to be a particularly CPU-heavy task.

#
# ```
# File.open("/tmp/test.json", "w+") do |f|
# (0..1_000_000_000).each.to_json(f)
# end
wonderix marked this conversation as resolved.
Show resolved Hide resolved
#
# File.open("/tmp/test.json", "r") do |f|
# p Iterator(Int32).from_json(f).skip(1_000_000_000).to_a
# end
# ```
#
# WARNING: The `string_or_io` can't be used by anything else until the iterator is fully consumed.
def self.from_json(string_or_io)
straight-shoota marked this conversation as resolved.
Show resolved Hide resolved
Iterator(T).new(JSON::PullParser.new(string_or_io))
end

# Creates a new iterator which iterates over a JSON array. See also `Iterator#from_json`.
#
# WARNING: The `JSON::PullParser` can't be used by anything else until the iterator is fully consumed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is generally true when you pass a pull parser. So I'm not sure we should include this warning. Like, a pull parser shouldn't have more than one owner.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, we can remove these things later on.

def self.new(pull : JSON::PullParser)
FromJson(T).new(pull)
end

private class FromJson(T)
include Iterator(T)

def initialize(@pull : JSON::PullParser)
@pull.read_begin_array
@end = false
end

def next
if @end
stop
elsif @pull.kind.end_array?
@pull.read_next
@end = true
stop
else
T.new(@pull)
end
end
end
end

def Nil.new(pull : JSON::PullParser)
pull.read_null
end
Expand Down
10 changes: 10 additions & 0 deletions src/json/to_json.cr
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,16 @@ struct Set
end
end

module Iterator(T)
# Converts the content of an iterator into a JSON array in lazy way.
# See `Iterator#from_json` for an example.
def to_json(json : JSON::Builder)
json.array do
each &.to_json(json)
end
end
end

class Hash
# Serializes this Hash into JSON.
#
Expand Down
12 changes: 12 additions & 0 deletions src/yaml/to_yaml.cr
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,18 @@ class Array
end
end

module Iterator(T)
# Converts the content of an iterator to YAML.
# The convertion is done in a lazy way.
# In contrast to `Iterator#to_json` this operation requires memory for the
# for the complete YAML document
def to_yaml(yaml : YAML::Nodes::Builder)
yaml.sequence(reference: self) do
each &.to_yaml(yaml)
end
end
end

struct Tuple
def to_yaml(yaml : YAML::Nodes::Builder) : Nil
yaml.sequence do
Expand Down