Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tcpsocket supports a send buffer #6876

Closed
wants to merge 3 commits into from
Closed

tcpsocket supports a send buffer #6876

wants to merge 3 commits into from

Conversation

amitmurthy
Copy link
Contributor

In the same spirit as #6768, this PR avoids a copy while transmitting large arrays between workers. It does this by

  1. moving the buffering functionality of worker connections to becoming a feature of tcpsocket itself and
  2. directly writing large arrays to the socket instead of via an intermediate buffer.

@hisq
Copy link

hisq commented May 17, 2014

Probably also fixes #4508.

@JeffBezanson
Copy link
Member

Awesome!! Looks pretty good but I will read it in more detail later.

@vtjnash
Copy link
Member

vtjnash commented May 19, 2014

would it make sense to make this a feature of all our streams, add generic lock/unlock functions, and utilize them whenever entering/exiting IO-related functions (e.g. print/show)?

@Keno
Copy link
Member

Keno commented May 19, 2014

I agree with Jameson. This looks fine to me and I agree that we shouldn't by default buffer just a single write call, since that would be confusing.

@amitmurthy
Copy link
Contributor Author

@vtjnash did you mean all streams (including IOStream) or only AsyncStreams?

@vtjnash
Copy link
Member

vtjnash commented May 19, 2014

perhaps only AsyncStreams, but all streams would be even nicer. it doesn't entirely makes sense for IOBuffer to be pre-buffered, but it does make sense for it to have a write lock.

This would make IOStream double-buffered, but we could then replace IOStream with uv_file in general and get non-blocking IO for everything (which could potentially make the Pkg manager faster, since it appears to be I/O limited doing a independent operations on metadata for a lot of files).

@JeffBezanson
Copy link
Member

Yes, with a buffer on the libuv File we can get rid of IOStream entirely (assuming it will perform well).

end
end

make_lockable(s::AsyncStream) = (s.write_lock = RemoteRef())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently using a RemoteRef for locking. Wondering if I should change to use libuv mutexes. Local remote refs in workers suffer from the fact that they cannot be initialized till the worker id is known. Also libuv mutexes should be faster and more lightweight.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just realized that libuv mutexes are recursive and Julia tasks would show up as being part of the same thread. May not be usable.

@amitmurthy
Copy link
Contributor Author

Have moved the implementation to support all AsyncStreams (except UPDSocket).

@amitmurthy
Copy link
Contributor Author

No tidy way to solve #3787 without locks though. With locking (if we decide to export the same), user code could look like

Base.make_lockable(STDOUT)
@sync begin
    for i in 1:3
        @async begin
            Base.lock_w(STDOUT) 
            println(`Getting garbled`)
            Base.unlock_w(STDOUT) 
        end
    end
end

@amitmurthy
Copy link
Contributor Author

Bump. As stated it now supports all AsyncStreams. Feedback?

Merging of File and IOStream is a bit more involved and I would like to do that separately.

@Keno
Copy link
Member

Keno commented Jun 2, 2014

What's the need for having the lock inside the IOStream rather than as a separate thing?

@amitmurthy
Copy link
Contributor Author

It is more intuitive to lock a stream and then write multiple times to it. Rather than have user code manage a lock separately. Sort of like lockf / fcntl on an open file fd.

Anyways, the lock in the IOStream object is not initialized till make_locakable is called on the stream. This is required since a local RemoteRef cannot be instantiated properly on a worker till its julia pid is known. And the pid is known only when the worker receives the first initialization message from the master process.

@amitmurthy
Copy link
Contributor Author

Bump. Would be great if this could make it into 0.3

@andreasnoack
Copy link
Member

What is the status here? I get i nice speedup from this on large arrays. See the yellow graph. It shows the time it takes to copy an array of Float64s to another process on my laptop as a function of the array size. The hump seems to be genuine as I have rerun the tests couple of times and the reported times are minima of ten runs. Maybe related to the way the buffer grows?

There also seems to be quite a bit of overhead in the @spawn compared to MPI Send/Recv. Is this tracked in an issue or just considered unavoidable because of the way @spawns work?

muck svg

@amitmurthy
Copy link
Contributor Author

It is waiting for a review by @JeffBezanson

The hump comes about since this optimization does not kick in for arrays less than 1MB in bytesize. The value of const SENDBUF_SZ.

I think quite a bit of overhead in @spawn is due to serialization itself. For e.g.

julia> iob=IOBuffer()
julia> a=ones(10^8);

julia> @time remotecall_fetch(2, x->x, a);
elapsed time: 0.925497984 seconds (1769649824 bytes allocated, 5.42% gc time)

julia> @time remotecall_fetch(2, x->x, a);
elapsed time: 0.914441734 seconds (1767317752 bytes allocated, 3.63% gc time)

julia> @time serialize(iob, a);
elapsed time: 0.239022428 seconds (967168640 bytes allocated, 5.84% gc time)

julia> seekstart(iob)

julia> @time serialize(iob, a);
elapsed time: 0.131713773 seconds (112 bytes allocated)

So, a local serialization to a pre-allocated buffer takes 0.13 seconds. If not pre-allocated, it is 0.23 seconds. Again the deserialization/serialization on the remote end will have similar figures.

@amitmurthy
Copy link
Contributor Author

Closed in favor of #10073

@amitmurthy amitmurthy closed this Feb 4, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants