serializing closures is slow on master #16508

amitmurthy · 2016-05-22T05:19:08Z

using Serialization

function foo(n)
    io=IOBuffer()
    f = x->x
    @time for i in 1:10^n
        serialize(io, f)
        seekstart(io)
        deserialize(io)
    end
end

function bar(n)
    io=IOBuffer()
    @time for i in 1:10^n
        serialize(io, "Hello World")
        seekstart(io)
        deserialize(io)
    end
end

foo(1)
bar(1)
foo(4)
bar(4)

On 0.4

julia> foo(4)
  0.208748 seconds (680.01 k allocations: 35.706 MB, 1.70% gc time)

julia> bar(4)
  0.010271 seconds (100.00 k allocations: 9.003 MB, 24.10% gc time)

On master

julia> foo(4)
  3.409323 seconds (9.96 M allocations: 541.994 MB, 1.60% gc time)

julia> bar(4)
  0.012568 seconds (80.06 k allocations: 9.616 MB, 19.17% gc time)

@andreasnoack, your hunch was correct, this looks like the cause of the slowdown in distributed arrays - https://travis-ci.org/JuliaParallel/DistributedArrays.jl/builds/131800408

The text was updated successfully, but these errors were encountered:

amitmurthy · 2016-05-22T05:57:51Z

cc: @shashi could also explain the slowdown you are seeing in ComputeFramework.

ViralBShah · 2016-05-22T10:25:23Z

This is important for 0.5.

StefanKarpinski · 2016-06-02T16:13:21Z

#16695 helps but is kind of a workaround?

amitmurthy · 2016-06-15T13:51:15Z

I'll leave this issue open as #16774 optimized it only for parallel processing - and still not upto 0.4 levels. Will remove the 0.5 tag though, because, as discussed in #16774, this is the best for now.

ExpandingMan · 2016-09-21T18:24:24Z

Not completely sure it's related to this, but deserializing large objects (in this case DataFrames) from disk seems to take significantly longer on 0.5 (took me 9 seconds for a 38 MB dataframe). I'm afraid I can't give my 0.4 number because I replaced the 0.4 build, but I'm quite sure it was much faster (maybe 1 s at most?). Serialization, on the other hand, seems fine.

I did some further testing and it seems that this depends strongly on what the stored datatypes are. I had a 300MB dataframe of floats load in 1s, but 38MB of mixed strings, DateTime and Float takes forever (9s).

oscardssmith · 2020-12-07T23:05:27Z

Do we still care about this? It seems not to have significantly improved between 0.5 and master, but maybe we should close as wont-fix since no one has complained in 4 years?

oxinabox · 2020-12-07T23:12:50Z

Serializing closures is pretty cruicial for our distributed processing, so yes?

timholy · 2020-12-07T23:18:11Z

Julia 0.5 is when fast closures got added, so maybe it's just inevitable?

ViralBShah · 2022-09-06T01:48:04Z

Just tried the benchmark again, and still as slow as reported.

vtjnash · 2023-08-24T19:21:52Z

Yes, but we recommend not sending closures (rather load packages on all nodes), so this isn't really an issue

andreasnoack · 2023-08-24T20:54:05Z

If that is the case then it would be helpful with hints to how we can avoid closures when communicating between processes in e.g. DistributedArrays as mention above.

amitmurthy added performance Must go faster regression Regression in behavior compared to a previous version labels May 22, 2016

ViralBShah added this to the 0.5.0 milestone May 22, 2016

amitmurthy mentioned this issue Jun 1, 2016

RFC/WIP : cache mapped function remotely for pmap. [ci skip] #16695

Closed

StefanKarpinski assigned JeffBezanson Jun 2, 2016

amitmurthy mentioned this issue Jun 5, 2016

Send TypeName objects only once to workers #16774

Merged

amitmurthy removed this from the 0.5.0 milestone Jun 15, 2016

JeffBezanson modified the milestone: 0.5.0 Jun 15, 2016

amitmurthy mentioned this issue Sep 30, 2016

Dramatic performance degradation in Julia 0.5 JuliaNLSolvers/Optim.jl#290

Closed

ViralBShah added the parallelism Parallel or distributed computation label Sep 6, 2022

vtjnash closed this as completed Aug 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

serializing closures is slow on master #16508

serializing closures is slow on master #16508

amitmurthy commented May 22, 2016 •

edited by ViralBShah

Loading

amitmurthy commented May 22, 2016

ViralBShah commented May 22, 2016

StefanKarpinski commented Jun 2, 2016 •

edited

Loading

amitmurthy commented Jun 15, 2016 •

edited by tkelman

Loading

ExpandingMan commented Sep 21, 2016 •

edited

Loading

oscardssmith commented Dec 7, 2020

oxinabox commented Dec 7, 2020

timholy commented Dec 7, 2020

ViralBShah commented Sep 6, 2022

vtjnash commented Aug 24, 2023

andreasnoack commented Aug 24, 2023

serializing closures is slow on master #16508

serializing closures is slow on master #16508

Comments

amitmurthy commented May 22, 2016 • edited by ViralBShah Loading

amitmurthy commented May 22, 2016

ViralBShah commented May 22, 2016

StefanKarpinski commented Jun 2, 2016 • edited Loading

amitmurthy commented Jun 15, 2016 • edited by tkelman Loading

ExpandingMan commented Sep 21, 2016 • edited Loading

oscardssmith commented Dec 7, 2020

oxinabox commented Dec 7, 2020

timholy commented Dec 7, 2020

ViralBShah commented Sep 6, 2022

vtjnash commented Aug 24, 2023

andreasnoack commented Aug 24, 2023

amitmurthy commented May 22, 2016 •

edited by ViralBShah

Loading

StefanKarpinski commented Jun 2, 2016 •

edited

Loading

amitmurthy commented Jun 15, 2016 •

edited by tkelman

Loading

ExpandingMan commented Sep 21, 2016 •

edited

Loading