Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flux.loadparams! is slow. #1764

Closed
Gregliest opened this issue Nov 9, 2021 · 2 comments
Closed

Flux.loadparams! is slow. #1764

Gregliest opened this issue Nov 9, 2021 · 2 comments

Comments

@Gregliest
Copy link
Contributor

According to the test case below, Flux.loadparams! is about 7x slower than creating a new Flux model and results in a lot more allocations. Is there a reason for that? The speed of loadparams! also seems to scale more poorly than creating a new model if more layers are added.

using Flux, BenchmarkTools

model = Chain(
  Dense(2, 3),
  Dense(3, 2),
  Dense(2, 2),
  Dense(2, 1)
)

p = Flux.params(model)

function buildModel(params)
  return Chain(
    Dense(params[1], params[2]),
    Dense(params[3], params[4]),
    Dense(params[5], params[6])
  )
end

@btime buildModel(p) # 691.101 ns (5 allocations: 224 bytes)
@btime Flux.loadparams!(model, p) # 4.963 μs (191 allocations: 7.08 KiB)
@ToucheSir
Copy link
Member

loadparams! has to iterate through the entire model structure recursively because it calls params, so the additional time + allocations are expected:

for (p, x) in zip(params(m), xs)
. Generally there shouldn't be a need to run it on any hot code path though, is there a reason you want to repeatedly call loadparams!?

@Gregliest
Copy link
Contributor Author

Ok, that makes sense. I was thinking of using it in a Turing model to replace parameters when sampling stochastic weights, but I don't think that works with the gradient calculation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants