Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

References as pointers #84

Closed
joaquimg opened this issue Jul 26, 2017 · 17 comments
Closed

References as pointers #84

joaquimg opened this issue Jul 26, 2017 · 17 comments

Comments

@joaquimg
Copy link
Member

What if...

References were pointers instead of UInt´s?

With UInts we can have simple vector as mappings IF the solver does not support deletions. However, if the solver supports deletions we need some mapping. This mapping could be a vector, which would grow a lot with lots of deletions; could be a linked list which is fast to delete but not fast to access; could be a Dict which is easy to delete but not so fast to access.

Given that we usually do a lot more additions than deletions, may we should not be much worried of the time it takes to delete, but with the time it takes to add.

My proposal goes as follows, instead of giving the user a unique UInt everytime a variable/constraint is added we could give a pointer to an Int that directly matches the solver´s index, the SolverInstance would also hold a list of those pointers. In case of adding a constraint there would be no need for a lookup to get the solver´s corresponding index. In case of a deletion we can set that pointer to a zero value and fix all the other pointers since the SolverInstance has a list of them, then the user´s reference are fixed.

@joaquimg
Copy link
Member Author

Is connected to #63 and #71

@mlubin
Copy link
Member

mlubin commented Jul 26, 2017

There's a huge performance penalty if references objects have a field that contains a reference to a GC-managed object. What type precisely are you imagining as a pointer?

@joaquimg
Copy link
Member Author

joaquimg commented Jul 26, 2017

My very first idea was simply:

struct VariableReference
    value::Vector{Int}
end

and that vector has length 1.
I imagine one can do better that this... The SolverInstance would keep a vector of these vectors in order to apply the updates after deletions.

@mlubin
Copy link
Member

mlubin commented Jul 26, 2017

The GC will explode if you have 10,000,000 variables. I don't have the link off hand, but it's a big issue with the Julia compiler that may or may not be fixed by 1.0.

@joaquimg
Copy link
Member Author

joaquimg commented Jul 26, 2017

So we can´t have 10,000,000 vectors in julia? :(

@mlubin
Copy link
Member

mlubin commented Jul 26, 2017

By explode I mean take up a majority of the execution time.

@mlubin
Copy link
Member

mlubin commented Jul 26, 2017

@joaquimg
Copy link
Member Author

joaquimg commented Jul 26, 2017

julia> using BenchmarkTools

julia> function refs()
       [Int[3] for i in 1:10_000_000]
       end
refs (generic function with 1 method)
julia> @benchmark refs()
BenchmarkTools.Trial:
  memory estimate:  991.82 MiB
  allocs estimate:  10000002
  --------------
  minimum time:     1.185 s (46.46% GC)
  median time:      2.013 s (65.95% GC)
  mean time:        1.870 s (63.30% GC)
  maximum time:     2.412 s (69.37% GC)
  --------------
  samples:          3
  evals/sample:     1
julia> function dicts()
       Dict(zip(collect(1:10_000_000),collect(1:10_000_000)))
       end
dicts (generic function with 1 method)

julia> @benchmark dicts()
BenchmarkTools.Trial:
  memory estimate:  693.76 MiB
  allocs estimate:  79
  --------------
  minimum time:     2.202 s (12.19% GC)
  median time:      2.252 s (12.26% GC)
  mean time:        2.275 s (13.38% GC)
  maximum time:     2.370 s (15.55% GC)
  --------------
  samples:          3
  evals/sample:     1

@mlubin
Copy link
Member

mlubin commented Jul 26, 2017

The extra GC overhead will easily consume the factor of 2 that you're looking to gain by avoiding dictionaries.

@joaquimg
Copy link
Member Author

yep :(

@joaquimg
Copy link
Member Author

Out of curiosity, do you see other drawbacks besides this GC problem?

@mlubin
Copy link
Member

mlubin commented Jul 26, 2017

Having to

fix all the other pointers

forces an O(n) operation for every deletion. I'd rather have constant-time addition and constant-time deletion that's a tiny bit slower. But I think there are ways to tweak the idea to also have constant-time deletion.

@joaquimg
Copy link
Member Author

My understanding of the current usage suggestion is that we have a dict:
variable_map = Dict{MOI.VariableRerence, Int}() or variable_map = Dict{UInt64, Int}()

The Int is the variable index for the solver, when a deletion happens, GLPK, Gurobi and Xpress, shift the indices and thus one have to loop over the dictionary to correct the Int´s to match the correct variables right?

@mlubin
Copy link
Member

mlubin commented Jul 26, 2017

Not all solvers have poor deletion semantics like that. Some solvers like SCIP have their own reference objects and don't force you to keep the indices, so we should design the interface so that it's possible for deletion to be fast in that case.

@blegat
Copy link
Member

blegat commented Jul 26, 2017

In theory, you could avoid the O(n) by using a Fenwick Tree to encode the deletions. You will add a deletion to the Fenwick Tree on O(log(n)) and you will be able to retrieve the number of deletion done before an index i in O(log(n)). Not sure if it is worth doing it practice though

@mlubin
Copy link
Member

mlubin commented Aug 1, 2017

Ok to close or is there more to discuss here?

@joaquimg
Copy link
Member Author

joaquimg commented Aug 1, 2017

For now we should close, maybe if the issues in julia are fixed we can reopen.

@joaquimg joaquimg closed this as completed Aug 1, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants