Skip to content
This repository has been archived by the owner on Oct 21, 2021. It is now read-only.

dijkstra_shortest_paths_explicit() #141

Merged
merged 1 commit into from
Dec 30, 2014
Merged

dijkstra_shortest_paths_explicit() #141

merged 1 commit into from
Dec 30, 2014

Conversation

sbromberger
Copy link
Contributor

Function that will return the actual nodes involved in dijkstra_shortest_paths as an array of (arrays of vertices).

julia> g = simple_graph(4)
julia> add_edge!(g,1,2); add_edge!(g,1,3); add_edge!(g,2,3); add_edge!(g,3,4)
julia> show_dijkstra_shortest_paths(g,2)
4-element Array{Array{Int64,N},1}:
 Int64[]
 [2]
 [2,3]
 [2,3,4]

Ref: #139

function show_dijkstra_shortest_paths{V}(g::AbstractGraph{V},all...)
state = dijkstra_shortest_paths(g,all...)
allvertices = g.vertices
patharr = Array(Array{V},0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be Array(Vector{V},0), I believe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - sorry. Let me change that.

@mlubin
Copy link
Contributor

mlubin commented Dec 28, 2014

Seems reasonable to me, though travis is failing. I would name it something like dijkstra_shorest_paths_explicit for improved tab completion. And docs need to be updated.

@sbromberger
Copy link
Contributor Author

Travis appears to be failing because the change that allowed dijkstra_shortest_paths() with default weights of 1 (#130) isn't being used by the validator (I think). I can change the test but would prefer not to as there should be no reason to explicitly define weights on the graph. Advice please?


g3 = simple_graph(4)
add_edge!(g3,1,2); add_edge!(g3,1,3); add_edge!(g3,2,3); add_edge!(g3,3,4)
sps = show_dijkstra_shortest_paths(g3,"2")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are integer vertices, so should be 2 not "2" here.

@sbromberger
Copy link
Contributor Author

I got it - thanks for catching my stupid mistakes. For some reason I'm unable to rebase/squash these commits.

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 6c321e5 on sbromberger:show_paths into ab89beb on JuliaLang:master.

@sbromberger
Copy link
Contributor Author

(I'll update the docs separately if that's ok.)

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 5e59955 on sbromberger:show_paths into ab89beb on JuliaLang:master.

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling aadbdce on sbromberger:show_paths into ab89beb on JuliaLang:master.

@mlubin
Copy link
Contributor

mlubin commented Dec 28, 2014

Okay, would be nicer if this were squashed with a separate commit for the whitespace changes. Leaving open (for a short time) for others' comments.

@sbromberger
Copy link
Contributor Author

I tried squashing but it's giving me errors. I'll keep at it for a bit.

Also, just a question: the docs imply that dijkstra_shortest_paths() can take multiple sources. I can't seem to get that working and haven't tested more than one source for dijkstra_shortest_paths_explicit(). Am I misunderstanding the docs?

@sbromberger
Copy link
Contributor Author

I keep getting rebase errors on squash. I think this is the best I can do unless you'd prefer I start clean. Your call.

@sbromberger sbromberger changed the title show_dijkstra_shortest_paths() dijkstra_shortest_paths_explicit() Dec 30, 2014
@mlubin
Copy link
Contributor

mlubin commented Dec 30, 2014

Are you using git rebase -i HEAD~6? All that should be needed is to mark the last 5 commits as fixup commits to squash them.

@sbromberger
Copy link
Contributor Author

HEAD~6 shows

pick e0dd99e helper for shortest paths with default edge weights=1
pick 305ffcf edge_dists is now optional for Dijkstra
pick b8341cc show_dijkstra_shortest_paths()
pick e68e19a show_dijkstra_shortest_paths()
pick 6c321e5 silly test fix
pick 5e59955 rename per suggestion
pick aadbdce updated docs

I think the problem is that I've got b8341cc and e68e19a with the same thing. In any case, I'm getting

error: could not apply e68e19a... show_dijkstra_shortest_paths()

Let me see if the friendly folks on freenode#github can help :)

@mlubin
Copy link
Contributor

mlubin commented Dec 30, 2014

Would probably be less effort just to open a new PR.

@mlubin
Copy link
Contributor

mlubin commented Dec 30, 2014

We've all done it before :)

@sbromberger
Copy link
Contributor Author

It's an academic challenge at this point. Give me a few before I throw in the towel. :)

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 742fdc7 on sbromberger:show_paths into ab89beb on JuliaLang:master.

@sbromberger
Copy link
Contributor Author

Et voilà.

@mlubin
Copy link
Contributor

mlubin commented Dec 30, 2014

Was the issue with multiple sources resolved? If not, could you update the method signature to only allow a single source? E.g.,

function dijkstra_shortest_paths_explicit{V}(g::AbstractGraph{V},source::V,all...)

@sbromberger
Copy link
Contributor Author

It was not, but I couldn't figure out how/whether it even worked in dijkstra_shortest_paths(). I can change the signature and resquash, and we can revisit later.

change Array to Vector

silly test fix

rename per suggestion

updated docs

explicit: source cannot be Vector/Array
@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling ad6a276 on sbromberger:show_paths into ab89beb on JuliaLang:master.

mlubin added a commit that referenced this pull request Dec 30, 2014
dijkstra_shortest_paths_explicit()
@mlubin mlubin merged commit 5b4889a into JuliaAttic:master Dec 30, 2014
@mlubin
Copy link
Contributor

mlubin commented Dec 30, 2014

Thanks!

@sbromberger
Copy link
Contributor Author

Thank you for all the guidance. I'm going to be working on more centrality measures (see #129). Could you let me know whether the proposed licensing text will work?

@yeesian
Copy link
Contributor

yeesian commented Dec 31, 2014

Just saw this commit; it is great work, thanks! Some remarks:

  1. OpenStreetMap.jl actually provides something similar, and we should have abstracted out that portion of it into Graphs.jl.
  2. We operate on the DijkstraStates object returned by the algorithm, rather than providing a new function, since it seems more like a property/method of the object, than the algorithm
  3. Is the empty array also a valid path?
  4. We do actually use multiple starting points in bellmanford's, so I vote for keeping the version with multiple sources, although I haven't checked that it also works for dijkstra's.
  5. Does it make sense to provide a vector of distances/weights, to accompany the vector of nodes?

@tedsteiner

@sbromberger
Copy link
Contributor Author

Hi @yeesian (btw, your name is doing something really weird to my email client.)

  1. I wasn't aware of OpenStreetMap. I would've gladly just reused that had I known about it, but I asked in Is there a builtin for the actual shortest path (by vertex)? #139 and nobody knew of one.
  2. I originally used DijkstraStates but it turns out I needed the graph and source anyway, since I'm extracting routes for all vertices, so it seemed wasteful to ask for graph, source, and states when I could just derive the third from the first two.
  3. The empty array indicates no path exists between the source node and that target.
  4. I'm not sure multiple sources makes sense here. What does that look like, and how does it work with dijkstra_shortest_paths()? We can certainly add it if there's some benefit, but I couldn't figure out what multiple sources with Dijkstra was supposed to do.
  5. My use case (betweenness centrality) doesn't need the distances, but it's certainly something we can provide. I'd opt to defaulting the edge_distances parameter to ones() much like the convenience function I wrote for dijkstra_shortest_paths().

Also: I think yours will fail with simple_graphs since there is no Inf defined on integers, so I'm curious how an unreachable node from simple_graph will not get caught in a loop after having passed https://github.com/tedsteiner/OpenStreetMap.jl/blob/master/src/routing.jl#L217.

@yeesian
Copy link
Contributor

yeesian commented Dec 31, 2014

Hey @sbromberger (I've corrected my name to be more decorous, sorry about messing up your email client!),

  1. The fault was mine for not spotting that issue earlier, sorry about that!
  2. You're right about being able to derive the states from the graph and source, rather than going through DijkstraStates, and there's no harm in having both. The usecase for working with DijkstraStatesis the following: often I don't know if I want to see all paths in advance (if the graph is pretty enormous, for e.g. road networks), and by the time I've completed Dijkstra's algorithm, I want to be able to query the result, rather than re-performing the algorithm.
  3. And yeah, the empty array makes sense now, thanks! I'm curious: how about returning a Dict{V,Vector{V}} instead? Where the keys are the end-vertices, and the values are the paths to the end-vertices? That way we don't need to assume that the nodes/vertices need to be integers*. That'll generalize more naturally to the case with multiple starting points (below)
  4. I think the best way to illustrate multiple sources is by example:
# construct a graph and the edge distance vector

g = simple_inclist(5)

inputs = [       # each element is (u, v, dist)
    (1, 2, 10.),
    (1, 3, 5.),
    (2, 3, 2.),
    (3, 2, 3.),
    (2, 4, 1.),
    (3, 5, 2.),
    (4, 5, 4.),
    (5, 4, 6.),
    (5, 1, 7.),
    (3, 4, 9.) ]

ne = length(inputs)
dists = zeros(ne)

for i = 1 : ne
    a = inputs[i]
    add_edge!(g, a[1], a[2])   # add edge
    dists[i] = a[3]             # set distance
end

# Single source: Vertex 1
r = dijkstra_shortest_paths(g, dists, 1)

@assert r.parents == [1, 3, 1, 2, 3]
@assert r.dists == [0., 8., 5., 9., 7.]

# Multiple sources: Vertices 1 and 4
r = dijkstra_shortest_paths(g, dists, [1, 4]) #

@assert r.parents == [1, 3, 1, 4, 4]
@assert r.dists == [0., 8., 5., 0., 4.]
  1. I'm with you on defaulting the edge parameters to ones. Perhaps a better approach might be to provide keyword args for deciding whether or not to provide the (vector of) distances? It is just convenient for the moment to provide both, and ignore the parts that we don't need.

You're right about it failing (in the general case) with the implementation with OSM. There we had the luxury of assuming that our weights are Float64s, because we constructed the network that way. A more generic implementation that also works for Ints would be nice.

*The vertices in OSM are also integers (since they're node ids), so we're not in conflict there. But I'll leave that discussion to #143?

@sbromberger
Copy link
Contributor Author

Hi @yeesian

  1. No worries. I needed something quickly, did a search, didn't find anything, asked, ... it's not reasonable to expect everyone to have full knowledge. :)

  2. I see the benefit now that I understand the use case. Would a separate dispatch method (with DijkstraState) be acceptable, with the understanding that you won't get vertex information?

  3. See, e.g., Decision: should per-vertex summary functions return dict, array, or other? #142 - I'm actually learning towards @pozorvlak 's point of view, since if we're processing vertices then getting back to the indices via vertex_index() is at best O(1) and could be worse. (See also Discussion: why should we allow edges / vertices to be anything other than Ints? #143).

  4. Let me take some time to think about this. Thanks for the example.

  5. I'm generally ok with this but I think there's an argument to being consistent with, e.g., dijkstra_shortest_paths() which apparently does it via multiple dispatch. Right now since I glob all parameters, it's transparent when I call dijkstra_shortest_paths().

Also: the failing is not with weights. With vertices that are Ints, the DijkstraState.parents returns internally-inconsistent and ambiguous values when there is no parent (sometimes it's 0, sometimes it's an uninitialized array value), and with vertices that are other types, the "no parent" condition is also inconsistent - floats get Inf, strings get #undef, etc. - see #140 for more details.

@yeesian
Copy link
Contributor

yeesian commented Dec 31, 2014

Ah sorry I'm out for the moment; I'll have a look when I'm back? I think you're on to the right direction for (5) too. Have a good new years eve! (:

@sbromberger
Copy link
Contributor Author

@yeesian Happy New Year to you as well!

BTW, this is how your name was coming across in Mac Mail and in iOS:
screen shot 2014-12-31 at 10 31 56

@yeesian
Copy link
Contributor

yeesian commented Dec 31, 2014

Oh haha that was actually intentional, and inspired by a thread on stackoverflow

@sbromberger
Copy link
Contributor Author

Brilliant. And here I was getting ready to blame Apple for breaking i18n.

@sbromberger
Copy link
Contributor Author

Ah, if I understand correctly, a dijkstra_shortest_paths() with multiple sources treats each source as parent 0, and will correctly calculate the distances from all other vertices to the closest source.

That complicates things a bit, I would imagine. Let me think on this.

@yeesian
Copy link
Contributor

yeesian commented Jan 1, 2015

Sorry for the delay,

It seems you're onto the issues at hand (really thanks!), do let me/us know what you eventually decide, and where I might be able to help?


  1. I see the benefit now that I understand the use case. Would a separate dispatch method (with DijkstraState) be acceptable, with the understanding that you won't get vertex information?
    [...]
  2. I'm generally ok with this but I think there's an argument to being consistent with, e.g., dijkstra_shortest_paths() which apparently does it via multiple dispatch. Right now since I glob all parameters, it's transparent when I call dijkstra_shortest_paths().

Yeah sure, although if we leave out vertex information, perhaps we should avoid having to work with DijkstraState altogether, and use some notion of isdefined (on the vertices), rather than examining the distances. What do you think? One way I can see it pan out:

paths = enumerate_paths(parents::Vector{V}) # assumed to be all destinations
paths = enumerate_paths(parents::Vector{V}, dest::V) # to a single destination
paths = enumerate_paths(parents::Vector{V}, dest::Vector{V}) # to a subset of the nodes

The user can then map the vertex ids to the respective distances (in DijkstraState), or we could provide convenience functions for those too, i.e.

paths, distances = enumerate_paths(dijkstra::DijkstraState) # assumed to be all destinations
paths, distances = enumerate_paths(dijkstra::DijkstraState, dest::V) # to a single destination
paths, distances = enumerate_paths(dijkstra::DijkstraState, dest::Vector{V}) # to a subset of the nodes
  1. See, e.g., Decision: should per-vertex summary functions return dict, array, or other? #142 - I'm actually learning towards @pozorvlak 's point of view, since if we're processing vertices then getting back to the indices via vertex_index() is at best O(1) and could be worse. (See also Discussion: why should we allow edges / vertices to be anything other than Ints? #143).

I haven't noticed the discussion over at #142, and am perfectly happy with the decision to provide an "array indexed by vertex-index" instead of a Dict.

@yeesian
Copy link
Contributor

yeesian commented Jan 1, 2015

Also, if I understand it correctly,

paths = dijkstra_shortest_paths_explicit(g,source)

is equivalent to

result = dijkstra_shortest_paths(g,source) # L252
paths = enumerate_paths(result.parents) #L253-283

(And I think the latter is a little clearer in intent)

@sbromberger
Copy link
Contributor Author

@yeesian
Happy New Year :)

perhaps we should avoid having to work with DijkstraState altogether

But I thought the use case you needed was one in which you didn't know in advance whether explicit path determination would be necessary, but you had DijkstraState info which was already expensive to calculate.

In any case, what might be best is for you to submit a PR or gist that we could discuss so that we make sure we're talking about the same thing. (This thread is taxing my ability to keep the issues/proposals straight.) I'm totally open to whatever method works best, as long as I can get explicit path information out of it :)

I'm also nervous about using isdefined because in at least two cases (vertices of Ints and Float64s), the result is, in fact defined - but specifically in the former, is indeterminate. This should be fixed using Nullable but I don't know enough about this type to submit a PR myself.

ETA:

Also, if I understand it correctly, ...

Yes, essentially. The difference is pretty much the fact that we're referencing the vertices explicitly via g instead of relying on vertex indexing in results.parents. This was the result of a previous incarnation of the code in which I mapped the paths to vertices via dict and might be revisited if we assume a DijkstraState as an argument.

@yeesian
Copy link
Contributor

yeesian commented Jan 1, 2015

Yeah sure! Sorry about extending this thread beyond its due. And happy new year to you too! (:

@sbromberger
Copy link
Contributor Author

Let me rewrite to use DijkstraState, if that's ok with you, and I'll post a gist for discussion.

@yeesian
Copy link
Contributor

yeesian commented Jan 1, 2015

No worries, I'll submit a PR, and we can discuss there?

@sbromberger
Copy link
Contributor Author

OK, sounds good. I'll wait for your PR :)

Just a caution: the reason we need the vertices is because you need to test "connectedness" for the case where vertices are Ints (https://github.com/JuliaLang/Graphs.jl/blob/master/src/dijkstra_spath.jl#L260). Since DijkstraStates doesn't include vertex information (except as instances in parents which is not helpful since it's not inclusive and could contain elements not in V), we need g.vertices.

@yeesian
Copy link
Contributor

yeesian commented Jan 1, 2015

Oh! I see now that you require access to g.vertices in enumerating your paths, although I'm wondering why that is necessary, and figuring it out might take me alot longer than I expected. Sorry about it

@mlubin
Copy link
Contributor

mlubin commented Jan 1, 2015

@sbromberger
Copy link
Contributor Author

Oh! I see now that you require access to g.vertices in enumerating your paths, although I'm wondering why that is necessary, and figuring it out might take me alot longer than I expected. Sorry about it

See above: the reason you need the actual vertices is because (until #140 is resolved) you need to test for membership in V in order to determine whether or not a valid parent actually exists.

for i in 1:length(parents)
path = V[]
currvertex = allvertices[i]
connected = isdefined(parents,i) && (parents[i] in allvertices)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is going to be pretty inefficient since allvertices is usually quite large; see the workaround proposed in #140

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For SimpleGraph, allvertices is just a range hence pretty efficient. For other graph types, you're right.

This was referenced Jan 1, 2015
ggggggggg pushed a commit to ggggggggg/Graphs.jl that referenced this pull request Mar 17, 2015
to use enumerate_paths to return the paths corresponding to a (sub)set
of destination vertices. See discussion in JuliaAttic#141
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants