Skip to content

Commit

Permalink
Improve graph structure a little bit (#114)
Browse files Browse the repository at this point in the history
* Add a function transpose_graph

* Use the formatter on src/graph.jl

* Remove old code

* Add unit tests for transpose_graph

* Clean up transpose

* No docs

* Docs

* Codecov

---------

Co-authored-by: Alexis Montoison <[email protected]>
  • Loading branch information
gdalle and amontoison authored Sep 26, 2024
1 parent 620436c commit 939c9df
Show file tree
Hide file tree
Showing 7 changed files with 118 additions and 63 deletions.
1 change: 1 addition & 0 deletions docs/src/dev.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ SparseMatrixColorings.vertices
SparseMatrixColorings.neighbors
SparseMatrixColorings.adjacency_graph
SparseMatrixColorings.bipartite_graph
transpose
```

## Low-level coloring
Expand Down
1 change: 1 addition & 0 deletions src/SparseMatrixColorings.jl
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ using SparseArrays:
SparseMatrixCSC,
dropzeros,
dropzeros!,
ftranspose!,
nnz,
nonzeros,
nzrange,
Expand Down
26 changes: 13 additions & 13 deletions src/coloring.jl
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,8 @@ The vertices are colored in a greedy fashion, following the `order` supplied.
function partial_distance2_coloring(
bg::BipartiteGraph, ::Val{side}, order::AbstractOrder
) where {side}
color = Vector{Int}(undef, length(bg, Val(side)))
forbidden_colors = Vector{Int}(undef, length(bg, Val(side)))
color = Vector{Int}(undef, nb_vertices(bg, Val(side)))
forbidden_colors = Vector{Int}(undef, nb_vertices(bg, Val(side)))
vertices_in_order = vertices(bg, Val(side), order)
partial_distance2_coloring!(color, forbidden_colors, bg, Val(side), vertices_in_order)
return color
Expand Down Expand Up @@ -76,11 +76,11 @@ The vertices are colored in a greedy fashion, following the `order` supplied.
"""
function star_coloring(g::Graph{false}, order::AbstractOrder)
# Initialize data structures
nvertices = length(g)
color = zeros(Int, nvertices)
forbidden_colors = zeros(Int, nvertices)
first_neighbor = fill((0, 0), nvertices) # at first no neighbors have been encountered
treated = zeros(Int, nvertices)
n = nb_vertices(g)

This comment has been minimized.

Copy link
@amontoison

amontoison Sep 27, 2024

Author Collaborator

Why not keeping nvertices?
It's more explicit than n.

This comment has been minimized.

Copy link
@gdalle

gdalle Sep 27, 2024

Author Owner

Because it was very close to the new function name nb_vertices

This comment has been minimized.

Copy link
@gdalle

gdalle Sep 27, 2024

Author Owner
color = zeros(Int, n)
forbidden_colors = zeros(Int, n)
first_neighbor = fill((0, 0), n) # at first no neighbors have been encountered
treated = zeros(Int, n)
star = Dict{Tuple{Int,Int},Int}()
hub = Int[]
vertices_in_order = vertices(g, order)
Expand Down Expand Up @@ -269,12 +269,12 @@ The vertices are colored in a greedy fashion, following the `order` supplied.
"""
function acyclic_coloring(g::Graph{false}, order::AbstractOrder)
# Initialize data structures
nvertices = length(g)
nedges = nnz(g) ÷ 2 # symmetric sparse matrix with empty diagonal

This comment has been minimized.

Copy link
@amontoison

amontoison Sep 27, 2024

Author Collaborator

Why renaming the variable nedges to a variable name e??
I can understand nv and ne for brevity but not n and e.

This comment has been minimized.

Copy link
@gdalle

gdalle Sep 27, 2024

Author Owner

Fine, you can do a renaming PR if you want, I just wanted to avoid excessive similarity with the new functions that's all

This comment has been minimized.

Copy link
@gdalle

gdalle Sep 27, 2024

Author Owner
color = zeros(Int, nvertices)
forbidden_colors = zeros(Int, nvertices)
first_neighbor = fill((0, 0), nvertices) # at first no neighbors have been encountered
first_visit_to_tree = fill((0, 0), nedges)
n = nb_vertices(g)
e = nb_edges(g) ÷ 2 # symmetric sparse matrix with empty diagonal
color = zeros(Int, n)
forbidden_colors = zeros(Int, n)
first_neighbor = fill((0, 0), n) # at first no neighbors have been encountered
first_visit_to_tree = fill((0, 0), e)
forest = DisjointSets{Tuple{Int,Int}}()
vertices_in_order = vertices(g, order)

Expand Down
75 changes: 48 additions & 27 deletions src/graph.jl
Original file line number Diff line number Diff line change
Expand Up @@ -3,63 +3,72 @@
"""
Graph{loops,T}
Undirected graph structure stored in Compressed Sparse Column (CSC) format.
Note that the underyling CSC may not be square, so the term "graph" is slightly abusive.
Store a sparse matrix (in CSC) without its values, keeping only the pattern of nonzeros.
It can be seen as a graph mapping columns to rows, hence the name `Graph`.
The type parameter `loops` must be set to:
- `true` if coefficients `(i, i)` present in the CSC are counted as edges in the graph (e.g. for each half of a bipartite graph)
- `false` otherwise (e.g. for an adjacency graph)
# Fields
- `colptr::Vector{T}`: same as for `SparseMatrixCSC`
- `rowval::Vector{T}`: same as for `SparseMatrixCSC`
Copied from `SparseMatrixCSC`:
- `m::Int`: number of rows
- `n::Int`: number of columns
- `colptr::Vector{T}`: column `j` is in `colptr[j]:(colptr[j+1]-1)`

This comment has been minimized.

Copy link
@amontoison

amontoison Sep 27, 2024

Author Collaborator

column j is stored in rowval between the indices colptr[j] and colptr[j+1] - 1

This comment has been minimized.

Copy link
@gdalle

gdalle Sep 27, 2024

Author Owner

dude I copied the docstring word for word from SparseMatrixCSC, are you really going to nitpick on this ^^?

- `rowval::Vector{T}`: row indices of stored values
"""
struct Graph{loops,T<:Integer}
m::Int

This comment has been minimized.

Copy link
@amontoison

amontoison Sep 27, 2024

Author Collaborator

Is it not more relevant to store the number of edges and vertices instead of m and n?

This comment has been minimized.

Copy link
@gdalle

gdalle Sep 27, 2024

Author Owner

The advantage of this storage method is that if we need to adapt any code from SparseArrays we can just copy paste it right away

This comment has been minimized.

Copy link
@gdalle

gdalle Sep 27, 2024

Author Owner

Also this graph is not really symmetric, if the matrix is rectangular there are incoming and outgoing vertices (kind of), so it's good to keep track of m

n::Int
colptr::Vector{T}
rowval::Vector{T}
end

function Graph{loops}(S::SparseMatrixCSC{Tv,Ti}) where {loops,Tv,Ti}
return Graph{loops,Ti}(S.colptr, S.rowval)
return Graph{loops,Ti}(S.m, S.n, S.colptr, S.rowval)
end

Base.length(g::Graph) = length(g.colptr) - 1
SparseArrays.nnz(g::Graph) = length(g.rowval)

This comment has been minimized.

Copy link
@amontoison

amontoison Sep 27, 2024

Author Collaborator

I think nnz should not be defined for Graph.
I don't see in which context we could use it.
It could also be misleading in the case of Graph{false}, because its output is different from nb_edges.

This comment has been minimized.

Copy link
@gdalle

gdalle Sep 27, 2024

Author Owner

That's why I separated the definition of nb_edges into a different function. In our context, nnz can be useful whenever we make operations on the underlying sparse CSC structure

SparseArrays.rowvals(g::Graph) = g.rowval
SparseArrays.nzrange(g::Graph, j::Integer) = g.colptr[j]:(g.colptr[j + 1] - 1)

nb_vertices(g::Graph) = g.n
vertices(g::Graph) = 1:nb_vertices(g)

SparseArrays.nnz(g::Graph{true}) = length(g.rowval)
nb_edges(g::Graph{true}) = length(g.rowval)

function SparseArrays.nnz(g::Graph{false})
n = 0
for j in 1:(length(g.colptr) - 1)
for k in g.colptr[j]:(g.colptr[j + 1] - 1)
i = g.rowval[k]
function nb_edges(g::Graph{false})
e = 0

This comment has been minimized.

Copy link
@amontoison

amontoison Sep 27, 2024

Author Collaborator

e -> nedges ?

This comment has been minimized.

Copy link
@gdalle

gdalle Sep 27, 2024

Author Owner

let's settle for ne

This comment has been minimized.

Copy link
@gdalle

gdalle Sep 27, 2024

Author Owner
for j in vertices(g)
for k in nzrange(g, j)
i = rowvals(g)[k]
if i != j
n += 1
e += 1
end
end
end
return n
return e
end

vertices(g::Graph) = 1:length(g)

function neighbors(g::Graph{true}, v::Integer)
return view(g.rowval, g.colptr[v]:(g.colptr[v + 1] - 1))
return view(rowvals(g), nzrange(g, v))
end

function neighbors(g::Graph{false}, v::Integer)
neighbors_with_loops = view(g.rowval, g.colptr[v]:(g.colptr[v + 1] - 1))
neighbors_with_loops = view(rowvals(g), nzrange(g, v))
return Iterators.filter(!=(v), neighbors_with_loops) # TODO: optimize
end

function degree(g::Graph{true}, v::Integer)
return length(g.colptr[v]:(g.colptr[v + 1] - 1))
return length(nzrange(g, v))
end

function degree(g::Graph{false}, v::Integer)
d = length(g.colptr[v]:(g.colptr[v + 1] - 1))
for k in g.colptr[v]:(g.colptr[v + 1] - 1)
if g.rowval[k] == v
d = length(nzrange(g, v))
for k in nzrange(g, v)
if rowvals(g)[k] == v
d -= 1
end
end
Expand All @@ -69,6 +78,17 @@ end
maximum_degree(g::Graph) = maximum(Base.Fix1(degree, g), vertices(g))
minimum_degree(g::Graph) = minimum(Base.Fix1(degree, g), vertices(g))

"""
transpose(g::Graph)
Return a [`Graph`](@ref) corresponding to the transpose of (the underlying matrix of) `g`.
"""
function Base.transpose(g::Graph{loops,T}) where {loops,T}
S = SparseMatrixCSC{T,T}(g.m, g.n, g.colptr, g.rowval, g.rowval)
Sᵀ = convert(SparseMatrixCSC, transpose(S)) # TODO: use ftranspose! without segfault?
return Graph{loops}(Sᵀ)
end

## Bipartite graph

"""
Expand All @@ -88,16 +108,17 @@ struct BipartiteGraph{T<:Integer}
g2::Graph{true,T}
end

Base.length(bg::BipartiteGraph, ::Val{1}) = length(bg.g1)
Base.length(bg::BipartiteGraph, ::Val{2}) = length(bg.g2)
SparseArrays.nnz(bg::BipartiteGraph) = nnz(bg.g1)
nb_vertices(bg::BipartiteGraph, ::Val{1}) = nb_vertices(bg.g1)
nb_vertices(bg::BipartiteGraph, ::Val{2}) = nb_vertices(bg.g2)

nb_edges(bg::BipartiteGraph) = nb_edges(bg.g1)

"""
vertices(bg::BipartiteGraph, Val(side))
Return the list of vertices of `bg` from the specified `side` as a range `1:n`.
"""
vertices(bg::BipartiteGraph, ::Val{side}) where {side} = 1:length(bg, Val(side))
vertices(bg::BipartiteGraph, ::Val{side}) where {side} = 1:nb_vertices(bg, Val(side))

"""
neighbors(bg::BipartiteGraph, Val(side), v::Integer)
Expand Down Expand Up @@ -159,7 +180,7 @@ function bipartite_graph(A::SparseMatrixCSC; symmetric_pattern::Bool=false)
checksquare(A) # proxy for checking full symmetry
g1 = g2
else
g1 = Graph{true}(convert(SparseMatrixCSC, transpose(A))) # rows to columns
g1 = transpose(g2) # rows to columns
end
return BipartiteGraph(g1, g2)
end
4 changes: 2 additions & 2 deletions src/order.jl
Original file line number Diff line number Diff line change
Expand Up @@ -41,11 +41,11 @@ end
RandomOrder() = RandomOrder(default_rng())

function vertices(g::Graph, order::RandomOrder)
return randperm(order.rng, length(g))
return randperm(order.rng, nb_vertices(g))
end

function vertices(bg::BipartiteGraph, ::Val{side}, order::RandomOrder) where {side}
return randperm(order.rng, length(bg, Val(side)))
return randperm(order.rng, nb_vertices(bg, Val(side)))
end

"""
Expand Down
53 changes: 40 additions & 13 deletions test/graph.jl
Original file line number Diff line number Diff line change
@@ -1,34 +1,50 @@
using LinearAlgebra
using SparseArrays
using SparseMatrixColorings: Graph, adjacency_graph, bipartite_graph, degree, neighbors
using SparseMatrixColorings:
Graph, adjacency_graph, bipartite_graph, degree, nb_vertices, nb_edges, neighbors
using Test

## Standard graph

@testset "Graph" begin
g = Graph{true}(sparse([
1 0 1
1 1 0
0 0 0
1 0 1 1
1 1 0 0
0 0 0 1
]))
gᵀ = transpose(g)

@test length(g) == 3
@test nnz(g) == 4
@test nb_vertices(g) == 4
@test nb_edges(g) == 6
@test nnz(g) == 6
@test neighbors(g, 1) == [1, 2]
@test neighbors(g, 2) == [2]
@test neighbors(g, 3) == [1]
@test neighbors(g, 4) == [1, 3]
@test degree(g, 1) == 2
@test degree(g, 2) == 1
@test degree(g, 3) == 1
@test degree(g, 4) == 2

@test nb_vertices(gᵀ) == 3
@test nb_edges(gᵀ) == 6
@test nnz(gᵀ) == 6
@test neighbors(gᵀ, 1) == [1, 3, 4]
@test neighbors(gᵀ, 2) == [1, 2]
@test neighbors(gᵀ, 3) == [4]
@test degree(gᵀ, 1) == 3
@test degree(gᵀ, 2) == 2
@test degree(gᵀ, 3) == 1

g = Graph{false}(sparse([
1 0 1
1 1 0
0 0 0
]))

@test length(g) == 3
@test nnz(g) == 2
@test nb_vertices(g) == 3
@test nb_edges(g) == 2
@test nnz(g) == 4
@test collect(neighbors(g, 1)) == [2]
@test collect(neighbors(g, 2)) == Int[]
@test collect(neighbors(g, 3)) == [1]
Expand All @@ -49,8 +65,8 @@ end;

bg = bipartite_graph(A; symmetric_pattern=false)
@test_throws DimensionMismatch bipartite_graph(A; symmetric_pattern=true)
@test length(bg, Val(1)) == 4
@test length(bg, Val(2)) == 8
@test nb_vertices(bg, Val(1)) == 4
@test nb_vertices(bg, Val(2)) == 8
# neighbors of rows
@test neighbors(bg, Val(1), 1) == [1, 6, 7, 8]
@test neighbors(bg, Val(1), 2) == [2, 5, 7, 8]
Expand All @@ -73,8 +89,8 @@ end;
1 1 0 1
])
bg = bipartite_graph(A; symmetric_pattern=true)
@test length(bg, Val(1)) == 4
@test length(bg, Val(2)) == 4
@test nb_vertices(bg, Val(1)) == 4
@test nb_vertices(bg, Val(2)) == 4
# neighbors of rows and columns
@test neighbors(bg, Val(1), 1) == neighbors(bg, Val(2), 1) == [1, 3, 4]
@test neighbors(bg, Val(1), 2) == neighbors(bg, Val(2), 2) == [2, 4]
Expand All @@ -94,7 +110,7 @@ end;

B = transpose(A) * A
g = adjacency_graph(B - Diagonal(B))
@test length(g) == 8
@test nb_vertices(g) == 8
@test collect(neighbors(g, 1)) == [6, 7, 8]
@test collect(neighbors(g, 2)) == [5, 7, 8]
@test collect(neighbors(g, 3)) == [5, 6, 8]
Expand All @@ -104,3 +120,14 @@ end;
@test collect(neighbors(g, 7)) == [1, 2, 4, 5, 6, 8]
@test collect(neighbors(g, 8)) == [1, 2, 3, 5, 6, 7]
end

@testset "Transpose" begin
for _ in 1:1000
A = sprand(rand(100:1000), rand(100:1000), 0.1)
g = Graph{true}(A)
gᵀ = transpose(g)
gᵀ_true = Graph{true}(sparse(transpose(A)))
@test gᵀ.colptr == gᵀ_true.colptr
@test gᵀ.rowval == gᵀ_true.rowval
end
end
21 changes: 13 additions & 8 deletions test/suitesparse.jl
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ using SparseMatrixColorings:
degree,
minimum_degree,
maximum_degree,
nb_vertices,
nb_edges,
neighbors,
partial_distance2_coloring,
star_coloring,
Expand All @@ -34,15 +36,16 @@ colpack_table_6_7 = CSV.read(
original_mat = matrixdepot("$(row[:group])/$(row[:name])")
mat = dropzeros(original_mat)
bg = bipartite_graph(mat)
@test length(bg, Val(1)) == row[:V1]
@test length(bg, Val(2)) == row[:V2]
@test nnz(bg) == row[:E]
@test nb_vertices(bg, Val(1)) == row[:V1]
@test nb_vertices(bg, Val(2)) == row[:V2]
@test nb_edges(bg) == row[:E]
@test maximum_degree(bg, Val(1)) == row[:Δ1]
@test maximum_degree(bg, Val(2)) == row[:Δ2]
color_N1 = partial_distance2_coloring(bg, Val(1), NaturalOrder())
color_N2 = partial_distance2_coloring(bg, Val(2), NaturalOrder())
@test length(unique(color_N1)) == row[:N1]
@test length(unique(color_N2)) == row[:N2]
yield()
end
end;

Expand All @@ -61,9 +64,9 @@ what_table_31_32 = CSV.read(
original_mat = matrixdepot("$(row[:group])/$(row[:name])")
mat = original_mat # no dropzeros
bg = bipartite_graph(mat)
@test length(bg, Val(1)) == row[:m]
@test length(bg, Val(2)) == row[:n]
@test nnz(bg) == row[:nnz]
@test nb_vertices(bg, Val(1)) == row[:m]
@test nb_vertices(bg, Val(2)) == row[:n]
@test nb_edges(bg) == row[:nnz]
@test minimum_degree(bg, Val(1)) == row[:ρmin]
@test maximum_degree(bg, Val(1)) == row[:ρmax]
@test minimum_degree(bg, Val(2)) == row[:κmin]
Expand All @@ -74,6 +77,7 @@ what_table_31_32 = CSV.read(
else
@test_broken length(unique(color_Nb)) == row[:K]
end
yield()
end
end;

Expand All @@ -91,11 +95,12 @@ what_table_41_42 = CSV.read(
mat = dropzeros(sparse(original_mat))
ag = adjacency_graph(mat)
bg = bipartite_graph(mat)
@test length(ag) == row[:V]
@test nnz(ag) ÷ 2 == row[:E]
@test nb_vertices(ag) == row[:V]
@test nb_edges(ag) ÷ 2 == row[:E]
@test maximum_degree(ag) == row[]
@test minimum_degree(ag) == row[]
color_N, _ = star_coloring(ag, NaturalOrder())
@test_skip row[:KS1] <= length(unique(color_N)) <= row[:KS2] # TODO: find better
yield()
end
end;

2 comments on commit 939c9df

@amontoison
Copy link
Collaborator

@amontoison amontoison commented on 939c9df Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gdalle I added 6 comments.

Update: I see why you need g.n, g.m and nnz(g) now for transpose(g) .
If we define transpose_graph(S) directly, as I did before that you modified my PR, these routines are not needed.

@gdalle
Copy link
Owner Author

@gdalle gdalle commented on 939c9df Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It felt cleaner to reuse the transpose function from Base and have it work from graph to graph, rather than from a matrix to a graph.

Please sign in to comment.