-
-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
COO FeaturedGraph #204
COO FeaturedGraph #204
Conversation
f4e3b4e
to
cdef677
Compare
4943b65
to
0a54ce3
Compare
About graph network, I suggest here is the paper to read. Especially, chapter 3 Graph networks. |
yes I know that paper, which is what the
update_edge(l::Layer, M, E) = M
I think these are 2 positive changes, but they are not particularly relevant, I can revert them if you want. |
edge_index | ||
num_nodes::Int | ||
num_edges::Int | ||
# ndata::Dict{String, Any} # https://github.com/FluxML/Zygote.jl/issues/717 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to include various kind of features in a FeaturedGraph
? I had think of this before, but I give up and turn to MetaGraph
. MetaGraph
has implemented similar structure as this. And another problem will come to we should provide a way for user to specify what node/edge/global feature them want to train on.
Is it better to use Symbol
as key?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This requires some extra thought, I don't plan to implement it in this PR, just wanted to write down a stub.
Both PytorchGeometric and DGL support adding an arbitrary number of edge and node features, we should do the same at some point since it is something very convenient to have (e.g. you want to store node labels, training masks etc...)
MetaGraph.jl is not a good fit. Besides being scarcely maintained, it provides maps with single nodes/edges as keys which is not good for vectorized operations, we want to store arrays.
# l.σ.(l.weight * x * L̃ .+ l.bias) | ||
# end | ||
|
||
message(l::GCNConv, xi, xj) = xj |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am quite confused with GCNConv
implemented with message-passing neural network. I doesn't leverage anything from the MPNN. As I said, all it need are algebraic operations, e.g. l.weight * x .+ l.bias
, instead of indexing by xi
or xj
. So, I don't think GCNConv <: MessagePassing
is good. Furthermore, people wouldn't classify GCNConv
as a spatial-based graph convolution layer. It is a spectral-based graph convolution layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am quite confused with GCNConv implemented with message-passing neural network. I doesn't leverage anything from the MPNN.
It is leveraging gather
and scatter(+,...)
, which basically implement sparse matrix multiplication
As I said, all it need are algebraic operations, e.g. l.weight * x .+ l.bias, instead of indexing by xi or xj. So, I don't think GCNConv <: MessagePassing is good. Furthermore, people wouldn't classify GCNConv as a spatial-based graph convolution layer. It is a spectral-based graph convolution layer.
Although it was introduced as a spectral-based operator, I think most people these days think about it in the general message passing scheme. In any case, both interpretations are valid. The important thing is to do something which is well adapted to the underlying graph implementation. Calling adjacency_matrix(fg)
is a huge performance hit when the underlying is COO or adjacency list, we have to prohibit these perfomance traps.
I plan to expand FeaturedGraph support for sparse adjacency matrix representations, so that we can have back the algebraic form of GCNConv. I would do that in a separate PR though if you don't mind, this is already quite huge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think most people these days think about it in the general message passing scheme.
Sorry, I couldn't accept this statement. It is definition. Please don't break definitions. If most people do, they are wrong in the beginning.
The important thing is to do something which is well adapted to the underlying graph implementation.
I agree with this statement.
Calling
adjacency_matrix(fg)
is a huge performance hit when the underlying is COO or adjacency list, we have to prohibit these perfomance traps.
Under definition, it is no way to take the performance issue as excuse. If the huge performance is such, then just find a way to resolve the conversion penalty issues, rather than breaking definitions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I plan to expand FeaturedGraph support for sparse adjacency matrix representations
I agree with you on this point and this is the way I planned long time ago.
I would do that in a separate PR though if you don't mind, this is already quite huge
Yeah, I have the same feeling. It's huge. Let's break it down.
I think this is good to go, I don't want to overcharge a single PR which is quite big already. |
""" | ||
MessagePassing | ||
|
||
The abstract type from which all message passing layers are derived. | ||
|
||
Related methods are [`propagate`](@ref), [`message`](@ref), | ||
[`update`](@ref), [`update_edge`](@ref), and [`update_global`](@ref). | ||
""" | ||
abstract type MessagePassing end | ||
|
||
""" | ||
propagate(mp::MessagePassing, fg::FeaturedGraph, aggr) | ||
propagate(mp::MessagePassing, fg::FeaturedGraph, E, X, u, aggr) | ||
|
||
Perform the sequence of operation implementing the message-passing scheme | ||
and updating node, edge, and global features `X`, `E`, and `u` respectively. | ||
|
||
The computation involved is the following: | ||
|
||
```julia | ||
M = compute_batch_message(mp, fg, E, X, u) | ||
E = update_edge(mp, M, E, u) | ||
M̄ = aggregate_neighbors(mp, aggr, fg, M) | ||
X = update(mp, M̄, X, u) | ||
u = update_global(mp, E, X, u) | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really annoying. It confuses the definition of message-passing network and graph network. We shouldn't break the definitions. By definition from paper, there are two functions to be defined:
message
: it process information from node feature of itself and it's neighbors, as well as edge feature, and then gives a message as output.update
: Message after aggregation is passed toupdate
function. It takes message and node feature itself as input and output a new node feature for next layer.
Between these, an aggregation function is needed to apply to aggregate messages.
But for graph network is not the case. Graph network is a more general scheme that include message-passing network. Paper defines a series of APIs and the algorithm for graph network. It contains:
- Update function:
- For node
- For edge
- For global
- Aggregate function:
- For aggregating edge feature into node feature
- For aggregating edge feature into global feature
- For aggregating node feature into global feature
This architecture is flexible because the functions above should be optional. They can be "turn on/off". If the definition is given by user, thus, it is "turned on". Thus, user can combine their own neural network by giving definitions or not.
Thus, message-passing network is one of the specialization of graph network. It is specialize by following:
message
is the update function of edge in graph network.- aggregation in message-passing is the aggregation from edge feature into node feature in graph network.
update
is the update function of node in graph network.
I think engineering must be done by satisfying the definitions above. If the behavior is different, we shouldn't call it the same name. MessagePassing
is for message-passing network and GraphNet
is for graph network.
I'm breaking this in 2 or 3 PRs. First on is #215 |
Close this due to merge of yuehhua/GraphSignals.jl#54. |
This is just an investigation of what would entail having a COO implementation for the FeaturedGraph.
I reimplemented FeaturedGraph inside this repo, as a subtype of LightGraphs.AbstractGraphs. Tests aren't passing yet.
For the time being, this PR drops GraphSignals.jl as a dependence. If the investigation turns out to be successful I will move the code to GraphSignals.
UPDATE:
I have done a large redesign of the library, the code is much simpler, and overall performance should be much better (especially on gpu).
source
,target
(COO) edges' representation is the natural fit for message passing. Everything is handled in a very concise way byNNlib.gather
andNNlib.scatter
ChebConv
!adjacency_matrix
andnormalized_laplacian
can be expressed in a gpu friendly wayGCNConv
now has two implementations, both gpu friendly, one based on message passing one on multiplication by normalized laplacian. The second one is commented out, waiting for adjacency matrix storage support in FeaturedGraph(which is the case where it would make sense to use the laplacian algebra instead of message passing).
we can guarantee efficient, gpu friendly, flexible, and consistent implementations for every layer.
GraphNet
andMessagePassing
types intoMessagePassing
. There seems to be no need (efficiency/flexibility/convenience) to have both.message
andupdate
functions now deal with batched node/edge features coming fromgather
andscatter
. Perfomance-wise, this is much better than the previous implementation relying onmapreduce
.Fix #185 fix #194 fix #195 fix #197 fix #200 fix #209
TODO list (this PR or future ones)
GCNConv
when using message passinggather/scatter
handle nicely the case of isolated nodesChebConv
with message passingscaled_laplacian
FeaturedGraph
support for (sparse) adjacency matrix underlying representationFeaturedGraph
implementation back toGraphSignals