-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create cugraph-equivariant
package
#4036
Create cugraph-equivariant
package
#4036
Conversation
@stadlmax I've updated the wrapper based on the new fused_tp kernels. Can you take a look? Tests won't pass until we have the new kernels reflected in pylibcugraphops nightly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a problem if mlp_fast_first_layer=False
but src_scalars
and dst_scalars
are provided.
(optional) It would be nice if we can remove that mlp_fast_first_layer
and simply do the thing when src_scalars
and dst_scalars
are provided.
python/cugraph-equivariant/cugraph_equivariant/nn/tensor_product_conv.py
Outdated
Show resolved
Hide resolved
python/cugraph-equivariant/cugraph_equivariant/nn/tensor_product_conv.py
Outdated
Show resolved
Hide resolved
batch_norm: bool = True, | ||
mlp_channels: Optional[Sequence[int]] = None, | ||
mlp_activation: Optional[Callable[..., nn.Module]] = nn.GELU, | ||
mlp_fast_first_layer: bool = False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this argument need to be here? Can't the special just happen when src_scalars
and dst_scalars
are provided?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Mario here. There are a few branches when it comes to computing the tp weights:
- user direct input the precomputed weights to forward
- feed only the edge_emb to the mlp to compute the weight. This could include the case that the node embeddings are already concatenated into the edge_emb, which we can't distinguish
- As in point 2 but additionally have one node embedding tensor, i.e., the graph is non-directed
- As in point 2 but additionally have separate src and dst node embedding tensors
Edit: we already handled cases 1, 2 and 4, despite some argument against the need for mlp_fast_first_layer
in the init(). Do we want to handle case 3 explicitly in the API? i.e., if only src_scalars but not dst_scalars is given, do we want to assume the dst embedding are also index from src_scalars or do we always want the user to supply the same tensor to both src_scalars=node_attr, dst_scalars=node_attr
? I guess it can't hurt to do the latter so it's up to you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the code to support arbitrary number of scalar features in [src, dst]_scalars. They can also be None
if needed.
Regarding use case 3, users should input src_scalars=node_attr, dst_scalars=node_attr
.
|
||
from .tensor_product_conv import FullyConnectedTensorProductConv | ||
|
||
DiffDockTensorProductConv = FullyConnectedTensorProductConv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this alias?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is from one of our discussions on Monday, but it's totally optional. @mariogeiger Do you think we need the alias?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for me, the name FullyConnectedTensorProductConv
is perfect by its own
python/cugraph-equivariant/cugraph_equivariant/nn/tensor_product_conv.py
Outdated
Show resolved
Hide resolved
\sum_{b \in \mathcal{N}_a} Y\left(\hat{r}_{a b}\right) | ||
\otimes_{\psi_{a b}} \mathbf{h}_b | ||
|
||
where the path weights :math:`\psi_{a b}` are from user input to the forward() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the Diffdock paper, 1) \psi is itself the MLP for computing the weights from the edge and node embeddings. 2) This implementation has the option to either directly input the weights to the forward() or compute the weights using the MLP (\psi) from the edge and/or the node embeddings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
\psi denotes the weights while \Psi is the MLP. I have added a brief note to show the two options here
python/cugraph-equivariant/cugraph_equivariant/nn/tensor_product_conv.py
Show resolved
Hide resolved
mlp = [] | ||
for i in range(len(dims) - 1): | ||
mlp.append(nn.Linear(dims[i], dims[i + 1])) | ||
if mlp_activation is not None and i != len(dims) - 2: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When would mlp_activation
be None
?
python/cugraph-equivariant/cugraph_equivariant/nn/tensor_product_conv.py
Show resolved
Hide resolved
@mariogeiger @DejunL I will also remove the |
python/cugraph-equivariant/cugraph_equivariant/nn/tensor_product_conv.py
Show resolved
Hide resolved
python/cugraph-equivariant/cugraph_equivariant/nn/tensor_product_conv.py
Outdated
Show resolved
Hide resolved
in_irreps: o3.Irreps, | ||
sh_irreps: o3.Irreps, | ||
out_irreps: o3.Irreps, | ||
batch_norm: bool = True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the Diffdock training code, we have customized batch_norm function. Instead of being a boolean option, can this take a callable that defaults to e3nn.nn.BatchNorm
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Will make the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't come up with a neat way to support this because if we change the argument to be a callable (with the default value =Batchnorm
), the customized batch_norm function must have the same function signature as e3nn.nn.Batchnorm. With that, I would suggest applying customized function outside of the conv layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both the e3nn.nn.BatchNorm and the modified version we used are nn.module so I should have suggested change the type of the batch_norm
argument to nn.module
. But then it becomes a moot point since we can apply the BatchNorm outside of the conv layer like you said. Let me double check with Guoqing, who authored the modified version of BatchNorm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just confirmed with Guoqing that the current API is OK. So it's up to you if you want to keep it as it is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it make more sense to just not have BN in the API then?
Yes, for now in the Diffdock model, edge_emb and src_scalars are really scalars. But in general, we don't want to constrain them to be scalar or of same dimensionality. Thank you! |
cugraph-equivariant
packagecugraph-equivariant
package
--channel pytorch \ | ||
--channel nvidia \ | ||
cugraph-equivariant | ||
pip install e3nn==0.5.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure, but would this be better off as a dependency in the py_test_cugraph_equivariant
section of the dependencies.yaml
file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally yes, but e3nn
depends on PyTorch. Having that in pyproject.toml might pull wrong versions of pytorch for users. cugraph-dgl and -pyg's pyproject.toml does not have pytorch either (I guess for the same reason).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tingyu66 do you want me to remove pytorch from the dependencies of e3nn?
python/cugraph-equivariant/cugraph_equivariant/tests/test_tensor_product_conv.py
Outdated
Show resolved
Hide resolved
…or_product_conv.py Co-authored-by: Mario Geiger <[email protected]>
…ct_conv.py Co-authored-by: Mario Geiger <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
besides the minor comment w.r.t. having BN in the API, no further comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same, looks good except that BN that I'm not super super happy about
/merge |
Bring up
cugraph-equivariant
package and add TensorProduct conv layers.