Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DL Edition] T034: GNN based molecular property prediction #287

Merged
merged 10 commits into from
Apr 11, 2023

Conversation

PaulaKramer
Copy link
Collaborator

@PaulaKramer PaulaKramer commented Dec 6, 2022

Details

  • Talktorial ID: 034
  • Title: [DL Edition] T034: GNN based molecular property prediction
  • Original authors: Paula Kramer
  • Reviewer(s): XXX
  • Date of review: DD-MM-YYYY

Content

  • One line summary: Introduction to Graph Neural Networks for Property Prediction
  • Potential labels or categories (e.g. machine learning, small molecules, online APIs): Machine learning, small molecules, graph neural networks
  • Time it took to execute (approx.): 7 min
  • I have used the talktorial template and followed the content and formatting suggestions there
  • Packages must be open-sourced and should be installable from conda-forge. If you are adding new packages to the TeachOpenCADD environment, please check if already installed packages can perform the same functionality and if not leave a sentence explaining why the new addition is needed. If the new package is not on conda-forge, please list them and their intended usage here.
    • numpy, matplotlib: Already in TeachOpenCADD
    • pytorch 1.12.1, pytorch-cluster 1.6.0, pytorch-scatter 2.1.0, pytorch-sparse 0.6.15, pyg 2.2.0 (conda-forge): I use it for implementing graph neural networks
  • Data must be publicly available, preferably accessible via a webserver or downloadable via a URL. Please list the data resources that you use and how to access them:

Content style

  • Talktorial includes cross-references to other talktorials if applicable
  • The table of contents reflects the talktorial story-line; order of #, ##, ### headers is correct
  • URLs are linked with meaningful words, instead of pasting the URL directly or linking words like here.
  • I have spell-checked the notebook
  • Images have enough resolution to be rendered with quality, without being too heavy.
  • All figures have a description
  • Markdown cell content is still in-line with code cell output (whenever results are discussed)
  • I have checked that cell outputs are not incredibly long (this applies also to DataFrames)
  • Formatting looks correctly on the Sphinx render (bold, italics, figure placing)

Code style

  • Variable and function names follow snake case rules (e.g. a_variable_name vs aVariableName)
  • Spacing follows PEP8 (run Black on the code cells if needed)
  • Code line are under 99 characters each (run black-nb -l 99)
  • Comments are useful and well placed
  • There are no unpythonic idioms like for i in range(len(list)) (see slides)
  • All 3rd party dependencies are listed at the top of the notebook
  • I have marked all code cell with output referenced in markdown cells with the label # NBVAL_CHECK_OUTPUT
  • I have identified potential candidates for a code refactor / useful functions
  • All import ... lines are at the top (practice part) cell, ordered by standard library / 3rd party packages / our own (teachopencadd.*)
  • I have used absolute paths instead of relative paths
    HERE = Path(_dh[-1])
    DATA = HERE / "data"

Website

We present our talktorials on our TeachOpenCADD website (https://projects.volkamerlab.org/teachopencadd/), so we have to check as well if the Jupyter notebook renders nicely there.

  • If this PR adds a new talktorial, please follow these steps:
    • Add your talktorial to the complete list of talktorials here (at the end).
    • Add your talktorial to one or multiple of the collections here. Or propose a new collection section in your PR.
    • Add your talktorial's nblink file by running python generate_nblinks.py from within the directory teachopencadd/docs/talktorials.
    • Please complile the website following the instructions here.
  • Check the rendering of the talktorial of this PR.
  • Is your talktorial listed in the talktorial list?
  • Is your talktorial listed in the talktorial collections?
    • Add a picture for your talktorial in the collection view by following these instructions.

@AndreaVolkamer AndreaVolkamer changed the title Start branch [DL Edition] T034: GNN based molecular property prediction Dec 8, 2022
@AndreaVolkamer AndreaVolkamer added the new talktorial New talktorial label Dec 8, 2022
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@gerritgr
Copy link
Collaborator

gerritgr commented Jan 30, 2023

  • GNNs should be defined first as differentiable and trainable, permutation equi(/in)variant functions. The architectures should be introduced as specific instances of such functions.  
  • The relationship between massage passing as a general and powerful framework and GCN/GIN as (less powerful) instances could be clarified: Also the relationship between the aggregation/pooling function (input is a set) and the permutation invariance. 
  • refer to T33 in the introduction.
  • d is overloaded with the degree and the feature dimension.
  • The advantages of a GNN library should be stated (sparse matrices, graph batching), also mention www.dgl.ai.
  • bessere property?  -> ChatGPT suggests: Electronegativity, Ionization potential, Bond angles and distances, no idea if they make sense, though.
  • Say explicitly that the pooling layer is invariant to the order of the input (the same as the aggregation function).
  • True vs predicted value -> would say Ground truth vs prediction
  • Can you give an intuition on what makes GIN more powerful?

@gerritgr gerritgr merged commit f7542d6 into DL_edition Apr 11, 2023
@mbackenkoehler mbackenkoehler deleted the pk-034-gnns branch January 29, 2024 10:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new talktorial New talktorial
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants