Merge pull request #94 from maxxxzdn/master

add NeurIPS 2023 blog
AMLab-Amsterdam · Dec 1, 2023 · 2d75ff1 · 2d75ff1
2 parents 227ff35 + a1b77a5
commit 2d75ff1
Show file tree

Hide file tree

Showing 6 changed files with 223 additions and 0 deletions.
diff --git a/_posts/2023-11-20-neurips.md b/_posts/2023-11-20-neurips.md
@@ -0,0 +1,223 @@
+---
+layout: post
+title: AMLab Papers at NeurIPS 2023
+description: |
+    Read here about the 12 papers that AMLab members will be presenting at NeurIPS this year. 
+author: Max Zhdanov
+date: 2023-11-20 8:00:00+0100
+---
+
+<div style="background-color: rgba(82, 67, 100, 0.1); padding: 10px 0; text-align: center;">
+    <img src="https://neurips.cc/media/Press/NeurIPS_logo.svg" alt="NeurIPS" width="240" height="120" style="margin: auto;">
+</div>
+
+
+<hr style="margin-top: 20px;">
+
+<table>
+  <tr>
+    <td style="width: 50%;">
+      <strong style="font-size: 20px;">Clifford Group Equivariant Neural Networks</strong><br>
+      <em>David Ruhe, Johannes Brandstetter, Patrick Forré</em><br>
+
+      <span style="font-size: 14px; display: block; margin-bottom: 10px;">
+        <a href="https://openreview.net/forum?id=n84bzMrGUD&referrer=%5BAuthor%20Console%5D(%2Fgroup%5DNeurIPS.cc%2F2023%2FConference%2FAuthors%23your-submissions)" style="text-decoration: none; color: #007bff;">Conference paper</a> &middot; 
+        <a href="https://openreview.net/pdf?id=n84bzMrGUD" style="text-decoration: none; color: #007bff;">PDF</a> &middot; 
+        <a href="https://github.com/DavidRuhe/clifford-group-equivariant-neural-networks" style="text-decoration: none; color: #007bff;">GitHub repo</a>
+      </span>
+
+      <strong>Abstract:</strong> We introduce Clifford Group Equivariant Neural Networks: a novel approach for constructing O(n)- and E(n)-equivariant models. We identify and study the Clifford group: a subgroup inside the Clifford algebra tailored to achieve several favorable properties. Primarily, the group's action forms an orthogonal automorphism that extends beyond the typical vector space to the entire Clifford algebra while respecting the multivector grading. This leads to several non-equivalent subrepresentations corresponding to the multivector decomposition. Furthermore, we prove that the action respects not just the vector space structure of the Clifford algebra but also its multiplicative structure, i.e., the geometric product. These findings imply that every polynomial in multivectors, including their grade projections, constitutes an equivariant map with respect to the Clifford group, allowing us to parameterize equivariant neural network layers. An advantage worth mentioning is that we obtain expressive layers that can elegantly generalize to inner-product spaces of any dimension. We demonstrate, notably from a single core implementation, state-of-the-art performance on several distinct tasks, including a three-dimensional N-body experiment, a four-dimensional Lorentz-equivariant high-energy physics experiment, and a five-dimensional convex hull experiment. <br>
+
+      <span style="font-size: 14px; font-weight: bold; color: #6c757d;">
+        #GeometricAlgebra #Equivariance #RepresentationTheory
+      </span>
+
+    </td>
+    <td style="position: relative; width: 50%;">
+      <img src="/assets/neurips2023/clifford.png" alt="Graphical Abstract" style="height: auto; max-height: 90%; width: auto; max-width: 90%;  margin-left: 5%;">
+      <span style="position: absolute; top: 14%; right: 0; background-color: #ffcc00; color: black; padding: 5px; font-size: 20px; transform: rotate(35deg); transform-origin: top right;">
+        Oral Presentation
+      </span>
+    </td>
+  </tr>
+</table>
+
+<hr style="margin-top: 20px;">
+
+<table>
+  <tr>
+    <td style="width: 50%;">
+      <strong style="font-size: 20px;">Rotating Features for Object Discovery</strong><br>
+      <em>Sindy Löwe, Phillip Lippe, Francesco Locatello, Max Welling</em><br>
+
+      <span style="font-size: 14px; display: block; margin-bottom: 10px;">
+        <a href="https://arxiv.org/abs/2306.00600" style="text-decoration: none; color: #007bff;">Paper</a> &middot; 
+        <a href="https://arxiv.org/pdf/2306.00600.pdf" style="text-decoration: none; color: #007bff;">PDF</a> &middot; 
+        <a href="https://github.com/loeweX/RotatingFeatures" style="text-decoration: none; color: #007bff;">GitHub repo</a>
+      </span>
+
+      <strong>Abstract:</strong> The binding problem in human cognition, concerning how the brain represents and connects objects within a fixed network of neural connections, remains a subject of intense debate. Most machine learning efforts addressing this issue in an unsupervised setting have focused on slot-based methods, which may be limiting due to their discrete nature and difficulty to express uncertainty. Recently, the Complex AutoEncoder was proposed as an alternative that learns continuous and distributed object-centric representations. However, it is only applicable to simple toy data. In this paper, we present Rotating Features, a generalization of complex-valued features to higher dimensions, and a new evaluation procedure for extracting objects from distributed representations. Additionally, we show the applicability of our approach to pre-trained features. Together, these advancements enable us to scale distributed object-centric representations from simple toy to real-world data. We believe this work advances a new paradigm for addressing the binding problem in machine learning and has the potential to inspire further innovation in the field. <br>
+
+      <span style="font-size: 14px; font-weight: bold; color: #6c757d;">
+        #ObjectDiscovery #RepresentationLearning #ComplexAutoEncoder
+      </span>
+
+    </td>
+    <td style="position: relative; width: 50%;">
+      <img src="https://github.com/loeweX/RotatingFeatures/blob/main/imgs/RotatingFeatures.png?raw=true" alt="Graphical Abstract" style="height: auto; max-height: 90%; width: auto; max-width: 90%; margin-left: 5%;">
+      <span style="position: absolute; top: 14%; right: 0; background-color: #ffcc00; color: black; padding: 5px; font-size: 20px; transform: rotate(35deg); transform-origin: top right;">
+        Oral Presentation
+      </span>
+    </td>
+  </tr>
+</table>
+
+<hr style="margin-top: 20px;">
+
+<table>
+  <tr>
+    <td style="width: 50%;">
+      <strong style="font-size: 20px;">Deep Gaussian Markov Random Fields for graph-structured dynamical systems</strong><br>
+      <em>Fiona Lippert, Bart Kranstauber, E. Emiel van Loon, Patrick Forré</em><br>
+
+      <span style="font-size: 14px; display: block; margin-bottom: 10px;">
+        <a href="https://arxiv.org/abs/2306.08445" style="text-decoration: none; color: #007bff;">Conference paper</a> &middot; 
+        <a href="https://arxiv.org/pdf/2306.08445.pdf" style="text-decoration: none; color: #007bff;">PDF</a>
+      </span>
+
+      <strong>Abstract:</strong> Probabilistic inference in high-dimensional state-space models is computationally challenging. For many spatiotemporal systems, however, prior knowledge about the dependency structure of state variables is available. We leverage this structure to develop a computationally efficient approach to state estimation and learning in graph-structured state-space models with (partially) unknown dynamics and limited historical data. Building on recent methods that combine ideas from deep learning with principled inference in Gaussian Markov random fields (GMRF), we reformulate graph-structured state-space models as Deep GMRFs defined by simple spatial and temporal graph layers. This results in a flexible spatiotemporal prior that can be learned efficiently from a single time sequence via variational inference. Under linear Gaussian assumptions, we retain a closed-form posterior, which can be sampled efficiently using the conjugate gradient method, scaling favourably compared to classical Kalman filter based approaches. <br>
+
+      <span style="font-size: 14px; font-weight: bold; color: #6c757d;">
+        #GaussianMarkovRandomFields #SpatiotemporalModels #VariationalInference
+      </span>
+
+    <td style="width: 50%;">
+      <!-- Image placeholder since no image link is provided -->
+      <img src="/assets/neurips2023/ST-DGMRF_graphical_abstract.png" alt="Graphical Abstract" style="height: auto; max-height: 80%; width: auto; max-width: 80%;  margin-left: 20%;">
+    </td>
+    </td>
+  </tr>
+</table>
+
+<hr style="margin-top: 20px;">
+
+<table>
+  <tr>
+    <td style="width: 50%;">
+      <strong style="font-size: 20px;">Implicit Convolutional Kernels for Steerable CNNs</strong><br>
+      <em>Maksim Zhdanov, Nico Hoffmann, Gabriele Cesa</em><br>
+      <span style="font-size: 14px; display: block; margin-bottom: 10px;">
+    <a href="https://openreview.net/forum?id=2YtdxqvdjX" style="text-decoration: none; color: #007bff;">Conference paper</a> &middot; 
+    <a href="https://openreview.net/pdf?id=2YtdxqvdjX" style="text-decoration: none; color: #007bff;">PDF</a> &middot; 
+    <a href="https://github.com/maxxxzdn/implicit-steerable-cnns" style="text-decoration: none; color: #007bff;">GitHub repo</a>
+  </span>
+
+  <strong>Abstract:</strong> Steerable convolutional neural networks (CNNs) provide a general framework for building neural networks equivariant to translations and other transformations belonging to an origin-preserving group \(G\), such as reflections and rotations. They rely on standard convolutions with \(G\)-steerable kernels obtained by analytically solving the group-specific equivariance constraint imposed onto the kernel space. As the solution is tailored to a particular group \(G\), the implementation of a kernel basis does not generalize to other symmetry transformations, which complicates the development of general group equivariant models. We propose using implicit neural representation via multi-layer perceptrons (MLPs) to parameterize \(G\)-steerable kernels. The resulting framework offers a simple and flexible way to implement Steerable CNNs and generalizes to any group \(G\) for which a \(G\)-equivariant MLP can be built. We prove the effectiveness of our method on multiple tasks, including N-body simulations, point cloud classification, and molecular property prediction. <br>
+
+  <span style="font-size: 14px; font-weight: bold; color: #6c757d;">
+    #SteerableCNNs #Equivariance #MolecularSimulations
+  </span>
+
+</td>
+    <td style="width: 50%;">
+      <img src="https://github.com/maxxxzdn/implicit-steerable-kernels/blob/main/assets/abstract.png?raw=true" alt="Graphical Abstract" style="height: auto; max-height: 60%; width: auto; max-width: 60%; margin-left: 20%;">
+    </td>
+
+  </tr>
+</table>
+<hr style="margin-top: 20px;">
+
+<table>
+  <tr>
+    <td style="width: 50%;">
+      <strong style="font-size: 20px;">The Memory-Perturbation Equation: Understanding Model's Sensitivity to Data</strong><br>
+      <em>Peter Nickl, Lu Xu*, Dharmesh Tailor*, Thomas Möllenhoff, Emtiyaz Khan</em><br><em>(*=equal contribution)</em><br>
+
+      <span style="font-size: 14px; display: block; margin-bottom: 10px;">
+        <a href="https://openreview.net/forum?id=dqS1GuoG2V" style="text-decoration: none; color: #007bff;">Conference paper</a> &middot; 
+        <a href="https://arxiv.org/abs/2310.19273" style="text-decoration: none; color: #007bff;">PDF</a> &middot; 
+        <a href="https://github.com/team-approx-bayes/memory-perturbation" style="text-decoration: none; color: #007bff;">GitHub repo</a>
+      </span>
+
+      <strong>Abstract:</strong> Understanding model's sensitivity to its training data is crucial but can also be challenging and costly, especially during training. To simplify such issues, we present the Memory-Perturbation Equation (MPE) which relates model's sensitivity to perturbation in its training data. Derived using Bayesian principles, the MPE unifies existing sensitivity measures, generalizes them to a wide-variety of models and algorithms, and unravels useful properties regarding sensitivities. Our empirical results show that sensitivity estimates obtained during training can be used to faithfully predict generalization on unseen test data. The proposed equation is expected to be useful for future research on robust and adaptive learning. <br>
+
+      <span style="font-size: 14px; font-weight: bold; color: #6c757d;">
+        #MemoryPerturbationEquation #Sensitivity #Generalization
+      </span>
+
+    </td>
+    <td style="width: 50%;">
+      <!-- Image placeholder since no image link is provided -->
+      <img src="/assets/neurips2023/mpe_blog_image.png" alt="Graphical Abstract" style="height: auto; max-height: 60%; width: auto; max-width: 60%;  margin-left: 20%;">
+    </td>
+  </tr>
+</table>
+
+
+<hr style="margin-top: 20px;">
+
+<table>
+
+  <tr>
+    <td style="width: 50%;">
+      <strong style="font-size: 20px;">Topological Obstructions and How to Avoid Them</strong><br>
+      <em>Babak Esmaeili, Robin Walters, Heiko Zimmermann, Jan-Willem van de Meent</em><br>
+
+      <span style="font-size: 14px; display: block; margin-bottom: 10px;">
+        <a href="https://openreview.net/forum?id=1tviRBNxI9" style="text-decoration: none; color: #007bff;">Conference paper</a> &middot; 
+        <a href="https://openreview.net/pdf?id=1tviRBNxI9" style="text-decoration: none; color: #007bff;">PDF</a> &middot; 
+        <!-- GitHub link not provided -->
+      </span>
+
+      <strong>Abstract:</strong> Incorporating geometric inductive biases into models can aid interpretability and generalization, but encoding to a specific geometric structure can be challenging due to the imposed topological constraints. In this paper, we theoretically and empirically characterize obstructions to training encoders with geometric latent spaces. We show that local optima can arise due to singularities (e.g. self-intersection) or due to an incorrect degree or winding number. We then discuss how normalizing flows can potentially circumvent these obstructions by defining multimodal variational distributions. Inspired by this observation, we propose a new flow-based model that maps data points to multimodal distributions over geometric spaces and empirically evaluate our model on 2 domains. We observe improved stability during training and a higher chance of converging to a homeomorphic encoder. <br>
+
+      <span style="font-size: 14px; font-weight: bold; color: #6c757d;">
+        #GeometricInductiveBiases #DataTopology #NormalizingFlows
+      </span>
+
+    </td>
+    <td style="width: 50%;">
+      <!-- Image placeholder since no image link is provided -->
+      <img src="/assets/neurips2023/top_obst.jpeg" alt="Graphical Abstract" style="height: auto; max-height: 90%; width: auto; max-width: 90%;  margin-left: 5%;">
+    </td>
+  </tr>
+</table>
+
+<hr style="margin-top: 20px;">
+
+<table>
+  <tr>
+    <td style="width: 50%;">
+      <strong style="font-size: 20px;">Learning Dynamic Attribute-factored World Models for Efficient Multi-object Reinforcement Learning</strong><br>
+      <em>Fan Feng, Sara Magliacane</em><br>
+
+      <span style="font-size: 14px; display: block; margin-bottom: 10px;">
+        <a href="https://arxiv.org/abs/2307.09205" style="text-decoration: none; color: #007bff;">Paper</a> &middot; 
+        <a href="https://arxiv.org/pdf/2307.09205.pdf" style="text-decoration: none; color: #007bff;">PDF</a> &middot; 
+        <!-- GitHub repo to be updated later -->
+      </span>
+
+      <strong>Abstract:</strong> In many reinforcement learning tasks, the agent has to learn to interact with many objects of different types and generalize to unseen combinations and numbers of objects. Often a task is a composition of previously learned tasks (e.g. block stacking). These are examples of compositional generalization, in which we compose object-centric representations to solve complex tasks. Recent works have shown the benefits of object-factored representations and hierarchical abstractions for improving sample efficiency in these settings. In this paper, we introduce the Dynamic Attribute FacTored RL (DAFT-RL) framework. In DAFT-RL, we leverage object-centric representation learning to extract objects from visual inputs, classify them in classes, and infer their latent parameters. For each class of object, we learn a class template graph that describes the dynamics and reward of an object of this class factorized according to its attributes. We also learn an interaction pattern graph that describes how objects of different classes interact with each other at the attribute level. Through these graphs and a dynamic interaction graph that models the interactions between objects, we learn a policy that can then be directly applied in a new environment by just estimating the interactions and latent parameters. We evaluate DAFT-RL in three benchmark datasets and show our framework outperforms the state-of-the-art in generalizing across unseen objects with varying attributes and latent parameters, as well as in the composition of previously learned tasks. <br>
+
+      <span style="font-size: 14px; font-weight: bold; color: #6c757d;">
+        #ReinforcementLearning #ObjectCentricRepresentations
+      </span>
+
+    </td>
+    <td style="width: 50%;">
+      <!-- Image placeholder since no image link is provided -->
+      <img src="/assets/neurips2023/dyn_attr.png" alt="Graphical Abstract" style="height: auto; max-height: 100%; width: auto; max-width: 100%;  margin-left: 5%;">
+    </td>
+  </tr>
+</table>
+
+<hr style="margin-top: 20px;">
+
+
+
+
+
+
+
+
+
diff --git a/assets/neurips2023/ST-DGMRF_graphical_abstract.png b/assets/neurips2023/ST-DGMRF_graphical_abstract.png
diff --git a/assets/neurips2023/clifford.png b/assets/neurips2023/clifford.png
diff --git a/assets/neurips2023/dyn_attr.png b/assets/neurips2023/dyn_attr.png
diff --git a/assets/neurips2023/mpe_blog_image.png b/assets/neurips2023/mpe_blog_image.png
diff --git a/assets/neurips2023/top_obst.jpeg b/assets/neurips2023/top_obst.jpeg