Merge pull request #198 from basf/master

PyPi release v1.1.0
basf · Jan 3, 2025 · 90f3547 · 90f3547
2 parents 46cae81 + 648d029
commit 90f3547
Show file tree

Hide file tree

Showing 130 changed files with 5,519 additions and 5,778 deletions.
diff --git a/.github/ISSUE_TEMPLATE/bug_report.md b/.github/ISSUE_TEMPLATE/bug_report.md
@@ -25,4 +25,4 @@ If applicable, add screenshots to help explain your problem.
  - Mambular Version [e.g. 0.1.2]
 
 **Additional context**
-Add any other context about the problem here.
+Add any other context about the problem here.
diff --git a/.github/ISSUE_TEMPLATE/config.yml b/.github/ISSUE_TEMPLATE/config.yml
@@ -1 +1 @@
-blank_issues_enabled: false
+blank_issues_enabled: false
diff --git a/.github/ISSUE_TEMPLATE/doc_request.md b/.github/ISSUE_TEMPLATE/doc_request.md
@@ -8,4 +8,4 @@ assignees: ''
 ---
 
 **Description of the question**
-A clear and concise description of what should be documented.
+A clear and concise description of what should be documented.
diff --git a/.github/ISSUE_TEMPLATE/feature_request.md b/.github/ISSUE_TEMPLATE/feature_request.md
@@ -17,4 +17,4 @@ A clear and concise description of what you want to happen.
 A clear and concise description of any alternative solutions or features you've considered.
 
 **Additional context**
-Add any other context or screenshots about the feature request here.
+Add any other context or screenshots about the feature request here.
diff --git a/.github/ISSUE_TEMPLATE/question.md b/.github/ISSUE_TEMPLATE/question.md
@@ -14,4 +14,4 @@ Gives some context if needed (environment, system, hardware).
 A clear and concise description of what the task is.
 
 **Describe the solution you'd like**
-A clear and concise description of what you want to happen.
+A clear and concise description of what you want to happen.
diff --git a/.github/workflows/build-publish-pypi.yml b/.github/workflows/build-publish-pypi.yml
@@ -6,7 +6,7 @@ on:
       - release
 
 jobs:
-  publish:
+  build-publish:
     runs-on: ubuntu-latest
 
     steps:
@@ -18,15 +18,19 @@ jobs:
         with:
           python-version: "3.8"
 
-      - name: Install dependencies
+      - name: Install Poetry
         run: |
-          python -m pip install --upgrade pip
-          pip install setuptools wheel twine
+          curl -sSL https://install.python-poetry.org | python3 -
+          export PATH="$HOME/.local/bin:$PATH"
+
+      - name: Install dependencies
+        run: poetry install
 
-      - name: Build and publish package
+      - name: Build package
+        run: poetry build
+
+      - name: Publish to PyPI
         env:
           TWINE_USERNAME: __token__
           TWINE_PASSWORD: ${{ secrets.PYPI_TOKEN }}
-        run: |
-          python setup.py sdist bdist_wheel
-          twine upload dist/*
+        run: poetry publish --username $TWINE_USERNAME --password $TWINE_PASSWORD
diff --git a/.gitignore b/.gitignore
@@ -172,4 +172,5 @@ examples/lightning_logs
 docs/_build/doctrees/*
 docs/_build/html/*
 
+
 dev/*
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -0,0 +1,31 @@
+exclude: "^$"
+fail_fast: false
+default_stages: [commit, push]
+repos:
+  - repo: https://github.com/pre-commit/pre-commit-hooks
+    rev: v4.5.0
+    hooks:
+      - id: check-case-conflict
+      - id: check-merge-conflict
+      - id: end-of-file-fixer
+      - id: mixed-line-ending
+      - id: trailing-whitespace
+        args: [--markdown-linebreak-ext=md]
+
+  - repo: https://github.com/charliermarsh/ruff-pre-commit
+    rev: v0.1.14
+    hooks:
+      - id: ruff-format
+        types_or: [python, pyi, jupyter]
+      - id: ruff
+        types_or: [python, pyi, jupyter]
+        args: [ --fix, --exit-non-zero-on-fix ]
+
+  - repo: https://github.com/pre-commit/mirrors-prettier
+    rev: v4.0.0-alpha.8
+    hooks:
+      - id: prettier
+        types:
+          - yaml
+          - markdown
+          - json
diff --git a/.vscode/settings.json b/.vscode/settings.json
@@ -0,0 +1,10 @@
+{
+    "editor.formatOnSave": true,
+    "editor.codeActionsOnSave": {
+        "source.organizeImports": "explicit",
+        "source.fixAll": "explicit"
+    },
+    "[python]": {
+        "editor.defaultFormatter": "charliermarsh.ruff"
+    },
+}
diff --git a/docs/codeofconduct.md → CODE_OF_CONDUCT.md b/docs/codeofconduct.md → CODE_OF_CONDUCT.md
@@ -1,6 +1,7 @@
+
 # Code of Conduct
 
-- **Purpose**:  The purpose of this Code of Conduct is to establish a welcoming and inclusive community around the `Mambular` project. We want to foster an environment where everyone feels respected, valued, and able to contribute to the project.
+- **Purpose**:  The purpose of this Code of Conduct is to establish a welcoming and inclusive community around the `STREAM` project. We want to foster an environment where everyone feels respected, valued, and able to contribute to the project.
 
 - **Openness and Respect**: We strive to create an open and respectful community where everyone can freely express their opinions and ideas. We encourage constructive discussions and debates, but we will not tolerate any form of harassment, discrimination, or disrespectful behavior.
 

diff --git a/LICENSE b/LICENSE
@@ -18,4 +18,4 @@ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
-SOFTWARE.
+SOFTWARE.
diff --git a/README.md b/README.md
@@ -19,7 +19,7 @@
     <h1>Mambular: Tabular Deep Learning Made Simple</h1>
 </div>
 
-Mambular is a Python library for tabular deep learning. It includes models that leverage the Mamba (State Space Model) architecture, as well as other popular models like TabTransformer, FTTransformer, TabM and tabular ResNets. Check out our paper `Mambular: A Sequential Model for Tabular Deep Learning`, available [here](https://arxiv.org/abs/2408.06291). Also check out our paper introducing [TabulaRNN](https://arxiv.org/pdf/2411.17207) and analyzing the efficiency of NLP inspired tabular models. 
+Mambular is a Python library for tabular deep learning. It includes models that leverage the Mamba (State Space Model) architecture, as well as other popular models like TabTransformer, FTTransformer, TabM and tabular ResNets. Check out our paper `Mambular: A Sequential Model for Tabular Deep Learning`, available [here](https://arxiv.org/abs/2408.06291). Also check out our paper introducing [TabulaRNN](https://arxiv.org/pdf/2411.17207) and analyzing the efficiency of NLP inspired tabular models.
 
 <h3> Table of Contents </h3>
 
@@ -66,6 +66,7 @@ Mambular is a Python package that brings the power of advanced deep learning arc
 | `TabulaRNN`      | A Recurrent Neural Network for Tabular data, introduced [here](https://arxiv.org/pdf/2411.17207).                                                   |
 | `MambAttention`  | A combination between Mamba and Transformers, also introduced [here](https://arxiv.org/pdf/2411.17207).                                             |
 | `NDTF`           | A neural decision forest using soft decision trees. See [Kontschieder et al.](https://openaccess.thecvf.com/content_iccv_2015/html/Kontschieder_Deep_Neural_Decision_ICCV_2015_paper.html) for inspiration. |
+| `SAINT`          | Improve neural networs via Row Attention and Contrastive Pre-Training, introduced [here](https://arxiv.org/pdf/2106.01342).                                              |
 
 
 
@@ -90,7 +91,7 @@ If you want to use the original mamba and mamba2 implementations, additionally i
 pip install mamba-ssm
 ```
 
-Be careful to use the correct torch and cuda versions: 
+Be careful to use the correct torch and cuda versions:
 
 ```sh
 pip install torch==2.0.0+cu118 torchvision==0.15.0+cu118 torchaudio==2.0.0+cu118 -f https://download.pytorch.org/whl/cu118/torch_stable.html
@@ -115,7 +116,7 @@ Mambular simplifies data preprocessing with a range of tools designed for easy t
 - **Polynomial Features**: Automatically generates polynomial and interaction terms for numerical features, enhancing the ability to capture higher-order relationships.  
 - **Box-Cox & Yeo-Johnson Transformations**: Performs power transformations to stabilize variance and normalize distributions.  
 - **Custom Binning**: Enables user-defined bin edges for precise discretization of numerical data.  
- 
+
 
 
 
@@ -147,15 +148,15 @@ preds = model.predict_proba(X)
 ```
 
 <h3> Hyperparameter Optimization</h3>
-Since all of the models are sklearn base estimators, you can use the built-in hyperparameter optimizatino from sklearn. 
+Since all of the models are sklearn base estimators, you can use the built-in hyperparameter optimizatino from sklearn.
 
 ```python
 from sklearn.model_selection import RandomizedSearchCV
 
 param_dist = {
-    'd_model': randint(32, 128),   
-    'n_layers': randint(2, 10),   
-    'lr': uniform(1e-5, 1e-3) 
+    'd_model': randint(32, 128),  
+    'n_layers': randint(2, 10),  
+    'lr': uniform(1e-5, 1e-3)
 }
 
 random_search = RandomizedSearchCV(
@@ -179,10 +180,10 @@ print("Best Score:", random_search.best_score_)
 Note, that using this, you can also optimize the preprocessing. Just use the prefix ``prepro__`` when specifying the preprocessor arguments you want to optimize:
 ```python
 param_dist = {
-    'd_model': randint(32, 128),   
-    'n_layers': randint(2, 10),   
+    'd_model': randint(32, 128),  
+    'n_layers': randint(2, 10),  
     'lr': uniform(1e-5, 1e-3),
-    "prepro__numerical_preprocessing": ["ple", "standardization", "box-cox"] 
+    "prepro__numerical_preprocessing": ["ple", "standardization", "box-cox"]
 }
 
 ```
@@ -239,16 +240,16 @@ model = MambularLSS(
     dropout=0.2,
     d_model=64,
     n_layers=8,
- 
+
 )
 
 # Fit the model to your data
 model.fit(
-    X, 
-    y, 
-    max_epochs=150, 
-    lr=1e-04, 
-    patience=10,     
+    X,
+    y,
+    max_epochs=150,
+    lr=1e-04,
+    patience=10,  
     family="normal" # define your distribution
     )
 
@@ -305,7 +306,7 @@ Here's how you can implement a custom model with Mambular:
        def forward(self, num_features, cat_features):
            x = num_features + cat_features
            x = torch.cat(x, dim=1)
-           
+
            # Pass through linear layer
            output = self.linear(x)
            return output

diff --git a/docs/api/base_models/BaseModels.rst b/docs/api/base_models/BaseModels.rst
@@ -48,3 +48,7 @@ mambular.base_models
 .. autoclass:: mambular.base_models.NDTF
     :members:
     :no-inherited-members:
+
+.. autoclass:: mambular.base_models.SAINT
+    :members:
+    :no-inherited-members:
diff --git a/docs/api/base_models/index.rst b/docs/api/base_models/index.rst
@@ -22,6 +22,7 @@ Modules                                       Description
 :class:`NDTF`                                 Neural Decision Tree Forest (NDTF) model for tabular tasks, blending decision tree concepts with neural networks.
 :class:`TabulaRNN`                            Recurrent neural network (RNN) model, including LSTM and GRU architectures, tailored for sequential or time-series tabular data.
 :class:`MambAttention`                        Attention-based architecture for tabular tasks, combining feature importance weighting with advanced normalization techniques.
+:class:`SAINT`                                SAINT model. Transformer based model using row and column attetion.
 =========================================    =======================================================================================================
 
 

diff --git a/docs/api/configs/Configurations.rst b/docs/api/configs/Configurations.rst
@@ -44,3 +44,7 @@ Configurations
 .. autoclass:: mambular.configs.DefaultTabMConfig
    :members:
    :undoc-members:
+
+.. autoclass:: mambular.configs.DefaultSAINTConfig
+   :members:
+   :undoc-members:
diff --git a/docs/api/configs/index.rst b/docs/api/configs/index.rst
@@ -95,6 +95,14 @@ Dataclass                                   Description
 :class:`DefaultTabMConfig`                  Default configuration for the TabM model (Batch-Ensembling MLP).
 =======================================    =======================================================================================================
 
+SAINT
+-----
+=======================================    =======================================================================================================
+Dataclass                                   Description
+=======================================    =======================================================================================================
+:class:`DefaultSAINTConfig`                 Default configuration for the SAINT model.
+=======================================    =======================================================================================================
+
 .. toctree::
    :maxdepth: 1
 

diff --git a/docs/api/models/Models.rst b/docs/api/models/Models.rst
@@ -5,7 +5,7 @@ mambular.models
     :members:
     :inherited-members:
 
-.. autoclass:: mambular.models.MambularRegressor 
+.. autoclass:: mambular.models.MambularRegressor
     :members:
     :inherited-members:
 
@@ -29,7 +29,7 @@ mambular.models
     :members:
     :undoc-members:
 
-.. autoclass:: mambular.models.MLPRegressor 
+.. autoclass:: mambular.models.MLPRegressor
     :members:
     :undoc-members:
 
@@ -49,7 +49,7 @@ mambular.models
     :members:
     :undoc-members:
 
-.. autoclass:: mambular.models.ResNetClassifier 
+.. autoclass:: mambular.models.ResNetClassifier
     :members:
     :undoc-members:
 
@@ -101,7 +101,7 @@ mambular.models
     :members:
     :inherited-members:
 
-.. autoclass:: mambular.models.TabMRegressor 
+.. autoclass:: mambular.models.TabMRegressor
     :members:
     :inherited-members:
 
@@ -113,7 +113,7 @@ mambular.models
     :members:
     :inherited-members:
 
-.. autoclass:: mambular.models.NODERegressor 
+.. autoclass:: mambular.models.NODERegressor
     :members:
     :inherited-members:
 
@@ -125,14 +125,26 @@ mambular.models
     :members:
     :inherited-members:
 
-.. autoclass:: mambular.models.NDTFRegressor 
+.. autoclass:: mambular.models.NDTFRegressor
     :members:
     :inherited-members:
 
 .. autoclass:: mambular.models.NDTFLSS
     :members:
     :undoc-members:
 
+.. autoclass:: mambular.models.SAINTClassifier
+    :members:
+    :inherited-members:
+
+.. autoclass:: mambular.models.SAINTRegressor
+    :members:
+    :inherited-members:
+
+.. autoclass:: mambular.models.SAINTLSS
+    :members:
+    :undoc-members:
+
 .. autoclass:: mambular.models.SklearnBaseClassifier
     :members:
     :undoc-members:

diff --git a/docs/api/models/index.rst b/docs/api/models/index.rst
@@ -117,6 +117,16 @@ Modules                                     Description
 :class:`NDTFLSS`                            Distributional tasks using a Neural Decision Forest.
 =======================================    =======================================================================================================
 
+SAINT
+-----
+=======================================    =======================================================================================================
+Modules                                     Description
+=======================================    =======================================================================================================
+:class:`SAINTClassifier`                    Multi-class and binary classification tasks using SAINT.
+:class:`SAINTRegressor`                     Regression tasks using SAINT.
+:class:`SAINTLSS`                           Distributional tasks using SAINT.
+=======================================    =======================================================================================================
+
 Base Classes
 ------------
 =======================================    =======================================================================================================
@@ -129,5 +139,5 @@ Modules                                     Description
 
 .. toctree::
    :maxdepth: 1
-   
+
    Models
Original file line number	Diff line number	Diff line change
		@@ -1 +1 @@
		blank_issues_enabled: false
		blank_issues_enabled: false
Original file line number	Diff line number	Diff line change
Expand Up		@@ -172,4 +172,5 @@ examples/lightning_logs
		docs/_build/doctrees/*
		docs/_build/html/*


		dev/*