Skip to content

Commit

Permalink
Clarify that datasets and readers are deprecated since v3
Browse files Browse the repository at this point in the history
They only exist for backwards compatibility, and will be removed eventually.
  • Loading branch information
tomaarsen committed Jan 21, 2025
1 parent 57395b4 commit 14b9cae
Show file tree
Hide file tree
Showing 13 changed files with 125 additions and 0 deletions.
11 changes: 11 additions & 0 deletions sentence_transformers/datasets/DenoisingAutoEncoderDataset.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,14 @@
"""
This file contains deprecated code that can only be used with the old `model.fit`-style Sentence Transformers v2.X training.
It exists for backwards compatibility with the `model.old_fit` method, but will be removed in a future version.
Nowadays, with Sentence Transformers v3+, it is recommended to use the `SentenceTransformerTrainer` class to train models.
See https://www.sbert.net/docs/sentence_transformer/training_overview.html for more information.
See this script for more details on how to use the new training API:
https://github.com/UKPLab/sentence-transformers/blob/master/examples/unsupervised_learning/TSDAE/train_stsb_tsdae.py
"""

from __future__ import annotations

import numpy as np
Expand Down
10 changes: 10 additions & 0 deletions sentence_transformers/datasets/NoDuplicatesDataLoader.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,13 @@
"""
This file contains deprecated code that can only be used with the old `model.fit`-style Sentence Transformers v2.X training.
It exists for backwards compatibility with the `model.old_fit` method, but will be removed in a future version.
Nowadays, with Sentence Transformers v3+, it is recommended to use the `SentenceTransformerTrainer` class to train models.
See https://www.sbert.net/docs/sentence_transformer/training_overview.html for more information.
In particular, you can pass "no_duplicates" to `batch_sampler` in the `SentenceTransformerTrainingArguments` class.
"""

from __future__ import annotations

import math
Expand Down
10 changes: 10 additions & 0 deletions sentence_transformers/datasets/ParallelSentencesDataset.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,13 @@
"""
This file contains deprecated code that can only be used with the old `model.fit`-style Sentence Transformers v2.X training.
It exists for backwards compatibility with the `model.old_fit` method, but will be removed in a future version.
Nowadays, with Sentence Transformers v3+, it is recommended to use the `SentenceTransformerTrainer` class to train models.
See https://www.sbert.net/docs/sentence_transformer/training_overview.html for more information.
Instead, you should create a `datasets` `Dataset` for training: https://huggingface.co/docs/datasets/create_dataset
"""

from __future__ import annotations

import gzip
Expand Down
10 changes: 10 additions & 0 deletions sentence_transformers/datasets/SentenceLabelDataset.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,13 @@
"""
This file contains deprecated code that can only be used with the old `model.fit`-style Sentence Transformers v2.X training.
It exists for backwards compatibility with the `model.old_fit` method, but will be removed in a future version.
Nowadays, with Sentence Transformers v3+, it is recommended to use the `SentenceTransformerTrainer` class to train models.
See https://www.sbert.net/docs/sentence_transformer/training_overview.html for more information.
In particular, you can pass "group_by_label" to `batch_sampler` in the `SentenceTransformerTrainingArguments` class.
"""

from __future__ import annotations

import logging
Expand Down
8 changes: 8 additions & 0 deletions sentence_transformers/datasets/SentencesDataset.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,11 @@
"""
This file contains deprecated code that can only be used with the old `model.fit`-style Sentence Transformers v2.X training.
It exists for backwards compatibility with the `model.old_fit` method, but will be removed in a future version.
Nowadays, with Sentence Transformers v3+, it is recommended to use the `SentenceTransformerTrainer` class to train models.
See https://www.sbert.net/docs/sentence_transformer/training_overview.html for more information.
"""

from __future__ import annotations

from torch.utils.data import Dataset
Expand Down
8 changes: 8 additions & 0 deletions sentence_transformers/datasets/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,11 @@
"""
This directory contains deprecated code that can only be used with the old `model.fit`-style Sentence Transformers v2.X training.
It exists for backwards compatibility with the `model.old_fit` method, but will be removed in a future version.
Nowadays, with Sentence Transformers v3+, it is recommended to use the `SentenceTransformerTrainer` class to train models.
See https://www.sbert.net/docs/sentence_transformer/training_overview.html for more information.
"""

from __future__ import annotations

from .DenoisingAutoEncoderDataset import DenoisingAutoEncoderDataset
Expand Down
10 changes: 10 additions & 0 deletions sentence_transformers/readers/InputExample.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,13 @@
"""
This file contains deprecated code that can only be used with the old `model.fit`-style Sentence Transformers v2.X training.
It exists for backwards compatibility with the `model.old_fit` method, but will be removed in a future version.
Nowadays, with Sentence Transformers v3+, it is recommended to use the `SentenceTransformerTrainer` class to train models.
See https://www.sbert.net/docs/sentence_transformer/training_overview.html for more information.
Instead, you should create a `datasets` `Dataset` for training: https://huggingface.co/docs/datasets/create_dataset
"""

from __future__ import annotations


Expand Down
10 changes: 10 additions & 0 deletions sentence_transformers/readers/LabelSentenceReader.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,13 @@
"""
This file contains deprecated code that can only be used with the old `model.fit`-style Sentence Transformers v2.X training.
It exists for backwards compatibility with the `model.old_fit` method, but will be removed in a future version.
Nowadays, with Sentence Transformers v3+, it is recommended to use the `SentenceTransformerTrainer` class to train models.
See https://www.sbert.net/docs/sentence_transformer/training_overview.html for more information.
Instead, you should create a `datasets` `Dataset` for training: https://huggingface.co/docs/datasets/create_dataset
"""

from __future__ import annotations

import os
Expand Down
10 changes: 10 additions & 0 deletions sentence_transformers/readers/NLIDataReader.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,13 @@
"""
This file contains deprecated code that can only be used with the old `model.fit`-style Sentence Transformers v2.X training.
It exists for backwards compatibility with the `model.old_fit` method, but will be removed in a future version.
Nowadays, with Sentence Transformers v3+, it is recommended to use the `SentenceTransformerTrainer` class to train models.
See https://www.sbert.net/docs/sentence_transformer/training_overview.html for more information.
Instead, you should create a `datasets` `Dataset` for training: https://huggingface.co/docs/datasets/create_dataset
"""

from __future__ import annotations

import gzip
Expand Down
10 changes: 10 additions & 0 deletions sentence_transformers/readers/PairedFilesReader.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,13 @@
"""
This file contains deprecated code that can only be used with the old `model.fit`-style Sentence Transformers v2.X training.
It exists for backwards compatibility with the `model.old_fit` method, but will be removed in a future version.
Nowadays, with Sentence Transformers v3+, it is recommended to use the `SentenceTransformerTrainer` class to train models.
See https://www.sbert.net/docs/sentence_transformer/training_overview.html for more information.
Instead, you should create a `datasets` `Dataset` for training: https://huggingface.co/docs/datasets/create_dataset
"""

from __future__ import annotations

import gzip
Expand Down
10 changes: 10 additions & 0 deletions sentence_transformers/readers/STSDataReader.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,13 @@
"""
This file contains deprecated code that can only be used with the old `model.fit`-style Sentence Transformers v2.X training.
It exists for backwards compatibility with the `model.old_fit` method, but will be removed in a future version.
Nowadays, with Sentence Transformers v3+, it is recommended to use the `SentenceTransformerTrainer` class to train models.
See https://www.sbert.net/docs/sentence_transformer/training_overview.html for more information.
Instead, you should create a `datasets` `Dataset` for training: https://huggingface.co/docs/datasets/create_dataset
"""

from __future__ import annotations

import csv
Expand Down
10 changes: 10 additions & 0 deletions sentence_transformers/readers/TripletReader.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,13 @@
"""
This file contains deprecated code that can only be used with the old `model.fit`-style Sentence Transformers v2.X training.
It exists for backwards compatibility with the `model.old_fit` method, but will be removed in a future version.
Nowadays, with Sentence Transformers v3+, it is recommended to use the `SentenceTransformerTrainer` class to train models.
See https://www.sbert.net/docs/sentence_transformer/training_overview.html for more information.
Instead, you should create a `datasets` `Dataset` for training: https://huggingface.co/docs/datasets/create_dataset
"""

from __future__ import annotations

import csv
Expand Down
8 changes: 8 additions & 0 deletions sentence_transformers/readers/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,11 @@
"""
This directory contains deprecated code that can only be used with the old `model.fit`-style Sentence Transformers v2.X training.
It exists for backwards compatibility with the `model.old_fit` method, but will be removed in a future version.
Nowadays, with Sentence Transformers v3+, it is recommended to use the `SentenceTransformerTrainer` class to train models.
See https://www.sbert.net/docs/sentence_transformer/training_overview.html for more information.
"""

from __future__ import annotations

from .InputExample import InputExample
Expand Down

0 comments on commit 14b9cae

Please sign in to comment.