Releases: Lightning-Universe/lightning-flash
Compatibility patch
[0.8.2] - 2023-06-30
Changed
- Added GATE backbone for Tabular integrations (#1559)
Fixed
- Fixed datamodule can't load files with square brackets in names (#1501)
- Fixed channel dim selection on segmentation target (#1509)
- Fixed used of
jsonargparse
avoiding reliance on non-public internal logic (#1620) - Compatibility with
pytorch-tabular>=1.0
(#1545) - Compatibility latest
numpy
(#1595)
New Contributors
- @kjappelbaum made their first contribution in #1503
- @yurijmikhalevich made their first contribution in #1501
- @pirj made their first contribution in #1520
- @ArjunSharda made their first contribution in #1517
- @izikgo made their first contribution in #1509
- @manujosephv made their first contribution in #1559
- @mauvilsa made their first contribution in #1620
Full Changelog: 0.8.1.post0...0.8.2
Dependency's adjustments
What's Changed
- fixed type of 'n_gram' from bool to int in TranslationTask by @BrightXiaoHan in #1486
- pinned
torchmetrics
version for compatibility by @Borda in #1495 - pinned
sahi
to fix object detection when installing in a fresh environment by @ethanwharris in #1496 - pinned
numpy
for type compatibility by @Borda in #1504
New Contributors
- @BrightXiaoHan made their first contribution in #1486
- @kjappelbaum made their first contribution in #1503
Full Changelog: 0.8.1...0.8.1.post0
Minor compatibility patch
What's Changed
- Add CLIP backbones for text / image classification by @ethanwharris in #1458
- Replace DP/DDP/DDPSpawn plugins to strategies, keep the old for compatibility by @krshrimali in #1451
- Integration of
lightning_utilties
function intoflash
by @uakarsh in #1457 - refactored
image_classifier_head
toclassifier_head
by @Abelarm in #1464 - Raise better error if
icevision
not installed if module isn't found (loading data) by @krshrimali in #1474 - Add support for Lightning 1.8 + Fixes for the CI by @krshrimali in #1470 and #1479
- Fix compatibility with TM 0.10 by @ethanwharris in #1469
New Contributors
Full Changelog: 0.8.0...0.8.1
TPU Support, Remote Data Loading, Video Classification from tensors: Feature Rich Release
We are elated to announce the release of Lightning Flash v0.8, a feature-rich release with improved testing to ensure better user experience for all our lovely users! The team at Lightning AI and our community contributors have been working hard for this release, and nothing makes us happier to share all their lovely contributions with you.
We discuss major features and changes below. For a curated list, scroll to the bottom to see all the pull requests included for this release.
TPU Support 🦸🏻
Before this release, Lightning Flash worked well on a single-core TPU (training, validation, and prediction), but failed comprehensively on multiple cores. This release has enabled training and validation support for multi-core TPUs, allowing users to try out their models on TPUs using Lightning Flash. Prediction of multi-core TPUs is an ongoing effort, and we hope to bring it to you in the near future.
Before v0.8 | After v0.8 | |
---|---|---|
Single core | Training, Validation, Prediction | Training, Validation, Prediction |
Multiple cores | Not supported | Training, Validation |
As we move ahead, and we see more users trying the TPUs with Lightning Flash, we expect that there might be unseen errors or issues, and we will be looking forward to addressing them as we get a chance. So please don't hesitate to let us know your experience!
Remote Data Loading: fsspec
arrives into Lightning Flash ☁️
Before this release, users had to download a dataset or a file from the URL and pass it to our data loader classes. This was a pain point that we are happy to let go of in this release. Starting v0.8, you'll not have to download any of those files locally, and you can just pass the file URL - and expect it to work!
Before v0.8 | After v0.8 | |
---|---|---|
Example |
Download titanic.csv from the URL and pass the path to the train_file argument:
from flash.tabular import TabularClassificationData
datamodule = TabularClassificationData.from_csv(
categorical_fields=["Age", "Cabin"],
numerical_fields="Fare",
target_fields="Survived",
train_file="titanic.csv",
val_split=0.1,
batch_size=8,
) |
Just pass the URL to train_file argument: from flash.tabular import TabularClassificationData
datamodule = TabularClassificationData.from_csv(
categorical_fields=["Age", "Cabin"],
numerical_fields="Fare",
target_fields="Survived",
train_file="https://pl-flash-data.s3.amazonaws.com/titanic.csv",
val_split=0.1,
batch_size=8,
) |
For more details, feel free to check out the documentation here.
Video Classification from Tensors 📹
At times, it's required to load raw data, or pre-process videos before progressing to loading data and training the model. These raw data for Video Classification, are mostly available as tensors, and before this release - one had to save them again in video files, and pass the paths to the data loading classes in Flash. Starting this release, we now support loading data from tensors for Video Classification.
import torch
from flash.video import VideoClassifier, VideoClassificationData
import flash
# 5 number of frames, 3 channels, height = 10 and width = 10
mock_tensors = torch.randint(size=(3, 5, 10, 10), low=0, high=255)
datamodule = VideoClassificationData.from_tensors(
train_data=[mock_tensors, mock_tensors], # can also stack: torch.stack((mock_tensors, mock_tensors))
train_targets=["patient", "doctor"],
predict_data=[mock_tensors],
batch_size=1,
)
model = VideoClassifier(num_classes=datamodule.num_classes, pretrained=False, backbone="slow_r50", labels=datamodule.labels)
trainer = flash.Trainer(max_epochs=1)
trainer.finetune(model, datamodule=datamodule)
This will also come in handy for those having multi-modal pipelines who don't want to save the output of a model to files and instead pass the raw data to the next model, saving you quite a lot of time wasted in the conversion process.
Refactored Transforms in Lightning Flash ⚙️
One of the community-driven contributions that we are proud to share. Before this release, a user had to pass an input transform class for each stage, which was cumbersome. With this release, you can just pass transform=<YourTransformClass>
to the required method. This is a breaking change, and if you are not sure how to resolve this, please create an issue and we'll be happy to help!
Before v0.8 | After v0.8 | |
---|---|---|
Example |
dm = XYZTask_DataModule.from_xyz(
train_file=train_file,
val_file=val_file,
test_file=test_file,
predict_file=predict_file,
train_transform=InputTransform,
val_transform=InputTransform,
test_transform=InputTransform,
predict_transform=InputTransform,
transform_kwargs=transform_kwargs,
) |
dm = XYZTask_DataModule.from_xyz(
train_file=train_file,
val_file=val_file,
test_file=test_file,
predict_file=predict_file,
transform=InputTransform(**transform_kwargs),
) |
Note that, within your InputTransform
class, you can have <stage>_per_batch_transform_on_device
methods to support various stages.
class SampleInputTransform(InputTransform):
def per_sample_transform(self):
def fn(x):
return x
return fn
def train_per_batch_transform_on_device(self) -> Callable:
return ...
def val_per_batch_transform_on_device(self) -> Callable:
return ...
def test_per_batch_transform_on_device(self) -> Callable:
return ...
def predict_per_batch_transform_on_device(self) -> Callable:
return ...
Object Detection in Flash is now servable 💁
If you aren't aware yet, Lightning Flash supports serving models. Starting this release, Object Detection is added to the beautiful category of tasks that can be served using Lightning Flash. Below is an example of how the inference server code for object detection will look like:
# Inference Server
from flash.image import ObjectDetector
model = ObjectDetector.load_from_checkpoint("https://flash-weights.s3.amazonaws.com/0.8.0/object_detection_model.pt")
model.serve()
For more details, check out the documentation here.
Added
- Added support for
from_tensors
forVideoClassification
(#1389) - Added fine tuning strategies for DeepSpeed (with parameter loading and storing omitted) (#1377)
- Added
torchvision
as a requirement todatatype_audio.txt
as it's used for Audio Classification (#1425) - Added
figsize
andlimit_nb_samples
for showing batch images (#1381) - Added support for
from_lists
for Tabular Classification and Regression (#1337) - Added support for
from_dicts
for Tabular Classification and Regression (#1331) - Added support for using the
ImageEmbedder
SSL training for all image classifier backbones (#1264) - Added support for audio file formats to
AudioClassificationData
(#1085) - Added support for Flash serve to the
ObjectDetector
(#1370) - Added support for loading
ImageClassificationData
from PIL images withfrom_images
(#1372) - Added support for loading
ObjectDetectionData
withfrom_numpy
,from_images
, andfrom_tensors
(#1372) - Added support for remote data loading with fsspec (#1387)
- Added support for TSV files to
from_csv
methods (#1387) - Added support for more formats when loading audio files (#1387)
- Added support to use any task as an embedder by calling
as_embedder
(#1396) - Added support for normalization of images in
SemanticSegmentationData
(#1399)
Changed
- Changed the
ImageEmbedder
dependency on VISSL to optional (#1276) - Changed the transforms in
SemanticSegmentationData
to use albumentations instead of Kornia (#1313)
Removed
- Removed support for audio files with
sd2
extension, because SoundFile (for sd2 extension) doesn't accept fsspec objects. ([#1409](https://github.com/Lightning-AI/lightn...
Bi-Weekly Patch Release
[0.7.5] - 2022-05-11
Fixed
- Fixed image classification data show_train_batch for subplots with rows > 1. (#1315)
- Fixed support for all the versions (including the latest and older) of baal. (#1315)
- Fixed a bug where a loaded TabularClassifier or TabularRegressor checkpoint could not be served (#1324)
- Fixed a bug where the freeze_unfreeze and unfreeze_milestones finetuning strategies could not be used in tandem with a onecyclelr LR scheduler (#1329)
- Fixed a bug where the backbone learning rate would be divided by 10 when unfrozen if using the freeze_unfreeze or unfreeze_milestones strategies (#1329)
Contributors
@Borda @ethanwharris @kaushikb11 @krshrimali
If we forgot someone let us know 😃
Bi-Weekly Patch Release
[0.7.4] - 2022-04-27
Fixed
- Fixed a bug where LR schedulers from HuggingFace could not be used with newer versions of PyTorch Lightning (#1307)
- Fixed a bug where the default Flash zero configurations for
ObjectDetector
,InstanceSegmentation
, andKeypointDetector
would error with the latest version of some requirements (#1306) - Fixed plain
LightningModule
support for Flash data modules. (#1281)
Contributors
@Borda @ethanwharris @krshrimali @rohitgr7
If we forgot someone let us know 😃
Bi-Weekly Patch Release
[0.7.3] - 2022-04-13
Fixed
- Fixed a bug where some backbones were incorrectly listed as available for the
ObjectDetector
,InstanceSegmentation
, andKeypointDetector
(#1267) - Fixed a bug where the backbone would not be frozen when finetuning the
SpeechRecognition
task (#1275) - Fixed a bug where the backbone would not be frozen when finetuning the
QuestionAnswering
task with certain model types (#1275)
Contributors
If we forgot someone let us know 😃
Bi-Weekly Patch Release
[0.7.2] - 2022-03-30
Fixed
- Fixed examples (question answering), where NLTK's
punkt
module needs to be downloaded first. (#1215) - Fixed normalizing inputs to video classification (#1213)
- Fixed a bug where
pretraining_transforms
in theImageEmbedder
was never called. (1196) - Fixed a bug where
BASE_MODEL_NAME
was not in the dict for dino and moco strategies. (1196) - Fixed support for
torch==1.11.0
(#1234) - Fixed DDP spawn support for
ObjectDetector
,InstanceSegmentation
, andKeypointDetector
(#1222) - Fixed a bug where
InstanceSegmentation
would fail if samples had an inconsistent number of bboxes, labels, and masks (these will now be treated as negative samples) (#1222) - Fixed a bug where collate functions were never called in the
ImageEmbedder
class. (#1217) - Fixed a bug where
ObjectDetector
,InstanceSegmentation
, andKeypointDetector
would log train and validation metrics with the same name (#1252) - Fixed a bug where using
ReduceLROnPlateau
would raise an error (#1251) - Fixed GPU support for self-supervised training with the
ImageEmbedder
(#1256)
Contributors
@aisensiy @andife @aniketmaurya @Borda @dudeperf3ct @ethanwharris @krshrimali
If we forgot someone let us know 😃
Bi-Weekly Patch Release
[0.7.1] - 2022-03-01
Added
- Added the normalization parameters of
torchvision.transforms.Normalize
astransform_kwargs
in theImageClassificationInputTransform
(#1178) - Added
available_outputs
method to theTask
(#1206)
Fixed
- Fixed a bug where DDP would not work with Flash tasks (#1182)
- Fixed DDP support for
VideoClassifier
(#1189) - Fixed a bug where buffers in loss functions were not correctly registered in the
Task
(#1203) - Fixed support for passing a sampler instance to
from_*
methods / theDataModule
(#1204)
Contributors
@aisensiy @AndresAlgaba @Borda @ethanwharris
If we forgot someone due to not matching commit email with GitHub account, let us know :]
PyTorch Tabular, Enhanced Data Loading and Stability
[0.7.0] - 2022-02-15
Added
- Added support for multi-label, space delimited, targets (#1076)
- Added support for tabular classification / regression backbones from PyTorch Tabular (#1098)
- Added Flash zero support for tabular regression (#1098)
- Added support for COCO annotations with non-default keypoint labels to
KeypointDetectionData.from_coco
(#1102) - Added support for
from_csv
andfrom_data_frame
toVideoClassificationData
(#1117) - Added support for
SemanticSegmentationData.from_folders
where mask files have different extensions to the image files (#1130) - Added
FlashRegistry
of Available Heads forflash.image.ImageClassifier
(#1152) - Added support for
ObjectDetectionData.from_files
(#1154) - Added support for passing the
Output
object (or a string e.g."labels"
) to theflash.Trainer.predict
method (#1157) - Added support for passing the
TargetFormatter
object tofrom_*
methods for classification to override target handling (#1171)
Changed
- Changed
Wav2Vec2Processor
toAutoProcessor
and seperate it from backbone [optional] (#1075) - Renamed
ClassificationInput
toClassificationInputMixin
(#1116) - Changed the default
learning_rate
for all tasks to beNone
, corresponding to the default for your chosen optimizer (#1172)
Fixed
- Fixed a bug when not explicitly passing
embedding_sizes
to theTabularClassifier
andTabularRegressor
tasks (#1067) - Fixed a bug where under some circumstances transforms would not get called (#1072)
- Fixed a bug where prediction would sometimes give the wrong number of outputs (#1077)
- Fixed a bug where passing the
val_split
to theDataModule
would not have the desired effect (#1079) - Fixed a bug where passing
predict_data_frame
toImageClassificationData.from_data_frame
raised an error (#1088) - Fixed a bug where segmentation files / masks were loaded with an inconsistent ordering (#1094)
- Fixed a bug with
AudioClassificationData.from_numpy
(#1096) - Fixed a bug when using
SpeechRecognitionData.from_files
for training / validating / testing (#1097) - Fixed a bug when using
SpeechRecognitionData.from_csv
orfrom_json
when predicting without targets (#1097) - Fixed a bug where
SpeechRecognitionData.from_datasets
did not work as expected (#1097) - Fixed a bug where loading data for prediction with
SemanticSegmentationData.from_folders
raised an error (#1101) - Fixed a bug when passing a
predict_folder
argument tofrom_coco
/from_voc
/from_via
in IceVision tasks (#1102) - Fixed
ObjectDetectionData.from_voc
andObjectDetectionData.from_via
(#1102) - Fixed a bug where
InstanceSegmentationData.from_coco
would raise an error if not using file-based masks (#1102) - Fixed
InstanceSegmentationData.from_voc
(#1102) - Fixed a bug when loading tabular data for prediction without a target field / column (#1114)
- Fixed a bug when loading prediction data for graph classification without targets (#1121)
- Fixed a bug where loading Seq2Seq data for prediction would not work if the target field was not present (#1128)
- Fixed a bug where
from_fiftyone
classmethods did not work correctly with apredict_dataset
(#1136) - Fixed a bug where the
labels
property would returnNone
when usingObjectDetectionData.from_fiftyone
(#1136) - Fixed a bug where
TabularData
would not work correctly with no categorical variables (#1144) - Fixed a bug where loading
TabularForecastingData
for prediction would only yield a single sample per series (#1149) - Fixed a bug where backbones for the
ObjectDetector
,KeypointDetector
, andInstanceSegmentation
tasks were not always frozen correctly when finetuning (#1163) - Fixed a bug where
DataModule.multi_label
would sometimes beNone
when it had been inferred to beFalse
(#1165)
Removed
- Removed the
Seq2SeqData
base class (useTranslationData
orSummarizationData
directly) (#1128) - Removed the ability to attach the
Output
object directly to the model (#1157)
Contributors
@Actis92 @AjinkyaIndulkar @bartonp2 @Borda @daMichaelB @ethanwharris @flozi00 @karthikrangasai @MikeTrizna
If we forgot someone due to not matching commit email with GitHub account, let us know :]