🌟A collection of papers, datasets, code, and pre-trained weights for Remote Sensing Foundation Models (RSFMs).
🔥🔥🔥 Last Updated on 2023.12.11 🔥🔥🔥
Abbreviation | Title | Publication | Paper | Code & Weights |
---|---|---|---|---|
GASSL | Geography-Aware Self-Supervised Learning | ICCV2021 | GASSL | link |
SeCo | Seasonal Contrast: Unsupervised Pre-Training From Uncurated Remote Sensing Data | ICCV2021 | SeCo | link |
SatMAE | SatMAE: Pre-training Transformers for Temporal and Multi-Spectral Satellite Imagery | NeurIPS2022 | SatMAE | link |
RingMo | RingMo: A remote sensing foundation model with masked image modeling | TGRS2022 | RingMo | null |
RVSA | Advancing plain vision transformer toward remote sensing foundation model | TGRS2022 | RVSA | link |
RSP | An Empirical Study of Remote Sensing Pretraining | TGRS2022 | RSP | link |
MATTER | Self-Supervised Material and Texture Representation Learning for Remote Sensing Tasks | CVPR2022 | MATTER | null |
CSPT | Consecutive Pre-Training: A Knowledge Transfer Learning Strategy with Relevant Unlabeled Data for Remote Sensing Domain | RS2022 | CSPT | link |
- | Self-supervised Vision Transformers for Land-cover Segmentation and Classification | CVPRW2022 | Paper | link |
BFM | A billion-scale foundation model for remote sensing images | Arxiv2023 | BFM | null |
TOV | TOV: The original vision model for optical remote sensing image understanding via self-supervised learning | JSTARS2023 | TOV | link |
CMID | CMID: A Unified Self-Supervised Learning Framework for Remote Sensing Image Understanding | TGRS2023 | CMID | link |
RingMo-Sense | RingMo-Sense: Remote Sensing Foundation Model for Spatiotemporal Prediction via Spatiotemporal Evolution Disentangling | TGRS2023 | RingMo-Sense | null |
CACo | Change-Aware Sampling and Contrastive Learning for Satellite Images | CVPR2023 | CACo | link |
SatLas | SatlasPretrain: A Large-Scale Dataset for Remote Sensing Image Understanding | ICCV2023 | SatLas | link |
GFM | Towards Geospatial Foundation Models via Continual Pretraining | ICCV2023 | GFM | link |
Scale-MAE | Scale-MAE: A Scale-Aware Masked Autoencoder for Multiscale Geospatial Representation Learning | ICCV2023 | Scale-MAE | link |
SpectralGPT | SpectralGPT: Spectral Foundation Model | Arxiv2023 | SpectralGPT | null |
DINO-MC | DINO-MC: Self-supervised Contrastive Learning for Remote Sensing Imagery with Multi-sized Local Crops | Arxiv2023 | DINO-MC | link |
CROMA | CROMA: Remote Sensing Representations with Contrastive Radar-Optical Masked Autoencoders | NeurIPS2023 | CROMA | link |
Cross-Scale MAE | Cross-Scale MAE: A Tale of Multiscale Exploitation in Remote Sensing | NeurIPS2023 | Cross-Scale MAE | null |
DeCUR | DeCUR: decoupling common & unique representations for multimodal self-supervision | Arxiv2023 | DeCUR | link |
Presto | Lightweight, Pre-trained Transformers for Remote Sensing Timeseries | Arxiv2023 | Presto | link |
CtxMIM | CtxMIM: Context-Enhanced Masked Image Modeling for Remote Sensing Image Understanding | Arxiv2023 | CtxMIM | null |
XGeo | Multisensory Geospatial Models via Cross-Sensor Pretraining | - | XGeo | null |
FG-MAE | Feature Guided Masked Autoencoder for Self-supervised Learning in Remote Sensing | Arxiv2023 | FG-MAE | link |
Prithiv | Foundation Models for Generalist Geospatial Artificial Intelligence | Arxiv2023 | Prithiv | link |
RingMo-lite | RingMo-lite: A Remote Sensing Multi-task Lightweight Network with CNN-Transformer Hybrid Framework | Arxiv2023 | RingMo-lite | null |
- | A Self-Supervised Cross-Modal Remote Sensing Foundation Model with Multi-Domain Representation and Cross-Domain Fusion | IGARSS2023 | Paper | null |
USat | USat: A Unified Self-Supervised Encoder for Multi-Sensor Satellite Imagery | Arxiv2023 | USat | link |
Abbreviation | Title | Publication | Paper | Code & Weights |
---|---|---|---|---|
RSGPT | RSGPT: A Remote Sensing Vision Language Model and Benchmark | Arxiv2023 | RSGPT | link |
RemoteCLIP | RemoteCLIP: A Vision Language Foundation Model for Remote Sensing | Arxiv2023 | RemoteCLIP | link |
GeoChat | GeoChat: Grounded Large Vision-Language Model for Remote Sensing | Arxiv2023 | GeoChat | link |
GRAFT | Remote Sensing Vision-Language Foundation Models without Annotations via Ground Remote Alignment | ICLR2024 | GRAFT | null |
- | Charting New Territories: Exploring the Geographic and Geospatial Capabilities of Multimodal LLMs | Arxiv2023 | Paper | link |
Abbreviation | Title | Publication | Paper | Code & Weights |
---|---|---|---|---|
DiffusionSat | DiffusionSat: A Generative Foundation Model for Satellite Imagery | Arxiv2023 | DiffusionSat | null |
- | Non-Visible Light Data Synthesis and Application: A Case Study for Synthetic Aperture Radar Imagery | Arxiv2023 | Paper | link |
Abbreviation | Title | Publication | Paper | Code & Weights |
---|---|---|---|---|
CSP | CSP: Self-Supervised Contrastive Spatial Pre-Training for Geospatial-Visual Representations | ICML2023 | CSP | link |
GeoCLIP | GeoCLIP: Clip-Inspired Alignment between Locations and Images for Effective Worldwide Geo-localization | NeurIPS2023 | GeoCLIP | link |
SatCLIP | SatCLIP: Global, General-Purpose Location Embeddings with Satellite Imagery | Arxiv2023 | SatCLIP | Comming soon |
Abbreviation | Title | Publication | Paper | Attribute | Link |
---|---|---|---|---|---|
fMoW | Functional Map of the World | CVPR2018 | fMoW | Vision | link |
SEN12MS | SEN12MS -- A Curated Dataset of Georeferenced Multi-Spectral Sentinel-1/2 Imagery for Deep Learning and Data Fusion | - | SEN12MS | Vision | link |
BEN-MM | BigEarthNet-MM: A Large Scale Multi-Modal Multi-Label Benchmark Archive for Remote Sensing Image Classification and Retrieval | GRSM2021 | BEN-MM | Vision | link |
MillionAID | On Creating Benchmark Dataset for Aerial Image Interpretation: Reviews, Guidances, and Million-AID | JSTARS2021 | MillionAID | Vision | link |
SeCo | Seasonal Contrast: Unsupervised Pre-Training From Uncurated Remote Sensing Data | ICCV2021 | SeCo | Vision | link |
fMoW-S2 | SatMAE: Pre-training Transformers for Temporal and Multi-Spectral Satellite Imagery | NeurIPS2022 | fMoW-S2 | Vision | link |
TOV-RS-Balanced | TOV: The original vision model for optical remote sensing image understanding via self-supervised learning | JSTARS2023 | TOV | Vision | link |
SSL4EO-S12 | SSL4EO-S12: A Large-Scale Multi-Modal, Multi-Temporal Dataset for Self-Supervised Learning in Earth Observation | GRSM2023 | SSL4EO-S12 | Vision | link |
SSL4EO-L | SSL4EO-L: Datasets and Foundation Models for Landsat Imagery | Arxiv2023 | SSL4EO-L | Vision | link |
SatlasPretrain | SatlasPretrain: A Large-Scale Dataset for Remote Sensing Image Understanding | ICCV2023 | SatlasPretrain | Vision (Supervised) | link |
CACo | Change-Aware Sampling and Contrastive Learning for Satellite Images | CVPR2023 | CACo | Vision | Comming soon |
RSVG | RSVG: Exploring Data and Models for Visual Grounding on Remote Sensing Data | TGRS2023 | RSVG | Vision-Language | link |
RS5M | RS5M: A Large Scale Vision-Language Dataset for Remote Sensing Vision-Language Foundation Model | Arxiv2023 | RS5M | Vision-Language | link |
GEO-Bench | GEO-Bench: Toward Foundation Models for Earth Monitoring | Arxiv2023 | GEO-Bench | Vision (Evaluation) | link |
RSICap & RSIEval | RSGPT: A Remote Sensing Vision Language Model and Benchmark | Arxiv2023 | RSGPT | Vision-Language | Comming soon |
Title | Publication | Paper | Attribute |
---|---|---|---|
Self-Supervised Remote Sensing Feature Learning: Learning Paradigms, Challenges, and Future Works | TGRS2023 | Paper | Vision & Vision-Language |
Vision-Language Models in Remote Sensing: Current Progress and Future Trends | Arxiv2023 | Paper | Vision-Language |
The Potential of Visual ChatGPT For Remote Sensing | Arxiv2023 | Paper | Vision-Language |
遥感大模型:进展与前瞻 | 武汉大学学报 (信息科学版) 2023 | Paper | Vision & Vision-Language |
地理人工智能样本:模型、质量与服务 | 武汉大学学报 (信息科学版) 2023 | Paper | - |
Brain-Inspired Remote Sensing Foundation Models and Open Problems: A Comprehensive Survey | JSTARS2023 | Paper | Vision & Vision-Language |
Revisiting pre-trained remote sensing model benchmarks: resizing and normalization matters | Arxiv2023 | Paper | Vision |
An Agenda for Multimodal Foundation Models for Earth Observation | IGARSS2023 | Paper | Vision |
Transfer learning in environmental remote sensing | RSE2024 | Paper | Transfer learning |
遥感基础模型发展综述与未来设想 | 遥感学报2023 | Paper | - |