🔥🔥🔥 Multimodal Large Language Models for Remote Sensing: A Survey
[Project Page]This Page |
School of Artificial Intelligence, OPtics, and ElectroNics (iOPEN), Northwestern Polytechnical University
✨✨✨ Behold our meticulously curated trove of RS-MLLMs resources!!!
🎉🚀💡 The website will be updated in real-time to track the latest state of RS-MLLMs!!!
📑📚🔍 Feast your eyes on an assortment of model architecture, training pipelines, datasets, comprehensive evaluation benchmarks, intelligent agents for remote sensing, techniques for instruction tuning, and much more.
🌟🔥📢 A collection of remote sensing multimodal large language model papers focusing on the vision-language domain.
In this repository, we will collect and document researchers and their outstanding work related to remote sensing multimodal large language model (vision-language).
- The list will be continuously updated 🔥🔥
- 📦 coming soon! 🚀
- May-22-2024: The first RS-MLLMs review manuscript has been submitted for review. 🔥🔥
Table of Contents
- Awesome Papers
- Awesome Datasets
- Latest Evaluation Benchmarks for Remote Sensing Vision-Language Tasks
Title | Venue | Date | Code | Note |
---|---|---|---|---|
RS-Agent: Automating Remote Sensing Tasks through Intelligent Agents W. Xu, Z. Yu, Y. Wang, J. Wang, and M. Peng. |
arXiv | 2024-06-11 | - | - |
GeoLLM-Engine: A Realistic Environment for Building Geospatial Copilots S. Singh, M. Fore, D. Stamoulis, and D. Group. |
arXiv | 2024-04-23 | - | - |
Evaluating Tool-Augmented Agents in Remote Sensing Platforms S. Singh, M. Fore, and D. Stamoulis. |
arXiv | 2024-04-23 | - | - |
Change-Agent: Towards Interactive Comprehensive Remote Sensing Change Interpretation and Analysis C. Liu, K. Chen, H. Zhang, Z. Qi, Z. Zou, and Z. Shi. |
arXiv | 2024-04-01 | Github | - |
Remote Sensing ChatGPT: Solving Remote Sensing Tasks with ChatGPT and Visual Models H. Guo, X. Su, C. Wu, B. Du, L. Zhang, and D. Li. |
arXiv | 2024-01-17 | Github | - |
Tree-GPT: Modular Large Language Model Expert System for Forest Remote Sensing Image Understanding and Interactive Analysis S. Du, S. Tang, W. Wang, X. Li, and R. Guo. |
arXiv | 2023-10-07 | - | - |
Title | Venue | Date | Code | Note |
---|---|---|---|---|
RS5M and GeoRSCLIP: A Large Scale Vision-Language Dataset and A Large Vision-Language Model for Remote Sensing Z. Zhang, T. Zhao, Y. Guo, and J. Yin. |
arXiv | 2024-01-02 | Github | accepted by IEEE-TGRS |
RemoteCLIP: A Vision Language Foundation Model for Remote Sensing F. Liu, D. Chen, Z. Guan, X. Zhou, J. Zhu, and J. Zhou. |
T-GRS | 2024-04-18 | Github | arXiv |
Remote Sensing Vision-Language Foundation Models without Annotations via Ground Remote Alignment U. Mall, C. P. Phoo, M. K. Liu, C. Vondrick, B. Hariharan, and K. Bala. |
ICLR | 2024-01-16 | Project | arXiv |
RS-CLIP: Zero Shot Remote Sensing Scene Classification via Contrastive Vision-Language Supervision X. Li, C. Wen, Y. Hu, and N. Zhou. |
JAG | 2023-09-18 | Github | - |
Parameter-Efficient Transfer Learning for Remote Sensing Image–Text Retrieval Y. Yuan, Y. Zhan, and Z. Xiong. |
T-GRS | 2023-08-28 | Github | arXiv |
Title | Venue | Date | Code | Note |
---|---|---|---|---|
Remote Sensing Temporal Vision-Language Models: A Comprehensive Survey C. Liu, J. Zhang, K. Chen, M. Wang, Z. Zou, and Z. Shi |
arXiv | 2024-12-03 | Github | arXiv |
From Pixels to Prose: Advancing Multi-Modal Language Models for Remote Sensing X. Sun, B. Peng, C. Zhang, F. Jin, Q. Niu, J. Liu, K. Chen, M. Li, P. Feng, Z. Bi, M. Liu, and Y. Zhang. |
arXiv | 2024-11-05 | - | - |
Foundation Models for Remote Sensing and Earth Observation: A Survey A. Xiao, W. Xuan, J. Wang, J. Huang, D. Tao, S. Lu, and N. Yokoya. |
arXiv | 2024-10-22 | Github | arXiv |
Advancements in Visual Language Models for Remote Sensing: Datasets, Capabilities, and Enhancement Techniques L. Tao, H. Zhang, H. Jing, Y. Liu, K. Yao, C. Li, and X. Xue. |
arXiv | 2024-10-15 | Github | arXiv |
Towards Vision-Language Geo-Foundation Model: A Survey Y. Zhou, L. Feng, Y. Ke, X. Jiang, J. Yan, and W. Zhang. |
arXiv | 2024-06-13 | Github | arXiv |
Vision-Language Models in Remote Sensing: Current progress and future trends X. Li, C. Wen, Y. Hu, Z. Yuan, and X. X. Zhu. |
MGRS | 2024-04-22 | - | - |
Language Integration in Remote Sensing: Tasks, datasets, and future directions L. Bashmal, Y. Bazi, F. Melgani, M. M. Al Rahhal, and M. A. Al Zuair. |
MGRS | 2023-10-11 | - | - |
Brain-Inspired Remote Sensing Foundation Models and Open Problems: A Comprehensive Survey L. Jiao et al. |
JSTARS | 2023-09-18 | - | - |
Title | Venue | Date | Code | Note |
---|---|---|---|---|
On the Foundations of Earth and Climate Foundation Models X. X. Zhu et al. |
arXiv | 2024-05-07 | Github | - |
On the Promises and Challenges of Multimodal Foundation Models for Geographical, Environmental, Agricultural, and Urban Planning Applications C. Tan et al. |
arXiv | 2023-12-23 | - | - |
Charting New Territories: Exploring the Geographic and Geospatial Capabilities of Multimodal LLMs J. Roberts, T. Lüddecke, R. Sheikh, K. Han, and S. Albanie. |
arXiv | 2023-11-24 | Github | - |
The Potential of Visual ChatGPT for Remote Sensing L. P. Osco, E. L. de Lemos, W. N. Gonçalves, A. P. M. Ramos, and J. Marcato Junior. |
Remote Sensing | 2023-06-22 | - | - |
Title | Venue | Date | Code | Note |
---|---|---|---|---|
RSTeller: Scaling Up Visual Language Modeling in Remote Sensing with Rich Linguistic Semantics from Openly Available Data and Large Language Models J. Ge, Y. Zheng, K. Guo, and J. Liang. |
arXiv | 2024-08-27 | Github | Link |
ChatEarthNet: A Global-Scale, High-Quality Image-Text Dataset for Remote Sensing Z. Yuan, Z. Xiong, L. Mou, and X. X. Zhu. |
arXiv | 2024-02-17 | Github | Link |
RS5M and GeoRSCLIP: A Large Scale Vision-Language Dataset and A Large Vision-Language Model for Remote Sensing Z. Zhang, T. Zhao, Y. Guo, and J. Yin. |
arXiv | 2024-01-02 | Github | - |
SkyScript: A Large and Semantically Diverse Vision-Language Dataset for Remote Sensing Z. Wang, R. Prabha, T. Huang, J. Wu, and R. Rajagopal. |
AAAI | 2024-03-24 | Github | arXiv |
If you have any questions about this project, please feel free to contact [email protected].