This is the official reproduction of Qihoo-T2X, which represents a groundbreaking DiT architecture paradigm designed for Text-to-Any tasks.
QIHOO-T2X: AN EFFICIENT PROXY-TOKENIZED DIFFUSION TRANSFORMER FOR TEXT-TO-ANY-TASK
Jing Wang*, Ao Ma*†, Jiasong Feng*, Dawei Leng‡, Yuhui Yin, Xiaodan Liang‡(*Equal Contribution, †Project Lead, ‡Corresponding Authors)
- [2025.02.11] 🔥 We have open-sourced our model and inference code in Ascend/MindSpeed-MM.
- [2025.01.22] 🔥 Our paper has been accepted for presentation at ICLR 2025.
- [2024.09.12] We created a project homepage featuring galleries for Qihoo-T2X
We are seeking academic interns in the AIGC field. If interested, please send your resume to [email protected].
@misc{wang2024qihoot2xefficiencyfocuseddiffusiontransformer,
title={Qihoo-T2X: An Efficiency-Focused Diffusion Transformer via Proxy Tokens for Text-to-Any-Task},
author={Jing Wang and Ao Ma and Jiasong Feng and Dawei Leng and Yuhui Yin and Xiaodan Liang},
year={2024},
eprint={2409.04005},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2409.04005},
}