Texera supports scalable data computation and enables advanced AI/ML techniques.
"Collaboration" is a key focus, and we enable an experience similar to Google Docs, but for data science.
- Data science is labor-intensive and particularly challenging for non-IT users applying AI/ML.
- Many workflow-based data science platforms lack parallelism, limiting their ability to handle big datasets.
- Cloud services and technologies have advanced significantly over the past decade, enabling powerful browser-based interfaces supported by high-speed networks.
- Existing data science platforms offer limited interaction during long-running jobs, making them difficult to manage after execution begins.
- Provide data science as cloud services;
- Provide a browser-based GUI to form a workflow without writing code;
- Allow non-IT people to access data science;
- Support collaborative data science;
- Allow users to interact with the execution of a job;
- Support huge volumes of data efficiently.
The Texera interface supports real-time collaboration on data science projects, allowing seamless sharing of data and workflows with easy access to AI/ML techniques and efficient management of public and private resources. The workflow in the use case shown below includes data cleaning, ML model training, and validation.
- (11/2024) IcedTea: Efficient and Responsive Time-Travel Debugging in Dataflow Systems
Shengquan Ni, Yicong Huang, Zuozhi Wang, and Chen Li To appear in VLDB 2025 - (8/2024) Pasta: A Cost-Based Optimizer for Generating Pipelining Schedules for Dataflow DAGs
Xiaozhen Liu, Yicong Huang, Xinyuan Lin, Avinash Kumar, Sadeem Alsudais, and Chen Li
To appear in SIGMOD 2025 - (7/2024) Texera: A System for Collaborative and Interactive Data Analytics Using Workflows
Zuozhi Wang, Yicong Huang, Shengquan Ni, Avinash Kumar, Sadeem Alsudais, Xiaozhen Liu, Xinyuan Lin, Yunyan Ding, and Chen Li
In VLDB 2024, Scalable Data Science track | PDF | Slides - (3/2024) Demonstration of Udon: Line-by-line Debugging of User-Defined Functions in Data Workflows
Yicong Huang, Zuozhi Wang, and Chen Li
In SIGMOD 2024 Best Demo Runner-Up Award🏆 | PDF - (2/2024) Data Science Tasks Implemented with Scripts versus GUI-Based Workflows: The Good, the Bad, and the Ugly
Alexander K Taylor, Yicong Huang, Junheng Hao, Xinyuan Lin, Xiusi Chen, Wei Wang, and Chen Li
In DataPlat Workshop at ICDE 2024 | PDF | Slides
Expand All
- (8/2023) Building a Collaborative Data Analytics System: Opportunities and Challenges
Zuozhi Wang, Chen Li
In Tutorial at VLDB 2023 | PDF | Slides - (8/2023) Udon: Efficient Debugging of User-Defined Functions in Big Data Systems with Line-by-Line Control
Yicong Huang, Zuozhi Wang, and Chen Li
In SIGMOD 2024 | PDF | Slides - (8/2023) Improving Iterative Analytics in GUI-Based Data-Processing Systems with Visualization, Version Control, and Result Reuse
Sadeem Alsudais Ph.D. Thesis | PDF - (7/2023) Using Texera to Characterize Climate Change Discussions on Twitter During Wildfires
Shengquan Ni, Yicong Huang, Jessie W. Y. Ko, Alexander Taylor, Xiusi Chen, Avinash Kumar, Sadeem Alsudais, Zuozhi Wang, Xiaozhen Liu, Wei Wang, Suellen Hopfer, and Chen Li
In Data Science Day at KDD 2023 - (7/2023) Raven: Accelerating Execution of Iterative Data Analytics by Reusing Results of Previous Equivalent Versions
Sadeem Alsudais, Avinash Kumar, and Chen Li
In HILDA Workshop at SIGMOD 2023 | PDF - (6/2023) Texera: A System for Collaborative and Interactive Data Analytics Using Workflows
Zuozhi Wang Ph.D. Thesis | PDF - (12/2022) Towards Interactive, Adaptive and Result-aware Big Data Analytics
Avinash Kumar Ph.D. Thesis | PDF - (9/2022) Fries: Fast and Consistent Runtime Reconfiguration in Dataflow Systems with Transactional Guarantees
Zuozhi Wang, Shengquan Ni, Avinash Kumar, and Chen Li
In VLDB 2023 | PDF | Slides - (7/2022) Drove: Tracking Execution Results of Workflows on Large Datasets
Sadeem Alsudais
In the Ph.D. Workshop at VLDB 2022 | PDF - (6/2022) Demonstration of Accelerating Machine Learning Inference Queries with Correlative Proxy Models
Zhihui Yang, Yicong Huang, Zuozhi Wang, Feng Gao, Yao Lu, Chen Li, and X. Sean Wang
In VLDB 2022 | PDF - (6/2022) Demonstration of Collaborative and Interactive Workflow-Based Data Analytics in Texera
Xiaozhen Liu, Zuozhi Wang, Shengquan Ni, Sadeem Alsudais, Yicong Huang, Avinash Kumar, and Chen Li
In VLDB 2022 | PDF | Demo Video - (4/2022) Optimizing Machine Learning Inference Queries with Correlative Proxy Models
Zhihui Yang, Zuozhi Wang, Yicong Huang, Yao Lu, Chen Li, and X. Sean Wang
In VLDB 2022 | PDF - (7/2020) Demonstration of Interactive Runtime Debugging of Distributed Dataflows in Texera
Zuozhi Wang, Avinash Kumar, Shengquan Ni, and Chen Li
In VLDB 2020 | PDF | Video | Slides - (1/2020) Amber: A Debuggable Dataflow system based on the Actor Model
Avinash Kumar, Zuozhi Wang, Shengquan Ni, and Chen Li
In VLDB 2020 | PDF | Video | Slides - (4/2017) A Demonstration of TextDB: Declarative and Scalable Text Analytics on Large Data Sets
Zuozhi Wang, Flavio Bayer, Seungjin Lee, Kishore Narendran, Xuxi Pan, Qing Tang, Jimmy Wang, and Chen Li
In ICDE 2017 Best Demo award | PDF | Video
- (2/2025) DS4ALL: Teaching High-School Students Data Science and AI/ML Using the Texera Workflow Platform as a Service
Jiadong Bai, Xiaozhen Liu, Anthony Cuturrufo, Alexander Kundu Taylor, Jeehyun Hwang, Mingyu Derek Ma, Xinyuan Lin, Yanqiao Zhu, Yicong Huang, Yunyan Ding, Wei Wang, and Chen Li
To appear in Data Science Education K-12: Research to Practice Annual Conference 2025 - (7/2024) Brain Image Data Processing Using Collaborative Data Workflows on Texera
Yunyan Ding, Yicong Huang, Pan Gao, Andy Thai, Atchuth Naveen Chilaparasetti, M. Gopi, Xiangmin Xu, and Chen Li
In Frontiers Neural Circuits | PDF - (1/2024) Wording Matters: The Effect of Linguistic Characteristics and Political Ideology on Resharing of COVID-19 Vaccine Tweets
Judith Borghouts, Yicong Huang, Suellen Hopfer, Chen Li, and Gloria Mark
In TOCHI 2024 | PDF - (1/2024) How the Experience of California Wildfires Shape Twitter Climate Change Framings
Jessie W. Y. Ko, Shengquan Ni, Alexander Taylor, Xiusi Chen, Yicong Huang, Avinash Kumar, Sadeem Alsudais, Zuozhi Wang, Xiaozhen Liu, Wei Wang, Chen Li, and Suellen Hopfer In Climatic Change 2024 | PDF - (11/2023) The Marketing and Perceptions of Non-Tobacco Blunt Wraps on Twitter
Joshua U. Rhee, Yicong Huang, Aurash J. Soroosh, Sadeem Alsudais, Shengquan Ni, Avinash Kumar, Jacob Paredes, Chen Li, and David S. Timberlake In Substance Use & Misuse 2023 | PDF
Expand All
- (3/2023) Understanding Underlying Moral Values and Language Use of COVID-19 Vaccine Attitudes on Twitter
Judith Borghouts, Yicong Huang, Sydney Gibbs, Suellen Hopfer, Chen Li, and Gloria Mark In PNAS Nexus 2023 | PDF - (10/2022) Public Opinions Toward COVID-19 Vaccine Mandates: A Machine Learning-Based Analysis of U.S. Tweets
Yawen Guo, Jun Zhu, Yicong Huang, Lu He, Changyang He, Chen Li, and Kai Zheng In AMIA 2022 | PDF - (9/2021) The Social Amplification and Attenuation of COVID-19 Risk Perception Shaping Mask-Wearing Behavior: A Longitudinal Twitter Analysis
Suellen Hopfer, Emilia J. Fields, Yuwen Lu, Ganesh Ramakrishnan, Ted Grover, Quishi Bai, Yicong Huang, Chen Li, and Gloria Mark In PLOS ONE 2021 | PDF - (4/2021) Why Do People Oppose Mask Wearing? A Comprehensive Analysis of U.S. Tweets During the COVID-19 Pandemic
Lu He, Changyang He, Tera Leigh Reynolds, Qiushi Bai, Yicong Huang, Chen Li, Kai Zheng, and Yunan Chen
In JAMIA 2021 | PDF
dkNET Webinar 04/26/2024 |
Texera Demo @ VLDB'20 |
Amber Presentation @ VLDB'20 |
- For users, visit Guide to Use Texera.
- For developers, visit Guide to Develop Texera.
Texera was formally known as "TextDB" before August 28, 2017.
This project is supported by the National Science Foundation under the awards IIS-1745673, IIS-2107150, AWS Research Credits, and Google Cloud Platform Education Programs.
-
This project is supported by an NIH NIDDK award.
-
Yourkit has given an open source license to use their profiler in this project.
Please cite Texera as
@article{DBLP:journals/pvldb/WangHNKALLDL24,
author = {Zuozhi Wang and
Yicong Huang and
Shengquan Ni and
Avinash Kumar and
Sadeem Alsudais and
Xiaozhen Liu and
Xinyuan Lin and
Yunyan Ding and
Chen Li},
title = {Texera: {A} System for Collaborative and Interactive Data Analytics
Using Workflows},
journal = {Proc. {VLDB} Endow.},
volume = {17},
number = {11},
pages = {3580--3588},
year = {2024},
url = {https://www.vldb.org/pvldb/vol17/p3580-wang.pdf},
timestamp = {Thu, 19 Sep 2024 13:09:37 +0200},
biburl = {https://dblp.org/rec/journals/pvldb/WangHNKALLDL24.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}