diff --git a/.history/index_20231115221104.html b/.history/index_20231115221104.html new file mode 100644 index 0000000..fc882d8 --- /dev/null +++ b/.history/index_20231115221104.html @@ -0,0 +1,608 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + TableLlama: Towards Open Large Generalist Models for Tables + + + + + + + + + + + + + + + + + + + + + + +
+
+
+
+
+ +
+ Icon +

TableLlama: Towards Open Large Generalist Models for Tables

+
+ +
+ + + 1Tianshu Zhang*, + + 2Xiang Yue, + + 1Yifei Li, + + 1Huan Sun + +
+ +
+ + 1The Ohio State University, + 2IN.AI + +
+ zhang.11535@osu.edu + , sun.397@osu.edu + +
+ + +
+ +
+
+
+
+
+
+ + + +
+
+
+
+
+

Abstract

+
+

+ Semi-structured tables are ubiquitous. There has been a variety of tasks that aim to automatically interpret, augment, and query tables. Current methods often require pretraining on tables or special model architecture design, are restricted to specific table types, or have simplifying assumptions about tables and tasks. This paper makes the first step towards developing open-source large language models (LLMs) as generalists for a diversity of table-based tasks. Towards that end, we construct TableInstruct, a new dataset with a variety of realistic tables and tasks, for instruction tuning and evaluating LLMs. We further develop the first open-source generalist model for tables, TableLlama, by fine-tuning Llama 2 (7B) with LongLoRA to address the long context challenge. We experiment under both in-domain setting and out-of-domain setting. On 7 out of 8 in-domain tasks, TableLlama achieves com- parable or better performance than the SOTA for each task, despite the latter often has task-specific design. On 6 out-of-domain datasets, it achieves 6-48 absolute point gains compared with the base model, showing that training on TableInstruct enhances the model’s generalizability. We will open-source our dataset and trained model to boost future work on developing open generalist models for tables. +

+
+
+
+
+
+
+ + + +
+
+
+
+
+
+ + An overview of TableInstruct and 🦙TableLlama +

+ Figure 1: An overview of TableInstruct and TableLlama. TableInstruct includes a wide variety of realistic tables and tasks with instructions. We make the first step towards developing open-source generalist models for tables with TableInstruct and TableLlama. +

+
+
+
+
+
+
+ + + +
+
+
+
+
+
+ + The hybrid instruction tuning of 🦣MAmmoTH +

+ Figure 2: Illustration of three exemplary tasks: (a) Column type annotation. This task is to annotate the selected column with the correct semantic types. (b) Row population. This task is to populate rows given table metadata and partial row entities. (c) Hierarchical table QA. For subfigures (a) and (b), we mark candidates with red color in the + "task instruction" part. The candidate set size can be hundreds to thousands in TableInstruct. +

+
+
+
+
+
+
+ + +
+
+
+
+
+

Our Dataset: TableInstruct

+
+ +
+
+
+
+
+
+ +
+
+
+
+
+

In-domain Evaluation:

+
+ +
+
+
+
+
+
+ + +
+
+
+
+
+

Out-of-domain Evaluation:

+
+ +
+
+
+
+
+
+ + + + +
+
+

Reference

+ Please kindly cite our paper if you use our code, data, models or results: +

+
@misc{zhang2023tablellama,
+  title={TableLlama: Towards Open Large Generalist Models for Tables}, 
+  author={Tianshu Zhang and Xiang Yue and Yifei Li and Huan Sun},
+  year={2023},
+  eprint={2311.09206},
+  archivePrefix={arXiv},
+  primaryClass={cs.CL}
+}
+      
+
+
+ + + + + + + + + + + + + diff --git a/README.md b/README.md index 69373d4..59f1e59 100644 --- a/README.md +++ b/README.md @@ -1,44 +1,2 @@ -# Academic Project Page Template -This is an academic paper project page template. - - -Example project pages built using this template are: -- https://www.vision.huji.ac.il/deepsim/ -- https://www.vision.huji.ac.il/3d_ads/ -- https://www.vision.huji.ac.il/ssrl_ad/ -- https://www.vision.huji.ac.il/conffusion/ - - -## Start using the template -To start using the template click on `Use this Template`. - -The template uses html for controlling the content and css for controlling the style. -To edit the websites contents edit the `index.html` file. It contains different HTML "building blocks", use whichever ones you need and comment out the rest. - -**IMPORTANT!** Make sure to replace the `favicon.ico` under `static/images/` with one of your own, otherwise your favicon is going to be a dreambooth image of me. - -## Components -- Teaser video -- Images Carousel -- Youtube embedding -- Video Carousel -- PDF Poster -- Bibtex citation - -## Tips: -- The `index.html` file contains comments instructing you what to replace, you should follow these comments. -- The `meta` tags in the `index.html` file are used to provide metadata about your paper -(e.g. helping search engine index the website, showing a preview image when sharing the website, etc.) -- The resolution of images and videos can usually be around 1920-2048, there rarely a need for better resolution that take longer to load. -- All the images and videos you use should be compressed to allow for fast loading of the website (and thus better indexing by search engines). For images, you can use [TinyPNG](https://tinypng.com), for videos you can need to find the tradeoff between size and quality. -- When using large video files (larger than 10MB), it's better to use youtube for hosting the video as serving the video from the website can take time. -- Using a tracker can help you analyze the traffic and see where users came from. [statcounter](https://statcounter.com) is a free, easy to use tracker that takes under 5 minutes to set up. -- This project page can also be made into a github pages website. -- Replace the favicon to one of your choosing (the default one is of the Hebrew University). -- Suggestions, improvements and comments are welcome, simply open an issue or contact me. You can find my contact information at [https://pages.cs.huji.ac.il/eliahu-horwitz/](https://pages.cs.huji.ac.il/eliahu-horwitz/) - -## Acknowledgments -Parts of this project page were adopted from the [Nerfies](https://nerfies.github.io/) page. - -## Website License -Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. +# TableLlama +This is the repo for ```TableLlama``` and ```TableInstruct```. We are tirelessly working on the library and will release it soon. Please stay tuned!! diff --git a/index.html b/index.html index 58c7191..fc882d8 100644 --- a/index.html +++ b/index.html @@ -128,7 +128,7 @@

TableLlama: Towards Open Large Generali -