Skip to content

Models used for "Investigating Large Language Models for Complex Word Identification in Multilingual and Multidomain Setups" accepted at EMNLP 2024 main conference.

License

Notifications You must be signed in to change notification settings

razvanalex-phd/cwi_llm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

Investigating Large Language Models for Complex Word Identification in Multilingual and Multidomain Setups

This repository contains the list of fine-tuned models used for "Investigating Large Language Models for Complex Word Identification in Multilingual and Multidomain Setups" accepted at EMNLP 2024 main conference. The public fine-tuned LLama-2-based models can be found on HuggingFace.

🚀 Fine-Tuned Models

Base model Dataset HuggingFace URL
Llama-2-7b-chat CWI Shared 2018 EN unstpb-nlp/llama-2-7b-ft-cwi-2018-en
Llama-2-7b-chat CWI Shared 2018 ES unstpb-nlp/llama-2-7b-ft-cwi-2018-es
LLama-2-7b-chat CWI Shared 2018 DE unstpb-nlp/llama-2-7b-ft-cwi-2018-de
Llama-2-7b-chat CompLex LCP 2021 unstpb-nlp/llama-2-7b-ft-CompLex-2021
Llama-2-13b-chat CWI Shared 2018 EN unstpb-nlp/llama-2-13b-ft-cwi-2018-en
Llama-2-13b-chat CWI Shared 2018 ES unstpb-nlp/llama-2-13b-ft-cwi-2018-es
Llama-2-13b-chat CWI Shared 2018 DE unstpb-nlp/llama-2-13b-ft-cwi-2018-de
Llama-2-13b-chat CompLex LCP 2021 unstpb-nlp/llama-2-13b-ft-CompLex-2021
Vicuna-v1.5-7b CWI Shared 2018 EN unstpb-nlp/vicuna-v15-7b-ft-cwi-2018-en
Vicuna-v1.5-7b CWI Shared 2018 ES unstpb-nlp/vicuna-v15-7b-ft-cwi-2018-es
Vicuna-v1.5-7b CWI Shared 2018 DE unstpb-nlp/vicuna-v15-7b-ft-cwi-2018-de
Vicuna-v1.5-7b CompLex LCP 2021 unstpb-nlp/vicuna-v15-7b-ft-CompLex-2021
Vicuna-v1.5-13b CWI Shared 2018 EN unstpb-nlp/vicuna-v15-13b-ft-cwi-2018-en
Vicuna-v1.5-13b CWI Shared 2018 ES unstpb-nlp/vicuna-v15-13b-ft-cwi-2018-es
Vicuna-v1.5-13b CWI Shared 2018 DE unstpb-nlp/vicuna-v15-13b-ft-cwi-2018-de
Vicuna-v1.5-13b CompLex LCP 2021 unstpb-nlp/vicuna-v15-13b-ft-CompLex-2021
Llama-3-8b-chat CWI Shared 2018 EN unstpb-nlp/llama-3-8b-ft-cwi-2018-en
Llama-3-8b-chat CWI Shared 2018 ES unstpb-nlp/llama-3-8b-ft-cwi-2018-es
Llama-3-8b-chat CWI Shared 2018 DE unstpb-nlp/llama-3-8b-ft-cwi-2018-de
Llama-3-8b-chat CompLex LCP 2021 unstpb-nlp/llama-3-8b-ft-CompLex-2021

📚 Meta-Learning Models

Note

Will be added soon.

🤖 Datasets for Fine-Tuning ChatGPT

These are extracts from CWI 2018 and LCP 2021 datasets used during fine-tuning ChanGPT-3.5-turbo.

Base model Dataset Training Data Validation Data Trained tokens Epochs Batch size LR multiplier
gpt-3.5-turbo-1106 CWI Shared 2018 EN train set validation set 163,749 3 1 2
gpt-3.5-turbo-1106 CWI Shared 2018 ES train set validation set 224,784 3 1 2
gpt-3.5-turbo-1106 CWI Shared 2018 DE train set validation set 218,364 3 1 2
gpt-3.5-turbo-1106 CompLex LCP 2021 train set validation set 185,613 3 1 2

⚖️ License

Llama 2-based models are available under the Llama 2 Community License.

📖 Citation

You can cite our work as follows:

@misc{smădu2024investigatinglargelanguagemodels,
      title={Investigating Large Language Models for Complex Word Identification in Multilingual and Multidomain Setups}, 
      author={Răzvan-Alexandru Smădu and David-Gabriel Ion and Dumitru-Clementin Cercel and Florin Pop and Mihaela-Claudia Cercel},
      year={2024},
      eprint={2411.01706},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2411.01706}, 
}

About

Models used for "Investigating Large Language Models for Complex Word Identification in Multilingual and Multidomain Setups" accepted at EMNLP 2024 main conference.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published