Qwen2 -> Named Entity Recognition

Note

With the latest update, the pipeline now supports high-level training control through a YAML file, eliminating the need to modify source code, except when adding a new dataset. For new datasets, you must still convert your dataset to the required format. For all other cases, simply modify the configuration file (i.e., train_config.yaml) to make necessary adjustments.

Basic Information

Setup Details

Accelerator: NVIDIA RTX 4090D $\times$ 2
Platform: Linux
Internet: Enabled

Model and Resources

LLM: Qwen/Qwen2-1.5B-Instruct
Dataset: The Learning Agency Lab - PII Data Detection
Utils: transformers | trl

Additional Information

Important

Key modules are implemented in the qwen2ner module. For more technical details, please refer to the module.

Download the dataset from Kaggle to the dataset folder.
Construct the .csv format dataset.

python3 construct_text_data.py

Train the model.

./train.sh

Inference on a single text.

python3 inference.py \
    --model_name_or_path MODEL_NAME_OR_PATH

Blogs / 中文博客

知乎：【大模型微调】Qwen SFT：基于 trl 框架的 QLoRA 微调
CSDN：【大模型微调】Qwen SFT：基于 trl 框架的 QLoRA 微调

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Qwen2 -> Named Entity Recognition

Basic Information

Setup Details

Model and Resources

Additional Information

Blogs / 中文博客

Files

README.md

Latest commit

History

README.md

File metadata and controls

Qwen2 -> Named Entity Recognition

Basic Information

Setup Details

Model and Resources

Additional Information

Blogs / 中文博客