- Premise๋ฌธ์ฅ์ ์ฐธ๊ณ ํ์ฌ hypothesis ๋ฌธ์ฅ์ ์ฐธ, ๊ฑฐ์ง, ์ค๋ฆฝ์ ํ๋ณํด์ผํ๋ค.
premise: ์จ๋ฆ์ ์๊ณ ์๋๋ก๋ถํฐ ์ ํด์ ธ ๋ด๋ ค์ค๋ ๋จ์๋ค์ ๋ํ์ ์ธ ๋์ด๋ก์, ์๋
์ด๋ ์ฅ์ ๋ค์ด ๋๊ณ ํํํ ๋ฐฑ์ฌ์ฅ์ด๋ ๋ง๋น์์ ๋ชจ์ฌ ์๋ก ํ๊ณผ ์ฌ๊ธฐ๋ฅผ ๊ฒจ๋ฃจ๋ ๊ฒ์ด๋ค.
hypothesis: ์จ๋ฆ์ ์ฌ์๋ค์ ๋์ด์ด๋ค.
label: contradiction
- Accuracy
- Public: Test Data ์ค Random sampling 60%
- Private: ์ ์ฒด Test Data
- Train: 24998
- Test: 1666
GPU: Colab Pro P100
pip install -r requirements.txt
unzip -q './data/open.zip' -d './data'
# Train
python train.py --explain
# Inference
python inference.py --explain
+- data
| +- klue-nli-v1.1.tar.gz (klue_dev ์ ์์ ์ฌ์ฉ)
| +- klue_dev.csv (KLUE OFFICAL dev dataset)
| +- kor_nli_valid.csv (kakaobrain - kornli dataset)
| +- open.zip (Original Dataset)
+- EDA
| +- EDA.ipynb (Dataset EDA)
| +- aug_dataset.ipynb (Dataset augmentation => klue_dev.csv & kor_nli_valid.csv)
+- utils
| +- collate_functions.py
| +- loss.py
| +- mk_data.py
| +- nlpdata_eda.py
| +- random_seed.py
+- requirements.txt
+- dataset.py
+- model.py
+- train.py
+- train_kfold.py
+- inference.py
- KLUE/RoBERTa-large + Classifier Head with Hyperparmeter Tuning (Baseline์ผ๋ก ์ง์ )
- KLUE/RoBERTa-large๋ฅผ backbone์ผ๋ก ํ์ฉํ NLI ๋ชจ๋ธ ์ ์ฉ
- ๋ค์ํ HyperParameter Tuning ์คํ์ ํตํ ์ฑ๋ฅ ํฅ์
- Self-Explaining Structures Improve NLP Models (Paper Review ์ฐธ๊ณ )
- KLUE/RoBERTa-large๋ฅผ backbone์ผ๋ก ํ์ฉ (intermediate layer)
- SIC layer์ถ๊ฐ (backbone model์์์ output layer๋ค ์ฌ์ด์ ์กฐํฉ ์์ฑ) => span ์ ๋ณด ์ ๋ฌ
- Interpreatation layer๋ฅผ ์ถ๊ฐ => span์์์ ๊ฐ์ค์น ์ถ์ถ
- ์ถ์ถ๋ ๊ฐ์ค์น์ span ์ ๋ณด๋ฅผ weighted sumํ์ฌ ์ต์ข output ์ถ๋ ฅ
- ์ธ๋ถ Dataset ์ ์ ๋ฐ ํ์ฉ
- KLUE OFFICIAL Dev Dataset ํ์ฉ
- KakaoBrain KorNLI Dataset ์ค Human Trnaslated Data๋ง ํ์ฉ (Original Dataset๊ณผ ์ ์ฌํ Data ์ถ๊ฐ)
- Out of Fold Ensemble
- Stratified KFold๋ฅผ Ensemble ์งํ
- Baseline + Explaining Model Ensemble
Single Baseline(train:valid=8:2) | Single Self-Explaining(train:valid=8:2) | |
---|---|---|
Accuracy | 0.872 | 0.864 |
Baseline KFold | Self-Explaining KFold | Baseline + Self-Explaining(Public) | Baseline + Self-Explaining(Private) | |
---|---|---|---|---|
Accuracy | 0.888 | 0.874 | 0.89 | 0.89015 |
- Self-Explaining์ ๋ํ Error Analysis
- ์ด๋ค ๋ถ๋ถ์ weight๋ฅผ ์ฃผ์ด ์์ธก์ ์งํํ๋๋ฐ ์ถ๋ ฅํด๋ณด๊ธฐ