Biaffine Parser

An implementation of "Deep Biaffine Attention for Neural Dependency Parsing".

Details and hyperparameter choices are almost identical to those described in the paper, except for some training settings. Also, we do not provide a decoding algorithm to ensure well-formedness, and this does not seriously affect the results.

Another version of the implementation is available on char branch, which replaces the tag embedding with char lstm and achieves better performance.

Requirements

python == 3.7.0
pytorch == 1.0.0

Datasets

The model is evaluated on the Stanford Dependency conversion (v3.3.0) of the English Penn Treebank with POS tags predicted by Stanford POS tagger.

For all datasets, we follow the conventional data splits:

Train: 02-21 (39,832 sentences)
Dev: 22 (1,700 sentences)
Test: 23 (2,416 sentences)

Performance

	UAS	LAS
tag embedding	95.87	94.19
char lstm	96.17	94.53

Note that punctuation is excluded in all evaluation metrics.

Aside from using consistent hyperparameters, there are some keypoints that significantly affect the performance:

Dividing the pretrained embedding by its standard-deviation
Applying the same dropout mask at every recurrent timestep
Jointly dropping the words and tags

For the above reasons, we may have to give up some native modules in pytorch (e.g., LSTM and Dropout), and use self-implemented ones instead.

As shown above, our results, especially on char lstm version, have outperformed the offical implementation (95.74 and 94.08).

Usage

You can start the training, evaluation and prediction process by using subcommands registered in parser.commands.

$ python run.py -h
usage: run.py [-h] {evaluate,predict,train} ...

Create the Biaffine Parser model.

optional arguments:
  -h, --help            show this help message and exit

Commands:
  {evaluate,predict,train}
    evaluate            Evaluate the specified model and dataset.
    predict             Use a trained model to make predictions.
    train               Train a model.

Before triggering the subparser, please make sure that the data files must be in CoNLL-X format. If some fields are missing, you can use underscores as placeholders.

Optional arguments of the subparsers are as follows:

$ python run.py train -h
usage: run.py train [-h] [--ftrain FTRAIN] [--fdev FDEV] [--ftest FTEST]
                    [--fembed FEMBED] [--device DEVICE] [--seed SEED]
                    [--threads THREADS] [--file FILE] [--vocab VOCAB]

optional arguments:
  -h, --help            show this help message and exit
  --ftrain FTRAIN       path to train file
  --fdev FDEV           path to dev file
  --ftest FTEST         path to test file
  --fembed FEMBED       path to pretrained embedding file
  --device DEVICE, -d DEVICE
                        ID of GPU to use
  --seed SEED, -s SEED  seed for generating random numbers
  --threads THREADS, -t THREADS
                        max num of threads
  --file FILE, -f FILE  path to model file
  --vocab VOCAB, -v VOCAB
                        path to vocabulary file

$ python run.py evaluate -h
usage: run.py evaluate [-h] [--batch-size BATCH_SIZE] [--include-punct]
                       [--fdata FDATA] [--device DEVICE] [--seed SEED]
                       [--threads THREADS] [--file FILE] [--vocab VOCAB]

optional arguments:
  -h, --help            show this help message and exit
  --batch-size BATCH_SIZE
                        batch size
  --include-punct       whether to include punctuation
  --fdata FDATA         path to dataset
  --device DEVICE, -d DEVICE
                        ID of GPU to use
  --seed SEED, -s SEED  seed for generating random numbers
  --threads THREADS, -t THREADS
                        max num of threads
  --file FILE, -f FILE  path to model file
  --vocab VOCAB, -v VOCAB
                        path to vocabulary file

$ python run.py predict -h
usage: run.py predict [-h] [--batch-size BATCH_SIZE] [--fdata FDATA]
                      [--fpred FPRED] [--device DEVICE] [--seed SEED]
                      [--threads THREADS] [--file FILE] [--vocab VOCAB]

optional arguments:
  -h, --help            show this help message and exit
  --batch-size BATCH_SIZE
                        batch size
  --fdata FDATA         path to dataset
  --fpred FPRED         path to predicted result
  --device DEVICE, -d DEVICE
                        ID of GPU to use
  --seed SEED, -s SEED  seed for generating random numbers
  --threads THREADS, -t THREADS
                        max num of threads
  --file FILE, -f FILE  path to model file
  --vocab VOCAB, -v VOCAB
                        path to vocabulary file

Hyperparameters

Param	Description	Value
n_embed	dimension of word embedding	100
n_tag_embed	dimension of tag embedding	100
embed_dropout	dropout ratio of embeddings	0.33
n_lstm_hidden	dimension of lstm hidden state	400
n_lstm_layers	number of lstm layers	3
lstm_dropout	dropout ratio of lstm	0.33
n_mlp_arc	arc mlp size	500
n_mlp_rel	label mlp size	100
mlp_dropout	dropout ratio of mlp	0.33
lr	starting learning rate of training	2e-3
betas	hyperparameter of momentum and L2 norm	(0.9, 0.9)
epsilon	stability constant	1e-12
annealing	formula of learning rate annealing
batch_size	number of sentences per training update	200
epochs	max number of epochs	1000
patience	patience for early stop	100

References

Deep Biaffine Attention for Neural Dependency Parsing

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
parser		parser
results		results
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
config.py		config.py
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Biaffine Parser

Requirements

Datasets

Performance

Usage

Hyperparameters

References

About

Releases

Packages

Contributors 2

Languages

License

chiehminwei/biaffine-parser

Folders and files

Latest commit

History

Repository files navigation

Biaffine Parser

Requirements

Datasets

Performance

Usage

Hyperparameters

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages