CYK Parser

This is the work for the TP2 for for the master MVA course Natural Language processing

Dependencies

Necessary packages : Python>=3.6, nltk, numpy, pyevalb, pickle.

Usage

There are 2 mode to use the code :

--mode dataset -d <path_to_dataset> to evaluate the parser a dataset in the same format as the sequoia corpus. By default, it is launched on the last 10% of the sequoia corpus.
--mode sentence -s "Your sentences to parse" will run the parser on a unique sentence. Each token has to be separated by a single whitespace. The input can take several lines with one sentence per line.

There are several other option :

-h : give information on all the different options
-w True : if given, the output of a dataset is written to <datasetPath>\<dataset_name>_output.txt. The non parsable sentence from the dataset are written to non_parsable.txt.
-e <path_to_embedding>: if given, it will use the given file for Out of Vocabulary embedding. The default embedding is : polyglot-fr.pkl

Output

If mode dataset : the output is the different metrics to check the accuracy of parsing toward dataset. If -w is given, see Usage for the written files.

If mode sentence : the output is the parsed tree of the input sentence.

Examples

Parse the sentence "Un ours mange une pomme .":

./run.sh -s "Un ours mange une pomme ."

Get accuracy on the validation dataset :

./run.sh --mode dataset

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
Data		Data
CYK.py		CYK.py
Levenshtein.py		Levenshtein.py
OOV.py		OOV.py
PCFG.py		PCFG.py
debug.py		debug.py
evaluation.py		evaluation.py
main.py		main.py
read_data.py		read_data.py
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CYK Parser

Dependencies

Usage

Output

Examples

About

Releases

Packages

Languages

HugoSenetaire/CYK-Parser

Folders and files

Latest commit

History

Repository files navigation

CYK Parser

Dependencies

Usage

Output

Examples

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages