Help with script inputs, training, predicting and evaluation #138

eedenong · 2021-11-10T16:29:02Z

Hi! I have been running into the issue that after following the steps to train and predict as highlighted in the README, the evaluation scores are quite poor. I am not sure if it is due to misformatting the training data, so I would like to seek some help here! Here are the steps that I took to carry out the training, prediction, and evaluation:

All of this was done on Google Colab.

Data preprocessing
I used the FCE dataset to generate the train and dev sets, specifically

Use the error.py script from the PIE repo (https://github.com/awasthiabhijeet/PIE/tree/master/errorify) to generate the parallel text files correct.txt and incorrect.txt

2. Use the preprocess_data.py from the GECToR repo to generate the output files (train.txt and dev.txt respectively)

Model training
Then, I trained the model using the generated train.txt and dev.txt
My Google Colab runtime timed out and I got the following:

Prediction and evaluation
Afterwards, I ran the prediction script using a txt file, train_incorr_sentences.txt from the PIE repository (https://github.com/awasthiabhijeet/PIE/tree/master/scratch), to obtain the predictions as preds_output.m2. The model path specified was pointing to the best.th file in the model_output folder

Then, I used the two parallel text files from the same PIE folder (train_incorr_sentences.txt and train_corr_sentences.txt) to generate a reference file ref_output.m2

Then, I ran the m2scorer script with the SYSTEM argument set as preds_output.m2 and SOURCE_GOLD as ref_output.m2

These were the resulting scores (after training once i.e. stage1):
Precision: 0.0831
Recall: 0.0780
F0.5: 0.0820

I am not sure if I am using the wrong datasets and passing them into the wrong scripts, as there isn't much documentation specifying exactly what kind of files and the format of the files to pass in. It would be a very big help if someone could help to point me in the correct direction of the specifics of what kind of data I should be using for each step, and if I am processing them correctly!

I also read that you did 3 stages of training, is this expected behaviour after the first stage of training?

The text was updated successfully, but these errors were encountered:

skurzhanskyi · 2021-11-10T17:15:22Z

Hi @eedenong

Please take a look at the corresponding README sections if you want to reproduce the results in the paper.

The data for the first stage could be found here, as it was mentioned in the Dataset section.
It looks like you're using default parameters for training. We explained in detail our parameters at each stage here.
From what I see, you used preprocess_data.py correctly (errorful data as a source and error-free data as a target). You may also want to look at similar issues (Format of SOURCE and TARGET #136, Are source correct txt file and target incorrect txt file in prerprocessing? #104, What kind of data format do you use? #53).
You can take a look at our scores after each stage in Table 4 in the paper.

eedenong · 2021-11-10T19:35:32Z

Thank you, I will take a look at them!

Just to clarify, for the model inference input file, should it be in the m2 format or the txt format, and should it be a dataset of incorrect sentences to be corrected? And in this case, will it suffice for me to simply use a dataset of incorrect sentences for example a1_train_incorr_sentences.txt from the PIE synthetic dataset? So far the issues that I have seen only refer to the formats of the text files with regards to the preprocessing.

skurzhanskyi · 2021-11-10T20:15:02Z

If you're talking about predict.py, it takes model input that is incorrect sentences.
For the prediction stage, the model shouldn't require correct output as part of the input. Thus m2 and preprocess_data.py formats don't fit here.

eedenong · 2021-11-11T07:46:26Z

For the prediction stage, the model should require correct output as part of the input.

Regarding this, may I know which of the input are you referring to? Are you referring to the --output_file argument that is passed into predict.py, or do you mean that the correct output should be in the same text file as the input file into --input_file for precict.py?

skurzhanskyi · 2021-11-11T09:16:36Z

Oh, sorry. I meant
For the prediction stage, the model shouldn't require correct output as part of the input

eedenong · 2021-11-11T19:11:45Z

I see, thank you! I have another query:

The data for the first stage could be found here, as it was mentioned in the Dataset section.

May I clarify if I am supposed to generate the 98/2 train/dev split from the file generated from the preprocess.py? Or am I supposed to find separate train/dev sets and preprocess them separately to generate the train and dev sets?

skurzhanskyi · 2021-11-11T19:23:15Z

The results will be the same

skurzhanskyi closed this as completed Dec 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Help with script inputs, training, predicting and evaluation #138

Help with script inputs, training, predicting and evaluation #138

eedenong commented Nov 10, 2021

skurzhanskyi commented Nov 10, 2021

eedenong commented Nov 10, 2021

skurzhanskyi commented Nov 10, 2021 •

edited

Loading

eedenong commented Nov 11, 2021 •

edited

Loading

skurzhanskyi commented Nov 11, 2021

eedenong commented Nov 11, 2021

skurzhanskyi commented Nov 11, 2021

Help with script inputs, training, predicting and evaluation #138

Help with script inputs, training, predicting and evaluation #138

Comments

eedenong commented Nov 10, 2021

skurzhanskyi commented Nov 10, 2021

eedenong commented Nov 10, 2021

skurzhanskyi commented Nov 10, 2021 • edited Loading

eedenong commented Nov 11, 2021 • edited Loading

skurzhanskyi commented Nov 11, 2021

eedenong commented Nov 11, 2021

skurzhanskyi commented Nov 11, 2021

skurzhanskyi commented Nov 10, 2021 •

edited

Loading

eedenong commented Nov 11, 2021 •

edited

Loading