Contains all of the equation images used to evaluate cnn model
Contains the csv(s) of the evaluated images
- Column Names: path, gt, predicted output, % similarity, WER
Contains the csv(s) of the ground truth to the equation images
- Column Names: path, gt
Contains the tsv(s) of the ground truth to the equation images
Contains the images used for comparing with rcnn model
Contains the folders that have been processed by the bounding box & cropping function
- Note: each directory have their respective ground_truth_csv
First Step to preprocess images that stores equation images
Note: eqn_folder is meant to be replaced so that older folders do not get replaced
- creates directory in the processed_data/<eqn_folder>
To retrieve the ground truths that are in the folder containing equation images and to store in .csv
Contains CNN Model & Evaluation functions to store evaluated results into evaluation_csv
Contains all the individual digits & symbols images used to train CNN model. However, since the folder size is too large it cannot be uploaded to github repo
Splits all the individual digits & symbols images into 90% train and 10% test to be stored in final_82
Please ensure that you change the folder path for any functions that you will use
pre_processing_from_dir
: directory consisting your training/test imagespd.read_csv
: csv file containing your ground truthpre_processing_from_test
: folder under processed_data consisting of pre-processed equation imagesfiltered_data.to_csv
: change accordingly how you would like your file name to be named
- Run all the cells under the subheading "Global Variable", you need to edit these changes before running as they may override previous trained model.
- Run all the cells under the subheading "Running model on Training Data (Digits & Symbols Images only (17 classes))".
- Make sure you run the model before doing this (model can take a few hours depending on your hyperparameters)
- images would have to be pre-processed using the
Draw bounding box.ipynb
- A folder would then be created and stored under processed_data folder
- Run all the cells under the subheading "Retrieve Ground Truth from csv".
- Note: Ground Truth has to be in .csv format