-
Notifications
You must be signed in to change notification settings - Fork 0
/
README
26 lines (21 loc) · 1.72 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
In this mini-project, we are provided 3 binary classification datasets. Each of these datasets
represents the same machine learning task (which can be posed as a binary classification pro-
blem) and was generated from the same
raw dataset . The 3 datasets (described in Section 2) only differ
in terms of features being used to represent each input from the original raw dataset. Each
of these 3 datasets further consists of a training set, a validation set, and a test set. The vali-
dation set can be used for selecting the best hyperparameters or any other analyses . While the validation set
labels are provided , the test set labels are hidden.
More of the description of task of project is in project description pdf
Detailed explanation And How did we get to the final model and statistics on amount of data and accuracy is included in
group_no_61_project_report.pdf
The end model for question1 is in question1_feature_extraction_thenSVm.py file if you want to train it at
different percentage of data you can just change the test size parameter in train_test_split in line 102
The end model for question 2 is question2.ipynb you can also train this model at different percentage of data
by changing test size in train test split
61.py file contains all the end models and the combined model you can just run at it and it will save
the predictions in the file pred_text.txt pred_feat.txt pred_emoticon.txt pred_combined.txt and while it has
been run it needs numpy, pandas, pytorch , tenserflow,sklearn libraries installed it will also give you the
accuracy on validation data also. for it run fine each data should be stored in the directory as given in
project
I also have provided the predictions in the name provided( when run on my system)