-
Notifications
You must be signed in to change notification settings - Fork 0
/
OCR.m
115 lines (79 loc) · 3.01 KB
/
OCR.m
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
clc
clear
%{
The following script was written as part of a university project, although
many additions were made to be a fully working Object Character Recognition
Engine for english capital letters.
IMPORTANT: It requires LIBSVM to be installed and
linked to matlab installation, in order to run properly. Otherwise
'ovrpredict' function will raise an error
Also due to the simplistic nature of this project, the image needs to
contain only the letters, alligned sideways. It is highly advised that the
TestImage is binary in its initial form.
Tests on the Test Set showed a 93% accuracy in prediction, which is
optimistic because the system suffers from variance. May be patched in
future.
Example image is contained on: https://github.com/stavskal/OCR-English-Capital-Letters
%}
%The pretrained model is saved under the same directory in models93.mat
load('models93.mat')
%Image containing to-be-recognized letters
image=imread('TestImage.png');
imshow(image);
%First Step of Preprocessingw
%Convert RGB -> Binary
image=im2double(rgb2gray(image));
imageComp=im2bw(image,graythresh(image));
rec=1;
%Connected Components algorithm requires inverse binary image
%to label all components correctly
temp=ones([size(image,1) size(image,2)]);
imageComp=temp-imageComp;
% cc is a matrix, same in size as initial image
cc=bwconncomp(imageComp,4);
%Label matrix contains all connected components in the form:
%first component is represented by ones(1), second by 2, etc.
Label=labelmatrix(cc);
k=1;
for i=1:cc.NumObjects
for j=1:2
%Preprocessing of each letter
%'letter' matrix after the following procedure will contain
%only one labeled component. It is then resized to 25x25
%cut off leftmost and righmost part of letter
if j==1
ele=find(Label==i);
[x ,y]=size(Label);
first1=floor(ele(1)/x)+1;
last1=ceil(ele(end)/x);
%insert in "letter" matrix letter perigramma
elseif j==2
ele=find(letter==i);
if numel(ele)~=0
[x, y]=size(letter);
first=floor(ele(1)/x)+1;
last=ceil(ele(end)/x);
end
end
if j==1
letter=Label(:,(first1:last1));
elseif j==2
letter=letter(:,(first:last));
end
if j==1
letter=imrotate(letter,90);
end
end
%rotating back to its initial form
letter=(imrotate(letter,-90));
separatedLetter=image( (first:last), (first1:last1));
test_image=double(imresize(separatedLetter, [25 25]));
X(k,:)=double(test_image(:));
k=k+1;
end
y=ones(size(X,1),1);
%predicting each letter using pretraind 'model'
%y==13 does not contribute in any way to the result, just required
[recc,a,qwd] = ovrpredict(double(y == 13), X, model);
%converting class number to letter (ASCII), 1->A,2->B, etc
Handwritten_Text=char(recc+64)'