segfault at at ../ccstruct/matrix.h:256 #881

Shreeshrii · 2017-05-05T04:52:39Z

gdb --args lstmtraining -U ~/tess4training/eng/eng.unicharset \
>   --script_dir ../langdata   \
>   --append_index 5 --net_spec '[Lfx256 O1c105]' \
>   --continue_from ~/tess4training/englayer_from_eng/eng.lstm \
>   --model_output ~/tess4training/englayer_from_eng/englayer \
>   --train_listfile ~/tess4training/eng/eng.training_files.txt \
>   --target_error_rate 0.01 \
>   --debug_interval  -1
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from lstmtraining...done.
(gdb) run
Starting program: /usr/local/bin/lstmtraining -U /home/shree/tess4training/eng/eng.unicharset --script_dir ../langdata --append_index 5 --net_spec \[Lfx256\ O1c105\] --c
ontinue_from /home/shree/tess4training/englayer_from_eng/eng.lstm --model_output /home/shree/tess4training/englayer_from_eng/englayer --train_listfile /home/shree/tess4t
raining/eng/eng.training_files.txt --target_error_rate 0.01 --debug_interval -1
warning: Error disabling address space randomization: Success
warning: linux_ptrace_test_ret_to_nx: PTRACE_KILL waitpid returned -1: Interrupted system call
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Loaded file /home/shree/tess4training/englayer_from_eng/eng.lstm, unpacking...
Warning: LSTMTrainer deserialized an LSTMRecognizer!
Continuing from /home/shree/tess4training/englayer_from_eng/eng.lstm
Other case É of é is not in unicharset
Appending a new network to an old one!!Setting unichar properties
Setting properties for script Common
Setting properties for script Latin
Num outputs,weights in serial:
  Lfx256:256, 394240
  Fc105:105, 26985
Total weights = 421225
Built network:[1,0,0,1[C5,5Ft16]Mp3,3Lfys64Lfx128Lrx128Lfx256Fc105] from request [Lfx256 O1c105]
Training parameters:
  Debug interval = -1, weights = 0.1, learning rate = 0.0001, momentum=0.9
[New Thread 0x7fea20f40700 (LWP 1158)]
Loaded 104/104 pages (1-104) of document /home/shree/tess4training/eng/eng.Arial.exp0.lstmf
[Thread 0x7fea20f40700 (LWP 1158) exited]
[New Thread 0x7fea1bff0700 (LWP 1159)]
Loaded 104/104 pages (1-104) of document /home/shree/tess4training/eng/eng.Times_New_Roman.exp0.lstmf
[Thread 0x7fea1bff0700 (LWP 1159) exited]
[New Thread 0x7fea1bff0700 (LWP 1160)]
[New Thread 0x7fea20f40700 (LWP 1161)]
[New Thread 0x7fea1acf0700 (LWP 1162)]
Iteration 0: ALIGNED TRUTH : used through € between NEW % J. should when High when We it
Iteration 0: BEST OCR TEXT : St$Zv#kv#qSZ]$vUZUyvHvkyEHkSZUZtvyvyvHvq$Rwq,qtHgE#gqZUSUZ#EqgqSqSqyvHvt$q£$R$RqtS$U,#,v$USUHZq,c$U$qSq,$#vyvHvSHZSZ=Z$[ [vESUSZUyvHvSqSqyHZ
vt,q€U€vlvyvySyHvqtSUZtyHvSqSqvqHq,U$U,UegqgqveZS
File /tmp/tmp.E8R3inud3B/eng/eng.Arial.exp0.lstmf page 6 :

Program received signal SIGSEGV, Segmentation fault.
operator+= (addend=..., this=0xb92048) at ../ccstruct/matrix.h:256
256               (*this)(x, y) += addend(x, y);
(gdb) backtrace
#0  operator+= (addend=..., this=0xb92048) at ../ccstruct/matrix.h:256
#1  tesseract::WeightMatrix::Update (this=0xb92048, learning_rate=<optimized out>, momentum=0.89999997615814209, num_samples=<optimized out>) at weightmatrix.cpp:261
#2  0x00007fea254b7083 in tesseract::Plumbing::Update (this=0xb91e80, learning_rate=1.24999988e-05, momentum=0.899999976, num_samples=1) at plumbing.cpp:219
#3  0x00007fea254b7083 in tesseract::Plumbing::Update (this=0xb91230, learning_rate=9.99999975e-05, momentum=0.899999976, num_samples=1) at plumbing.cpp:219
#4  0x00007fea254a8469 in tesseract::LSTMTrainer::TrainOnLine (this=this@entry=0x7ffff03ebaf0, trainingdata=<optimized out>, batch=batch@entry=false)
    at lstmtrainer.cpp:813
#5  0x0000000000406ff4 in TrainOnLine (batch=false, samples_trainer=0x7ffff03ebaf0, this=0x7ffff03ebaf0) at ../lstm/lstmtrainer.h:273
#6  main (argc=1, argv=0x7ffff03ec488) at lstmtraining.cpp:198
(gdb) up
#1  tesseract::WeightMatrix::Update (this=0xb92048, learning_rate=<optimized out>, momentum=0.89999997615814209, num_samples=<optimized out>) at weightmatrix.cpp:261
261       if (momentum > 0.0) wf_ += updates_;
(gdb) print num_samples
$1 = <optimized out>
(gdb) print momentum
$2 = 0.89999997615814209
(gdb) print wf_
$3 = {_vptr.GENERIC_2D_ARRAY = 0x7fea257bf230 <vtable for GENERIC_2D_ARRAY<double>+16>, array_ = 0xb92230, empty_ = 0, dim1_ = 16, dim2_ = 26, size_allocated_ = 416}
(gdb) print updates_
$4 = {_vptr.GENERIC_2D_ARRAY = 0x7fea257bf230 <vtable for GENERIC_2D_ARRAY<double>+16>, array_ = 0x0, empty_ = 0, dim1_ = 0, dim2_ = 0, size_allocated_ = 0}
(gdb) quit

The text was updated successfully, but these errors were encountered:

stweil · 2017-05-05T05:15:09Z

How can I reproduce this crash? updates_ looks strange. No wonder that the code crashes when it tries to add a 0 x 0 matrix.

Shreeshrii · 2017-05-05T05:49:39Z

This was the with the latest code from github, including Ray's latest commit regarding endian support.

You will need to change the fonts dir in the commands below.

rm -rf ~/tess4training/eng  
  
training/tesstrain.sh \
   --fonts_dir ~/.fonts \
   --lang eng  \
   --noextract_font_properties \
   --linedata_only \
   --exposures "0" \
   --langdata_dir ../langdata \
   --tessdata_dir ../tessdata \
   --fontlist "Arial"  "Times New Roman,"  \
   --output_dir ~/tess4training/eng
  
rm -rf ~/tess4training/englayer_from_eng    

mkdir -p ~/tess4training/englayer_from_eng 

combine_tessdata -e ../tessdata/eng.traineddata \
  ~/tess4training/englayer_from_eng/eng.lstm

lstmtraining -U ~/tess4training/eng/eng.unicharset \
  --script_dir ../langdata   \
  --append_index 5 --net_spec '[Lfx256 O1c105]' \
  --continue_from ~/tess4training/englayer_from_eng/eng.lstm \
  --model_output ~/tess4training/englayer_from_eng/englayer \
  --train_listfile ~/tess4training/eng/eng.training_files.txt \
  --target_error_rate 0.01 \
  --debug_interval  -1

amitdo · 2017-05-05T06:19:09Z

Ray said that he needs to fix training after his last commits.

stweil · 2017-05-05T06:41:55Z

When I run all but the last step with older executables, I get an error from lstmtraining:

Deserialize failed: tess4training/eng/eng.Arial.exp0.lstmf read 0/72 pages

That might confirm that the last commits introduced some incompatibility.

stweil · 2017-05-05T07:14:43Z

I could confirm that commit 8e79297 introduced a regression. Before that commit, lstmtraining did run the iterations.

amitdo · 2017-05-05T07:14:54Z

83588bc#commitcomment-22019959

Now back to finding out why training isn't working properly...

Shreeshrii · 2017-05-05T08:10:51Z

Thanks, Amit. I had seen that comment but thought that it was related to training of new language models.

stweil · 2017-05-05T08:27:46Z

Here language_ is added to the serialized data, so old and new data are incompatible (this is the reason why I get the above error message). The new data is correctly handled in ImageData::DeSerialize and in ImageData::SkipDeSerialize.

Does the training process use old files which contain old incompatible ImageData?

Shreeshrii · 2017-05-05T08:43:00Z

I am not sure what is included in ImageData.

The first part of the training commands (tesstrain.sh) runs text2image and creates box/tiff pairs and these are then used to create unicharset, dawg and lstmf files. I have not used any external box/tiff pairs.

These lstmf files are used along with the extracted lstm file from the traineddata from the repo.

combine_tessdata -e ../tessdata/eng.traineddata ~/tess4training/englayer_from_eng/eng.lstm

Possibly, this could be the cause of incompatibility, since these files were built with earlier code.

Shreeshrii · 2017-05-05T09:14:20Z

the lstmtraining command for this issue is for replacing a top layer.

Shreeshrii · 2017-05-06T04:24:03Z

lstmtraining command for replacing a top layer is working now after the changes by Ray.

 lstmtraining -U ~/tess4training/eng/eng.unicharset \
>   --script_dir ../langdata   \
>   --append_index 5 --net_spec '[Lfx256 O1c105]' \
>   --continue_from ~/tess4training/englayer_from_eng/eng.lstm \
 >   --model_output ~/tess4training/englayer_from_eng/englayer \
>   --train_listfile ~/tess4training/eng/eng.training_files.txt \
>   --target_error_rate 0.01 \
>   --debug_interval  -1
Loaded file /home/shree/tess4training/englayer_from_eng/eng.lstm, unpacking...
Warning: LSTMTrainer deserialized an LSTMRecognizer!
Continuing from /home/shree/tess4training/englayer_from_eng/eng.lstm
Other case É of é is not in unicharset
Appending a new network to an old one!!Setting unichar properties
Setting properties for script Common
Setting properties for script Latin
Num outputs,weights in serial:
  Lfx256:256, 394240
  Fc105:105, 26985
Total weights = 421225
Built network:[1,0,0,1[C5,5Ft16]Mp3,3Lfys64Lfx128Lrx128Lfx256Fc105] from request [Lfx256 O1c105]
Training parameters:
  Debug interval = -1, weights = 0.1, learning rate = 0.0001, momentum=0.9
Loaded 104/104 pages (1-104) of document /home/shree/tess4training/eng/eng.Arial.exp0.lstmf
Loaded 104/104 pages (1-104) of document /home/shree/tess4training/eng/eng.Times_New_Roman.exp0.lstmf
Iteration 0: ALIGNED TRUTH : (GeneRIF) World and ABOUT time October the service through is Help
Iteration 0: BEST OCR TEXT : #yvtEktgqtHv$gqy$g$ScqESvq,UgU,UZHZtZ€U Zq,yqaqtqHt SUctéHS$Ht:HtS$Uu5,q,UvUtHSqgqtqH\t$gZUHZtZHZqZqZqEZUyHEqgq€EgtqZtZv#
v\EyZqgqEUyHvHEHZHSZ$ZtvyHZvUv#qtq€U€e#qevZH
File /home/shree/tess4training/eng/eng.Arial.exp0.lstmf page 0 :
Mean rms=8.666%, delta=100%, train=301.515%(100%), skip ratio=0%
Iteration 1: ALIGNED TRUTH : through they not this our English ° 14 ® Development of 8 What's an
Iteration 1: BEST OCR TEXT : kZyHZkEZkSZ$ZtqtyHZq]UtyHESqS#tHvZ:ZSZqUZyHZvk#t4SZSZqZqyq4,q,eHvZvtv€yveS#tyHZq$qSq4$SU,Uq#R#$SHtU,$,H\Sq7Zt#q¥SHtvtktaH
vHSZtHZSZqS:SZSv#$S,tU$U5UH\HvZSZUHvqEyqHZH
File /home/shree/tess4training/eng/eng.Times_New_Roman.exp0.lstmf page 0 :
Mean rms=8.647%, delta=100%, train=303.743%(100%), skip ratio=0%
Iteration 2: ALIGNED TRUTH : good 25 now this [J User INDEX had help 6 all iGoogle if in Ca¥ years
Iteration 2: BEST OCR TEXT : StvEHE#Zt]$v4#kqUklvtEHEHZSUZ#tEZUyHvZv#vq,uc$,4[$[e#v#ZqZqyq$Sq$u$H\R$q$w$,tyHv#qHSc ZUtHvHZgqZvZHq]UlvE4qZ[$vUvUv$vHg\E
H\tv\HZqgq4SvUvevHqt$,SqHS$,q,S#taqSqHqH$S
File /home/shree/tess4training/eng/eng.Arial.exp0.lstmf page 1 :
Mean rms=8.648%, delta=100%, train=301.046%(100%), skip ratio=0%
Iteration 3: ALIGNED TRUTH : were jobs BBC get is" University A 1998 fixed Search And under .
Iteration 3: BEST OCR TEXT : ,SkZ]EoZqyES$qEvyvtHZRHZHkvqt$u$vt/$He}$,$vt,tvytZqZHqUvkv#$q]oq4U$tetHvHvZtZtqZtZHv#veZSyS#tq#}éHSqU [$Uk$SCpH.$vqUyHv$H
#qS€ Sv,$évEqg€qZqyt$tyHSqU$téHtHZHc SU,SvZtvHc]$eSZqZqy{q,S
File /tmp/tmp.2xbvLE29x7/eng/eng.Times_New_Roman.exp0.lstmf page 25 :
Mean rms=8.696%, delta=100%, train=306.644%(100%), skip ratio=0%

amitdo mentioned this issue May 5, 2017

segfault : Assert failed:in file weightmatrix.cpp, line 227 #880

Closed

Shreeshrii closed this as completed May 6, 2017

Shreeshrii mentioned this issue May 7, 2017

LSTM: Training: Deserialize Failed #792

Closed

amitdo added bug training labels May 22, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

segfault at at ../ccstruct/matrix.h:256 #881

segfault at at ../ccstruct/matrix.h:256 #881

Shreeshrii commented May 5, 2017

stweil commented May 5, 2017 •

edited

Loading

Shreeshrii commented May 5, 2017 •

edited

Loading

amitdo commented May 5, 2017

stweil commented May 5, 2017

stweil commented May 5, 2017

amitdo commented May 5, 2017

Shreeshrii commented May 5, 2017

stweil commented May 5, 2017

Shreeshrii commented May 5, 2017 •

edited

Loading

Shreeshrii commented May 5, 2017

Shreeshrii commented May 6, 2017 •

edited

Loading

segfault at at ../ccstruct/matrix.h:256 #881

segfault at at ../ccstruct/matrix.h:256 #881

Comments

Shreeshrii commented May 5, 2017

stweil commented May 5, 2017 • edited Loading

Shreeshrii commented May 5, 2017 • edited Loading

amitdo commented May 5, 2017

stweil commented May 5, 2017

stweil commented May 5, 2017

amitdo commented May 5, 2017

Shreeshrii commented May 5, 2017

stweil commented May 5, 2017

Shreeshrii commented May 5, 2017 • edited Loading

Shreeshrii commented May 5, 2017

Shreeshrii commented May 6, 2017 • edited Loading

stweil commented May 5, 2017 •

edited

Loading

Shreeshrii commented May 5, 2017 •

edited

Loading

Shreeshrii commented May 5, 2017 •

edited

Loading

Shreeshrii commented May 6, 2017 •

edited

Loading