You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have been trying to fine-tune the german (deu.trainingdata) for diacritics because I have to OCR documents which contain names that have diacritics. The problem is that german has umlauts (ä, ö and ü) and no matter what I try it wont learn diacritics like á or â for example.
Things that I tried:
the first thing I tried was to use examples from the documents itself to train but since there are not that many examples I thought its just not sufficient
therefore the second thing I tried is to use a name database and write a script which basically generates as many examples of names including diacritics as I like but that also does not seem to work
So my question is am I doing something wrong or how should I approach fine-tuning german for diacritics?
Thanks in advance for any response highly appreciate :)
The text was updated successfully, but these errors were encountered:
Sorry for the late response i have been a little busy latly.
I tried latin but then there are other issues it does not support 'ß' or 'ä' for example, so then I tried combining german and latin which also did not yield in wanted results.
Leaving me with my initial question is it possible to fine-tune german for diacritics and if so what would be the best practice?
I have been trying to fine-tune the german (deu.trainingdata) for diacritics because I have to OCR documents which contain names that have diacritics. The problem is that german has umlauts (ä, ö and ü) and no matter what I try it wont learn diacritics like á or â for example.
Things that I tried:
So my question is am I doing something wrong or how should I approach fine-tuning german for diacritics?
Thanks in advance for any response highly appreciate :)
The text was updated successfully, but these errors were encountered: