Urdu-Augmented-TextLines-Dataset

A Dataset for Urdu Textline OCR The Dataset contains text images in gray scale and their corresponding text in utf 8. Each .rar file contains folders with nested folders containing Augmented images and a single text folder. This dataset contain three types of images

Low Resolution text line images
High resolution text line images
Words images

Summary of Dataset

	Low Res Folder	High Res Folder	Words Folder
Unedited images	20787	23018	118013
Chars in Unedited images	1602435	2234487	1080079
Words in Unedited images	370381	515498
Total Augmented Images	119652	483378	1063772
Size in GB	2.2 GB	8.7 GB	9.6 GB
Link	Low Res Dataset	High Res Dataset	Words Dataset

Examples

Low Res Unedited

Low Res Augmented

High Res Unedited

High Res Augmented

Word Unedited

Word Augmented

Project Url:

Trained model With Minimal Code is Deployed here https://github.com/HassamChundrigar/Urdu-Ocr

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Examples		Examples
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Urdu-Augmented-TextLines-Dataset

Summary of Dataset

Examples

Project Url:

About

Releases

Packages

License

HassamChundrigar/Urdu-Augmented-TextLines-Dataset

Folders and files

Latest commit

History

Repository files navigation

Urdu-Augmented-TextLines-Dataset

Summary of Dataset

Examples

Project Url:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages