Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow setting Tesseract path to executable and data #520

Merged
merged 1 commit into from
Feb 19, 2018
Merged

Conversation

dadoonet
Copy link
Owner

OCR Path

If your Tesseract application is not available in default system PATH, you can define the path to use
by setting fs.ocr.path property in your ~/.fscrawler/test/_settings.json file:

{
  "name" : "test",
  "fs" : {
    "url" : "/path/to/data/dir",
    "ocr" : {
      "path": "/path/to/tesseract/executable"
    }
  }
}

When you set it, it's highly recommended to set the data path for Tesseract.

OCR Data Path

Set the path to the 'tessdata' folder, which contains language files and config files if Tesseract
can not be automatically detected. You can define the path to use
by setting fs.ocr.data_path property in your ~/.fscrawler/test/_settings.json file:

{
  "name" : "test",
  "fs" : {
    "url" : "/path/to/data/dir",
    "ocr" : {
      "path": "/path/to/tesseract/executable",
      "data_path": "/path/to/tesseract/tessdata"
    }
  }
}

Closes #495.

## OCR Path

If your Tesseract application is not available in default system PATH, you can define the path to use
by setting `fs.ocr.path` property in your `~/.fscrawler/test/_settings.json` file:

```json
{
  "name" : "test",
  "fs" : {
    "url" : "/path/to/data/dir",
    "ocr" : {
      "path": "/path/to/tesseract/executable"
    }
  }
}
```

When you set it, it's highly recommended to [set the data path for Tesseract](#ocr-data-path).

## OCR Data Path

Set the path to the 'tessdata' folder, which contains language files and config files if Tesseract
can not be automatically detected. You can define the path to use
by setting `fs.ocr.data_path` property in your `~/.fscrawler/test/_settings.json` file:

```json
{
  "name" : "test",
  "fs" : {
    "url" : "/path/to/data/dir",
    "ocr" : {
      "path": "/path/to/tesseract/executable",
      "data_path": "/path/to/tesseract/tessdata"
    }
  }
}
```

Closes #495.
@dadoonet dadoonet added the new For new features or options label Feb 19, 2018
@dadoonet dadoonet added this to the 2.5 milestone Feb 19, 2018
@dadoonet dadoonet self-assigned this Feb 19, 2018
@dadoonet dadoonet merged commit eeae468 into master Feb 19, 2018
@dadoonet dadoonet deleted the pr/ocr_path branch February 19, 2018 20:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new For new features or options
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant