Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move fs.pdf_ocr setting to fs.ocr.pdf_strategy #693

Merged
merged 1 commit into from
Mar 2, 2019

Conversation

dadoonet
Copy link
Owner

@dadoonet dadoonet commented Mar 2, 2019

This change moves fs.pdf_ocr: true/false to fs.ocr.pdf_strategy: ocr_and_text/no_ocr.

name: "test"
fs:
  ocr:
    pdf_strategy: "no_ocr"

This allows putting all OCR settings under the same setting object fs.ocr and gives more control to the end user by exposing all capabilities:

Supported strategies are:

  • no_ocr: No OCR is performed on PDF documents. OCR might be performed on images though if OCR is not disabled.
  • ocr_only: Only OCR is performed.
  • ocr_and_text (default): OCR and text extraction is performed.

We are also moving the OCR documentation as a major section instead of being hidden in the "tips" section.

This change moves `fs.pdf_ocr: true/false` to `fs.ocr.pdf_strategy: ocr_and_text/no_ocr`.

```yml
name: "test"
fs:
  ocr:
    pdf_strategy: "no_ocr"
```

This allows putting all OCR settings under the same setting object `fs.ocr` and gives more control to the end user by exposing all capabilities:

Supported strategies are:

* `no_ocr`: No OCR is performed on PDF documents. OCR might be performed on images though if OCR is not disabled.
* `ocr_only`: Only OCR is performed.
* `ocr_and_text` (default): OCR and text extraction is performed.

We are also moving the OCR documentation as a major section instead of being hidden in the "tips" section.
@dadoonet dadoonet added the update When updating an existing feature label Mar 2, 2019
@dadoonet dadoonet added this to the 2.7 milestone Mar 2, 2019
@dadoonet dadoonet self-assigned this Mar 2, 2019
@dadoonet dadoonet merged commit 82949d6 into master Mar 2, 2019
@dadoonet dadoonet deleted the pr/move-ocr-settings branch March 2, 2019 17:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
update When updating an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant