Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for reading order in recognizing content [Formrecognizer] #20301

Merged
merged 3 commits into from
Apr 2, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions sdk/formrecognizer/azure-ai-formrecognizer/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
and `RecognizeCustomFormOptions` to specify the page numbers to analyze.
- Added support for `FormContentType` `image/bmp` when analyzing custom forms.
- Added support for pre-built ID documents recognition.
- Added property `ReadingOrder` to `RecognizeContentOptions` to specify the order in which recognized text lines are returned.

## 3.0.6 (2021-03-10)
### Dependency updates
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -332,7 +332,10 @@ public PollerFlux<FormRecognizerOperationResult, List<FormPage>> beginRecognizeC
finalRecognizeContentOptions.getPages(),
finalRecognizeContentOptions.getLanguage() == null
? null : Language.fromString(finalRecognizeContentOptions.getLanguage().toString()),
null,
finalRecognizeContentOptions.getReadingOrder() != null
? com.azure.ai.formrecognizer.implementation.models.ReadingOrder.fromString(
recognizeContentOptions.getReadingOrder().toString())
: null,
new SourcePath().setSource(formUrl),
context)
.map(response -> new FormRecognizerOperationResult(
Expand Down Expand Up @@ -422,7 +425,10 @@ PollerFlux<FormRecognizerOperationResult, List<FormPage>> beginRecognizeContent(
finalRecognizeContentOptions.getPages(),
finalRecognizeContentOptions.getLanguage() == null
? null : Language.fromString(finalRecognizeContentOptions.getLanguage().toString()),
null,
finalRecognizeContentOptions.getReadingOrder() != null
? com.azure.ai.formrecognizer.implementation.models.ReadingOrder.fromString(
recognizeContentOptions.getReadingOrder().toString())
: null,
context)
.map(response -> new FormRecognizerOperationResult(
parseModelId(response.getDeserializedHeaders().getOperationLocation()))),
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
// Copyright (c) Microsoft Corporation. All rights reserved.
// Licensed under the MIT License.

package com.azure.ai.formrecognizer.models;

/**
* Defines values for ReadingOrder.
*/
public enum ReadingOrder {

/**
* Enum value basic.
* Set it to basic for the lines to be sorted top to bottom, left to right, although in certain cases
* proximity is treated with higher priority.
*/
BASIC("basic"),

/**
* Enum value natural.
* Set it to "natural" value for the algorithm to use positional information to keep nearby lines together.
*/
NATURAL("natural");

/**
* The actual serialized value for a ReadingOrder instance.
*/
private final String value;

ReadingOrder(String value) {
this.value = value;
}

/**
* Parses a serialized value to a ReadingOrder instance.
*
* @param value the serialized value to parse.
* @return the parsed ReadingOrder object, or null if unable to parse.
*/
public static ReadingOrder fromString(String value) {
ReadingOrder[] items = ReadingOrder.values();
for (ReadingOrder item : items) {
if (item.toString().equalsIgnoreCase(value)) {
return item;
}
}
return null;
}

@Override
public String toString() {
return this.value;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ public final class RecognizeContentOptions {
private Duration pollInterval = DEFAULT_POLL_INTERVAL;
private FormRecognizerLanguage language;
private List<String> pages;
private ReadingOrder readingOrder;

/**
* Get the type of the form. Supported Media types including .pdf, .jpg, .png or .tiff type file stream.
Expand Down Expand Up @@ -117,4 +118,26 @@ public RecognizeContentOptions setPages(List<String> pages) {
this.pages = pages;
return this;
}

/**
* Get the order in which recognized text lines are returned.
*
* @return the order in which the recognized lines are returned.
*/
public ReadingOrder getReadingOrder() {
return readingOrder;
}

/**
* Specifies the order in which recognized text lines are returned. As the sorting order
* depends on the detected text, it may change across images and OCR version updates. Thus,
* business logic should be built upon the actual line location instead of order.
*
* @param readingOrder the order specifies in which text lines are returned
* @return the updated {@code RecognizeContentOptions} value.
*/
public RecognizeContentOptions setReadingOrder(ReadingOrder readingOrder) {
this.readingOrder = readingOrder;
return this;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
import com.azure.ai.formrecognizer.models.FormRecognizerLanguage;
import com.azure.ai.formrecognizer.models.FormRecognizerLocale;
import com.azure.ai.formrecognizer.models.FormRecognizerOperationResult;
import com.azure.ai.formrecognizer.models.ReadingOrder;
import com.azure.ai.formrecognizer.models.RecognizeBusinessCardsOptions;
import com.azure.ai.formrecognizer.models.RecognizeContentOptions;
import com.azure.ai.formrecognizer.models.RecognizeCustomFormsOptions;
Expand Down Expand Up @@ -1956,6 +1957,7 @@ public void recognizeInvoiceWithPage(HttpClient httpClient, FormRecognizerServic
assertEquals(1, recognizedForms.size());
}, INVOICE_PDF);
}

// ID Document Recognition

/**
Expand Down Expand Up @@ -2138,4 +2140,43 @@ public void recognizeIDDocumentFromUrlIncludeFieldElements(HttpClient httpClient
validatePrebuiltResultData(syncPoller.getFinalResult(), true, ID);
}, LICENSE_CARD_JPG);
}

/**
* Verify reading order parameter passed when specified by user.
*/
@ParameterizedTest(name = DISPLAY_NAME_WITH_ARGUMENTS)
@MethodSource("com.azure.ai.formrecognizer.TestUtils#getTestParameters")
public void recognizeContentWithReadingOrder(HttpClient httpClient, FormRecognizerServiceVersion serviceVersion) {
client = getFormRecognizerAsyncClient(httpClient, serviceVersion);
urlRunner(sourceUrl -> {
final SyncPoller<FormRecognizerOperationResult, List<FormPage>> syncPoller =
client.beginRecognizeContentFromUrl(sourceUrl,
new RecognizeContentOptions()
.setPollInterval(durationTestMode)
.setReadingOrder(ReadingOrder.BASIC))
.getSyncPoller();
syncPoller.getFinalResult();
validateNetworkCallRecord("readingOrder", "basic");
}, FORM_JPG);
}

/**
* Verify reading order parameter passed when specified by user.
*/
@ParameterizedTest(name = DISPLAY_NAME_WITH_ARGUMENTS)
@MethodSource("com.azure.ai.formrecognizer.TestUtils#getTestParameters")
public void recognizeContentWithReadingOrderNatural(HttpClient httpClient,
FormRecognizerServiceVersion serviceVersion) {
client = getFormRecognizerAsyncClient(httpClient, serviceVersion);
urlRunner(sourceUrl -> {
final SyncPoller<FormRecognizerOperationResult, List<FormPage>> syncPoller =
client.beginRecognizeContentFromUrl(sourceUrl,
new RecognizeContentOptions()
.setPollInterval(durationTestMode)
.setReadingOrder(ReadingOrder.NATURAL))
.getSyncPoller();
syncPoller.getFinalResult();
validateNetworkCallRecord("readingOrder", "natural");
}, FORM_JPG);
}
}

Large diffs are not rendered by default.

Large diffs are not rendered by default.