Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[form recognizer] Remove US receipt #11764

Merged
merged 12 commits into from
Jun 3, 2020
2 changes: 2 additions & 0 deletions sdk/formrecognizer/azure-ai-formrecognizer/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@
`CustomFormModel` and `CustomFormModelInfo` models.
- `models` property of `CustomFormModel` is renamed to `submodels`
- `CustomFormSubModel` is renamed to `CustomFormSubmodel`
- Removed `USReceipt`. To see how to deal with the return value of `begin_recognize_receipts`, see [sample_recognize_receipts.py](https://github.com/Azure/azure-sdk-for-python/blob/master/sdk/formrecognizer/azure-ai-formrecognizer/samples/sample_recognize_receipts.py) or [sample_recognize_receipts_from_url.py](https://github.com/Azure/azure-sdk-for-python/blob/master/sdk/formrecognizer/azure-ai-formrecognizer/samples/sample_recognize_receipts_from_url.py) for details.
- Removed `USReceiptItem`. To see how to access the individual items on a receipt, see [sample_recognize_receipts.py](https://github.com/Azure/azure-sdk-for-python/blob/master/sdk/formrecognizer/azure-ai-formrecognizer/samples/sample_recognize_receipts.py) or [sample_recognize_receipts_from_url.py](https://github.com/Azure/azure-sdk-for-python/blob/master/sdk/formrecognizer/azure-ai-formrecognizer/samples/sample_recognize_receipts_from_url.py) for details.
iscai-msft marked this conversation as resolved.
Show resolved Hide resolved

**New features**

Expand Down
47 changes: 33 additions & 14 deletions sdk/formrecognizer/azure-ai-formrecognizer/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -227,21 +227,40 @@ with open("<path to your receipt>", "rb") as fd:
poller = form_recognizer_client.begin_recognize_receipts(receipt)
result = poller.result()

r = result[0]
print("Receipt contained the following values with confidences: ")
print("Receipt Type: {} has confidence: {}".format(r.receipt_type.type, r.receipt_type.confidence))
print("Merchant Name: {} has confidence: {}".format(r.merchant_name.value, r.merchant_name.confidence))
print("Transaction Date: {} has confidence: {}".format(r.transaction_date.value, r.transaction_date.confidence))
receipt = result[0]
print("Receipt Type: {} has confidence: {}".format(receipt.receipt_type.type, receipt.receipt_type.confidence))
iscai-msft marked this conversation as resolved.
Show resolved Hide resolved
merchant_name = receipt.fields.get("MerchantName")
if merchant_name:
print("Merchant Name: {} has confidence: {}".format(merchant_name.value, merchant_name.confidence))
transaction_date = receipt.fields.get("TransactionDate")
if transaction_date:
print("Transaction Date: {} has confidence: {}".format(transaction_date.value, transaction_date.confidence))
print("Receipt items:")
for item in r.receipt_items:
print("...Item Name: {} has confidence: {}".format(item.name.value, item.name.confidence))
print("...Item Quantity: {} has confidence: {}".format(item.quantity.value, item.quantity.confidence))
print("...Individual Item Price: {} has confidence: {}".format(item.price.value, item.price.confidence))
print("...Total Item Price: {} has confidence: {}".format(item.total_price.value, item.total_price.confidence))
print("Subtotal: {} has confidence: {}".format(r.subtotal.value, r.subtotal.confidence))
print("Tax: {} has confidence: {}".format(r.tax.value, r.tax.confidence))
print("Tip: {} has confidence: {}".format(r.tip.value, r.tip.confidence))
print("Total: {} has confidence: {}".format(r.total.value, r.total.confidence))
for item in receipt.fields.get("Items").value:
item_name = item.value.get("Name")
if item_name:
print("...Item Name: {} has confidence: {}".format(item_name.value, item_name.confidence))
item_quantity = item.value.get("Quantity")
if item_quantity:
print("...Item Quantity: {} has confidence: {}".format(item_quantity.value, item_quantity.confidence))
item_price = item.value.get("Price")
if item_price:
print("...Individual Item Price: {} has confidence: {}".format(item_price.value, item_price.confidence))
item_total_price = item.value.get("TotalPrice")
if item_total_price:
print("...Total Item Price: {} has confidence: {}".format(item_total_price.value, item_total_price.confidence))
subtotal = receipt.fields.get("Subtotal")
if subtotal:
print("Subtotal: {} has confidence: {}".format(subtotal.value, subtotal.confidence))
tax = receipt.fields.get("Tax")
if tax:
print("Tax: {} has confidence: {}".format(tax.value, tax.confidence))
tip = receipt.fields.get("Tip")
if tip:
print("Tip: {} has confidence: {}".format(tip.value, tip.confidence))
total = receipt.fields.get("Total")
if total:
print("Total: {} has confidence: {}".format(total.value, total.confidence))
```

### Train a model
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,7 @@
TrainingStatus,
CustomFormModelStatus,
FormContentType,
USReceipt,
ReceiptType,
USReceiptItem,
FormTable,
FormTableCell,
TrainingDocumentInfo,
Expand Down Expand Up @@ -45,9 +43,7 @@
'CustomFormModelStatus',
'FormContentType',
'FormContent',
'USReceipt',
'ReceiptType',
'USReceiptItem',
'FormTable',
'FormTableCell',
'TrainingDocumentInfo',
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
from azure.core.polling.base_polling import LROBasePolling
from ._generated._form_recognizer_client import FormRecognizerClient as FormRecognizer
from ._response_handlers import (
prepare_us_receipt,
prepare_receipt,
prepare_content_result,
prepare_form_result
)
Expand Down Expand Up @@ -74,7 +74,7 @@ def __init__(self, endpoint, credential, **kwargs):

def _receipt_callback(self, raw_response, _, headers): # pylint: disable=unused-argument
analyze_result = self._client._deserialize(AnalyzeOperationResult, raw_response)
return prepare_us_receipt(analyze_result)
return prepare_receipt(analyze_result)

@distributed_trace
def begin_recognize_receipts(self, receipt, **kwargs):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -197,80 +197,18 @@ class RecognizedReceipt(RecognizedForm):
words, tables and page metadata.
:ivar ~azure.ai.formrecognizer.ReceiptType receipt_type:
The reciept type and confidence.
:ivar str receipt_locale: Defaults to "en-US".
"""
def __init__(self, **kwargs):
super(RecognizedReceipt, self).__init__(**kwargs)
self.receipt_type = kwargs.get("receipt_type", None)
self.receipt_locale = kwargs.get("receipt_locale", "en-US")

def __repr__(self):
return "RecognizedReceipt(form_type={}, fields={}, page_range={}, pages={}, " \
"receipt_type={}, receipt_locale={})".format(
"receipt_type={})".format(
self.form_type, repr(self.fields), repr(self.page_range), repr(self.pages),
repr(self.receipt_type), self.receipt_locale
repr(self.receipt_type)
)[:1024]

class USReceipt(RecognizedReceipt): # pylint: disable=too-many-instance-attributes
"""Extracted fields found on the US sales receipt. Provides
attributes for accessing common fields present in US sales receipts.

:ivar ~azure.ai.formrecognizer.FormField merchant_address:
The address of the merchant.
:ivar ~azure.ai.formrecognizer.FormField merchant_name:
The name of the merchant.
:ivar ~azure.ai.formrecognizer.FormField merchant_phone_number:
The phone number associated with the merchant.
:ivar list[~azure.ai.formrecognizer.USReceiptItem] receipt_items:
The purchased items found on the receipt.
:ivar ~azure.ai.formrecognizer.FormField subtotal:
The subtotal found on the receipt
:ivar ~azure.ai.formrecognizer.FormField tax:
The tax value found on the receipt.
:ivar ~azure.ai.formrecognizer.FormField tip:
The tip value found on the receipt.
:ivar ~azure.ai.formrecognizer.FormField total:
The total amount found on the receipt.
:ivar ~azure.ai.formrecognizer.FormField transaction_date:
The transaction date of the sale.
:ivar ~azure.ai.formrecognizer.FormField transaction_time:
The transaction time of the sale.
:ivar fields:
A dictionary of the fields found on the receipt.
:vartype fields: dict[str, ~azure.ai.formrecognizer.FormField]
:ivar ~azure.ai.formrecognizer.FormPageRange page_range:
The first and last page number of the input receipt.
:ivar list[~azure.ai.formrecognizer.FormPage] pages:
Contains page metadata such as page width, length, text angle, unit.
If `include_text_content=True` is passed, contains a list
of extracted text lines for each page in the input document.
:ivar str form_type: The type of form.
"""

def __init__(self, **kwargs):
super(USReceipt, self).__init__(**kwargs)
self.merchant_address = kwargs.get("merchant_address", None)
self.merchant_name = kwargs.get("merchant_name", None)
self.merchant_phone_number = kwargs.get("merchant_phone_number", None)
self.receipt_items = kwargs.get("receipt_items", None)
self.subtotal = kwargs.get("subtotal", None)
self.tax = kwargs.get("tax", None)
self.tip = kwargs.get("tip", None)
self.total = kwargs.get("total", None)
self.transaction_date = kwargs.get("transaction_date", None)
self.transaction_time = kwargs.get("transaction_time", None)

def __repr__(self):
return "USReceipt(merchant_address={}, merchant_name={}, merchant_phone_number={}, " \
"receipt_type={}, receipt_items={}, subtotal={}, tax={}, tip={}, total={}, "\
"transaction_date={}, transaction_time={}, fields={}, page_range={}, pages={}, " \
"form_type={}, receipt_locale={})".format(
repr(self.merchant_address), repr(self.merchant_name), repr(self.merchant_phone_number),
repr(self.receipt_type), repr(self.receipt_items), repr(self.subtotal), repr(self.tax),
repr(self.tip), repr(self.total), repr(self.transaction_date), repr(self.transaction_time),
repr(self.fields), repr(self.page_range), repr(self.pages), self.form_type, self.receipt_locale
)[:1024]


class FormField(object):
"""Represents a field recognized in an input form.
Expand Down Expand Up @@ -537,45 +475,6 @@ def __repr__(self):
return "ReceiptType(type={}, confidence={})".format(self.type, self.confidence)[:1024]


class USReceiptItem(object):
"""A receipt item on a US sales receipt.
Contains the item name, quantity, price, and total price.

:ivar ~azure.ai.formrecognizer.FormField name:
The name of the item.
:ivar ~azure.ai.formrecognizer.FormField quantity:
The quantity associated with this item.
:ivar ~azure.ai.formrecognizer.FormField price:
The price of a single unit of this item.
:ivar ~azure.ai.formrecognizer.FormField total_price:
The total price of this item, taking the quantity into account.
"""

def __init__(self, **kwargs):
self.name = kwargs.get("name", None)
self.quantity = kwargs.get("quantity", None)
self.price = kwargs.get("price", None)
self.total_price = kwargs.get("total_price", None)

@classmethod
def _from_generated(cls, items, read_result):
try:
receipt_item = items.value_array
return [cls(
name=FormField._from_generated("Name", item.value_object.get("Name"), read_result),
quantity=FormField._from_generated("Quantity", item.value_object.get("Quantity"), read_result),
price=FormField._from_generated("Price", item.value_object.get("Price"), read_result),
total_price=FormField._from_generated("TotalPrice", item.value_object.get("TotalPrice"), read_result),
) for item in receipt_item]
except AttributeError:
return []

def __repr__(self):
return "USReceiptItem(name={}, quantity={}, price={}, total_price={})".format(
repr(self.name), repr(self.quantity), repr(self.price), repr(self.total_price)
)[:1024]


class FormTable(object):
"""Information about the extracted table contained on a page.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,64 +7,35 @@
# pylint: disable=protected-access

from ._models import (
USReceipt,
ReceiptType,
FormField,
USReceiptItem,
FormPage,
FormLine,
FormTable,
FormTableCell,
FormPageRange,
RecognizedForm
RecognizedForm,
RecognizedReceipt
)


def prepare_us_receipt(response):
def prepare_receipt(response):
receipts = []
read_result = response.analyze_result.read_results
document_result = response.analyze_result.document_results
form_page = FormPage._from_generated(read_result)

for page in document_result:
if page.fields is None:
receipt = USReceipt(
receipt = RecognizedReceipt(
page_range=FormPageRange(first_page_number=page.page_range[0], last_page_number=page.page_range[1]),
pages=form_page[page.page_range[0]-1:page.page_range[1]],
form_type=page.doc_type,
)
receipts.append(receipt)
continue
receipt = USReceipt(
merchant_address=FormField._from_generated(
"MerchantAddress", page.fields.get("MerchantAddress"), read_result
),
merchant_name=FormField._from_generated(
"MerchantName", page.fields.get("MerchantName"), read_result
),
merchant_phone_number=FormField._from_generated(
"MerchantPhoneNumber",
page.fields.get("MerchantPhoneNumber"),
read_result,
),
receipt = RecognizedReceipt(
receipt_type=ReceiptType._from_generated(page.fields.get("ReceiptType")),
receipt_items=USReceiptItem._from_generated(
page.fields.get("Items"), read_result
),
subtotal=FormField._from_generated(
"Subtotal", page.fields.get("Subtotal"), read_result
),
tax=FormField._from_generated("Tax", page.fields.get("Tax"), read_result),
tip=FormField._from_generated("Tip", page.fields.get("Tip"), read_result),
total=FormField._from_generated(
"Total", page.fields.get("Total"), read_result
),
transaction_date=FormField._from_generated(
"TransactionDate", page.fields.get("TransactionDate"), read_result
),
transaction_time=FormField._from_generated(
"TransactionTime", page.fields.get("TransactionTime"), read_result
),
page_range=FormPageRange(
first_page_number=page.page_range[0], last_page_number=page.page_range[1]
),
Expand All @@ -73,7 +44,7 @@ def prepare_us_receipt(response):
fields={
key: FormField._from_generated(key, value, read_result)
for key, value in page.fields.items()
},
}
)

receipts.append(receipt)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
from azure.core.polling.async_base_polling import AsyncLROBasePolling
from .._generated.aio._form_recognizer_client_async import FormRecognizerClient as FormRecognizer
from .._response_handlers import (
prepare_us_receipt,
prepare_receipt,
prepare_content_result,
prepare_form_result
)
Expand Down Expand Up @@ -84,7 +84,7 @@ def __init__(

def _receipt_callback(self, raw_response, _, headers): # pylint: disable=unused-argument
analyze_result = self._client._deserialize(AnalyzeOperationResult, raw_response)
return prepare_us_receipt(analyze_result)
return prepare_receipt(analyze_result)

@distributed_trace_async
async def recognize_receipts(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,18 +46,38 @@ async def recognize_receipts(self):
for idx, receipt in enumerate(receipts):
print("--------Recognizing receipt #{}--------".format(idx))
print("Receipt Type: {} has confidence: {}".format(receipt.receipt_type.type, receipt.receipt_type.confidence))
print("Merchant Name: {} has confidence: {}".format(receipt.merchant_name.value, receipt.merchant_name.confidence))
print("Transaction Date: {} has confidence: {}".format(receipt.transaction_date.value, receipt.transaction_date.confidence))
merchant_name = receipt.fields.get("MerchantName")
if merchant_name:
print("Merchant Name: {} has confidence: {}".format(merchant_name.value, merchant_name.confidence))
transaction_date = receipt.fields.get("TransactionDate")
if transaction_date:
print("Transaction Date: {} has confidence: {}".format(transaction_date.value, transaction_date.confidence))
print("Receipt items:")
for item in receipt.receipt_items:
print("...Item Name: {} has confidence: {}".format(item.name.value, item.name.confidence))
print("...Item Quantity: {} has confidence: {}".format(item.quantity.value, item.quantity.confidence))
print("...Individual Item Price: {} has confidence: {}".format(item.price.value, item.price.confidence))
print("...Total Item Price: {} has confidence: {}".format(item.total_price.value, item.total_price.confidence))
print("Subtotal: {} has confidence: {}".format(receipt.subtotal.value, receipt.subtotal.confidence))
print("Tax: {} has confidence: {}".format(receipt.tax.value, receipt.tax.confidence))
print("Tip: {} has confidence: {}".format(receipt.tip.value, receipt.tip.confidence))
print("Total: {} has confidence: {}".format(receipt.total.value, receipt.total.confidence))
for item in receipt.fields.get("Items").value:
item_name = item.value.get("Name")
if item_name:
print("...Item Name: {} has confidence: {}".format(item_name.value, item_name.confidence))
item_quantity = item.value.get("Quantity")
if item_quantity:
print("...Item Quantity: {} has confidence: {}".format(item_quantity.value, item_quantity.confidence))
item_price = item.value.get("Price")
if item_price:
print("...Individual Item Price: {} has confidence: {}".format(item_price.value, item_price.confidence))
item_total_price = item.value.get("TotalPrice")
if item_total_price:
print("...Total Item Price: {} has confidence: {}".format(item_total_price.value, item_total_price.confidence))
subtotal = receipt.fields.get("Subtotal")
if subtotal:
print("Subtotal: {} has confidence: {}".format(subtotal.value, subtotal.confidence))
tax = receipt.fields.get("Tax")
if tax:
print("Tax: {} has confidence: {}".format(tax.value, tax.confidence))
tip = receipt.fields.get("Tip")
if tip:
print("Tip: {} has confidence: {}".format(tip.value, tip.confidence))
total = receipt.fields.get("Total")
if total:
print("Total: {} has confidence: {}".format(total.value, total.confidence))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand wanting to show the keys available, but wondering if we should do at least one sample or readme like this:

for receipt in result:
    for key, val in receipt.fields.items():
        print("{}: {} has confidence: {}".format(key, val.value, val.confidence))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean. I think I will leave the samples like this, but update the readme sample to traverse the fields like this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like how the readme looks.
For this sample, I will suggest adding a comment on top a link to the types the service is currently supporting, or guiding the user into finding this information

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the .NET sample, we're only showing a subset of illustrative values from the receipt.

Also, I'm not sure how it works in Python, but do you need to check the FieldValueType to cast these to the right time in order for a customer to use them in a scenario?

print("--------------------------------------")
# [END recognize_receipts_async]

Expand Down
Loading