Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[formrecognizer] Adding sync and async samples for ID documents #17186

Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
adding sync and async samples
catalinaperalta committed Mar 9, 2021

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
commit 6433a2eb851fa6fc9f967f77aa0e8e6532fda088
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# coding: utf-8

# -------------------------------------------------------------------------
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT License. See License.txt in the project root for
# license information.
# --------------------------------------------------------------------------

"""
FILE: sample_recognize_id_documents_async.py

DESCRIPTION:
This sample demonstrates how to recognize fields from an identity document.

See fields found on an ID document here:
https://aka.ms/formrecognizer/TODO

USAGE:
python sample_recognize_id_documents_async.py

Set the environment variables with your own values before running the sample:
1) AZURE_FORM_RECOGNIZER_ENDPOINT - the endpoint to your Cognitive Services resource.
2) AZURE_FORM_RECOGNIZER_KEY - your Form Recognizer API key
"""

import os
import asyncio


class RecognizeIdDocumentsSampleAsync(object):

async def recognize_id_document(self):
path_to_sample_forms = os.path.abspath(os.path.join(os.path.abspath(__file__),
"..", "./../sample_forms/id_documents/license.jpg"))

# [START recognize_id_document]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to match the end clause

Suggested change
# [START recognize_id_document]
# [START recognize_id_documents]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd been going back and forth about having the s at the end. Do you think I should rename the functions to have an s at the end?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. I decided to add the s to the function as well.

from azure.core.credentials import AzureKeyCredential
from azure.ai.formrecognizer.aio import FormRecognizerClient

endpoint = os.environ["AZURE_FORM_RECOGNIZER_ENDPOINT"]
key = os.environ["AZURE_FORM_RECOGNIZER_KEY"]

async with FormRecognizerClient(
endpoint=endpoint, credential=AzureKeyCredential(key)
) as form_recognizer_client:

with open(path_to_sample_forms, "rb") as f:
poller = await form_recognizer_client.begin_recognize_id_documents(id_document=f)

id_documents = await poller.result()

for idx, id_document in enumerate(id_documents):
print("--------Recognizing ID document #{}--------".format(idx+1))
first_name = id_document.fields.get("FirstName")
if first_name:
print("First Name: {} has confidence: {}".format(first_name.value, first_name.confidence))
last_name = id_document.fields.get("LastName")
if last_name:
print("Last Name: {} has confidence: {}".format(last_name.value, last_name.confidence))
document_number = id_document.fields.get("DocumentNumber")
if document_number:
print("Document Number: {} has confidence: {}".format(document_number.value, document_number.confidence))
dob = id_document.fields.get("DateOfBirth")
if dob:
print("Date of Birth: {} has confidence: {}".format(dob.value, dob.confidence))
doe = id_document.fields.get("DateOfExpiration")
if doe:
print("Date of Expiration: {} has confidence: {}".format(doe.value, doe.confidence))
sex = id_document.fields.get("Sex")
if sex:
print("Sex: {} has confidence: {}".format(sex.value_data.text, sex.confidence))
address = id_document.fields.get("Address")
if address:
print("Address: {} has confidence: {}".format(address.value, address.confidence))
# FIXME: uncomment this
# country = id_document.fields.get("Country")
# if country:
# print("Country: {} has confidence: {}".format(country.value, country.confidence))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it not finding "Country"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not the last time I tried it :( Although it did work a couple of weeks ago, must be due to some changes on their end.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I decided to leave it commented out for now, meanwhile they stabilize their endpoints

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to comment it out? the code should still work if Country is not found and will automatically print when it is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well it's a little funky at the moment. It includes the confidence but not the country. So it actually does print out the line but says Country is None, but does include the confidence (I think it's just some regression they had for this endpoint since it was working before). The other reason I had it commented out was that in a previous iteration I could just do country.value to get the country name, but now I have to do country.value_data.text (although this is ALSO currently broken). So I commented it out because it seems this field is in flux. I dont have a problem uncommenting it though if you think it's better that way, I'll just add a note for myself to check it closer to release.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh okay let's leave it commented out, I thought it wasn't finding the field at all

region = id_document.fields.get("Region")
if region:
print("Region: {} has confidence: {}".format(region.value, region.confidence))
# [END recognize_id_documents]

async def main():
sample = RecognizeIdDocumentsSampleAsync()
await sample.recognize_id_document()

if __name__ == '__main__':
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
# coding: utf-8

# -------------------------------------------------------------------------
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT License. See License.txt in the project root for
# license information.
# --------------------------------------------------------------------------

"""
FILE: sample_recognize_id_documents.py

DESCRIPTION:
This sample demonstrates how to recognize fields from an identity document.

See fields found on an ID document here:
https://aka.ms/formrecognizer/TODO

USAGE:
python sample_recognize_id_documents.py

Set the environment variables with your own values before running the sample:
1) AZURE_FORM_RECOGNIZER_ENDPOINT - the endpoint to your Cognitive Services resource.
2) AZURE_FORM_RECOGNIZER_KEY - your Form Recognizer API key
"""

import os


class RecognizeIdDocumentsSample(object):

def recognize_id_document(self):
path_to_sample_forms = os.path.abspath(os.path.join(os.path.abspath(__file__),
"..", "./sample_forms/id_documents/license.jpg"))

# [START recognize_id_document]
from azure.core.credentials import AzureKeyCredential
from azure.ai.formrecognizer import FormRecognizerClient

endpoint = os.environ["AZURE_FORM_RECOGNIZER_ENDPOINT"]
key = os.environ["AZURE_FORM_RECOGNIZER_KEY"]

form_recognizer_client = FormRecognizerClient(
endpoint=endpoint, credential=AzureKeyCredential(key)
)
with open(path_to_sample_forms, "rb") as f:
poller = form_recognizer_client.begin_recognize_id_documents(id_document=f)
id_documents = poller.result()

for idx, id_document in enumerate(id_documents):
print("--------Recognizing ID document #{}--------".format(idx+1))
first_name = id_document.fields.get("FirstName")
if first_name:
print("First Name: {} has confidence: {}".format(first_name.value, first_name.confidence))
last_name = id_document.fields.get("LastName")
if last_name:
print("Last Name: {} has confidence: {}".format(last_name.value, last_name.confidence))
document_number = id_document.fields.get("DocumentNumber")
if document_number:
print("Document Number: {} has confidence: {}".format(document_number.value, document_number.confidence))
dob = id_document.fields.get("DateOfBirth")
if dob:
print("Date of Birth: {} has confidence: {}".format(dob.value, dob.confidence))
doe = id_document.fields.get("DateOfExpiration")
if doe:
print("Date of Expiration: {} has confidence: {}".format(doe.value, doe.confidence))
sex = id_document.fields.get("Sex")
if sex:
print("Sex: {} has confidence: {}".format(sex.value_data.text, sex.confidence))
address = id_document.fields.get("Address")
if address:
print("Address: {} has confidence: {}".format(address.value, address.confidence))
# FIXME: uncomment this
# country = id_document.fields.get("Country")
# if country:
# print("Country: {} has confidence: {}".format(country.value, country.confidence))
region = id_document.fields.get("Region")
if region:
print("Region: {} has confidence: {}".format(region.value, region.confidence))
# [END recognize_id_documents]

if __name__ == '__main__':
sample = RecognizeIdDocumentsSample()
sample.recognize_id_document()