Demo: Azure Form Recognizers Without Labels

In this demo, we will look at the Form Recognizer API from Azure Cognitive Services and create a basic labelled solution using nothing more than a few documents our Postman and Azure Storage Explorer. If you’ve not used Postman before there is a starter post here and the same here for Azure Storage Explorer.

What is Azure Form Recognizer Without Labels?

Azure Form Recognizer is part of the Azure Cognitive Services suite. Cognitive Services are a set of APIs that developers can use to start building AI or Machine Learning into their applications, without actually needing to know too much about the data science behind it.

Form Recognizer itself, is a service that allows you to ‘analyze’ a document, usually either a .pdf or .jpg/.png, apply OCR to read the text and then return a set of Key Value Pairs so that you can work with the data.

In short, its a nice easy process.

  • Train a model by putting 5+ documents in blob storage and calling the API.
  • Get the Model Id by call get Get API
  • Analyze a new document by calling the API with the Model Id and document
  • Get the results!

Provision the Service in Azure

To start working with Cognitive Services, you’ll need to enable it in your Azure Subscription. In the Azure Portal search Form Recognizer and set up the Free Tier. The UI is fairly straight forward so i won’t go over it here. When you’re done you’ll need the endpoint and the key.

Train a new Model

To train a model, you need to put 5 or more examples of the completed document in a blob storage container. You must have a minimum of 5 and the more examples the better as it will help create a model that can pick up as many variants as possible. Both text and handwriting work, but its best to have a standard structure to the form if possible.

Open Azure Storage Explorer and navigate to your container. From here, upload your example documents and create a new SAS token for the container with at least Read and List properties.

In Postman, create a new POST request to {endpoint}/formrecognizer/v2.0-preview/custom/models. You’ll need two headers, one for content-type application/json and one for ocp-apim-subscription-key with the value set to your key.

In the body pop a json object with a single property of source and value equal to the SAS token you generated for your Storage Account.

{
	"source" : "https://{storageendpoint}/{container}?sv=2019-02-02&si=Read%20List&sr=c&sig=jTOKENy7I2lE3R0HEeIUjmTOKENc1%2FIX%2BI%3D"
}

Assuming you have ok permissions for your accounts, you will get a 201 Created back from the server and your response header will contain a header of ‘Location’ which will give you the URL to call in the next step!

Get your Model

In this step, you will use Postman to GET the URL returned in the Location Header of the previous step. You’ll need to keep the same headers you had before.

The response will bring you back a summary of your documents which hopefully will all show as a status of succeeded. If they don’t you may need to address issues with the documents before you can continue. I personally found they were always successful so lost as they had the correct privs on the SAS token. If you get permission errors, just make a short term token increasing privs until you find what works for you, but all I needed was List and Read.

The property on the model of ‘Status’ should say ‘ready’ so just grab your Model Id and go to the next step.

Analyze your document

To use your newly trained model against a document, you will need to pop it in Blob storage. I personally just created a new container and added my document to it so i could manage a seperation between training and real documents. Get a new SAS token for your real document and using Postman…

POST using the standard headers to https://{endpoint}/formrecognizer/v2.0-preview/custom/models/{modelId}/analyze

With a body of source, pointing at your real document url and sas token. This time you’ll get a 202 Accepted back with a header of Operation-Location which contains the URL to the results of your analysis.

Use your data

To get the results of your data, call a GET on {endpoint}/formrecognizer/v2.0-preview/custom/models/{modelId}/analyzeresults/{resultId} with the same headers as before.

This give you a fairly heft JSON object containing all the keys that were found on the document, their bounding boxes (ie, where this value was found) and the values.

The result has a list called ‘pageResults’, then on each page has a list of ‘keyValuePairs’ that contains your info under key.text and value.text. Just search your object for the keys you are expecting et voila.

Summary

Thats it! You’ve got your data and can now do whatever you want with it.

It is suprisingly easy to do this, as it needs to Labelling of any kind to mark up and teach the service your documents. For this reason it is better suited to easier documents, 1 pagers that are consistently structured. Think about any of your business documents, if you have any customers filling out stuff and returning it, this might be a candidate for automation.

You could even go further with it and hook it up to Power Automate (Flow) as Form Recognizer is a supported Action. Think document comes in, gets scanned to file server, Flow picks it up, analyzes and creates a record in your database with the data. Super fun right?! Possibilities here are endless.

If you need the Machine Learning to be smarter, Labelled learning might be for you which I will do a post for next. For now, enjoy your new found skill!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.