AI Capabilities - deansilbert/Azure GitHub Wiki
Document Intelligence
Document intelligence describes AI capabilities that support processing text and making sense of information in text. As an extension of optical character recognition (OCR), document intelligence takes the next step a person might after reading a form or document. It automates the process of extracting, understanding, and saving the data in text.
Consider an organization that needs to process large numbers of receipts for expenses claims, project costs, and other accounting purposes. Suppose someone needs to manually enter the information into a database. The manual process is relatively slow and potentially error-prone.
Using document intelligence, the company can take a scanned image of a receipt, digitize the text with OCR, and pair the field items with their field names in a database. Document intelligence can identify specific data such as the merchant's name, merchant's address, total value, and tax value.
Azure AI Document Intelligence
The Azure AI Document Intelligence supports features that can analyze documents and forms with prebuilt and custom models. In this module, you explore how Azure AI services provide access to document intelligence capabilities.
Azure AI Document Intelligence consists of features grouped by model type:
- Document Analysis - general document analysis that returns structured data representations, including regions of interest and their inter-relationships.
- Prebuilt Models - pretrained models that have been built to process common document types such as invoices, business cards, ID documents, and more. These models are designed to recognize and extract specific fields that are important for each document type.
- Custom Models - can be trained to identify specific fields that are not included in the existing pretrained models. Includes custom classification models and document field extraction models such as the custom generative AI model and custom neural model.
Prebuilt Models
The prebuilt models apply advanced machine learning to accurately identify and extract text, key-value pairs, tables, and structures from forms and documents. The main types of documents prebuilt models can process are financial services and legal, US tax, US mortgage, and personal identification documents. Some examples of these capabilities include extracting:
- customer and vendor details from invoices
- sales and transaction details from receipts
- identification and verification details from identity documents
- health insurance details
- business contact details
- agreement and party details from contracts
- taxable compensation, mortgage interest, student loan details and more
For example, consider the prebuilt receipt model. It processes receipts by:
- Matching field names to values
- Identifying tables of data
- Identifying specific fields, such as dates, telephone numbers, addresses, totals, and others
The receipt model has been trained to recognize data on several different receipt types, such as thermal receipts (printed on heat-sensitive paper), hotel receipts, gas receipts, credit card receipts, and parking receipts.
Fields recognized include:
- Name, address, and telephone number of the merchant
- Date and time of the purchase
- Name, quantity, and price of each item purchased
- Total, subtotals, and tax values
Each field and data pair has a confidence level, indicating the likely level of accuracy. Data extracted with a high confidence score could be used to automatically verify information on a receipt. The receipt model has been trained to recognize several different languages, depending on the receipt type.
Using Azure AI Document Intelligence
To use Azure AI Document Intelligence, create either a Document Intelligence or Azure AI services resource in your Azure subscription. If you have not used Document Intelligence before, select the free tier when you create the resource. There are some restrictions with the free tier, for example only the first two pages are processed for PDF or TIFF documents.
There are several ways you can use Azure AI Document Intelligence. After the resource has been created, you can use the resource in the Document Intelligence Studio, a user interface for testing document analysis, prebuilt models, and creating custom models.
You can also test out Azure AI Document Intelligence in Azure AI Foundry portal, a unified platform for enterprise AI operations, model builders, and application development.