Azure AI Document Intelligence

Azure AI Document Intelligence (formerly known as Azure Form Recognizer) is machine-learning based service that extracts text (including handwriting), tables or key-value-pairs from scanned documents or images.
Document Intelligence supports PDF, JPEG, PNG, BMP, or TIFF.

This current implementation of a loader using Document Intelligence can incorporate content page-wise and turn it into LangChain documents.

%pip install langchain langchain-community azure-ai-documentintelligence -q

[notice] A new release of pip is available: 23.3.1 -> 23.3.2
[notice] To update, run: python3 -m pip install --upgrade pip
Note: you may need to restart the kernel to use updated packages.

Example 1

The first example uses a local file which will be sent to Azure AI Document Intelligence.

With the initialized document analysis client, we can proceed to create an instance of the DocumentIntelligenceLoader:

from langchain_community.document_loaders import AzureAIDocumentIntelligenceLoader

file_path = "<filepath>"
endpoint = "<endpoint>"
key = "<key>"
loader = AzureAIDocumentIntelligenceLoader(
    api_endpoint=endpoint, api_key=key, file_path=file_path, api_model="prebuilt-layout"
)

documents = loader.load()

The default output contains one LangChain document with markdown format content:

documents

Example 2

The input file can also be URL path.

url_path = "<url>"
loader = AzureAIDocumentIntelligenceLoader(
    api_endpoint=endpoint, api_key=key, url_path=url_path, api_model="prebuilt-layout"
)

documents = loader.load()

documents

Example 1​

Example 2​

Example 1

Example 2