Integrating Mindee

Once you have your model set up, you'll want to start using it!

API key

Make sure you've created your API Key before continuing.

To create and manage your API keys, go to the "API Keys" section on the Mindee Platform.

Click on create API Key, choose a name for this API Key, and validate.

You're now ready to go!

Sending a File

To process your document using Mindee, simply send the file using the REST API.

Make a note of your model's ID for use in the API.

When getting started, we recommend using the Polling method which will be quickest.

Here are some code examples, these are self-contained and can be run as-is:

Requires Python 3.9 minimum and the requests library.

import json
import time
import requests
from pathlib import Path


def send_file_with_polling(
        file_path: str,
        model_id: str,
        api_key: str,
        max_retries: int = 30,
        polling_interval: int = 2,
) -> dict:
    file = Path(file_path)
    headers = {"Authorization": api_key}
    form_data = {"model_id": model_id, "rag": False}
    with open(file_path, "rb") as fh:
        files = {"file": (file.name, fh)}
        print(f"Enqueuing file: {file_path}")
        response = requests.post(
            url="https://api-v2.mindee.net/v2/inferences/enqueue",
            files=files,
            data=form_data,
            headers=headers,
        )
    response.raise_for_status()
    job_data = response.json().get("job")
    polling_url = job_data.get("polling_url")

    # Important to wait before attempting to poll
    time.sleep(3)

    # Poll for completion
    for attempt in range(max_retries):
        print(f"Polling on: {polling_url}")
        poll_response = requests.get(polling_url, headers=headers, allow_redirects=False)
        poll_data = poll_response.json()
        job_status = poll_data.get("job", {}).get("status")
        if poll_response.status_code == 302 or job_status == "Processed":
            result_url = poll_data.get("job", {}).get("result_url")
            print(f"Get result from: {result_url}")
            result_response = requests.get(result_url, headers=headers)
            result_data = result_response.json()
            return result_data
        # still processing, wait before next poll
        time.sleep(polling_interval)
    # If we've exhausted all retries
    raise TimeoutError(f"Polling timed out after {max_retries} attempts")


result = send_file_with_polling(
    file_path="/path/to/file.pdf",
    model_id="MY_MODEL_ID",
    api_key="MY_API_KEY",
)
print(json.dumps(result,indent=2))

Processing the Results

Once you've sent the file and retrieved the results, you can start extracting the JSON payload.

The model's fields will be in the result.fields object in the returned JSON.

Each key in the fields object corresponds to the field's name in your model configuration.

You'll want to adapt your processing depending on the type of field, for example looping over lists.

Accessing a simple value, where my_simple_field is the name of the field in the Model.

my_simple_field = result["inference"]["fields"]["my_simple_field"]
field_value = my_simple_field["value"]

Accessing a list of values, where my_list_field is the name of the field in the Model.

my_list_field = result["inference"]["fields"]["my_list_field"]

# access a value at a given position
field_first_value = my_list_field[0]["value"]

# loop over all values in the list
for list_item in my_list_field:
    item_value = list_item["value"]

Accessing an object field, where my_object_field is the name of the field in the Model. In this hypothetical case, the object has a sub-field named sub_field .

object_field = result["inference"]["fields"]["my_object_field"]
sub_field_value = object_field["sub_field"]["value"]

Accessing a list of objects, where my_object_list_field is the name of the field in the Model.

object_list_field = result["inference"]["fields"]["my_object_list_field"]

# access an object at a given position
object_item_0 = object_list_field[0]
sub_field_0_value = object_item_0["sub_field"]["value"]

# loop over object lists
for object_item in object_list_field:
  sub_field_value = object_item["sub_field"]["value"]

Last updated

Was this helpful?