Invoice Splitter API Python

The Python SDK supports the Mindee V1 Invoice Splitter API.

Product Specifications

Specification
Details

Endpoint Name

invoice_splitter

Recommended Version

v1.4

Supports Polling/Webhooks

✔️ Yes

Support Synchronous HTTP Calls

❌ No

Geography

🌐 Global

Quick-Start

Using the sample below, we are going to illustrate how to extract the data that we want using the SDK.

Invoice Splitter Sample

Sample Code

#
# Install the Python client library by running:
# pip install mindee
#

from mindee import Client, product, AsyncPredictResponse

# Init a new client
mindee_client = Client(api_key="my-api-key")

# Load a file from disk
input_doc = mindee_client.source_from_path("/path/to/the/file.ext")

# Load a file from disk and enqueue it.
result: AsyncPredictResponse = mindee_client.enqueue_and_parse(
    product.InvoiceSplitterV1,
    input_doc,
)

# Print a brief summary of the parsed data
print(result.document)

Sample Output (rST)

########
Document
########
:Mindee ID: 15ad7a19-7b75-43d0-b0c6-9a641a12b49b
:Filename: default_sample.pdf

Inference
#########
:Product: mindee/invoice_splitter v1.2
:Rotation applied: No

Prediction
==========
:Invoice Page Groups:
  +--------------------------------------------------------------------------+
  | Page Indexes                                                             |
  +==========================================================================+
  | 0                                                                        |
  +--------------------------------------------------------------------------+
  | 1                                                                        |
  +--------------------------------------------------------------------------+

Standard Fields

These fields are generic and used in several products.

BaseField

Each prediction object contains a set of fields that inherit from the generic BaseField class. A typical BaseField object will have the following attributes:

  • value (Union[float, str]): corresponds to the field value. Can be None if no value was extracted.

  • confidence (float): the confidence score of the field prediction.

  • bounding_box ([Point, Point, Point, Point]): contains exactly 4 relative vertices (points) coordinates of a right rectangle containing the field in the document.

  • polygon (List[Point]): contains the relative vertices coordinates (Point) of a polygon containing the field in the image.

  • page_id (int): the ID of the page, always None when at document-level.

  • reconstructed (bool): indicates whether an object was reconstructed (not extracted as the API gave it).

A Point simply refers to a list of two numbers ([float, float]).

Aside from the previous attributes, all basic fields have access to a custom __str__ method that can be used to print their value as a string.

Specific Fields

Fields which are specific to this product; they are not used in any other product.

Invoice Page Groups Field

List of page groups. Each group represents a single invoice within a multi-invoice document.

A InvoiceSplitterV1InvoicePageGroup implements the following attributes:

  • page_indexes (List[int]): List of page indexes that belong to the same invoice (group).

Attributes

The following fields are extracted for Invoice Splitter V1:

Invoice Page Groups

invoice_page_groups (List[InvoiceSplitterV1InvoicePageGroup]): List of page groups. Each group represents a single invoice within a multi-invoice document.

for invoice_page_groups_elem in result.document.inference.prediction.invoice_page_groups:
    print(invoice_page_groups_elem.value)

Last updated

Was this helpful?