# Receipt OCR

## Receipt OCR

Keep track of the changes and updates for the Receipt OCR API

## Version 5

#### ⚡️ Features and Changes (April 28th, 2025)

* :sparkles: :zap: Expand the granularity of the **`category`** and **`sub_category`** output.
* :sparkles: New categories: **`energy`, `shopping` , `software`**
* :sparkles: New sub-categories:
  * **`food`**
    * **`delivery`** (Deliveroo, UberEats...)
  * **`transport`**
    * \*\*`public` \*\*(tram, rer, metro, underground, Bus...)
    * **`car_rental`**
    * **`micro_mobility`** (Short-term rentals, like scooters and bikes.. For further information, you can consult <https://en.wikipedia.org/wiki/Micromobility>)
  * **`shopping`**
  * **`office_supplies`** (notebooks, pens, scissors, ...)
  * **`electronics`** (computer, printer, iPhone, cables, ...)
  * **`cultural`** (books, movies, music, ...)
  * **`groceries`** (food, cleaning, ...)
  * **`other`**
* :zap: **Significant Performance Improvement for Supplier Name Extraction:**\
  \
  We've also significantly boosted the accuracy of `supplier_name` extraction in the Receipt API. We've achieved a 30-to-50% error rate reduction by developing and training a novel, domain-specific NLP model.

#### ⚡️ Features and Changes (October 10th, 2024)

* :bug: Fix `category` / `subcategory` consistency.<br>

#### ⚡️ Features and Changes (June 12th, 2024)

* 🚀  **Extended latin alphabet support**\
  We released new models for our generic text detection and recognition pipeline. This release has increased the overall performances on all fields and supports extended latin alphabet characters:

  ```
  {'`', '¡', '¥', '¿', 'Á', 'Ã', 'Ä', 'Å', 'Æ', 'Ì', 'Í', 'Ð', 'Ñ', 'Ò', 'Ó', 'Õ', 'Ö', 'Ø', 'Ú', 'Ü', 'Ý', 'Þ', 'ß', 'á', 'ã', 'ä', 'å', 'æ', 'ì', 'í', 'ð', 'ñ', 'ò', 'ó', 'õ', 'ö', 'ø', 'ú', 'ü', 'ý', 'þ', 'Ā', 'ā', 'Ă', 'ă', 'Ą', 'ą', 'Ć', 'ć', 'Č', 'č', 'Ď', 'ď', 'Đ', 'đ', 'Ē', 'ē', 'Ė', 'ė', 'Ę', 'ę', 'Ě', 'ě', 'Ğ', 'ğ', 'Ģ', 'ģ', 'Ī', 'ī', 'Į', 'į', 'İ', 'ı', 'Ķ', 'ķ', 'Ĺ', 'ĺ', 'Ļ', 'ļ', 'Ľ', 'ľ', 'Ł', 'ł', 'Ń', 'ń', 'Ņ', 'ņ', 'Ň', 'ň', 'Ō', 'ō', 'Ő', 'ő', 'Ŕ', 'ŕ', 'Ŗ', 'ŗ', 'Ř', 'ř', 'Ś', 'ś', 'Ş', 'ş', 'Š', 'š', 'Ť', 'ť', 'Ū', 'ū', 'Ů', 'ů', 'Ű', 'ű', 'Ų', 'ų', 'Ź', 'ź', 'Ż', 'ż', 'Ž', 'ž', 'Ș', 'ș', 'Ț', 'ț', 'ẞ', '₿'}
  ```
* 🔥 **Strong improvement on `taxes` for multi-taxes extraction**\
  The main focus of this release was to improve drastically the multi-taxes extraction.\
  We measured a decrease in error rates of **56%** on `taxes` for multi-taxes extraction.

#### ⚡️ Features and Changes (May 16th, 2024)

* 🔥 **Strong improvement on `supplier_name` and `supplier_address`**\
  The main focus of this release was to improve drastically the supplier information extraction.\
  We measured a decrease in error rates of:
  * 27% for `supplier_name`
  * 14% for `supplier_address`
* ✨ **New field: `receipt_number`**\
  The API is now extracting the receipt number, returned as a string. It is an identifier assigned to some receipt, analogous to the invoice number.

<figure><img src="https://126655343-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F2al1MDqAP9Dg9iDRjkWg%2Fuploads%2Fgit-blob-9180c1b0760738e1e3d05a4b9dc30d10aa2bb4c7%2Fa5e8b93-Capture_decran_2024-05-21_a_10.37.00.png?alt=media" alt=""><figcaption></figcaption></figure>

#### ⚡️ Features and Changes (September 1st, 2023)

* New feature: Raw Value available for Supplier Name. The Raw Value extracts the name without post processing nor formatting. It can thus be different from the Value.

  <figure><img src="https://126655343-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F2al1MDqAP9Dg9iDRjkWg%2Fuploads%2Fgit-blob-634bdf3e67260d8ddb20d5855f33a0f0f9c8a62a%2F6e53ade-image.png?alt=media" alt=""><figcaption></figcaption></figure>

#### ⚡️ Features and Changes (April 06, 2023)

* New fields extracted in the response scheme:
  * Supplier Address
  * Supplier Company Registrations
  * Supplier Phone Number

<figure><img src="https://126655343-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F2al1MDqAP9Dg9iDRjkWg%2Fuploads%2Fgit-blob-a0049231532d8cf657f8b5ed64ad272fc534fb5a%2F77d532e-f994cad-Extraction_-_supplier_ovf.png?alt=media" alt=""><figcaption></figcaption></figure>

* Line items extraction. Line items are returned as a list in the json response. Each item includes:
  * a description
  * a quantity
  * a unit price
  * a total amount

<figure><img src="https://126655343-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F2al1MDqAP9Dg9iDRjkWg%2Fuploads%2Fgit-blob-cc6c22ced0d013a2b5a623da9b96a9809610f9d2%2F6bed561-Extraction_-_line_items.png?alt=media" alt=""><figcaption></figcaption></figure>

* Renaming of fields in the json response for more clarity:
  * `supplier` -> `supplier_name`

## Version 4

#### ⚡️ Features and Changes (January 03, 2023)

* New fields extracted in the response scheme:
  * subcategory, with the following options: plane, train, taxi, restaurant, shopping
  * category has a new option: telecom
  * document\_type, with the following options: "EXPENSE RECEIPT" or "CREDIT CARD RECEIPT"

#### ⚡️ Features and Changes (October 03, 2022)

<figure><img src="https://126655343-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F2al1MDqAP9Dg9iDRjkWg%2Fuploads%2Fgit-blob-1fd8a0b2fb79e1042ca95c6b964cc1abef70529d%2F1dc5d42-image_40.png?alt=media" alt=""><figcaption></figcaption></figure>

* Handwritten text recognition for total amount and tips
* Improved extraction performances for each field
* Update in response scheme with new extracted fields:
  * Total excluding taxes
  * Tip
  * Total amount (formerly total\_incl)
  * Tax base is now extracted on each tax item
* Detection of 44 currencies: EUR, GBP, CHF, USD, CAD, CZK, NOK, SEK, HUF, RON, PLN, RUB, DKK, XPF, TRY, MXN, COP, BRL, CLP, ARS, AED, SAR, QAR, ILS, OMR, CNY, PHP, SGD, HKD, JPY, MYR, KRW, TWD, THB, VND, IMR, IDR, DZD, MAD, TND, XOF, ZAR, XAF, AUD

## Version 3

#### ⚡️ Features and Changes (March 24, 2022)

* Update in response scheme with new orientation information available
* Update in polygon coordinates and format

#### ⚡️ Features and Changes (Oct 19, 2020)

* Supports the extraction and recognition of tax codes (VAT, HST, GST, City Tax, State tax).

## Version 2

{% hint style="warning" %}
**Receipt V2 API Performance Update**

For increased performance in the extraction of your fields and result, please upgrade to `v3` as this version is currently not maintained.
{% endhint %}

#### ⚡️ Features and Changes (Jan 13, 2020)

* Improvement in the extraction of date field
* Addition of merchant name to extracted fields<br>

## Version 1

{% hint style="warning" %}
Receipt V1 API Depreciation

Support for the receipt V1 API is deprecated. Please use `v3` instead.
{% endhint %}

#### ⚡️ Feature: First Release (August 6, 2019)

Extracted fields:

* Expense category
* Locale & currency
* Receipt date
* Receipt time
* Supplier
* Taxes details
* Total amount
