> For the complete documentation index, see [llms.txt](https://docs.mindee.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.mindee.com/extraction-models/sdk-integration/extraction-configuration.md).

# Extraction Configuration

Configuration parameters specific to Extraction models.

There are also [Basic Model Configuration](/integrations/client-libraries-sdk/basic-model-configuration.md) which can be used with all models.

## Optional Features Configuration

Enable or disable [Optional Features](/extraction-models/optional-features.md).

{% hint style="warning" icon="money-check-dollar-pen" %}
Enabling a feature not in your plan will result in a Payment Required error (HTTP 402).

Check the [Plans](/account-management/plans.md#feature-comparison) section for more information.
{% endhint %}

The default activation states for Optional Features are set on the platform.\
Any values set here will override the defaults.

Leave empty or null to use the default platform values.

For example: if the Polygon feature is enabled on the platform, and polygon is explicitly set to `false` in the parameters ⇒ the Polygon feature will **not** be enabled for the API call.

{% tabs %}
{% tab title="Python" %}
Only the `model_id` is required.

```python
model_params = ExtractionParameters(
    # ID of the model, required.
    model_id="MY_MODEL_ID",

    # Optional Features: set to `True` or `False` to override defaults

    # Enhance extraction accuracy with Retrieval-Augmented Generation.
    rag=None,
    # Extract the full text content from the document as strings.
    raw_text=None,
    # Calculate bounding box polygons for all fields.
    polygon=None,
    # Boost the precision and accuracy of all extractions.
    # Calculate confidence scores for all fields.
    confidence=None,
    
    # ... any other options ...
)
```

{% endtab %}

{% tab title="Node.js" %}
Only the `modelId` is required.

```typescript
const modelParams = {
  // ID of the model, required.
  modelId: "MY_MODEL_ID",

  // Optional Features: set to `true` or `false` to override defaults

  // Enhance extraction accuracy with Retrieval-Augmented Generation.
  rag: undefined,
  // Extract the full text content from the document as strings.
  rawText: undefined,
  // Calculate bounding box polygons for all fields.
  polygon: undefined,
  // Boost the precision and accuracy of all extractions.
  // Calculate confidence scores for all fields.
  confidence: undefined,
  
  // ... any other options ...
};
```

{% endtab %}

{% tab title="PHP" %}
Only the `modelId` is required.

```php
$modelParams = new ExtractionParameters(
    // ID of the model, required.
    "MY_MODEL_ID",

    // Optional Features: set to `true` or `false` to override defaults

    // Enhance extraction accuracy with Retrieval-Augmented Generation.
    rag: null,
    // Extract the full text content from the document as strings.
    rawText: null,
    // Calculate bounding box polygons for all fields.
    polygon: null,
    // Boost the precision and accuracy of all extractions.
    // Calculate confidence scores for all fields.
    confidence: null,
    
    // ... any other options ...
);
```

{% endtab %}

{% tab title="Ruby" %}
Only the `model_id` is required.

```ruby
model_params = {
    # ID of the model, required.
    model_id: 'MY_MODEL_ID',

    # Options: set to `true` or `false` to override defaults

    # Enhance extraction accuracy with Retrieval-Augmented Generation.
    rag: nil,
    # Extract the full text content from the document as strings.
    raw_text: nil,
    # Calculate bounding box polygons for all fields.
    polygon: nil,
    # Boost the precision and accuracy of all extractions.
    # Calculate confidence scores for all fields.
    confidence: nil,
    
    # ... any other options ...
}
```

{% endtab %}

{% tab title="Java" %}
Only the `modelId` is required.

```java
var modelParams = ExtractionParameters
    // ID of the model, required.
    .builder("MY_MODEL_ID")

    // Optional Features: set to `true` or `false` to override defaults

    // Enhance extraction accuracy with Retrieval-Augmented Generation.
    .rag(null)
    // Extract the full text content from the document as strings.
    .rawText(null)
    // Calculate bounding box polygons for all fields.
    .polygon(null)
    // Boost the precision and accuracy of all extractions.
    // Calculate confidence scores for all fields.
    .confidence(null)
    
    // ... any other options ...

    // complete the builder
    .build();
```

{% endtab %}

{% tab title=".NET" %}
Only the `modelId` is required.

```csharp
var modelParams = new ExtractionParameters(
    // ID of the model, required.
    modelId: "MY_MODEL_ID"

    // Optional Features: set to `true` or `false` to override defaults

    // Enhance extraction accuracy with Retrieval-Augmented Generation.
    , rag: null
    // Extract the full text content from the document as strings.
    , rawText: null
    // Calculate bounding box polygons for all fields.
    , polygon: null
    // Boost the precision and accuracy of all extractions.
    // Calculate confidence scores for all fields.
    , confidence: null
    
    // ... any other options ...
);
```

{% endtab %}
{% endtabs %}

## Dynamic Model Options

These options allow changing how the model performs an inference on a **per-call basis**.

These features can **only** be used via API.

{% hint style="info" %}
These advanced features are not meant for improving the  model's **overall** accuracy.

Instead, make sure the Data Schema has been [properly optimized](/extraction-models/data-schema.md#performance-optimization).
{% endhint %}

### Text Context

Give additional guidelines to the model to help it better process a specific document.

Useful when you have important context on the document, **and** when there isn't sufficient information on the document itself to provide that context to the model.

This is a free-form text format.

As an example, you could remove ambiguity for country or regional differences:

"The parts supplier is in Canada, these amounts are in CAD", if there is no address on the document.

### Data Schema

Allows changing the Data Schema on a per-call basis: directly modify the Data Schema: add, remove, or change fields.

The typical use case is when the data needing to be extracted change based on internal business logic.

To download the JSON string appropriate for your model:

1. Go to your model's page
2. On the left-hand menu, click on "General Settings"
3. Scroll down to the "Actions" section
4. Click on the "Download Data Schema" button:<br>

   <figure><img src="/files/OXM8QOI0EXGc5KuXi0MI" alt="The &#x22;Download Data Schema&#x22; button" width="530"><figcaption></figcaption></figure>

### Code Sample

The Data Schema can be passed as a JSON string or by instantiating the appropriate classes.

If passed as a JSON string, it will be validated in the client before being sent to the server.

{% tabs %}
{% tab title="Python" %}
Only the `model_id` is required.

```python
model_params = ExtractionParameters(
    # ID of the model, required.
    model_id="MY_MODEL_ID",

    # Text Context
    text_context="this is an invoice.",

    # Data Schema
    data_schema="{ ... JSON DATA ... }",

    # ... any other options ...
)
```

{% endtab %}

{% tab title="Node.js" %}
Only the `modelId` is required.

```typescript
const modelParams = {
  // ID of the model, required.
  modelId: "MY_MODEL_ID",

  // Text Context
  textContext: "this is an invoice.",

  // Data Schema
  dataSchema: "{ ... JSON DATA ... }",

  // ... any other options ...
};
```

{% endtab %}

{% tab title="PHP" %}
Only the `modelId` is required.

```php
$modelParams = new ExtractionParameters(
    // ID of the model, required.
    "MY_MODEL_ID",

    // Text Context
    textContext: "this is an invoice.",

    // Data Schema
    dataSchema: "{ ... JSON DATA ... }",

    // ... any other options ...
);
```

{% endtab %}

{% tab title="Ruby" %}
Only the `model_id` is required.

```ruby
model_params = {
    # ID of the model, required.
    model_id: 'MY_MODEL_ID',

    # Text Context
    text_context: "this is an invoice.",

    # Data Schema
    data_schema: "{ ... JSON DATA ... }",

    # ... any other options ...
}
```

{% endtab %}

{% tab title="Java" %}
Only the `modelId` is required.

```java
var modelParams = ExtractionParameters
    // ID of the model, required.
    .builder("MY_MODEL_ID")

    // Text Context
    .textContext("this is an invoice.")

    // Data Schema
    .dataSchema("{ ... JSON DATA ... }")

    // ... any other options ...

    // complete the builder
    .build();
```

{% endtab %}

{% tab title=".NET" %}
Only the `modelId` is required.

```csharp
var modelParams = new ExtractionParameters(
    // ID of the model, required.
    modelId: "MY_MODEL_ID"
    
    // Text Context
    , textContext: "this is an invoice."
    
    // Data Schema
    , dataSchema: "{ ... JSON DATA ... }"
    
    // ... any other options ...
);
```

{% endtab %}
{% endtabs %}


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.mindee.com/extraction-models/sdk-integration/extraction-configuration.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.