Load and Adjust a File
Reference documentation on loading and manipulating files prior to sending, using Mindee client libraries.
This is reference documentation.
Code samples shown are only examples, and will not work as-is. You'll need to copy-paste and modify according to your requirements.
Looking full code samples?
You can also use the "Ask" button at the top of any page in the documentation.
Requirements
In most cases you'll be loading a source file for use in the Mindee Client, take a look at the Client Configuration section for more info.
However, you don't actually need the client initialized to use these features, only the client library installed.
Overview
Overall, the steps to sending a file are:
Load a source file.
Optional: adjust the source file before sending.
Use the Mindee client instance to send the file.
Load a Source File
You can load a source file from a path, from raw bytes, from a bytes stream, or from a language-specific object. Choose the appropriate type based on your application requirements.
If you're unsure of which to use, we recommend loading from a path.
To load a file, you'll need to import the corresponding input class from the mindee module.
To load a path string, use PathInput .
To load a Path instance, use PathInput.
To load raw bytes, use BytesInput . The filename parameter is required.
To load a base-64 string, use Base64Input . The filename parameter is required.
The string will be decoded into bytes internally.
To load a file handle, use FileInput.
It must be opened in binary mode, as a BinaryIO .
To load a file, you'll need to import the corresponding input class and instantiate it.
Make sure to import the needed classes:
To load a path string, use PathInput.
To load a Buffer instance, use BufferInput . The filename parameter is required.
To load raw bytes, use BytesInput . The filename parameter is required.
To load a Stream, use StreamInput. The filename parameter is required.
To load a base-64 string, use Base64Input . The filename parameter is required.
To load a file, you'll need to import the corresponding input class from the Mindee\Input namespace.
To load a path string, use PathInput.
To load a file resource, use FileInput.
To load raw bytes, use BytesInput. The filename is required.
To load a base-64 string, use Base64Input .
The string will be decoded into bytes internally. The filename is required.
To load a path string, use the PathInputSource class.
To load raw bytes, use the BytesInputSource class.
To load a base-64 string, use Base64InputSource. The filename is required.
The string will be decoded into bytes internally.
To load a file handle, use FileInputSource. The filename is required.
It must be opened in binary mode.
To load a file, initialize it using the LocalInputSource class.
This class has different constructors to allow for opening various types of inputs.
To load a path string:
To load a Path instance:
To load a File instance:
To load a byte array, the filename is required:
To load an InputStream instance, the filename is required:
To load a base-64 string, the filename is required:
To load a file, initialize it using the LocalInputSource class.
This class has different constructors to allow for opening various types of inputs.
To load a path string:
To load a FileInfo instance:
To load a byte array, the filename is required:
To load a Stream instance, the filename is required:
Source File Metadata
Once a source file is loaded, various metadata can be accessed.
This can be useful for applying business rules based on the input file, for example:
Send PDFs to one model, images to another
Don't send PDFs with too many pages
Save the filename to a database
...
Here are some code samples, using an input source instance.
Adjust the Source File
Optionally make changes and adjustments to the source file before sending.
All file adjustments are applied in-memory to the source file instance.
If loaded from disk, the original file is not modified.
Fix PDF Headers
In some cases, PDFs will have corrupt or invalid headers. These files will return a 4xx HTTP error as the server will be unable to process them.
You can try to fix the headers using the provided functions.
Note: this feature is not yet available for all languages.
Here are some code samples, using an input source instance.
Compress Files
There is no need to send excessively large files to the Mindee API.
Unfortunately, many modern smartphones can take very high resolution images.
We provide a way to compress images before sending to the API.
Here are some code samples, using an input source instance.
Basic usage is very simple, and can be applied to both images and PDFs:
For images, you can also set a maximum height and/or width. The aspect ratio will always be preserved.
For example to compress and resize to no greater than 1920x1920 pixels:
Basic usage is very simple, and can be applied to both images and PDFs:
For images, you can also set a maximum height and/or width. The aspect ratio will always be preserved.
For example to compress and resize to no greater than 1920x1920 pixels:
Basic usage is very simple, and can be applied to both images and PDFs:
For images, you can also set a maximum height and/or width. The aspect ratio will always be preserved.
For example to compress and resize to no greater than 1920x1920 pixels:
Basic usage is very simple, and can be applied to both images and PDFs:
For images, you can also set a maximum height and/or width. The aspect ratio will always be preserved. For example to compress and resize to no greater than 1920x1920 pixels:
Basic usage is very simple, and can be applied to both images and PDFs:
For images, you can also set a maximum height and/or width. The aspect ratio will always be preserved.
For example to compress and resize to no greater than 1920x1920 pixels:
Basic usage is very simple, and can be applied to both images and PDFs:
For images, you can also set a maximum height and/or width. The aspect ratio will always be preserved.
For example to compress and resize to no greater than 1920x1920 pixels:
Manipulate PDF Pages
In some cases, PDFs will have some superfluous pages present.
For example a cover page or terms and conditions which are not useful to the desired data extraction.
These extra pages count towards your billing and slow down processing.
It is therefore in your best interest to remove them before sending.
Parameters:
"Page Indexes" is required and is a list of 0-based page indexes. Use negative values to specify indexes starting from the end, i.e.
-1for the last page."Operation" specifies whether to keep only specified pages or remove specified pages. One of "Keep Only" or "Remove".
"On Min Pages" is optional and specifies the minimum number of pages a document must have for the operation to take place. The value of
0means any number of pages.
Exact naming of parameters will depend on the language.
Here are some code samples, using an input source instance.
Some other examples:
Some other examples:
Some other examples:
Note: the name is apply_page_options instead of apply_page_options! even though the operation is in-place, this to harmonize with the other client libraries.
Some other examples:
Some other examples:
Some other examples:
Last updated
Was this helpful?

