You can customize the geai Ingestion Provider using the following parameters.
| Parameter |
Description |
| strategy |
Determines the processing approach. Options: -auto (default): Globant Enterprise AI selects the best option based on the document. - hi_res: High-resolution processing. Requires the model parameter. More expensive but potentially yields better results for complex documents, especially those PDF documents that have images and tables. Check an example Documents with images hi_res strategy. |
| model |
Specifies the AI model for image processing when OCR is not feasible. Default: openai/gpt-4o. Use the format provider/modelname. You must use models with visual support. Some examples are: openai/gpt-4o, openai/gpt-4o-mini, anthropic/claude-3-5-sonnet-20240620, vertex_ai/gemini-1.5-flash etc. |
| imagePrompt |
Custom prompt for image interpretation and text generation. If not provided, a default prompt is used. |
| scannedPrompt |
Custom prompt for scanned documents where the whole page is an image. |
| tablePrompt |
Custom prompt for table interpretation and text generation. If not provided, a default prompt is used. |
| logoProcess |
Determines whether the visual model will process the logos within the document or not. Options: -False (default): Does not process the logos. -True: To process the logos means to extract the explanation of each logo. |
| dpi |
Defines the DPI (Dots Per Inch) used when processing images. Default: 200 DPI. |
| structure |
Specifies whether the document is assumed to have a table structure. Valid values are: - (empty): Default value. Assumes no table structure. - table : Assumes the document is in a table or tabular format, applicable only to csv and xls* formats. Check an example table structure ingestion strategy. |
| password |
For processing PDF files with password. |
| startPage |
First page number to begin processing from. Default is 1. This parameter is for the cases where processing only a range of pages in the document is required, starting from startPage number |
| endPage |
Last page number to process. If set to 0, processing continues until the end of the document. This parameter is for the cases where processing only a range of pages in the document is required, finishing in endPage number |
| chunkStrategy(1) |
Select (empty) or byLayoutType. Use the byLayoutType value for processing Images and Tables, this parameter keeps the Images and Tables full in the same chunk, avoiding splitting their content in different chunks. |
| chunkSize |
Override the default chunkSize. |
| chunkOverlap |
Override the default chunkOverlap. |
| dialogue |
Boolean flag indicating if the video contains spoken dialogue for transcription. Is 1 if the video or audio contains dialogue and 0 if not |
| mediaPrompt |
Prompt used to describe or summarize visual frames in videos. This parameter is used when the video doesn't containg any dialogue or the dialogue value is equal to 0 |
| frameSamplingRate |
Time interval in seconds between each extracted frame from the video. For example, a value of 2 captures one frame every 2 seconds, defaults to 5. |
| merge |
Number of consecutive transcript lines to merge. Set to 0 to merge all lines. |
| whisperModel |
Specifies the Whisper model variant to use for transcript the audio extracted from videos and audios into text (small, medium, large). Defaults to small. |
| outputFormat |
Specifies the output format; options are (plainText, html, markdown). Defaults to plainText. |
1 - Specific for RAG Assistants.