This API returns a textual representation of a file.
The supported file formats are listed in RAG File Formats.
Check the generic variables needed to use the API.
| Method |
Path |
Description |
| POST |
/omni-parser/process |
Processes a RAG File Formats (such as pdf, docx, pptx) and returns a textual representation. |
- Method: POST
- Path: $BASE_URL/v1/omni-parser/process
- Parameters: The available options are listed in geai Ingestion Provider Parameters (using the default provider); you can use the alternative llamaParse provider (an API KEY is needed).
The expected response uses the format below. Some elements may be optional, depending on the file type:
{
"status": "string", // success, failed
"parts": [ // list of elements
{
"page": integer, // available depending on the file type
"type": "string", // type of element: Title, UncategorizedText, Title, NarrativeText, ListItem, Row
"text": "string" // the content of the element,
"midTime": integer // only available with Video
}
...
],
"requestId": "guid",
"usage": { // (1)
"total_cost": decimal,
"total_tokens": integer,
"currency": "string" // USD
}
}
# page 4 to 6 from a sample PDF
curl -X POST "$BASE_URL/v1/omni-parser/process" \
-H 'Authorization: Bearer $GEAI_APITOKEN' \
-F 'file=@"/C:/temp/sample.pdf"' \
-F 'startPage="4"' \
-F 'endPage="6"'
# Get the transcript dialogue from the sample video
curl -X POST "$BASE_URL/v1/omni-parser/process" \
-H 'Authorization: Bearer $GEAI_APITOKEN' \
-H 'file=@"/C:/temp/sample.mp4"' \
-F 'dialogue="true"'
# use the small whisper model to process the audio
curl -X POST "$BASE_URL/v1/omni-parser/process" \
-H 'Authorization: Bearer $GEAI_APITOKEN' \
-F 'file=@"/C:/temp/sample.mp3"' \
-F 'whisperModel="small"'
# process a presentation, use gpt-4.1-mini to interpret images
curl -X POST "$BASE_URL/v1/omni-parser/process" \
-H 'Authorization: Bearer $GEAI_APITOKEN' \
-F 'file=@"/C:/temp/sample.pptx"' \
-F 'model="openai/gpt-4.1-mini"' \
-F 'strategy="hi_res"'
# process a spreadsheet with table format
curl -X POST "$BASE_URL/v1/omni-parser/process" \
-H 'Authorization: Bearer $GEAI_APITOKEN' \
-F 'file=@"/C:/temp/sample.xlsx"' \
-F 'model="openai/gpt-4o-mini"' \
-F 'structure="table"'
{
"status": "success",
"parts": [
{
"page": 4,
"type": "NarrativeText",
"text": "All human beings are born free and equal in dignity and rights..."
},
{
"page": 5,
"type": "NarrativeText",
"text": "They are endowed with reason and conscience and should act towards one another in a spirit of brotherhood..."
}
],
"requestId": "a9fda40a-7931-4c7e-ac20-cb3531b7d30a",
"usage": {
"total_tokens": 4134,
"completion_tokens": 166,
"prompt_tokens": 3968,
"prompt_cost": 0.0003968,
"completion_cost": 6.64E-05,
"total_cost": 0.0004632,
"currency": "USD",
"prompt_tokens_details": {
"cached_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 0
}
}
}
{
"status": "success",
"parts": [
{
"text": "A view of Earth from space at night...",
"midTime": 0
},
{
"text": "Visible city lights across Africa, Europe, and the Middle East...",
"midTime": 8000
},
{
"text": "Dark surrounding background highlighting the planet's curvature and atmospheric glow.",
"midTime": 16000
}
],
"requestId": "a9fda40a-7931-4c7e-ac20-cb3531b7d30b",
"usage": {
"total_tokens": 4132,
"completion_tokens": 156,
"prompt_tokens": 3976,
"prompt_cost": 0.0003976,
"completion_cost": 6.24E-05,
"total_cost": 0.00046,
"currency": "USD",
"prompt_tokens_details": {
"cached_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 0
}
}
}
- The API is synchronous.
- Check the statusCode when different from 200.
- vLLM requests are cached when the request body is an exact match to a previously processed request, preventing redundant processing and saving costs. To disable caching for a specific request, set the header X-Saia-Cache-Enabled to false(1).