This API allows you to retrieve any document used by a RAG Assistant, either the original document or an internal processed version.
The original document is any binary file —such as a PDF, image, audio clip or plain text— stored exactly as it was uploaded.
The processed document is a JSON textual counterpart of the original file, generated after the platform has extracted and normalized its content. This version is returned by the API as UTF-8 plain text.
Check the API Reference for generic variables needed to use the API.
| Method |
Path |
Description |
| GET |
/document/{documentId} |
Returns the original document identified by documentId. |
| GET |
/document/{documentId}/processed |
Returns the processed document identified by documentId as plain UTF-8 text. |
All endpoints require authentication using one of the following:
- Authorization: Bearer $GEAI_APITOKEN
- Authorization: Bearer $OAuth_accesstoken
For $OAuth_accesstoken, you must also include the header: ProjectId: $GEAI_PROJECT_ID.
This endpoint returns the original document exactly as it was uploaded to the RAG Assistant. The file is delivered in its native binary format (for example, PDF, PNG, or DOCX).
| Name |
Type |
Description |
| documentId |
string |
GUID that uniquely identifies the document. Required. |
- Method: GET
- Path: $BASE_URL/v1/document/{documentId}
- Body: Empty
Binary format associated to the file.
curl -X GET "$GEAI_URL/v1/document/{documentId}" \
-H "Authorization: Bearer $GEAI_APITOKEN"
This endpoint returns the processed document, a UTF-8 .json version of the file, after the RAG Assistant has extracted and normalized its content.
| Name |
Type |
Description |
| documentId |
string |
GUID that uniquely identifies the document. Required. |
- Method: GET
- Path: $BASE_URL/v1/document/{documentId}/processed
- Body: Empty
The file format is as follows:
[
{
"pageContent": "...plain text...",
"metadata": {
"source": "string", // internal use
"id": "guid", // internal use
"doc_id": "guid", // internal use
"name": "string",
"extension": "string",
"description": "string",
... // metadata extra items
}
...
}
]
curl -X GET "$BASE_URL/v1/document/{documentId}/processed" \
-H "Authorization: Bearer $GEAI_APITOKEN"
The following response headers are returned:
| Key |
Value |
| Content-Encoding |
gzip |
| Content-Disposition |
follows the format: filename*=UTF-8''filename.extension;filename="filename.extension" |
| X-Frame-Options |
deny |
| X-Content-Type-Options |
nosniff |
| Transfer-Encoding |
chunked |
Since version 2025-07.