Table of contents
Official Content
  • This documentation is valid for:

This API allows you to retrieve any document used by a RAG Assistant, either the original document or an internal processed version.

The original document is any binary file —such as a PDF, image, audio clip or plain text— stored exactly as it was uploaded.

The processed document is a JSON textual counterpart of the original file, generated after the platform has extracted and normalized its content. This version is returned by the API as UTF-8 plain text.

Check the API Reference for generic variables needed to use the API.

Endpoints

Method Path Description
GET /document/{documentId} Returns the original document identified by documentId.
GET /document/{documentId}/processed Returns the processed document identified by documentId as plain UTF-8 text.

Authentication

All endpoints require authentication using one of the following:

  • Authorization: Bearer $GEAI_APITOKEN
  • Authorization: Bearer $OAuth_accesstoken

For $OAuth_accesstoken, you must also include the header: ProjectId: $GEAI_PROJECT_ID.

GET /document/{documentId}

This endpoint returns the original document exactly as it was uploaded to the RAG Assistant. The file is delivered in its native binary format (for example, PDF, PNG, or DOCX).

Parameters

Name Type Description
documentId string GUID that uniquely identifies the document. Required.

Request

  • Method: GET
  • Path: $BASE_URL/v1/document/{documentId}
  • Body: Empty

Response

Binary format associated to the file.

cURL Sample

curl -X GET "$GEAI_URL/v1/document/{documentId}" \
  -H "Authorization: Bearer $GEAI_APITOKEN"

GET /document/{documentId}/processed

This endpoint returns the processed document, a UTF-8 .json version of the file, after the RAG Assistant has extracted and normalized its content.

Parameters

Name Type Description
documentId string GUID that uniquely identifies the document. Required.

Request

  • Method: GET
  • Path: $BASE_URL/v1/document/{documentId}/processed
  • Body: Empty

Response

The file format is as follows:

[
  {
    "pageContent": "...plain text...",
    "metadata": {
      "source": "string", // internal use
      "id": "guid", // internal use
      "doc_id": "guid", // internal use
      "name": "string",
      "extension": "string",
      "description": "string",
      ... // metadata extra items
    }
...
  }
]

cURL sample

curl -X GET "$BASE_URL/v1/document/{documentId}/processed" \
  -H "Authorization: Bearer $GEAI_APITOKEN"

Considerations

The following response headers are returned:

Key Value
Content-Encoding gzip
Content-Disposition follows the format:
filename*=UTF-8''filename.extension;filename="filename.extension"
X-Frame-Options deny
X-Content-Type-Options nosniff
Transfer-Encoding chunked

Availability

Since version 2025-07.

Last update: December 2025 | © GeneXus. All rights reserved. GeneXus Powered by Globant