Table of contents
Official Content
  • This documentation is valid for:

Version 2025-05 was officially released on May 12, 2025.

Below are the most important fixes and features introduced in this version.

  • Support for Model Context Protocol (MCP) to integrate external Tools.
    • The GEAI proxy is a Python-based component that enables dynamic integration of external Tools into Globant Enterprise AI via MCP. It acts as a bridge between Globant Enterprise AI and one or more MCP-compliant Tool servers.
    • Once the MCP servers are properly configured and connected through the GEAI proxy, the Tools they expose become automatically available in The Lab > Tools of Globant Enterprise AI, ready for use by any Agent without additional setup.
    • See more information about this protocol https://modelcontextprotocol.io/introduction.
    • See how to import Tools using MCP Tool servers
  • New /responses endpoint for AI Interactions
    • /responses endpoint was introduced in Responses API, which is fully compatible with the OpenAI Responses API. This addition allows you to submit prompts as plain text, invoke functions, or pass files such as PDFs and images. The endpoint simplifies AI integration by supporting a familiar request/response structure, enabling a smoother transition for teams already using OpenAI-based workflows.
  • New Images API
    • A new API is available that lets you generate images from text prompts. Supported providers: OpenAI, Vertex AI and xAI.
  • LLMs:
    • New Gemini models:
      • Gemini 2.5 Pro Preview 'I/O edition': Built on its predecessor with significantly enhanced coding abilities and improved reasoning for complex tasks. Designed for developers and advanced users, this edition refines performance across benchmarks and expands its problem-solving reach. Release date: May 6th, 2025.
      • Gemini 2.5 Flash: Google's latest model built for complex problem-solving. It allows users to activate thinking and set a thinking budget (1–24k tokens). Designed to balance reasoning and speed, it delivers better performance and accuracy by reasoning before responding.
    • Updates in OpenAI's "o" series:
      • o3: The most powerful reasoning model in the "o" family; it pushes the frontier across coding, math, science, visual perception, and more.
      • o4-mini: A smaller model optimized for fast, cost-efficient reasoning; it achieves remarkable performance for its size and cost, particularly in math, coding, and visual tasks.
      • o1-pro: Available through our Responses API, offering a faster, more flexible, and easier way to create agentic experiences.
      • Over the next few weeks, the o1‑preview model will be migrated to the new o3 model, while o1‑mini will move to o4‑mini. More info in Deprecated Models.
    • Refer to the LLMs with Reasoning Capabilities article for step-by-step guidance on how to use reasoning-enabled models through the API.
    • The new GPT-4.1 model series by OpenAI is now available in the production environment, featuring significant improvements in coding, instruction following, and long-context handling—along with their first-ever nano model.
    • Grok 3 Model Family added, including two pairs of models:
      • Lightweight Variants:
        • grok-3-mini-beta and grok-3-mini-fast-beta support function calling and enhanced reasoning (with configurable effort levels) for tasks like meeting scheduling and basic customer support. Both variants deliver identical response quality; the difference lies in response latency, with the "fast" version optimized for quicker responses.
      • Flagship Variants:
        • grok-3-beta and grok-3-fast-beta are designed for enterprise use cases such as data extraction, coding, and text summarization. They bring deep domain expertise in fields like finance, healthcare, law, and science. Similar to the mini variants, these models have identical capabilities, with the "fast" version offering reduced response times at a higher cost.
    • Llama 4 collection by Meta: We continue to expand our coverage of this model family. Recently added Llama 4 Scout and Maverick through Vertex AI's serverless API. Also available in Beta: Llama 4 Maverick via Groq and SambaNova, and Llama 4 Scout through the Cerebras provider, which offers this model with an inference speed of up to 2,600 tokens per second.
    • Llama Nemotron Collection: The Llama Nemotron Ultra and Super models are now available in Beta as Nvidia NIM microservices. These are advanced reasoning models, post-trained to optimize performance on tasks such as RAG, tool calling, and alignment with human chat preferences. Both models support a context window of up to 128K tokens.
    • Introducing the OpenRouter Provider (Beta):
      • OpenRouter joins the GEAI model suite with its Auto Router meta-model, which analyzes each user query and dynamically routes it to the most suitable LLM. This workflow maximizes response quality while minimizing cost and latency, delivering the most efficient output possible.
      • Qwen3 Family recently added: The latest generation in the Qwen large language model series, features both dense and mixture-of-experts (MoE) architectures to excel in reasoning, multilingual support, and advanced agent tasks. Its unique ability to switch seamlessly between a thinking mode for complex reasoning and a non-thinking mode for efficient dialogue ensures versatile, high-quality performance. Significantly outperforming prior models like QwQ and Qwen2.5, Qwen3 delivers superior mathematics, coding, commonsense reasoning, creative writing, and interactive dialogue capabilities.
    • Better processing of error messages, for example on those cases where the LLMs return specific errors.
  • New Python SDK - PyGEAI. It's composed of libraries, tools, code samples, and other documentation that allows developers to interact with the platform more easily with a 16K token context window.
  • New omni-parser API to get the content of different file types.
  • RAG
    • Support for new audio and video formats.
    • New endpoints to reindex documents in RAG Assistants API
    • New parameters available when geai Ingestion Provider.
      • startPage and endPage to selectively process what is needed.
      • media parameters such as mediaPrompt, dialogue, frameSamplingRate and so on.
    • fix parameter truncate is not supported when calling the cohere-rerank-3.5 model in Rerank API.
  • Flows
    • File support for Teams & Slack: You can now easily send documents, images, audio, and video files through Teams and Slack when you integrate a Flow into these conversational channels.
  • Evaluation Module Enhancements
    • New Metrics Introduced:
      • Faithfulness: Assesses how factually consistent a response is with the retrieved context.
      • Hallucination: Calculated as 1 - Faithfulness, indicating the level of fabricated information.
      • Context Precision: Measures the proportion of relevant information within the retrieved contexts, compared against a reference answer for a given user input. (Note: Current calculation does not yet consider the position of retrieved chunks.)
      • Noise Sensitivity: This would involve analyzing the relationship between Assistant Accuracy and Context Precision across successive runs of an evaluation plan, varying the number of chunks retrieved. It examines how much and in what way the quality of the generated response changes when irrelevant content is added to the retrieved context.
  • The Lab Enhancements
    • Flows Integration: The definition and management of Flows are now fully integrated into The Lab.
    • Agentic Processes:
      • New Conditional Gateway: Introduces the ability to define branching paths based on natural language prompts, enabling dynamic decision-making within processes.
      • New Synchronization Gateway: Allows synchronization of multiple parallel paths. The process automatically waits at this point until all incoming paths are completed.
      • Enhanced Task Flexibility: Now, it supports multiple inputs and outputs per task, significantly expanding the complexity and richness of the processes you can model.
    • Meta-Agent Iris Improvements
      • Enhanced LLM Selection Experience: When creating or editing an agent with Iris, users now benefit from a refined LLM selection flow, improving usability and model configuration accuracy.
  • The Lab - Custom SSO not supported in this release.
  • Deployment Guide
  • Components Version Update
Last update: December 2025 | © GeneXus. All rights reserved. GeneXus Powered by Globant