OpenAI's New Agent Building Platform: A Comprehensive Overview

OpenAI has unveiled a significant evolution in its platform offerings with a suite of new tools specifically designed to help developers and enterprises build useful and reliable AI agents. These "agents" - systems that can independently accomplish tasks on behalf of users - represent the next frontier in AI application development.

The Challenge and OpenAI's Solution

Over the past year, OpenAI has introduced advanced reasoning, multimodal interactions, and enhanced safety techniques that laid the groundwork for models capable of handling complex, multi-step tasks. However, developers faced significant challenges in transforming these capabilities into production-ready agents, often struggling with extensive prompt iteration and custom orchestration logic without sufficient visibility or built-in support.

To address these pain points, OpenAI has launched a comprehensive set of APIs and tools specifically engineered to simplify agent development:

The Core Components

1. Responses API

The new Responses API serves as OpenAI's new API primitive for leveraging built-in tools to build agents. It offers:

A fusion of Chat Completions simplicity with Assistants API tool-use capabilities
A unified item-based design and simpler polymorphism
Intuitive streaming events and SDK helpers like response.output_text
Support for multiple tools and model turns in a single API call

const response = await openai.responses.create({
    model: "gpt-4o",
    tools: [ { type: "web_search_preview" } ],
    input: "What was a positive news story that happened today?",
});

console.log(response.output_text);

The Responses API is particularly valuable for developers who want to easily integrate OpenAI models and built-in tools without the complexity of combining multiple APIs or external vendors.

2. Built-in Tools

Web Search

This tool enables developers to integrate fast, up-to-date answers with clear citations from the web. Available with GPT-4o and GPT-4o-mini, web search has already proven useful for applications like:

Hebbia: Leveraging web search to help asset managers, private equity firms, and law practices extract actionable insights from extensive datasets, delivering richer market intelligence and improving precision in their analyses.

The web search capability performs impressively on benchmarks, with GPT-4o search preview and GPT-4o mini search preview scoring 90% and 88% respectively on SimpleQA, outperforming models without search capabilities.

File Search

The enhanced file search tool helps retrieve relevant information from large document collections, supporting:

Multiple file types
Query optimization
Metadata filtering
Custom reranking

Real-world applications include:

Navan: Using file search in its AI-powered travel agent to quickly provide users with precise answers from knowledge-base articles (like company travel policies), tailoring answers to individual account settings and user roles.

const productDocs = await openai.vectorStores.create({
    name: "Product Documentation",
    file_ids: [file1.id, file2.id, file3.id],
});

const response = await openai.responses.create({
    model: "gpt-4o-mini",
    tools: [{
        type: "file_search",
        vector_store_ids: [productDocs.id],
    }],
    input: "What is deep research by OpenAI?",
});

console.log(response.output_text);

Computer Use

This research preview tool, powered by the Computer-Using Agent (CUA) model that enables Operator, allows models to autonomously complete tasks on a computer by capturing mouse and keyboard actions for execution in various environments.

The tool has achieved impressive benchmark results:

38.1% success on OSWorld for full computer use tasks
58.1% on WebArena
87% on WebVoyager for web-based interactions

Companies already utilizing this capability include:

Unify: Using the computer use tool to access information previously unreachable via APIs, such as verifying through online maps if a business has expanded its real estate footprint.
Luminai: Integrating the tool to automate complex operational workflows for enterprises with legacy systems lacking API availability and standardized data, successfully automating application processing and user enrollment for a community service organization in days rather than months.

const response = await openai.responses.create({
    model: "computer-use-preview",
    tools: [{
        type: "computer_use_preview",
        display_width: 1024,
        display_height: 768,
        environment: "browser",
    }],
    truncation: "auto",
    input: "I'm looking for a new camera. Help me find the best one.",
});

console.log(response.output);

3. Agents SDK

The open-source Agents SDK simplifies orchestrating multi-agent workflows, offering significant improvements over OpenAI's experimental "Swarm" SDK released last year. Key features include:

Agents: Easily configurable LLMs with clear instructions and built-in tools
Handoffs: Intelligent transfer of control between agents
Guardrails: Configurable safety checks for input and output validation
Tracing & Observability: Visualization of agent execution traces for debugging and performance optimization

from agents import Agent, Runner, WebSearchTool, function_tool, guardrail

@function_tool
def submit_refund_request(item_id: str, reason: str):
    # Your refund logic goes here
    return "success"

support_agent = Agent(
    name="Support & Returns",
    instructions="You are a support agent who can submit refunds [...]",
    tools=[submit_refund_request],
)

shopping_agent = Agent(
    name="Shopping Assistant",
    instructions="You are a shopping assistant who can search the web [...]",
    tools=[WebSearchTool()],
)

triage_agent = Agent(
    name="Triage Agent",
    instructions="Route the user to the correct agent.",
    handoffs=[shopping_agent, support_agent],
)

output = Runner.run_sync(
    starting_agent=triage_agent,
    input="What shoes might work best with my outfit so far?",
)

The SDK has already been successfully adopted by:

Coinbase: Quickly prototyping and deploying AgentKit, a toolkit enabling AI agents to interact with crypto wallets and various on-chain activities, integrating custom actions from their Developer Platform SDK into a fully functional agent in just hours.
Box: Creating agents that leverage web search and the Agents SDK to enable enterprises to search, query, and extract insights from unstructured data stored within Box and public internet sources, respecting internal permissions and security policies.

Future Plans and API Strategy

OpenAI has outlined a clear roadmap for their APIs:

Chat Completions API: Will continue to be supported with new models and capabilities, particularly for developers who don't require built-in tools.
Assistants API: Will be deprecated with a target sunset date in mid-2026, with OpenAI working to achieve full feature parity between Assistants and the Responses API before then, including support for Assistant-like and Thread-like objects, and the Code Interpreter tool.

Safety and Reliability Considerations

OpenAI acknowledges the importance of safety in agent development, particularly for tools like Computer Use. They've conducted extensive safety testing and red teaming, implementing mitigations including:

Safety checks against prompt injections
Confirmation prompts for sensitive tasks
Tools to help isolate environments
Enhanced detection of potential policy violations

However, they note that human oversight is still recommended, particularly for non-browser environments where the model's reliability is still improving.

Conclusion

With these new building blocks, OpenAI is setting the stage for agents to become integral to the workforce, enhancing productivity across industries. The company plans to continue investing in deeper integrations across their APIs and new tools to help deploy, evaluate, and optimize agents in production.

As model capabilities become increasingly agentic, OpenAI aims to provide developers with a seamless platform experience for building agents that can assist with a wide variety of tasks across any industry, promising more updates in the coming weeks and months to further simplify and accelerate the development of agentic applications.

Blog

OpenAI Agents SDK