Speeding up agentic workflows with WebSockets in the Responses API

中文日本語 Español

OpenAI Mar 11, 2026

OpenAI improved agentic workflow performance by 40% using persistent WebSocket connections to reduce API overhead for faster model inference.

Read Full Article

Summary

To address latency bottlenecks in agentic workflows caused by repeated API overhead, OpenAI implemented persistent WebSocket connections for their Responses API. By caching conversation state and reducing redundant network calls, this update allows models like GPT-5.3-Codex-Spark to reach speeds of over 1,000 tokens per second. This architectural shift eliminates the need to rebuild context for every follow-up request, resulting in significant performance gains for developers and platforms like Vercel, Cline, and Cursor.

(Source：OpenAI)

中文日本語 Español

Read Full Article

TechCrunch Jun 6, 2026

OpenAI unveils Lockdown Mode to protect sensitive data from prompt injection attacks

The Brighter Side of News Jun 6, 2026

Digital ‘super-brain’ with a physics education speeds up technology development

TechCrunch Jun 6, 2026

What to expect from WWDC 2026: Siri’s highly anticipated revamp and Apple Intelligence updates

TechCrunch Jun 6, 2026

Sriram Krishnan is leaving his role as White House AI advisor

TechCrunch Jun 6, 2026