Running Codex safely at OpenAI
Summary
OpenAI has implemented several measures to ensure the safe deployment and operation of its Codex coding agent. These include establishing clear technical boundaries, allowing quick execution of low-risk actions, and requiring explicit approval for higher-risk operations. Key controls involve managed configuration, constrained execution, network policies, and agent-native logs. Codex operates within a bounded environment, with approvals and sandboxing working in tandem to define execution limits and trigger review for actions outside these boundaries. Auto-review mode streamlines routine requests, while network policies restrict outbound access to known destinations. Authentication is managed securely through OS keyrings and forced login via ChatGPT, tying usage to enterprise workspace controls. Rules govern shell commands, allowing benign ones without approval while blocking or requiring review for dangerous commands. These controls are enforced via cloud-managed requirements, macOS managed preferences, and local files. For visibility, Codex supports OpenTelemetry log export, providing agent-aware telemetry that complements traditional security logs by explaining the 'why' behind actions. This telemetry is used by an AI security triage agent to distinguish between expected behavior, mistakes, and malicious activity, and also for operational insights into adoption and system performance. These capabilities aim to balance developer productivity with enterprise security needs.
(Source:OpenAI)