What is OpenAI Codex in 2026?

OpenAI Codex (2025+) is a cloud-based AI coding agent powered by GPT-5.5 that can read entire codebases, write multi-file changes, generate tests, create pull requests, and run parallel tasks inside a secure cloud sandbox. It is not the deprecated 2021 Codex API.

How much does OpenAI Codex cost?

Codex is included in the Free tier with limited usage. ChatGPT Plus ($20/month) and Pro ($100-$200/month) include increasing Codex quotas. Business and Enterprise plans offer Codex-only seats for teams that only need coding capabilities.

What is the difference between Codex and Codex CLI?

Codex is the cloud-based agent inside ChatGPT that runs tasks in a remote sandbox. Codex CLI is an open-source, Rust-built terminal tool (72K+ GitHub stars, Apache 2.0) that runs locally on your machine and integrates with your existing development workflow.

How does OpenAI Codex compare to Claude Code?

Codex is approximately 3x more token-efficient than Claude Code, but Claude Code is preferred by 67% of developers in blind code review studies. Codex excels at parallel task execution and CI/CD integration, while Claude Code is favored for interactive pair-programming workflows.

OpenAI Codex - The AI Coding Agent Powered by GPT-5.5

📅 April 30, 2026 ⏱️ 30 min read AI & ML

OpenAI Codex AI coding agent architecture diagram

OpenAI Codex is no longer the experimental code-completion API from 2021. It has evolved into a full-blown AI coding agent - a cloud-based system powered by GPT-5.5 that can read your entire codebase, write multi-file changes, generate and run tests, create pull requests, and execute parallel tasks inside a secure sandbox. With over 4 million weekly active users as of April 2026, it has become the most widely deployed AI coding agent in the world.

This guide covers everything you need to know about the modern Codex platform: the model evolution from codex-1 through GPT-5.5, the cloud sandbox architecture, pricing across all tiers, the open-source Codex CLI, the AGENTS.md configuration standard, competitive benchmarks against Claude Code and Copilot, the Codex Security vulnerability scanner, and real-world adoption patterns. Whether you are evaluating Codex for your team or already using it and want to go deeper, this is the definitive reference.

1. What Is OpenAI Codex (2025+)?

If you remember the original Codex from 2021, forget everything about it. That was a code-completion API built on GPT-3 that powered GitHub Copilot's early autocomplete features. OpenAI deprecated it in March 2023. The Codex of 2025 and beyond is an entirely different product - an autonomous coding agent that lives inside ChatGPT and operates in a cloud-based sandboxed environment.

The modern Codex launched in May 2025 as a dedicated panel within the ChatGPT interface. Instead of completing single lines of code, it accepts high-level tasks like "refactor the authentication module to use JWT tokens" or "add pagination to the /users API endpoint and write integration tests." It then clones your repository into a cloud sandbox, reads the relevant files, formulates a plan, writes the code, runs the tests, and presents you with a complete diff or pull request.

Key distinction: The 2021 Codex was a code-completion model (think autocomplete). The 2025+ Codex is a coding agent (think junior developer who reads your codebase, writes code, runs tests, and submits PRs).

Core Identity

At its core, Codex is three things:

An agent, not a model. While it is powered by GPT-5.5 (and previously codex-1 and o3), the product is the agent layer - the orchestration, tool use, sandbox execution, and task management that wraps the model.
Cloud-native. Every task runs in an isolated cloud sandbox with its own filesystem, package manager, and optional internet access. Your local machine is never touched.
Repository-aware. Codex connects to your GitHub repositories (with GitLab and Bitbucket support in beta) and understands your project structure, dependencies, test suites, and CI/CD configuration.

What It Can Do Today

As of April 2026, Codex handles a wide range of software engineering tasks:

Write new features across multiple files with correct imports and dependencies
Fix bugs by reading stack traces, reproducing the issue, and verifying the fix
Refactor code - rename symbols, extract functions, restructure modules
Generate unit tests, integration tests, and end-to-end tests
Create pull requests with descriptive titles, summaries, and linked issues
Answer questions about your codebase by reading and analyzing the source
Run up to 8 parallel tasks simultaneously on different parts of your codebase
Delegate subtasks to specialized subagents for complex multi-step workflows
Interact with web browsers and desktop applications via Computer Use
Scan codebases for security vulnerabilities with Codex Security

2. The Model Evolution - codex-1 Through GPT-5.5

Understanding Codex requires understanding the models that power it. The agent has gone through a rapid evolution in under a year, with each generation bringing significant capability improvements.

codex-1 (May 2025)

The first model purpose-built for the Codex agent was codex-1, a fine-tuned variant of OpenAI's o3 reasoning model. Unlike general-purpose models, codex-1 was specifically optimized for software engineering tasks: reading large codebases, following coding conventions, writing idiomatic code, and operating within the constraints of a sandboxed environment.

codex-1 achieved a 72.1% score on SWE-Bench Verified, a benchmark that measures a model's ability to resolve real GitHub issues from popular open-source projects. For context, the base o3 model scored 69.1% on the same benchmark - a meaningful gap that demonstrated the value of task-specific fine-tuning.

Model	SWE-Bench Verified	Release	Notes
codex-1	72.1%	May 2025	Fine-tuned o3 for coding tasks
o3 (base)	69.1%	April 2025	General reasoning model
GPT-4.1	54.6%	April 2025	Non-reasoning baseline
Claude 3.5 Sonnet	49.0%	June 2024	Anthropic's coding model

codex-mini (June 2025)

OpenAI followed up with codex-mini, a smaller and faster variant optimized for latency-sensitive tasks. While it scored lower on SWE-Bench, it was 3-4x faster for common operations like code review, simple bug fixes, and test generation. This became the default model for Codex tasks that didn't require deep reasoning.

GPT-5.0 Integration (September 2025)

When GPT-5.0 launched, Codex was among the first products to integrate it. The jump was substantial - GPT-5.0 brought a 200K native context window (up from codex-1's 128K), dramatically better instruction following, and improved ability to maintain consistency across large multi-file changes. The SWE-Bench score climbed to approximately 78%.

GPT-5.5 - The Current Engine (February 2026)

The current Codex agent runs on GPT-5.5, which represents the most capable coding model OpenAI has shipped. Key improvements over GPT-5.0 include:

256K context window - enough to hold entire medium-sized codebases in a single context
Improved agentic behavior - better at decomposing complex tasks, recovering from errors, and knowing when to ask for clarification
Native tool use - the model was trained with tool-use data from the start, making sandbox operations, file I/O, and shell commands more reliable
Reduced hallucination - significantly fewer invented APIs, non-existent functions, or fabricated library features
Multi-language fluency - strong performance across Python, TypeScript, Rust, Go, Java, C++, C#, Ruby, PHP, and Swift

Model selection: Codex automatically selects the appropriate model based on task complexity. Simple tasks use a fast variant for quick turnaround, while complex multi-file refactors use the full GPT-5.5 reasoning model. You can override this in settings.

3. How It Works - Cloud Sandbox Architecture

The architecture behind Codex is what separates it from simple code-generation tools. Every task runs inside an isolated cloud sandbox - a lightweight virtual environment that provides a complete development setup without touching your local machine.

The Sandbox Environment

When you assign a task to Codex, the following happens:

Repository clone: Codex clones your connected GitHub repository into the sandbox. For large repos, it uses sparse checkout to pull only the relevant directories.
Environment setup: The sandbox installs dependencies based on your project's configuration files (package.json, requirements.txt, Cargo.toml, go.mod, etc.).
Task execution: The agent reads relevant files, formulates a plan, writes code, and executes commands (build, test, lint) inside the sandbox.
Result delivery: Once complete, Codex presents a diff of all changes, test results, and optionally creates a pull request directly on GitHub.

Isolation and Security

Each sandbox is a microVM - a lightweight virtual machine that provides hardware-level isolation. This means:

Tasks cannot access your local filesystem, environment variables, or credentials
Each task gets a fresh environment - no state leaks between tasks
The sandbox has its own network namespace with configurable internet access
All sandbox data is destroyed after task completion (configurable retention for debugging)

Internet Access Modes

Codex offers three network modes for sandboxes:

Mode	Network Access	Use Case
Isolated (default)	No internet access	Maximum security, internal codebases
Package-only	Access to package registries (npm, PyPI, crates.io)	Tasks that need to install dependencies
Full access	Unrestricted internet	Tasks that need to fetch APIs, documentation, or external resources

Security note: Full internet access means the agent can make outbound HTTP requests. If your codebase contains API keys or secrets in environment variables, use the isolated or package-only mode to prevent accidental exfiltration. Codex strips common secret patterns from sandbox environments, but defense in depth is always recommended.

Architecture Diagram

The high-level flow looks like this:

User (ChatGPT)
    |
    v
Codex Orchestrator
    |
    +-- Task Queue (up to 8 parallel tasks)
    |       |
    |       v
    +-- Sandbox Pool
            |
            +-- microVM 1: [clone repo] -> [install deps] -> [agent loop] -> [diff/PR]
            +-- microVM 2: [clone repo] -> [install deps] -> [agent loop] -> [diff/PR]
            +-- ...
            |
            v
        GitHub API (PR creation, branch push)

The Agent Loop

Inside each sandbox, the agent operates in a classic observe-think-act loop:

# Simplified pseudocode of the Codex agent loop
while not task_complete:
    # 1. Observe: Read files, check test output, review errors
    context = read_relevant_files(task, codebase)
    
    # 2. Think: Reason about what to do next
    plan = model.reason(task, context, previous_actions)
    
    # 3. Act: Write code, run commands, create files
    for action in plan.actions:
        if action.type == "write_file":
            write_file(action.path, action.content)
        elif action.type == "run_command":
            output = shell(action.command)
        elif action.type == "read_file":
            context.add(read_file(action.path))
    
    # 4. Verify: Run tests, check for errors
    test_results = shell("npm test")  # or pytest, cargo test, etc.
    
    if test_results.all_passed:
        task_complete = True
    else:
        # Loop back with error context
        context.add(test_results.errors)

The key insight is that Codex doesn't just generate code and hope for the best. It verifies its own work by running your test suite inside the sandbox. If tests fail, it reads the error output, reasons about the cause, and iterates. This loop typically runs 3-7 iterations for complex tasks.

4. Pricing and Availability

One of the most significant changes in early 2026 was OpenAI making Codex available across all ChatGPT tiers, including the free plan. Here is the complete pricing breakdown as of April 2026.

Plan	Monthly Price	Codex Access	Parallel Tasks	Notes
Free	$0	Limited (approx. 5 tasks/day)	1	GPT-5.5 mini model, no internet access
Plus	$20	Standard quota	2	Full GPT-5.5, package-only network
Pro	$100 / $200	High quota / Unlimited	4 / 8	Full internet access, priority queue
Business	$25/user	Team quota pool	4 per user	Admin controls, audit logs, SSO
Enterprise	Custom	Custom quota	Custom	VPC deployment, data residency, SLA

Codex-Only Seats

A notable addition in Q1 2026 was the introduction of Codex-only seats for Business and Enterprise plans. These are discounted seats ($15/user/month on Business) for team members who only need Codex access without the full ChatGPT feature set. This is targeted at development teams where not every engineer needs GPT-5.5 for general conversation but everyone needs the coding agent.

API Pricing

For teams building on top of Codex programmatically, the API pricing follows the standard OpenAI token-based model:

GPT-5.5 (Codex tasks):
  Input:   $2.50 / 1M tokens
  Output:  $10.00 / 1M tokens
  Cached:  $1.25 / 1M tokens (50% discount)

GPT-5.5 mini (fast tasks):
  Input:   $0.30 / 1M tokens
  Output:  $1.20 / 1M tokens
  Cached:  $0.15 / 1M tokens

Cost tip: Codex aggressively uses prompt caching for repository context. If you run multiple tasks against the same repo, subsequent tasks benefit from cached file contents, reducing input token costs by up to 50%. The AGENTS.md file (covered in Section 6) is always cached.

5. Key Capabilities

Codex's capabilities have expanded rapidly since launch. Here is a detailed breakdown of what the agent can do as of April 2026.

Multi-File Edits

Unlike simple code generators that produce isolated snippets, Codex understands project structure. When you ask it to add a new API endpoint, it will:

Create the route handler file
Update the router configuration to register the new route
Add the corresponding data model or schema if needed
Update TypeScript types or interfaces across the project
Modify the OpenAPI/Swagger spec if one exists
Add the route to any middleware chains (auth, validation, rate limiting)

This cross-file awareness is powered by the agent's ability to read and index your entire repository before making changes. It builds an internal map of imports, exports, type definitions, and call graphs.

Test Generation

Codex generates tests that match your existing test patterns. If your project uses Jest with React Testing Library, it writes Jest tests. If you use pytest with fixtures, it writes pytest tests with fixtures. It reads your existing test files to learn your conventions:

// Codex-generated test matching existing project conventions
describe('UserService', () => {
  let service: UserService;
  let mockRepo: jest.Mocked<UserRepository>;

  beforeEach(() => {
    mockRepo = createMockRepository();
    service = new UserService(mockRepo);
  });

  it('should return paginated users with correct metadata', async () => {
    mockRepo.findAll.mockResolvedValue({
      data: [mockUser({ id: '1' }), mockUser({ id: '2' })],
      total: 15,
    });

    const result = await service.getUsers({ page: 1, pageSize: 2 });

    expect(result.data).toHaveLength(2);
    expect(result.pagination).toEqual({
      page: 1,
      pageSize: 2,
      totalPages: 8,
      totalItems: 15,
    });
    expect(mockRepo.findAll).toHaveBeenCalledWith({
      skip: 0,
      take: 2,
    });
  });

  it('should throw NotFoundError for non-existent user', async () => {
    mockRepo.findById.mockResolvedValue(null);

    await expect(service.getUser('999')).rejects.toThrow(NotFoundError);
  });
});

Pull Request Creation

Codex can create pull requests directly on GitHub. When a task completes, you can choose to:

Review the diff in the ChatGPT interface and apply changes manually
Create a PR with an auto-generated title, description, and linked issue
Push to a branch without creating a PR (for further local work)

The PR descriptions are surprisingly good - they include a summary of changes, the reasoning behind design decisions, a list of files modified, and test results. Codex also adds inline comments on complex changes to explain its approach.

Parallel Task Execution

Pro users can run up to 8 tasks simultaneously, each in its own sandbox. This is transformative for large-scale refactoring. For example, you could run these tasks in parallel:

Task 1: "Migrate all API routes from Express to Fastify"
Task 2: "Update all test files to use the new Fastify test helpers"
Task 3: "Update the Docker configuration for Fastify"
Task 4: "Update the CI/CD pipeline for the new build process"

Each task runs independently, and Codex is smart enough to detect potential conflicts between parallel tasks. If Task 1 and Task 2 both modify the same file, Codex will flag the conflict and suggest a merge strategy.

Subagents (GA March 2026)

Subagents allow Codex to decompose complex tasks into smaller subtasks and delegate them to specialized child agents. This went generally available in March 2026 and is one of the most powerful features for complex engineering work.

When you give Codex a broad task like "set up a complete authentication system with JWT, refresh tokens, role-based access control, and password reset via email," it might spawn subagents for:

Subagent 1: JWT token generation and validation middleware
Subagent 2: Refresh token rotation and storage
Subagent 3: RBAC permission model and decorators
Subagent 4: Password reset email flow with templates
Subagent 5: Integration tests for the complete auth flow

The parent agent coordinates the subagents, resolves dependencies between their outputs, and merges the results into a coherent changeset. Subagents share the same repository context but operate in isolated execution environments.

Computer Use (April 2026)

The newest capability, launched in April 2026, is Computer Use - the ability for Codex to interact with graphical interfaces. This extends the agent beyond code editing into:

Browser testing: Codex can open your web application in a headless browser, navigate through user flows, and verify that UI changes render correctly
Visual regression: Compare screenshots before and after changes to detect unintended visual side effects
Documentation: Navigate your deployed application and generate screenshots for documentation
Form filling and testing: Interact with forms, buttons, and dynamic UI elements to test user workflows end-to-end

Computer Use is in early access. It works well for straightforward web applications but can struggle with complex SPAs that rely heavily on client-side state, WebSocket connections, or canvas-based rendering. Expect rapid improvements through Q2-Q3 2026.

6. AGENTS.md Configuration

One of Codex's most influential contributions to the broader AI tooling ecosystem is AGENTS.md - a configuration file that tells AI coding agents how to work with your repository. What started as a Codex-specific feature has become an industry standard governed by the Linux Foundation.

What Is AGENTS.md?

AGENTS.md is a Markdown file placed in the root of your repository (or in subdirectories for module-specific instructions). It provides structured guidance to AI agents about:

Project architecture and conventions
Build, test, and lint commands
Code style preferences and patterns to follow
Files and directories the agent should not modify
Security-sensitive areas that require human review
Dependency management rules
PR and commit message conventions

Example AGENTS.md

# AGENTS.md

## Project Overview
This is a TypeScript monorepo using Turborepo with three packages:
- `packages/api` - Express REST API
- `packages/web` - Next.js 15 frontend
- `packages/shared` - Shared types and utilities

## Build & Test
- Build: `turbo build`
- Test: `turbo test`
- Lint: `turbo lint`
- Type check: `turbo typecheck`

## Code Conventions
- Use functional components with hooks (no class components)
- Use `zod` for all runtime validation
- Use `drizzle-orm` for database queries (not raw SQL)
- Error handling: use `Result<T, E>` pattern from `packages/shared/result.ts`
- All API endpoints must have OpenAPI annotations

## Do Not Modify
- `packages/shared/generated/` - auto-generated from OpenAPI spec
- `*.migration.ts` files - managed by drizzle-kit
- `.github/workflows/` - CI/CD managed by platform team

## Security Review Required
- Any changes to `packages/api/src/middleware/auth.ts`
- Any changes to `packages/api/src/middleware/rbac.ts`
- Any new environment variable usage

## PR Conventions
- Branch naming: `codex/{issue-number}-{short-description}`
- Commit messages: conventional commits (feat:, fix:, chore:, etc.)
- PR description must reference the GitHub issue number

The Linux Foundation Standard

In Q1 2026, the Linux Foundation adopted AGENTS.md as a formal open standard under its AI tooling working group. This means:

Vendor-neutral: AGENTS.md works with Codex, Claude Code, Cursor, Copilot, Amazon Q, and any other agent that supports the spec
Versioned schema: The spec has a formal versioning system (currently v1.2) with backward compatibility guarantees
Validation tooling: A CLI validator (agents-md-lint) checks your AGENTS.md for correctness and completeness
Community governance: Changes to the spec go through an RFC process with input from all major AI tooling vendors

Adoption tip: Even if you don't use Codex, adding an AGENTS.md to your repository improves the behavior of every AI coding tool your team uses. It takes 15 minutes to write and pays dividends across all AI-assisted development workflows.

Hierarchical Configuration

AGENTS.md supports hierarchical configuration. You can place files at multiple levels:

repo-root/
  AGENTS.md              # Global project rules
  packages/
    api/
      AGENTS.md          # API-specific rules (inherits from root)
    web/
      AGENTS.md          # Frontend-specific rules (inherits from root)
    shared/
      AGENTS.md          # Shared library rules (inherits from root)

Child AGENTS.md files inherit from parent files and can override specific sections. This is particularly useful in monorepos where different packages have different conventions.

7. Codex CLI - Open Source Terminal Agent

While the cloud-based Codex agent lives inside ChatGPT, Codex CLI is its open-source counterpart - a terminal-based coding agent that runs locally on your machine. It has become one of the most popular developer tools on GitHub, with 72,000+ stars and a thriving contributor community.

Key Facts

Attribute	Detail
Repository	github.com/openai/codex
Language	Rust
License	Apache 2.0
GitHub Stars	72,000+
First Release	April 2025
Current Version	1.x (stable)

Installation

# macOS / Linux
brew install openai/tap/codex

# Or via cargo (Rust toolchain required)
cargo install codex-cli

# Or download pre-built binary
curl -fsSL https://cli.codex.openai.com/install.sh | sh

# Verify installation
codex --version

How It Differs from Cloud Codex

Codex CLI and cloud Codex share the same underlying models but differ in execution:

Feature	Cloud Codex (ChatGPT)	Codex CLI (Terminal)
Execution	Remote cloud sandbox	Local machine
File access	Cloned repo in sandbox	Direct filesystem access
Internet	Configurable per task	Uses your local network
Parallel tasks	Up to 8	1 (sequential)
PR creation	Built-in GitHub integration	Via git commands
Cost	Included in ChatGPT plan	API token usage (pay-per-token)
Approval modes	Automatic	suggest / auto-edit / full-auto

Approval Modes

Codex CLI has three approval modes that control how much autonomy the agent has:

# Suggest mode (default) - agent proposes changes, you approve each one
codex "add input validation to the user registration endpoint"

# Auto-edit mode - agent can edit files but asks before running commands
codex --auto-edit "refactor the database layer to use connection pooling"

# Full-auto mode - agent has full autonomy (use with caution)
codex --full-auto "fix all ESLint errors in the project"

Why Rust?

OpenAI chose Rust for Codex CLI for several practical reasons:

Startup time: The CLI launches in under 50ms, compared to 500ms+ for Node.js-based alternatives
Memory efficiency: Handles large codebases without excessive memory usage
Single binary: No runtime dependencies - download one binary and it works
Cross-platform: Compiles natively for macOS (ARM/x86), Linux, and Windows
Safety: Rust's ownership model prevents the memory bugs that plague long-running agent processes

Community note: Codex CLI's Apache 2.0 license means you can fork it, modify it, and use it in commercial products. Several companies have built internal tooling on top of the Codex CLI codebase, adding custom tool integrations and enterprise authentication.

8. Competitive Landscape

The AI coding agent market in 2026 is fiercely competitive. Codex is the most widely used, but it faces strong competition from several directions. Here is an honest comparison.

Codex vs. Claude Code (Anthropic)

Claude Code is Codex's most direct competitor - a terminal-based coding agent powered by Claude 3.5 Opus and Claude 4 Sonnet. The comparison is nuanced:

Dimension	OpenAI Codex	Claude Code
Token efficiency	3x more efficient	Higher token consumption per task
Blind code review preference	33% preferred	67% preferred
Execution model	Cloud sandbox + local CLI	Local terminal only
Parallel tasks	Up to 8	1 (sequential)
PR creation	Built-in (cloud)	Via git commands
Subagents	GA (March 2026)	GA (February 2026)
Open source	CLI only (Apache 2.0)	Not open source
IDE integration	ChatGPT web + CLI	Terminal + VS Code extension

The headline stat is striking: Codex is 3x more token-efficient than Claude Code for equivalent tasks, meaning it costs significantly less per task at API pricing. However, in blind code review studies where developers evaluated the output quality without knowing which tool produced it, Claude Code was preferred 67% of the time. This suggests Claude Code produces more idiomatic, readable, and well-structured code, even if it uses more tokens to get there.

The practical takeaway: Codex excels at high-throughput, parallel task execution and CI/CD integration. Claude Code excels at interactive pair-programming where code quality and developer experience matter most. Many teams use both.

Codex vs. GitHub Copilot

Copilot and Codex are siblings - both from the OpenAI/Microsoft ecosystem - but they serve different roles:

Copilot is an IDE-integrated assistant focused on real-time code completion, inline suggestions, and chat within VS Code/JetBrains. It is reactive - you write code, it suggests completions.
Codex is an autonomous agent focused on task completion. You describe what you want, and it does the work independently.

They are complementary, not competitive. Many developers use Copilot for moment-to-moment coding and Codex for larger tasks like feature implementation, refactoring, and test generation. GitHub has been integrating Codex capabilities into Copilot Workspace, blurring the line between the two products.

Codex vs. Cursor

Cursor is an AI-native IDE (a VS Code fork) that embeds AI deeply into the editing experience. Its strengths are:

Inline editing: Select code, describe a change, and Cursor modifies it in place
Multi-model support: Use GPT-5.5, Claude, Gemini, or local models
Codebase indexing: Cursor indexes your entire project for context-aware suggestions
Composer: Cursor's agent mode for multi-file changes

Cursor's advantage is the tight IDE integration - changes happen in your editor with full undo/redo support. Codex's advantage is the cloud sandbox model, which means tasks run in the background without blocking your editor, and you can run multiple tasks in parallel.

Codex vs. Amazon Q Developer

Amazon Q Developer is AWS's AI coding assistant, deeply integrated with the AWS ecosystem:

AWS expertise: Q excels at AWS-specific tasks - CloudFormation, CDK, Lambda, IAM policies
Code transformation: Q can migrate Java 8 to Java 17, .NET Framework to .NET Core
Security scanning: Built-in vulnerability detection tuned for AWS services
IDE integration: VS Code, JetBrains, and the AWS Console

Q is the best choice for AWS-heavy workloads. Codex is more general-purpose and stronger at non-cloud coding tasks. For teams building on AWS, using Q for infrastructure code and Codex for application code is a common pattern.

Market Share (April 2026)

Tool	Weekly Active Users	Primary Use Case
GitHub Copilot	15M+	IDE code completion
OpenAI Codex	4M+	Autonomous coding agent
Cursor	3M+	AI-native IDE
Claude Code	1.5M+	Terminal coding agent
Amazon Q Developer	1M+	AWS-integrated assistant

9. Codex Security

In Q1 2026, OpenAI launched Codex Security - a specialized mode that uses the Codex agent to scan codebases for security vulnerabilities. This is not a traditional static analysis tool. It uses GPT-5.5's reasoning capabilities to understand code semantics and identify vulnerabilities that pattern-matching tools miss.

How It Works

Codex Security operates in the same cloud sandbox as regular Codex tasks. When you trigger a security scan, the agent:

Clones your repository into an isolated sandbox
Builds a semantic understanding of the codebase - data flows, trust boundaries, authentication paths
Identifies potential vulnerabilities using a combination of pattern matching and reasoning
Verifies each finding by tracing the data flow from source to sink
Generates a report with severity ratings, affected code paths, and suggested fixes
Optionally creates PRs with the fixes applied

The Chromium/OpenSSL/PHP Audit

The most impressive demonstration of Codex Security came when OpenAI ran it against three of the most security-critical open-source projects: Chromium, OpenSSL, and PHP. The results were remarkable:

Project	Critical Issues Found	High Issues Found	False Positive Rate
Chromium	340+	1,200+	~12%
OpenSSL	280+	450+	~8%
PHP	180+	600+	~15%
Total	800+	2,250+	~12% avg

Finding 800+ critical issues across these heavily-audited codebases - projects that have been reviewed by thousands of security researchers over decades - demonstrated that AI-powered security scanning can find vulnerabilities that traditional tools and human reviewers miss. The ~12% false positive rate is competitive with commercial SAST tools like Snyk, Semgrep, and SonarQube.

Vulnerability Categories

Codex Security is particularly strong at finding:

Memory safety issues: Buffer overflows, use-after-free, double-free (in C/C++ codebases)
Injection vulnerabilities: SQL injection, command injection, XSS, template injection
Authentication bypasses: Logic errors in auth flows, missing authorization checks
Cryptographic weaknesses: Weak algorithms, improper key management, timing attacks
Race conditions: TOCTOU bugs, concurrent access without proper locking
Supply chain risks: Suspicious dependencies, typosquatting packages, outdated libraries with known CVEs

Integration: Codex Security can run as a GitHub Action on every PR, providing security review alongside your existing CI/CD pipeline. It adds comments directly to the PR with findings and suggested fixes.

10. Limitations and Known Issues

Codex is powerful, but it is not magic. Understanding its limitations is essential for using it effectively and setting appropriate expectations.

Context Window Constraints

Even with GPT-5.5's 256K context window, very large codebases can exceed the agent's ability to hold all relevant context simultaneously. For monorepos with millions of lines of code, Codex uses heuristics to select the most relevant files, which means it can miss cross-module dependencies or subtle interactions between distant parts of the codebase.

Hallucination (Reduced but Not Eliminated)

GPT-5.5 hallucinates significantly less than earlier models, but it still happens. Common hallucination patterns include:

Inventing API methods that don't exist in a library (especially for less popular packages)
Generating import paths that don't match the actual project structure
Assuming configuration options that aren't available in the version you're using
Creating test assertions based on assumed behavior rather than actual behavior

The sandbox's ability to run tests catches many of these issues, but not all. Always review Codex's output before merging.

Language and Framework Coverage

Codex performs best with popular languages and frameworks that have extensive training data. Performance degrades for:

Niche languages (Elixir, Haskell, OCaml, Zig) - functional but less idiomatic
Internal/proprietary frameworks - Codex can't know about your company's custom framework unless AGENTS.md provides detailed guidance
Very new libraries - anything released after the model's training cutoff may not be well-represented
Domain-specific languages (DSLs) - Terraform HCL and SQL are well-supported, but custom DSLs are hit-or-miss

Sandbox Limitations

No GPU access: The sandbox doesn't have GPU support, so ML training tasks or CUDA code can't be tested
Limited system services: No Docker-in-Docker, no systemd, no database servers (unless you use the full-access network mode to connect to external services)
Timeout: Tasks have a maximum execution time (15 minutes on Free/Plus, 30 minutes on Pro, configurable on Enterprise)
Filesystem size: Sandbox storage is limited to 10GB, which can be insufficient for projects with large binary assets

Non-Determinism

Like all LLM-based tools, Codex is non-deterministic. Running the same task twice may produce different code. The code will be functionally equivalent in most cases, but the exact implementation details - variable names, code structure, algorithm choices - can vary. This makes it unsuitable for tasks that require exact reproducibility.

Critical reminder: Codex is a tool, not a replacement for engineering judgment. Always review generated code, especially for security-sensitive paths, data handling, and business logic. The agent is excellent at mechanical tasks but can make subtle logical errors that only a human with domain knowledge would catch.

11. Real-World Adoption

Codex's growth from launch to 4 million+ weekly active users in under a year makes it one of the fastest-adopted developer tools in history. Here is how organizations are using it in practice.

Adoption by the Numbers

Metric	Value (April 2026)
Weekly active users	4M+
Tasks completed per day	12M+
PRs created per week	2.5M+
Enterprise customers	500+
Codex CLI daily downloads	50K+
AGENTS.md adoption (top 1K GitHub repos)	~40%

Common Use Patterns

Based on OpenAI's published usage data and community reports, the most common Codex workflows are:

1. Test Generation (28% of tasks)

The single most popular use case. Teams point Codex at untested code and ask it to generate comprehensive test suites. This is particularly valuable for legacy codebases that were built without tests - Codex can read the implementation, understand the expected behavior, and generate tests that serve as both documentation and regression protection.

2. Bug Fixes (22% of tasks)

Developers paste error messages, stack traces, or bug reports and let Codex trace the issue through the codebase. The agent's ability to read multiple files, understand data flow, and verify fixes by running tests makes it highly effective for debugging.

3. Feature Implementation (18% of tasks)

New feature development - adding endpoints, building UI components, implementing business logic. This is where Codex's multi-file editing and test generation capabilities shine.

4. Refactoring (15% of tasks)

Code modernization, dependency upgrades, pattern migrations (e.g., callbacks to async/await, class components to hooks), and structural reorganization.

5. Code Review and Documentation (12% of tasks)

Using Codex to review PRs, explain complex code, generate documentation, and add inline comments to poorly documented codebases.

6. Security Scanning (5% of tasks)

Running Codex Security scans as part of CI/CD pipelines or ad-hoc security audits.

Enterprise Case Studies

Several large organizations have shared their Codex adoption results:

A Fortune 500 fintech company reported a 40% reduction in time-to-merge for feature PRs after deploying Codex across their 200-person engineering team. Test coverage increased from 62% to 84% in three months.
A mid-size SaaS startup (50 engineers) uses Codex for all test generation and achieved 90%+ coverage across their TypeScript monorepo. They estimate Codex saves each developer 6-8 hours per week.
An open-source project maintainer uses Codex to triage and fix issues from community contributors, reducing the average issue resolution time from 12 days to 3 days.

Getting Started

If you are new to Codex, start with the free tier to explore its capabilities. Connect a GitHub repository, try a few test generation tasks, and review the output quality. Once you are comfortable, upgrade to Plus or Pro for higher quotas and parallel task execution. Add an AGENTS.md to your repository for the best results.

12. Frequently Asked Questions

Is OpenAI Codex the same as the old Codex API from 2021?

No. The original Codex was a code-completion API based on GPT-3 that powered early GitHub Copilot. It was deprecated in March 2023. The current Codex (2025+) is a completely different product - an autonomous coding agent powered by GPT-5.5 that runs in a cloud sandbox and can perform complex multi-file engineering tasks.

Does Codex have access to my source code?

Yes, when you connect a GitHub repository, Codex clones it into an isolated cloud sandbox to read and modify files. The sandbox is destroyed after task completion. OpenAI states that code processed by Codex is not used to train models for Business and Enterprise plans. For Free, Plus, and Pro plans, you can opt out of training data usage in your settings.

Can Codex work with private repositories?

Yes. Codex supports private GitHub repositories through OAuth integration. You grant Codex read/write access to specific repositories - it does not require access to your entire GitHub account. GitLab and Bitbucket support is in beta as of April 2026.

How does Codex compare to Claude Code?

Codex is approximately 3x more token-efficient than Claude Code, making it cheaper per task. However, Claude Code is preferred by 67% of developers in blind code review studies for output quality. Codex excels at parallel task execution and CI/CD integration, while Claude Code is favored for interactive pair-programming. Many teams use both tools for different workflows.

What languages does Codex support?

Codex supports all major programming languages. It performs best with Python, TypeScript/JavaScript, Rust, Go, Java, C#, C++, Ruby, PHP, and Swift. It can work with less common languages but may produce less idiomatic code. The AGENTS.md file can provide language-specific guidance to improve output quality.

Can I use Codex CLI without a ChatGPT subscription?

Yes. Codex CLI uses the OpenAI API directly, so you only need an API key with credits. You pay per token used. This is separate from ChatGPT subscription pricing. Many developers prefer this model for predictable, usage-based costs.

Is Codex CLI really open source?

Yes. Codex CLI is fully open source under the Apache 2.0 license. The source code is available on GitHub with 72,000+ stars. You can fork it, modify it, and use it in commercial products. The agent logic, tool integrations, and approval modes are all open. The only proprietary component is the GPT-5.5 model itself, which is accessed via the OpenAI API.

What is AGENTS.md and do I need one?

AGENTS.md is a configuration file that tells AI coding agents how to work with your repository. It includes project conventions, build commands, code style rules, and areas that require human review. While not strictly required, adding one significantly improves Codex's output quality. It is now a Linux Foundation standard supported by all major AI coding tools.

Can Codex deploy my code to production?

Codex can create PRs and trigger CI/CD pipelines, but it does not directly deploy to production. The recommended workflow is: Codex creates a PR, your CI/CD pipeline runs automated checks, a human reviews and approves, and your existing deployment process handles the rest. This keeps a human in the loop for production changes.