Skip to content

Todo Lists as Instruction Mirrors in OpenAI Codex

Beyond Task Tracking

In OpenAI Codex–powered environments (such as the ChatGPT IDE integration or GitHub Copilot Chat), automatically generated todo lists are more than just progress trackers. They serve as mirrors of how the model interprets your instructions.

When Codex converts a natural-language request into a structured todo plan, it exposes the model's internal parsing of your intent. A well-aligned todo list indicates your instructions are clear. Divergences highlight where your communication might need refinement.

Common Todo List Divergences

Out of Order

You specify: "First migrate database, then deploy app."

Codex orders it: deploy first, migrate later.

Missing Items

You say: "Run integration tests before merge."

Codex omits tests from the todo list.

Extra Items

You instruct: "Update homepage layout."

Codex adds: "Backup CSS files" — never mentioned.

Wrong Granularity

You request: "Update documentation."

Codex creates todos for each file: README.md, API.md, CONTRIBUTING.md.

Misinterpreted Step

You say: "Review code changes."

Codex logs: "Commit changes."

These mismatches are valuable signals: they reveal how Codex understands (or misunderstands) your intentions.

Real-Time Steering with Todos

OpenAI models support step-by-step reasoning with mid-task correction . By watching how Codex structures todo items, you can steer execution in real time.

Example:

Before Steering

Fix navbar alignment

Update footer text

Add contact form validation

Change button color to blue

Update documentation

Mid-task Instruction "Actually, make the button green instead."

After Steering

Fix navbar alignment

Update footer text

Add contact form validation

Change button color to green

Update documentation

This real-time feedback loop ensures the model's internal todo plan continuously mirrors your intent.

Experiment: Increasing Transparency

Developers can encourage Codex to generate granular todos for more visibility:

Instead of:

"Style the navbar."

Ask Codex to break it down into:

"Change height from 60px → 80px."

"Reduce padding-top from 16px → 12px."

"Adjust background from #fff → rgba(255,255,255,0.95)."

This exposes Codex's reasoning before execution, allowing you to approve or redirect decisions — similar to code reviews but at the planning stage.

Why It Matters

Instruction Clarity: Todo lists show how Codex interprets you.

Debugging Miscommunication: Divergence = communication gap.

Safe Iteration: Mid-task steering prevents costly mistakes.

Developer Control: Granularity makes hidden assumptions visible.

By treating todo lists as instruction mirrors, you can transform Codex from a black-box code generator into a transparent collaborator.

Key Sources

OpenAI – ChatGPT in VS Code

GitHub – Copilot Chat Documentation

OpenAI Docs – Step-by-step reasoning and corrections