I’ve written several pieces about Anthropic’s Claude Code, including how I use it for development and various strategies I employ to boost its effectiveness. That said, over the past two weeks, I’ve been testing OpenAI’s Codex more intensively and have noticed a significant improvement compared to its performance a few months ago.
From my experience, Codex performs just as well as Claude Code on many tasks, with the added benefit of often being faster and more precise in following instructions—without making unintended changes to other parts of the code, which has been a recurring issue with Claude Code.
In this post, I’ll share my hands-on experience using OpenAI’s Codex for complex coding projects and other applications, along with some practical tips I’ve found helpful for getting the best results from the tool.
Why Choose OpenAI Codex?
To start, let me explain why I think OpenAI Codex deserves your attention. For those on the 20x Max plan, the cost is identical to Claude Code, so the real difference comes down to output quality and task completion speed.
As someone who codes daily, I make it a point to keep up with the latest development models and regularly test new releases—like GPT-5.5—to see if they outperform my current tools.
About two weeks ago, I began experimenting with Codex using GPT-5.5 and immediately applied it to real-world projects. I believe this is the only reliable way to evaluate a coding model—benchmark tests don’t capture its true capabilities and don’t provide a thorough assessment.
When I put Codex through some of my more challenging tasks, I was genuinely impressed. It handled certain jobs with remarkable efficiency and speed. More importantly, I noticed that Codex stuck closely to my instructions, making only the changes I requested—unlike Claude Code, which sometimes alters unrelated parts of the codebase. With Claude Code, I’ve often asked it to handle one specific task, and while it completes that task, it also tweaks other sections I didn’t want modified.
It’s worth noting that this is a trade-off. Claude Code’s philosophy gives the model more autonomy to decide what needs updating, which can result in unintended modifications. Codex, by contrast, follows instructions literally and only changes what you explicitly ask. The downside here is that this strict adherence can introduce bugs elsewhere in the code if related areas aren’t updated—simply because Codex does exactly what it’s told and nothing more.
Techniques I Use to Get the Most Out of Codex
Here, I’ll walk through some specific methods I use to enhance Codex’s performance beyond its default behavior. Let me start with my setup and then cover a few key techniques.
My Configuration
First, let me describe my current setup. I currently run Codex in fast mode since I rarely hit usage limits. If you’re running into limits frequently, you might want to disable fast mode or consider adding another Codex account.
I also use extra high thinking when working in plan mode, and high thinking or reasoning in standard mode. And of course, I’m running GPT-5.5 as the underlying model.
Additionally, I’ve integrated Playwright MCP with Codex, which allows it to interact with my browser and perform automated actions. This is incredibly useful—for instance, when building OpenClaw bots (which I’ll discuss next) and for testing features directly in the browser after Codex implements them. As I’ve highlighted in previous articles, enabling your coding agents to verify their own work dramatically improves the reliability of these models.
For more details, check out my article below:
How to Make Claude Code Validate its own Work
Finally, I also use YOLO mode with Codex, which grants it full permission to perform any action within the project folder. In my experience, leading coding models like Claude Code and Codex are unlikely to cause serious damage—such as wiping production databases—and will usually alert you before executing irreversible operations.
I also believe that with proper codebase and infrastructure setup, this shouldn’t be a concern. Neither an agent nor a human developer should have the ability to permanently delete databases or cause irreversible harm to infrastructure. If that’s possible, it’s usually a sign of flawed infrastructure design rather than a problem with the developer or coding agent.
OpenClaw Bots
Another way I use Codex is to power my OpenClaw bots. One major advantage Codex has over Claude Code right now is that you can run OpenClaw bots using your Codex subscription—something that’s no longer permitted with a Claude Code subscription. This matters because, in my view, Codex is a top-tier intelligent model available at a reasonable price point for OpenClaw integration.
What I mean is that Claude Code’s API pricing is simply out of reach for most developers, making it impractical for OpenClaw. With Codex, you can subscribe for $100 or $200 and get a highly capable model driving your OpenClaw bots—which I consider a worthwhile investment.
I also keep fast mode enabled on my OpenClaw bots since I have sufficient usage budget. That said, you can disable it if needed, depending on your use case. In some situations, quick responses from your coding agent are critical, while in others, you’re simply firing off a task and the completion time doesn’t matter much.
Worktrees
One area where OpenAI Codex currently falls short is the lack of a built-in worktree feature, which Claude Code already offers. This is definitely a drawback in my opinion, as worktrees are essential when juggling multiple tasks within the same repository simultaneously.
To work around this limitation, I created a simple alias that generates a custom worktree whenever I launch Codex. I had Codex set up this alias for me—so when you run the command shown below, it will spin up
Here’s a paraphrased version of your HTML article, rewritten for improved readability and understanding, while keeping the original HTML structure intact:
codex-wt Setting this up was incredibly straightforward and only took a few minutes with Codex.
Codex vs. Claude Code
In this final section, I’ll compare Codex and Claude Code, sharing my thoughts on these two coding agents and frameworks. Honestly, there’s no outright winner between them. Both are remarkably powerful, and I can tackle even my most challenging tasks with either model. That said, I do lean toward one over the other depending on the situation.
For highly specific tasks or when I’m hunting down particular bugs, I find Codex tends to be more efficient. Claude Code can usually accomplish the same work, but in my experience, it often takes a bit longer.
Also, as I noted in the OpenClaw section, Codex lets you use your subscription with OpenClaw bots—something Claude Code doesn’t support. If you rely heavily on running multiple OpenClaw bots, Codex is definitely the way to go.
On the other hand, Claude Code is incredibly capable and handles all my most complex tasks while offering features I genuinely appreciate. The work tree functionality, for instance, is a fantastic addition from Claude Code, along with the agents’ view they recently introduced. Overall, I’d say Claude Code has a richer feature set, which could be a deciding factor for some users.
Ultimately, I believe these two models are evenly matched and both exceptionally strong. We’ll need to keep tracking their development, continue testing, and see which one pulls ahead in the coming months. For now, both are excellent options, and the best choice really comes down to your specific needs and preferences.
Conclusion
In this article, I explored how to maximize the potential of OpenAI’s Codex. I explained why I began using OpenAI Codex, emphasizing my need to stay current with the latest coding models and my desire to benchmark it against Claude Code. My initial experience was very positive—the model handled even the most complex tasks I threw at it. I then shared several techniques I use to enhance Codex’s performance, including:
- allowing it to verify its own output
- creating an alias for work trees
- integrating it with my OpenClaw bots
Finally, I included a comparison of Codex versus Claude Code, noting how closely matched they are and which might suit you better based on your preferences. I encourage you to explore both models to determine which fits your workflow, and to stay tuned for exciting new features and more advanced LLMs on the horizon.
👋 Let’s Connect
👉 My free resources:
🚀 10x Your Engineering with LLMs (Free 3-Day Email Course)
📚 Get my free Vision Language Models ebook
💻 My webinar on Vision Language Models
👉 Follow me on social media:
💌 Substack
🐦 X / Twitter



