In the middle of stress testing o3-mini and DeepSeek-r1.
o3-mini is winner - it's not even a competition!
I'm testing each model against the most common developer tasks: 1️⃣ Build a new project from scratch 2️⃣ Build a new feature in an existing app 3️⃣ Refactor existing code and generate tests
-
With o3-mini and cursor, I was able to build a ChatGPT replica that allows me to chat with local LLM models in a single shot and everything works like a charm!
On the other hand, DeepSeek-r1 got stuck and only generated a single javascript file. Far from a fully functional website.
Here's the GitHub repo with the prompts so you can replicate the results on your own:
aiwithbrandon
In the middle of stress testing o3-mini and DeepSeek-r1.
o3-mini is winner - it's not even a competition!
I'm testing each model against the most common developer tasks:
1️⃣ Build a new project from scratch
2️⃣ Build a new feature in an existing app
3️⃣ Refactor existing code and generate tests
-
With o3-mini and cursor, I was able to build a ChatGPT replica that allows me to chat with local LLM models in a single shot and everything works like a charm!
On the other hand, DeepSeek-r1 got stuck and only generated a single javascript file. Far from a fully functional website.
Here's the GitHub repo with the prompts so you can replicate the results on your own:
github.com/bhancockio/o3-mini-vs-deepseek-r1
The screenshot below shows the functional app that o3-mini was able to create in a single shot!
I'm still working on the second 2 tasks but right now, there is a clear winner.
I'll keep you posted as I keep testing.
Also, I'll be putting all of this into a YouTube video that will hopefully come out on Monday!
If you have any questions around o3-mini or deepseek-r1, let me know!
I feel like I have a PHD on these models after all the testing I've done over the past 24 hours.
1 month ago | [YT] | 49