Discussion
Loading...

Post

Log in
  • About
  • Code of conduct
  • Privacy
  • Users
  • Instances
  • About Bonfire
Simon Willison
Simon Willison
@simon@fedi.simonwillison.net  ·  activity timestamp 4 days ago

Pelicans for Opus 4.6 and Codex 5.3 - I don't have much interesting to say about these models yet to be honest, they're both incremental improvements on their predecessors and very capable https://simonwillison.net/2026/Feb/5/two-new-models/

Simon Willison’s Weblog

Opus 4.6 and Codex 5.3

Two major new model releases today, within about 15 minutes of each other. Anthropic released Opus 4.6. Here's its pelican: OpenAI release GPT-5.3-Codex, albeit only via their Codex app, not …
  • Copy link
  • Flag this post
  • Block
datenschatz
datenschatz
@datenschatz@norden.social replied  ·  activity timestamp 4 days ago

@simon "I've had a bit of preview access to both of these models and to be honest I'm finding it hard to find a good angle to write about them"

How about rating their own and each other's completed code with different instances?

From my experience, Claude was still much worse considering overall planning. Also web search on Claude seemed to be much worse than GPT 5.2.

  • Copy link
  • Flag this comment
  • Block
Mendel
Mendel
@mendel@techhub.social replied  ·  activity timestamp 4 days ago

@simon haven't tried opus 4.6 yet but 4.5 couldn't generate emails with the beefree simple schema json with a design that actually looked that great
https://docs.beefree.io/beefree-sdk/data-structures/simple-schema

  • Copy link
  • Flag this comment
  • Block
trademark
trademark
@trademark@fosstodon.org replied  ·  activity timestamp 4 days ago

@simon "I've been having trouble finding tasks that those previous models couldn't handle but the new ones are able to ace." Ask it to write assembly. Gemini 3 and Opus 4.5 were the first I could get to write non-trivial assembly programs, though they both failed to write "life" with sixel graphics.

  • Copy link
  • Flag this comment
  • Block
0×4a6f4672
0×4a6f4672
@jofr@ruby.social replied  ·  activity timestamp 4 days ago

@simon it could be that the guys at #Anthropic know you and your "pelican on a bicycle" test, since you are a well known AI blogger

  • Copy link
  • Flag this comment
  • Block

bonfire.cafe

A space for Bonfire maintainers and contributors to communicate

bonfire.cafe: About · Code of conduct · Privacy · Users · Instances
Bonfire social · 1.0.2-alpha.7 no JS en
Automatic federation enabled
Log in
  • Explore
  • About
  • Members
  • Code of Conduct