For benchmarks, I am particularly paying attention to its ARC-AGI-2 curves, x.com/arcprize/status/1993036393841672624
Also there is a 3x price cut compared to Opus 4.1, and that's a big deal.
Claude Opus 4.5 is currently the leader of the field.
***
Returning to yesterday's notes on Ilya at Dwarkesh, Ilya's existential safety approach is good, but his approach to speed is not, and neither is his reluctance to engage with recursive self-improvement (which will be a super-important speed factor). His critique of LLMs is correct, but those drawbacks are unlikely to prevent LLMs from becoming formidable software engineers and formidable AI researchers, accelerating the field of AI and helping to create future generations of AIs (which might end up having very different architectures).
In fact, some people at Anthropic seem to be advancing their timelines to "solving software engineering in the first half of 2026" on the heels of Opus 4.5, x.com/dmwlff/status/1993036664428806145:
>I believe this new model in Claude Code is a glimpse of the future we're hurtling towards, maybe as soon as the first half of next year: software engineering is done.
>
>Soon, we won't bother to check generated code, for the same reasons we don't check compiler output.
But he clarifies that higher-level activity are not close to being mastered by AI:
>The hard part is requirements, goals, feedback—figuring out what to build and whether it's working.
>
>There's still so much left to do, and plenty the models aren't close to yet: architecture, system design, understanding users, coordinating across teams.
***
With Ilya being so slow in his projections, our best bet is that Anthropic will win. We also should keep the conclusions of John David Pressman in mind (that we should move faster, rather than waiting and delaying).
But Ilya's various ideas are very good to keep in mind, together with their questionable parts.
One, the idea of sentient AIs which have caring about all sentient beings is very promising, but since we have no clue about what is sentient and what is not (having made absolutely zero progress on the hard part of the Hard Problem of Consciousness so far), it's difficult to fully rely on that, hence the alternative(s) proposed earlier in this series of posts. But also we should make a stronger push towards actually solving the Hard Problem (even if it being solved carries its own risks).
Revisiting one more thing from my yesterday's notes, mishka-discord.dreamwidth.org/7539.html
>>Ilya Sutskever 01:02:37
>>
>>t’s true. It’s possible it’s not the best criterion. I’ll say two things. Number one, care for sentient life, I think there is merit to it. It should be considered. I think it would be helpful if there was some kind of short list of ideas that the companies, when they are in this situation, could use. That’s number two.
>>
>>Number three, I think it would be really materially helpful if the power of the most powerful superintelligence was somehow capped because it would address a lot of these concerns. The question of how to do it, I’m not sure, but I think that would be materially helpful when you’re talking about really, really powerful systems.
>
>This last remark, about caps on the maximal intelligence power, is interesting. Of course, in my "Exploring non-anthropocentric aspects of AI existential safety", www.lesswrong.com/posts/WJuASYDnhZ8hs5CnD/exploring-non-anthropocentric-aspects-of-ai-existential it is not unlikely that the AI ecosystem would have to control maximal available intelligence for the reasons of "core non-anthropocentic existential safety" (that is, in order to make sure that the "fabric of reality" remains intact).
The idea of having a short list of ideas relevant to AI existential safety for superintelligent systems is obviously good and fruitful.
Now, returning to my remark that
>it is not unlikely that the AI ecosystem would have to control maximal available intelligence for the reasons of "core non-anthropocentic existential safety" (that is, in order to make sure that the "fabric of reality" remains intact).
the first possible danger one thinks about in connection with this is a super-intelligent entity being clever enough to perform really dangerous quantum gravity experiments endangering at least the local neighborhood of an uncertain size.
If one really believes in unity of spiritual and material, in eventual existence of the unified theory of matter and consciousness, this might be not a fantasy, but a very realistic possibility.