mishka_discord

S	M	T	W	T	F	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30

Gemini 3 is the new clear leader, with considerable gap between it and all other models.

Yet, it is clearly a "jagged intelligence", with sparks of novel brilliance and gaps at simple things.

Its new ARC-AGI-2 results are spectacular:
* Gemini 3 Pro: 31.11%, $0.81/task
* Gemini 3 Deep Think (Preview): 45.14%, $77.16/task

Even 31% exceeds carefully engineered manual efforts on the semi-private leaderboard (and in the official competition) and 45% is rather mind-blowing. At the same time, the system is very jagged, can succeed at complex things and fail at some relatively simple things from ARC-AGI-1, the Arcprize people are asking to investigate why (calling those ARC-AGI-1 mistakes "obvious"), see the thread: x.com/arcprize/status/1990820655411909018

See also this report by Victor Taelin. The system can succeed brilliantly and innovatively at very difficult things ("Gemini's solution is 2x simpler than my own's"), and fail at simple things: www.lesswrong.com/posts/N7oRkcz3PrNQSNyw9/victor-taelin-s-notes-on-gemini-3

also from Victor Taelin's twitter thread x.com/VictorTaelin/status/1990844923994886282

>And obviously this is first day so take this with a grain of salt, particularly on the parts I tested less. People are saying it is great at creative writing and health too. It might be? Inferring intent issues are 100% real though!

So this is a continuation of transition number 6, "Revolution of Competent Agents", mishka-discord.dreamwidth.org/4806.html

And it's probably not the end of it, there is another few weeks to go. Hopefully, Anthropic and, especially, OpenAI will come up with some answer to that. xAI has just released Grok-4.1 which seems to be on par with Anthropic and OpenAI current models.

But it does not make much of a dent in the reliability problem, judging by these reports.

This does not move us to transition number 7, "Trustworthy autonomy stage", mishka-discord.dreamwidth.org/5293.html

Google has released a new "Google Antigravity" agentic framework with that, and someone has already hacked its system prompt out of it: x.com/p1njc70r/status/1990919996265148701 We'll see how much of a breakthrough this new agentic framework is (it does seem to support third-party models, not just Gemini, it seems to have some new self-learning and verification-related capabilities, but I am just going by Google AI summary for "Google Antigravity" vs other agentc frameworks at the moment, I have not looked more closely yet).

***

A new important post by John David Pressman: www.lesswrong.com/posts/apHWSGDiydv3ivmg6/varieties-of-doom

See also x.com/jd_pressman/status/1990537576881742178

JDP is a remarkable thinker and a super-strong independent AI practitioner and researcher, and it would take way more than one short post to give justice to his output and even to this single new post "Varieties of Doom", which is a must-read.

I will only look at the last few paragraphs where he focuses on our rapidly deteriorating situation and observes, among other things, that if one considers two catastrophic scenarios, 5% of humans surviving a global nuclear war and giving rise to a new civilization, and super-intelligent AI takeover going badly and not leaving human survivors, then the AI successors are likely to be closer to our current civilization in their make-up and values compared to a new post-apocalyptic civilization based on humans. So he thinks we should accelerate the transcendence (while, of course, trying to ensure that it goes well, and that humans do survive and flourish).

Boldface is mine (and I agree with what is written in boldface):

>Humanism is dead, humanism remains dead, and it will continue to decompose.

>The "inevitable arc of moral progress" over the past 300 or so years is actually the inevitable moral arc of the gun. With drones set to displace bullets that arc is ending. Even setting aside superintelligence it's difficult to imagine our military peers in Asia won't automate their weapons and the factories necessary to produce them. At some point there will be a flashpoint, perhaps in Taiwan, and it will become obvious to everyone (if it hasn't already) that to make war primarily with human labor is to doom yourself to obsolescence and death.

>I talked of the latter day secular humanist madman as a hypothetical but he exists, he is Eliezer Yudkowsky! Watch the crowd snicker at Yudkowsky pleading with China to pick up the ideology America has seemingly abandoned. Yudkowsky has sincere faith that to do so would be to China's advantage and the audience laughs.

>I have no God to appeal to, only you dear reader so listen closely: There is no further natural "moral progress" from here because "moral progress" was simply Is disguised as Ought. What is so striking about Harry Potter And The Methods Of Rationality is that it's obvious sapience is sacred to its author. Implicit in the narrative's sympathy even for people who have hurt others is the idea that almost nobody is capable of committing an unforgivable crime for which they deserve death. Perhaps if I take the narrative of HPMOR completely literally it is not humanly possible. But it is transhumanly possible. I think right now we still live in a world something like the one HPMOR is written for, a place where a very thin sliver of humanity (if anyone at all) has ever done something so awful that their rightful fate is death or damnation. As this century continues and humanism declines at the same time our awesome technological powers expand I expect that to become less and less true. We will increasingly find it within ourselves to visit unforgivable atrocities on each other, and by the time circumstance is done making us its victims I'm not sure we won't deserve whatever ultimately happens to us.
>
>But if we ascend soon, it might not happen.
>
>Even at this late hour, where it might seem like things are caving in and our societal situation grows increasingly desperate, it could still end up not mattering if we transcend in the near future. I think we're an unusually good roll in the values department, and even if humanity did find some alternative tech tree to climb back up the ladder after nuclear armageddon it's not obvious to me that new civilization would ascend with values as benevolent and egalitarian as those brought about by industrialization and firearms. I worry if we let the sun set on us now for a brighter future tomorrow, it's unlikely to rise for us again. I've seen some advocates of AI pause criticize their opponents for being 'cucks' who want to hand over the universe to a stronger and better AI. Yet they become completely casual about the risks of handing over our lightcone to whatever future civilization rises from the ashes of WW3. If this is you I have to ask: Why are you so eager to inseminate the universe with some other civilization's cultural code? I suspect but cannot prove that much of it comes down to the goodness of this deed being too good for us, that we are too cowardly to seize our destiny. If this hemisphere of puritans does not grab its chance it will be because we lack the necessary sangfroid, the ability to stay calm in the face of unavoidable danger and make rational decisions. If we cannot bear to lock in our good values perhaps we will cede the task to a different people less paralyzed by scrupulosity and neurosis. Perhaps even humanity as a whole is too fearful and the remaining hope lies with some other species on some distant star.
>
>That, too, is natural selection at work.

Diary: Gemini 3, John David Pressman latest post

Profile

November 2025

Custom Text

Diary: Gemini 3, John David Pressman latest post

Expand Cut Tags

Style Credit

mishka_discord

Diary: Gemini 3, John David Pressman latest post

Profile

November 2025

Custom Text

Most Popular Tags

Diary: Gemini 3, John David Pressman latest post

Expand Cut Tags

Style Credit