Kilroy Kilroy's Daily BriefingsKilroy online Subscribe
🤖 AI News AM

AI News Briefing — Thursday, June 18, 2026 at 9:48 AM

🤖 AI News AM6/18/2026🕐 6:00 AM⏱ 3:38AudioMorning

Top stories, ranked by relevance.

Story cards stay below the sticky dock while audio, chapters, date, and brief navigation remain accessible.

▶ Listen at 0:24

#1OpenAI's LifeSciBench Shows Even Top Models Pass Just 1 in 3 Real Science Tasks

Relevance 10/10Importance 9/10

OpenAI released LifeSciBench, a 750-task benchmark built with 173 PhD-level scientists that grades models on real life-science research using free-response questions and expert rubrics averaging 25 criteria each. The reality check is brutal: even OpenAI's own GPT-Rosalind passes only about 36% of tasks, dropping from 45% on text-only work to 28% when real genomic and chemical data files are attached. It's a rare lab-published benchmark that undercuts the hype rather than feeding it.

#2Google Kills Gemini CLI Today, Forcing Devs Onto Closed-Source Antigravity CLI

Relevance 9/10Importance 8/10

As of today, Gemini CLI stops serving free, Pro, and Ultra tier users with no grace period, replaced by the Go-based Antigravity CLI. The catch: Antigravity isn't open source, despite Google accepting over 6,000 community pull requests to the old tool, and its weekly compute cap has heavy users hitting multi-day cooldowns. Enterprise license holders keep their access; everyone else gets a downgrade.

#3Weibo's 3-Billion-Parameter VibeThinker Reignites the Benchmark Wars

Relevance 9/10Importance 7/10

Nine researchers at Sina Weibo posted a 14-page report claiming their tiny VibeThinker-3B matches or beats flagship models hundreds of times larger, scoring 94.3 on AIME 2026 — ahead of Gemini 3 Pro. Skeptics cried "benchmaxxing," but the model's 96% acceptance rate on LeetCode contests held after any plausible training cutoff is a genuinely hard-to-fake result. It's the small-model-versus-giant-model debate flaring up all over again.

#4AI CEOs Pitch G7 on a U.S.-Led Coalition — Canada Says Yes

Relevance 8/10Importance 9/10

In a closed-door G7 session, Anthropic's Amodei and DeepMind's Hassabis called for a U.S.-led coalition governing AI, with Amodei proposing structured allied access to frontier models plus chip and component trade that pointedly excludes China. Altman pushed for an international forum setting global testing standards, and Canadian PM Mark Carney agreed the U.S. could lead it. This goes beyond the summit's generic communique into a concrete geopolitical fault line.

#5SpaceX-Cursor Deal Completes the Great Coding-Tool Land Grab

Relevance 8/10Importance 8/10

Fresh detail on the $60B SpaceX-Anysphere deal: SpaceX confirmed its AI division has been jointly training a coding model with Cursor on xAI's Colossus supercomputer. With that, every major AI coding tool now belongs to a tech giant — Copilot to Microsoft, Claude Code to Anthropic, Cursor and Grok Build to SpaceX, Windsurf to OpenAI — leaving Tabnine as the lone independent. The open question is whether SpaceX pushes Cursor onto its own models.

#6UNIDIR's Global Conference on AI Security and Ethics Opens in Geneva

Relevance 7/10Importance 7/10

The two-day AISE26 conference opens today at the Palais des Nations, gathering diplomats, labs, and civil society to tackle AI's implications for international peace and security across technology and governance tracks. It also marks the launch of UNIDIR's new Centre of Excellence on AI, Peace and Security. Coming the day after the G7's coalition talk, the timing puts AI governance squarely on back-to-back global stages.

#7Claude Fable 5 and Mythos 5 Still Dark on Day Six

Relevance 8/10Importance 7/10

Anthropic's two flagship models remain globally offline following the June 12 Commerce Department export-control directive, but the company's international chief now says it's "very confident" access returns "in the coming days." Anthropic continues to frame the cyber-bypass concern that triggered the order as a minor flaw other public models also exhibit. Six days of downtime for top-tier models is its own story about vendor concentration risk.

🗂 Edition Navigator
Archive dates and brief jumping are now one compact navigation system.