← Kilroy’s Daily Briefings
🎬 AI Video Intel

🎬 AI Video Intel — Monday, May 18, 2026 at 6:45 AM

🎬 AI Video Intel5/18/2026🕐 6:45 AMVideo modelsVisual AI

Top stories, ranked by relevance.

Story cards stay below the sticky dock while audio, chapters, date, and brief navigation remain accessible.

#1Google I/O Kicks Off Tomorrow — Veo 4 and Gemini Omni Expected on Stage

Google I/O 2026 starts May 19 at Shoreline Amphitheatre with keynotes from Pichai and Hassabis. A leaked UI string inside Gemini's video tab references "Omni" — likely a unified text/image/video model running alongside or replacing the Veo pipeline. Veo 4 rumors point to native 4K, 20-30 second clips, character-reference embedding, and multi-track audio with separated dialogue, ambience, and SFX. If even half of that ships, it resets the competitive field overnight. Clear your Tuesday.
Source: WaveSpeed Blog — https://wavespeed.ai/blog/posts/google-omni-video-model-leak-i-o-2026/

No image

#2ComfyUI v0.21.1 Ships New Partner Nodes — Flux2, Grok Edit, ByteDance Seedream, Claude LLM

Dropped May 13. The new Flux2ImageNode and GrokImageEditNodeV2 bring first-party Grok image editing into ComfyUI workflows. ByteDanceSeedreamNodeV2 adds DynamicCombo and Autogrow for Seedream pipelines. A Claude LLM node now enables in-workflow text generation for prompt chaining. Also: full HiDream-O1-Image support with dtype fixes, Anima TE LoRA in Kohya format, 4K resolution for ByteDance and Veo partner nodes, and Veo 3 Lite as a lightweight video option.
Source: ComfyUI Changelog — https://docs.comfy.org/changelog

#3Kling 3.0 Native 4K + Unified Audio Pipeline Now Live

Kuaishou's Kling 3.0 ships native 4K generation from text — not upscaled 1080p. The unified multimodal framework merges video, audio, and image into one pipeline with character-driven dialogue and lip-sync. Multi-shot mode handles up to 6 camera cuts in a single 15-second generation, effectively enabling 30+ second narrative sequences. Signups doubled within 10 hours of launch. Side-by-side comparisons show Kling's skin texture and lighting beating upscaled Sora output at broadcast grade.
Source: VO3 AI Blog — https://www.vo3ai.com/blog/kling-30-just-launched-native-4k-video3-ways-it-changes-ai-filmmaking-2026-04-24

No image

#4Runway Gen-4.5 Holds #1 Text-to-Video Benchmark — Image-to-Video Now Public

Gen-4.5 leads the Artificial Analysis text-to-video leaderboard at 1,247 Elo. The A2D architecture (autoregressive-to-diffusion) combines diffusion visual clarity with autoregressive scene understanding. Image-to-video is now available across all subscription tiers with 2-10 second durations and reference-image camera control. For controlled i2v workflows, this is currently the most precise commercial option.
Source: Runway Research — https://runwayml.com/research/introducing-runway-gen-4.5

#5Wan 2.6 Open-Source Ecosystem Matures — Wan 2.7 on the Horizon

Alibaba's Wan 2.6 (Apache 2.0, commercially free) is now the most complete open-source video stack: text-to-video, image-to-video, multi-shot generation, character consistency, and synchronized audio — all running locally on a 14B MoE architecture. Wan 2.7 is confirmed in development with improved photorealism, better physics simulation, and sharper detail. For anyone building a local pipeline that doesn't depend on API pricing, Wan is the backbone to bet on.
Source: MindStudio — https://www.mindstudio.ai/blog/what-is-wan-2-6-video-open-source

#6TikTok's AI Detector Now at 94.7% Accuracy on Synthetic Faces

TikTok's C2PA Content Credentials integration plus proprietary classifiers now catch AI-generated faces at 94.7% accuracy and AI backgrounds at 87.3%. Enforcement removal of unlabeled AI content jumped 340% in 2025, with 51,000+ synthetic videos pulled in H2 alone. Unlabeled content gets auto-labeled, distribution-throttled, or removed. If you're posting AI video to TikTok without the built-in label, you're gambling with reach. Use the disclosure toggle — it's in the posting interface.
Source: Storrito — https://storrito.com/resources/tiktoks-2026-ai-labeling-rules-and-what-they-signal-for-platform-governance/

#7Sora Is Dead, Disney Deal Collapsed — OpenAI Exits Consumer Video

OpenAI shut Sora down in March 2026 after active users fell below 500,000. The $1B Disney partnership — announced in December 2025 with 200+ characters licensed — collapsed before any money changed hands. The Sora web/app shut down April 26; the API sunsets September 24. If you had Sora in your pipeline, it's time to migrate. Kling for commercial API, Runway for controlled i2v, Wan for open-source local.
Source: Variety — https://variety.com/2026/digital/news/why-openai-disney-ended-sora-deal-bob-iger-1236698901/

No image

#8FramePack Unlocks 60-Second Video on 6GB VRAM via ComfyUI

Stanford's FramePack (from ControlNet creator Lvmin Zhang) compresses context frames dynamically — key frames keep 1,536 features, transitional frames drop to 192. Result: an RTX 3060 laptop with 6GB VRAM can generate up to 60-second coherent video, where previously 12GB+ was the floor. Kijai's ComfyUI wrapper integrates it with HunyuanVideo and FramePack models. If you've been hardware-limited, this is the unlock.
Source: FramePack.net — https://framepack.net/blog/comfyui-framepack-guide

No image

#9YouTube Shorts Hits 200B Daily Views — AI Video Monetization Numbers Hold

YouTube Shorts now reaches 200 billion daily views (up from 70B in 2023). AI-generated channels that disclose properly remain eligible for full YPP monetization. Current RPMs: long-form $1-$9/1K views depending on niche, Shorts $0.03-$0.13/1K views. TikTok's Creator Rewards Program pays $0.50-$1.00/1K qualified views for videos over 60 seconds. The economics are real but thin on short-form — the play remains long-form with AI as the production accelerator, not the entire product.
Source: Fliki — https://fliki.ai/blog/do-ai-generated-videos-monetize

No image