← Kilroy’s Daily Briefings
🎬 AI Video Intel

🎬 AI Video Intel — Tuesday, May 19, 2026 at 6:45 AM

🎬 AI Video Intel5/19/2026🕐 6:45 AMVideo modelsVisual AI

Top stories, ranked by relevance.

Story cards stay below the sticky dock while audio, chapters, date, and brief navigation remain accessible.

#1ComfyUI v0.21.1 Ships Flux2 Partner Nodes, Claude LLM Node, and Latent Previews

ComfyUI dropped v0.21.0 on May 11 and v0.21.1 two days later with native Flux2ImageNode, GrokImageEditNodeV2, ByteDance SeedreamNodeV2, an OpenAI Image node, and a Claude LLM node — all as first-class partner nodes with DynamicCombo and Autogrow UX. The release also adds high-quality Flux2 latent previews and support for Anima TE LoRA in Kohya format plus HiDream-O1-Image with fp8 dtype fixes.

No image

#2Alibaba's Wan 2.7 Is the Most Complete Open-Source Video Stack Under Apache 2.0

Alibaba shipped the full Wan 2.7 suite between April 1-6: four models covering text-to-video, image-to-video, reference-to-video with voice cloning, and instruction-based video editing — all Apache 2.0. The 27B-parameter MoE architecture (14B active) is available via API from $0.10/sec and runs locally. Wan 3.0 targeting 60B params, 4K, and 30-second generation is expected mid-2026.

No image

#3YouTube's Largest AI Demonetization Wave: 4.7B Views Erased, $10M Revenue Gone

YouTube's enforcement against low-quality AI content is accelerating hard. Over 4.7 billion lifetime views wiped, 35 million subscribers affected, and an estimated $10M in annual creator revenue vanished. The key: YouTube isn't banning AI — properly disclosed, quality AI content is still fully monetizable. It's mass-produced, repetitive slop that's getting hit. Disclosure via the "altered or synthetic content" toggle in Studio is now mandatory.

#4Runway Gen-4.5 Takes #1 on Artificial Analysis Video Benchmark

Runway's Gen-4.5 now holds the top spot in the Artificial Analysis Text-to-Video benchmark at 1,247 Elo points, surpassing all competitors. It supports text-to-video and image-to-video from 2-10 seconds, with improved stylistic control and visual consistency. Available across all paid plans at comparable pricing, with full API access for first-frame image input alongside text prompts.

#5Kling 3.0 Adds Multi-Shot Storyboards, Native Multilingual Audio, and 4K Output

Kuaishou's Kling 3.0 (launched Feb 5) brings multi-shot storyboarding — up to 6 shots in a single 15-second clip with per-shot control over duration, framing, camera movement, and narrative content. Native audio generation covers English, Chinese, Japanese, Korean, and Spanish with accent control. The Omni variant supports reference-based character consistency across scenes with voice cloning. Images now go up to 4K.

No image

#6Sora Is Officially Dead — API Sunset September 24

OpenAI's Sora app shut down April 26 and the API follows on September 24, 2026. The model was burning roughly $1M/day in compute with active users dropping from 1M to under 500K. Disney's rumored $1B investment never materialized — they reportedly learned of the shutdown less than an hour before the public announcement. If you still have Sora assets, download them now before permanent deletion.

No image

#7SANA-Video Runs Minute-Long 720p Generation on a Single RTX 5090

NVIDIA's SANA-Video, which scored an ICLR 2026 Oral, uses linear attention with a constant-memory KV cache to generate minute-long 720p video without VRAM scaling. With NVFP4 precision on an RTX 5090, a 5-second 720p clip drops from 71s to 29s inference. Supports both text-to-video and text+image-to-video. A real option for creators who want to keep generation local and fast.

No image

#8Seedance 2.0: First Unified Audio-Video Joint Generation with Phoneme-Level Lip-Sync

ByteDance's Seedance 2.0 (released Feb 12) accepts text, image, audio, and video inputs in combination — up to 9 images, 3 video clips, and 3 audio clips as reference — and generates multi-shot videos up to 15 seconds with dual-channel synchronized audio. The standout feature is phoneme-level lip-sync in 8+ languages, making it the go-to for multilingual talking-head content.

No image

#9Pika 2.5 Pikaffects Bring Physics-Aware VFX to Any Frame

Pika's 2.5 engine introduces Pikaffects — pre-set physics simulations you can apply to any object in frame. The update also adds automatic sound-effect generation matched to on-screen action (a car crash generates the crunch of metal) and near-zero flicker with professional-grade temporal consistency. Separately, Pika launched PikaStream 1.0 for real-time video chat with AI agents.

No image