Our Research
Building general-purpose multimodal
simulators of the world.
We believe models that use video as their main input/output modality, when supplemented by other modalities like text and audio, will form the next paradigm of computing.
Research from Runway
March 31, 2025
StochasticSplats: Stochastic Rasterization for Sorting-Free 3D Gaussian Splatting
by Shakiba Kheradmand, Delio Vicini, George Kopanas, Dmitry Lagun, Kwang Moo Yi, Mark Matthews, Andrea Tagliasacchi
3D Gaussian splatting (3DGS) is a popular radiance field method, with many application-specific extensions. Most variants rely on the same core algorithm: depth-sorting of Gaussian splats then rasterizing in primitive order. This ensures correct alpha compositing, but can cause rendering artifacts due to built-in approximations. Moreover, for a fixed representation, sorted rendering offers little control over render cost and visual fidelity. For example, and counter-intuitively, rendering a lower-resolution image is not necessarily faster. In this work, we address the above limitations by combining 3D Gaussian splatting with stochastic rasterization. Concretely, we leverage an unbiased Monte Carlo estimator of the volume rendering equation. This removes the need for sorting, and allows for accurate 3D blending of overlapping Gaussians. The number of Monte Carlo samples further imbues 3DG...
March 22, 2025
Progressive Prompt Detailing for Improved Alignment in Text-to-Image Generative Models
by Ketan Suhaas Saichandran, Xavier Thomas, Prakhar Kaushik, Deepti Ghadiyaram
Text-to-image generative models often struggle with long prompts detailing complex scenes, diverse objects with distinct visual characteristics and spatial relationships. In this work, we propose SCoPE (Scheduled interpolation of Coarse-to-fine Prompt Embeddings), a training-free method to improve text-to-image alignment by progressively refining the input prompt in a coarse-to-fine-grained manner. Given a detailed input prompt, we first decompose it into multiple sub-prompts which evolve from describing broad scene layout to highly intricate details. During inference, we interpolate between these sub-prompts and thus progressively introduce finer-grained details into the generated image. Our training-free plug-and-play approach significantly enhances prompt alignment, achieves an average improvement of up to +4% in Visual Question Answering (VQA) scores over the Stable Diffusion baselin...
March 9, 2025
What's in a Latent? Leveraging Diffusion Latent Space for Domain Generalization
by Xavier Thomas, Deepti Ghadiyaram
Domain Generalization aims to develop models that can generalize to novel and unseen data distributions. In this work, we study how model architectures and pre-training objectives impact feature richness and propose a method to effectively leverage them for domain generalization. Specifically, given a pre-trained feature space, we first discover latent domain structures, referred to as pseudo-domains, that capture domain-specific variations in an unsupervised manner. Next, we augment existing classifiers with these complementary pseudo-domain representations making them more amenable to diverse unseen test domains. We analyze how different pre-training feature spaces differ in the domain-specific variances they capture. Our empirical studies reveal that features from diffusion models excel at separating domains in the absence of explicit domain labels and capture nuanced domain-specific ...
We're advancing research in AI systems that can understand and simulate the world and its dynamics.

RNA Sessions
An ongoing series of talks about frontier research in
AI and art, hosted by Runway.Learn more
Introducing Act-One
Runway /
October 22, 2024

Foundations for Safe Generative Media
Research /
October 7, 2024

Pioneering New Interfaces in the Age of Generative Media
Research /
September 10, 2024

Introducing Gen-3 Alpha: A New Frontier for Video Generation
Research /
June 17, 2024

Introducing General World Models
Research /
December 11, 2023

More control, fidelity and expressibility
Research /
November 23, 2023

Mitigating stereotypical biases in text to image generative systems
Research /
October 10, 2023

Scale, Speed and Stepping Stones: The path to Gen-2
Research /
September 28, 2023

Gen-2: Generate novel videos with text, images or video clips
Research /
March 11, 2023

Gen-1: The next step forward for generative AI
Research /
December 11, 2022

Towards unified keyframe propagation models
Research /
May 25, 2022

High-Resolution Image Synthesis with Latent Diffusion Models
Research /
April 13, 2022

Soundify: Matching sound effects to video
Research /
January 11, 2022

Distributing Work: Adventures in queuing
Research /
November 1, 2021

Creative AI Conversations 2021 - 2022
Research /
September 1, 2021