Building new worlds responsibly
Our approach
Safety is built into our products from the ground up, not bolted on after the fact. We design multi-layered defenses to ensure that our safeguards are resilient.
As our models grow more capable, our safety systems must grow with them. We continually reassess risks, update our tooling and invest in new detection methods to stay ahead of emerging threats.
We focus on preventing the creation of content that is inherently harmful, while preserving the creative freedom that makes our tools valuable to creatives, brands, and frontier builders everywhere.
Our safeguards
Prevent
Usage policy
This policy sets the foundation for all of our safety work.
Model-Level Safeguards
We build safety directly into model behavior, filtering data before training and applying post-training techniques to teach our models to avoid generating certain types of harmful content.
Red Teaming
We subject our models to rigorous adversarial testing—both automated and manual—before launch to identify risks and understand how products could be misused.
Product-Level Safeguards
We build safety mitigations directly into the product.
Detect
Input & output detection
We use AI-based classifiers to analyze user inputs and generated outputs to catch potentially harmful content before it reaches users.
Human review
We review user reports and suspension appeals, catching what automated systems may miss or get wrong, and continuously improve our filters.
Continuous monitoring
We actively track usage trends, how the industry is evolving and shifts in the risk landscape to ensure our safeguards are keeping pace.
Enforce
Account enforcement
We use systematic tooling to identify and remove bad actors, including automatic suspensions for users who repeatedly trigger moderation systems or upload unlawful content.
Transparency
Provenance signals
We implement C2PA so that AI-generated content can be traced back to its source.
Report content
If you find content which raises concerns, and you believe it was created with our tools, please report it here.
Report content