Foundations for Safe Generative Media
October 7, 2024
by Deepti Ghadiyaram
Foundations for Safe Generative Media

As we continue to build General World Models that advance human creativity, support artists, and augment media and entertainment industries, we have further deepened our sense of responsibility to build tools that have a net positive impact on the world. Today, we are sharing our guardrails for Safety, Fairness, and Integrity in generative models that we have developed and implemented to prevent misuse of our tools. This is our philosophy on how to best build generative AI models responsibly—and reflects our commitment to do so.

Safety

A crucial goal that Runway strives for is to detect and shut down bad actors who generate and perpetuate harmful, hurtful, and inappropriate content. We have built an in-house visual moderation system that generalizes well to both AI-generated and real-world images and videos, striking a balance between offering creative and artistic liberty while preventing harm. Our system has been trained to automatically detect and block actors who repeatedly attempt to generate inappropriate content.

Throughout the process of honing our safety guardrails, we have learned that striking a balance between artistic liberty and harm prevention is incredibly difficult to achieve. We continue to refine our perspectives as our generative models improve and popular culture evolves. Overall, our in-house model performs significantly better, achieving an F1-score1 1: The F1 score is a metric used in machine learning to evaluate the performance of classification models, combining both precision and recall into a single value. of 83% and recall of 88% compared to the best-performing third-party API we’ve tested (F1-score: 70%, recall: 79%), and has a lower false-positive rate (2.8% vs. 5.6%) when evaluated on data that the models have not seen before.

Equally important is protecting the safety of children, a vital part of our engineering and research safety efforts. We have policies and procedures in place to ensure that we report any child sexual abuse material—as well as attempts to generate such material—that we identify, to the National Center for Missing and Exploited Children.

Though we test and research extensively to foresee how bad actors might misuse our tools, we acknowledge that we will not be able to predict all the different ways and social circumstances that would prompt bad actors to abuse our tools. Thus, we will continue to monitor our moderation system, including iterating, testing, and strengthening the guardrails constantly.

Graph showing F1-score and recall comparison

Fairness

An equally important principle that Runway strives for is to build creative tools for everyone. Given the global availability of our tools, we want all users, irrespective of their demographic and geographic location, to find their experience on Runway enriching. This means that our generations should not be skewed towards a particular demographic but should be able to represent all genders, skin tones, and cultures appropriately and fairly instead of perpetuating societal biases. We have trained, built, and deployed solutions to reduce the likelihood that prompts for certain professions (e.g., doctor, CEO, janitor, nurse) will default to any gender or racial stereotypes. Upon deploying our diversity fine-tuned model, we found that it improves the group fairness metric by 150% for perceived skin tone and 97.7% for perceived gender. We believe this is an important, yet initial, step towards building General World Models that represent everyone. We will continue to invest our research efforts in this space.

Additionally, we acknowledge that most generative tools work best when the input text prompts are in English, given the inherent design of the models. However, we believe that the generative tools we build should support creatives of diverse cultural backgrounds, regardless of the language they communicate best in. We want to work towards allowing our highly creative multilingual users the ability to express themselves in any language of their choice and have the best user experience.

Integrity

The significant improvements in visual quality and realism of the generated content at large have introduced a new set of social and ethical challenges. We acknowledge that generative tools could also be used to create media that can be used to perpetuate misinformation by impersonating people without their consent, among other misuses. It is of utmost importance to us that Runway continues to be a force for good and supports artists. Part of our safety efforts have been centered around building and developing robust technological safeguards to prevent users from attempting to generate content depicting known personalities.

Relatedly, we have also adopted the provenance standards outlined by the Coalition for Content Provenance and Authenticity (C2PA) to trace whether a given media item is generated or authentic. We do this by adding invisible watermarks to every Runway generation, thereby retaining evidence in the metadata that it is AI-generated. This technology will both detect misuses and enable users to distinguish AI-generated content from what is real.


At Runway, we believe that building safer generative tools goes hand in hand with offering more creative capabilities. As our tools expand into every artistic and creative corner, we will remain vigilant about potential misuses, rigorous in our safety evaluations, and continually proactive in enhancing our existing safeguards and building new ones as the technology evolves.