Generative media — the field of research that relates to "deep fakes" and other forms of synthesized audio and video — is advancing rapidly. In many use cases, the results are already indistinguishable from real media. This technology has exciting applications, such as Descript's Overdub feature, but it also holds the potential for misuse.
While Descript is among the first products available with generative media features, it won't be the last. As such, we are committed to modeling a responsible implementation of these technologies, unlocking the benefits of generative media while safeguarding against malicious use.
We believe you should own and control the use of your digital voice. Descript uses a process for training speech models that depends on verbal consent verification, ensuring that our customers can only create text to speech models that have been authorized by the voice’s owner. Once created, the voice owner has control over when and how it is used.
As the applications of this technology continue to evolve, we will remain in conversation with leading machine learning researchers, ethics professors, and the broader public about how to best develop and implement this technology.
Today, our technology is unique, but the foundational research is already widely available. Other generative media products will exist soon, and there's no reason to assume they will have the same constraints we've added to Descript.
It's unclear. While compelling research (example) is underway, the quality of generative media could increase at a rate that outpaces technology designed to detect it. While we cannot predict what the future holds for the media, we do believe it will continue to be important for each of us to be critical consumers of everything we see, hear, and read.