Top AI companies commit to child safety principles with Thorn

April 24, 2024, 12:09 AM UTC

After a series of highly publicized scandals related to deepfakes and child sexual abuse material (CSAM) have plagued the artificial intelligence industry, top AI companies have come together and pledged to combat the spread of AI-generated CSAM.

Thorn, a nonprofit that creates technology to fight child sexual abuse, announced Tuesday that Meta, Google, Microsoft, Civitai, Stability AI, Amazon, OpenAI and several other companies have signed onto new standards created by the group in an attempt to address the issue. At least five of the companies have previously responded to reports that their products and services have been used to facilitate the creation and spread of sexually explicit deepfakes featuring children.

AI-generated CSAM and deepfakes have become a hot-button issue in Congress and beyond, with reports detailing stories of teenage girls victimized at school with AI-generated sexually explicit images that feature their likenesses.

NBC News previously reported that sexually explicit deepfakes with real children’s faces were among top search results for terms like “fake nudes” on Microsoft’s Bing, as well as in Google search results for specific female celebrities and the word “deepfakes.” NBC News also identified an ad campaign running on Meta platforms in March 2024 for a deepfake app that offered to “undress” a picture of a 16-year-old actress.

The new “Safety by Design” principles the companies signed onto, pledging to integrate them into their technologies and products, include proposals that a number of the companies have already struggled with.

One principle is the development of technology that will allow companies to detect if an image was generated by AI. Many early iterations of this technology come in the form of watermarks, which are mostly easily removable.

Another principle is that CSAM will not be included in training datasets for AI models.

In December 2023, Stanford researchers discovered more than 1,000 images child sexual abuse images in a popular open source database of images used to train Stability AI’s Stable Diffusion 1.5, a version of one of the most popular AI image generators. The dataset, which was not created or managed by Stability AI, was taken down at the time.

In a statement to NBC News, Stability AI said its models were trained on a “filtered subset” of the dataset in which the child sexual abuse images were found.

“In addition, we subsequently fine-tuned these models to mitigate residual behaviors,” the statement said.

Thorn’s new principles also say that companies should only release models after they’ve been checked for child safety, that companies should responsibly host their models and that companies should make assurances that their models aren’t used for abuse.

It’s not clear how various companies will apply such standards, and some have attracted significant criticism for how they’ve been applied and the communities surrounding them.

Civitai, for instance, offers a marketplace where anyone can commission “bounties,” or deepfakes, of real or fake people.

At the time of publication, there were numerous requests on the “bounties” website for deepfakes of celebrity women, some appearing to seek sexually explicit results.

Civitai says “content depicting or intended to depict real individuals or minors (under 18) in a mature context” is prohibited.

In one bounty reviewed by NBC News, someone asked for a deepfake made of an adult film actress, specifying that it "should ideally be trained on images from her adult film era."

According to the company, the post was removed by a moderator after this article’s publication. The company said it did not violate Civitai’s terms of service or safety policies because it did not explicitly ask for “NSFW” results, but that it was removed “out of an abundance of caution.”

On Civitai’s pages displaying AI models, AI-generated images and AI-generated videos, some of them featured sexually suggestive depictions of what appeared to be young females.

In its release about the new “Safety by Design” principles, Thorn also nodded to the systemic stress that AI puts on an already-struggling sector of law enforcement. A report released Monday by the Stanford Internet Observatory found that only between 5% and 8% of reports to the National Center for Missing and Exploited Children regarding child sexual abuse imagery lead to arrests, and AI unlocks the potential for a flood of new, AI-generated child sexual abuse content.

Thorn creates technology that is used by tech companies and law enforcement that’s meant to detect child exploitation and sex trafficking. Tech companies have praised Thorn’s technology and work, with many partnering with the group to implement its technologies on their platforms.

Thorn has attracted scrutiny, however, for its work with law enforcement. One of their primary products collects online solicitations for sex and makes them available to police, a practice that Forbes reported has fallen under scrutiny from countertrafficking experts and sex worker advocates.

Kat Tenbarge

Kat Tenbarge is a tech and culture reporter for NBC News Digital. She can be reached at Kat.Tenbarge@nbcuni.com