How does content moderation work?

We want AI Dungeon to be a safe place for people to discover exciting content as well as freely explore their own ideas and creativity in a private way.

Here’s how that works:

Unpublished, single-player content is never moderated—This means we don’t have any flags, suspensions, or bans for anything users do in private, single-player play.
Targeted AI boundaries—The AI is subject to boundaries designed to prevent it from generating unsafe content, specifically content that promotes or glorifies the sexual exploitation of children. There are no consequences to players if the AI hits these boundaries, and humans never review cases where this happens. Stories are always private.
Seamless response filtering—We generate several possible responses for each user input so that if one option doesn’t pass the filter, one of the other options can be delivered to the player.
Transparent error handling—In rare instances, the AI may be unable to generate a response within its boundaries. When this occurs, players will see a message letting them know what happened, and provide ways they can continue their story.
Story encryption—Stories are encrypted for additional privacy for players. Stories are decrypted and sent to users’ devices when requested. The AI cannot process encrypted text, so decrypted elements of the story are passed to the AI to generate the next response.
Published stories are subject to our Community Guidelines—Unpublished content is never moderated. However, interactions with other users and content published within Latitude’s community is subject to our Community Guidelines. You can read those here.

On this page

How does content moderation work?

How does content moderation work?

How does content moderation work?

Footer Social Icons