Discord Platform Exploration - weiyinc11/HateSpeechModerationTwitch GitHub Wiki
Method of Controlled Environment Creation: Private Discord Server - invite only
Cool things to note:
- Safety setup - Automated Explicit Image filter in Discord's AutoMod
- Use of external moderators for 'custom server needs'
- Claims about AutoMod's detection for links to malware
Update on machine learning for Malware links and Open Source Tools:
Links on discord seem to have been taken temporary such that they expire after 24hrs : https://www.bleepingcomputer.com/news/security/discord-will-switch-to-temporary-file-links-to-block-malware-delivery/ Discord's claim to flag malicious links seem to have been implemented however, many users are skeptical of its precision:
- https://www.reddit.com/r/discordapp/comments/16cgdao/i_got_a_warning_for_sending_a_malicious_link/
- https://support.discord.com/hc/en-us/community/posts/360058233471-Believe-I-have-been-banned-unjustly-any-chance-for-me-to-appeal-explain-it-to-someone (4yrs ago)
- ! However, there is not much if any documentation from Discord describing this functionality.
- Discord also have a link to top.gg for Discord external bots: https://top.gg/bot/726955339710332958
External Discord Bots: Not Open Source but there is some of the leaked source code and github links
- Giselle bot : https://github.com/cycloptux/GiselleBot-Documentation
- MEE6 : https://github.com/Mee6 (Mainly Documentation though)
- Dyno : https://github.com/aididan20/dyno (Leaked source code but not up to date as they most likely have changed it)
- Bulbbot : https://github.com/TeamBulbbot/bulbbot
- All of the Policies detailed highlight user reports but do not detail any automated approaches to detections or violations such as User Safety, Platform Integrity, and Regulated or Illegal Activity as well as Off-platform policies.
In fact, when creating a closed environment of private users, it seems that there is no platform moderation implemented to avoid violations of these policies within a private server. Discord approaches the user experience as one from a good samaritan perspective.
"We invest heavily in proactive efforts to detect and remove abuse before it is viewed or experienced by others. We do this by leveraging advanced tooling,machine learning, and specialized teams, and partnering with external industry experts." - Transparency Report Q4 (2023)
- "Image hashing and machine-learning powered technologies that help us identify known and unknown child sexual abuse material... Machine learning models that use metadata and network patterns to identify bad actors or spaces with harmful content and activity. Automated moderation (AutoMod), a server-level feature that empowers community moderation through features like keyword and spam filters that can automatically trigger moderation actions." !! - AutoMod is an automated CM tool that Discord provides at a server level for keyword filters. However, Discord provides documentation for implementing either human moderator teams or implementing autoMods that are not immediately provided by Discord, especially in public and verified / partnered servers.
AutoMod features include:
- Blocking messages that contain specific keywords / spam -> 1 'Commonly Flagged Words' filter + 3 custom filters with max 1000 keywords each
- Logging flagged messages for human review
- Spam content that is detected via "This filter identifies spam at large by using a model that has been trained by messages that users have reported as spam to Discord."
- Explicit Media Content filter - allowing different levels of the intensity of moderation
- Messages include those within threads, text-in-voice channels, forum channels
Mee6: https://mee6.xyz/
Dyno: https://dyno.gg/
Giselle: https://docs.gisellebot.com/bot-invite.html
AutoModerator: https://automoderator.app/
Fire: https://getfire.bot/
Bulbbot: https://bulbbot.rocks/
Gearbot: https://gearbot.rocks/
However, most of the community guideline policy violations seem to rely the most on user reports as well as the Safety Reporting Network. Discord's Policy Hub provides in-depth documentation for what each policy entails as well as support for user reports.
requirement for implementing customization: Developer Mode as a User must be enabled
Safety Setup: There is an explicit image filter that "When a user has this feature on, every message sent to them will be automatically checked to see if it contains potentially sensitive media. Discord will then blur or block potentially sensitive media depending on the user's settings. That detection process does not report any message content or other information to Discord for enforcement purposes."
Server Settings offers Integrations such as:
Webhooks that allow you to update the server with automated messages based on an external website output. Compatible platforms that allow such integration include Github, CircleCl, DataDog. These are the most "low effort way to post messages to channels.. do not require but user or authentication". This could be used to automate large data inputs. Discohook + Svix to implement sending webhooks via Python and Svix API is an option for data input.
Referred Doc: https://support.discord.com/hc/en-us/articles/228383668-Intro-to-Webhooks
Integrated Apps is a customization feature that allows for external content moderation tools to be integrated into the server. There is a wide variety of content moderation tools that may servers use.
Perhaps Discord does not need to provide such a comprehensive and customizable in-house content moderation tool due to the external plug-ins.
Most popular app includes: MEE6, Dyno, and ProBot. Another option used is Yagpdb.xyz
MEE6 includes a series of customizable content moderation tools like keywords, excessive emoji usage as well as automated actions.
Setup of this bot in a discord server linked here .
Moderator Options:
Yagpdb seems to provide more customizations and a larger access to custom rulesets to be created for the server. From the moderation section, we have access to a basic automoderator bot and an advanced bot (Docs). All of the moderation tools seem to focus on user verification such as age of user, age of account, and email and phone number. More content related rules include: Blacklist and whitelists for nicknames, keywords, and links. Github access but no direct access to LLMs.
Moderation Tab under Settings
- Safety Setup: Verification Level (Ensures members are verified) and more but most interestingly, a Explicit image filter that is rated on levels of moderation but not customizable or backend accessible.
- AutoMod (Docs) : Includes a set of custom rules based moderation and built-in LLM detection such as for links to malware. The customizations include keyword detection (plus a wild card flag for prefix and suffix) based on
Categories of Commonly Flagged Words: Insults and Slurs - Protect members of your Community from personally insulting material targeted at them, including terms that may be considered slurs or hate speech. Sexual Content - Keep sexually explicit language out of your server to keep your Community family-friendly. Severe Profanity - Block the more egregious forms of profanity, while still allowing for mild forms of cursing or swearing.
and span mention and content rules:
Block Spam Content Rule - Enable this rule to detect messages containing unwanted spammy text content that disrupts your experience on Discord such as: Unsolicited messages or advertisements (free Nitro) and Invite spam Block Mention Spam Rule - Set a limit on the number of mentions a message may contain. Once configured, AutoMod can detect and block messages containing excessive user or role mentions and help prevent your members from receiving unnecessary notifications and pings.
2. Manual Integration of External Moderation Tools - Bot Doc
- Under Developer Tools / Portal, Discord allows dev users to build their own applications as well as integrate their own bots by utilizing the token. There is also an option to make this bot private.
There is a framework called discord.js that allows for users to build their own bots and deploy them to their discord servers. Using the Nodejs framework, bots are customizable to slash commands, handling events, and more. Discord.py is an API wrapper that enables communication with discord server as a client and for configuring bots.
Audit Logs Resource provides resources to monitor administrative actions or events on discord servers.