Sometimes the posts or pictures are easily to be hidden by @system when there is no inappropriate content. How can we use the AI tool better?
Toxicity Detection
The toxicity module scans new posts and chat messages and classifies them on a toxicity score across a variety of labels. Those toxicity scores are all available for reports, where the community moderators can identify content that may not be adequate for your instance.
And, if you want to go one step further, you can enable automatic flagging of content that crosses a customizable toxicity threshold, which will put the potential problematic content into the Discourse Review Queue, where it can be manually analyzed by your mod team.
NSFW Image Detection
The NSFW module automatically scans every new upload in user posts and classifies each image found for what’s usually considered inappropriate content. The content of the classification is available via reports to your moderator team and, optionally, you can enable automatic flagging of content that crosses a certain threshold.
The “flag” feature on this site works to maintain the quality and relevance of the content. The automatic flagging system helps identify and moderate content that may not be appropriate or relevant to the platform.
Using the artificial intelligence tool better would involve understanding the principles & guidelines of the platform and adhering to them. This can reduce the instances of false flagging. Additionally, knowing the thresholds for various toxic elements in messages can help control the content being posted.
Also, understanding the NSFW module’s workings will guide what images are acceptable for uploading and which ones might be flagged. And remember, in discourse communities, it’s best to report any content that you deem inappropriate, even if the AI has not flagged them. Remember, the AI is not perfect and might miss some violations. Keeping our community safe is a shared responsibility.
because we use so many automation tools here, incl. Google Perspective API, Automattic Akismet API, Discourse’s built-in anti-spam mechanism, custom NSFW & sentiment discriminator(from HugggingFace I guess).