Manage and orchestrate the entire Trust & Safety operation in one place - no coding required.
Take fast action on abuse. Our AI models contextually detect 14+ abuse areas - with unparalleled accuracy.
Watch our on-demand demo and see how ActiveOS and ActiveScore power Trust & Safety at scale.
The threat landscape is dynamic. Harness an intelligence-based approach to tackle the evolving risks to users on the web.
Don't wait for users to see abuse. Proactively detect it.
Prevent high-risk actors from striking again.
For a deep understanding of abuse
To catch the risks as they emerge
Disrupt the economy of abuse.
Mimic the bad actors - to stop them.
Online abuse has countless forms. Understand the types of risks Trust & Safety teams must keep users safe from on-platform.
Protect your most vulnerable users with a comprehensive set of child safety tools and services.
Stop online toxic & malicious activity in real time to keep your video streams and users safe from harm.
The world expects responsible use of AI. Implement adequate safeguards to your foundation model or AI application.
Implement the right AI-guardrails for your unique business needs, mitigate safety, privacy and security risks and stay in control of your data.
Our out-of-the-box solutions support platform transparency and compliance.
Keep up with T&S laws, from the Online Safety Bill to the Online Safety Act.
Over 70 elections will take place in 2024: don't let your platform be abused to harm election integrity.
Protect your brand integrity before the damage is done.
From privacy risks, to credential theft and malware, the cyber threats to users are continuously evolving.
Your guide on what to build and buy
Watch our on-demand demo series - Demo Tuesdays
Our product team is continuingly delivering new features and enhancements for ActiveOS and ActiveScore, including new AI models, enhanced media format coverage, and more.
Here are the new releases for this month:
Check out more details about each feature below:
Detecting and stopping communication related to drug solicitation is becoming more challenging as bad actors constantly find new ways to evade detection. Not to mention, the prevalence of illegal content poses legal risks.
To tackle this issue, our team has developed ActiveScore’s new drug solicitation contextual AI model. This is an additional signal to our other ActiveScore models to detect illegal drug activities, including violative usernames, keywords, images or text. This model has been trained on intelligence collected by our in-house domain experts. By analyzing the context of conversations and understanding the use of euphemisms, slang, emoji, and code words, it can automatically detect content that often sneaks undetected by standard detection methods.
You can access the model, like any of our other ActiveScore models, with one API integration. Each analyzed item will return a risk score from 0 to 100, indicating the likelihood of it containing drug solicitation. The results will also include associated indicators and descriptions.
The drug solicitation model can also be combined with our other models, such as drug images, violative usernames, and keywords, using the ActiveOS policy management tool. This allows you to customize your coverage based on relevant policies, addressing illegal drug activities from multiple angles.
Our base models are continuously improving daily thanks to feedback loops from moderator decisions. By retraining to your unique policies, drifts from the real-world, and up-to-date findings from our intelligence team, we are able to increase model accuracy over time.
For benchmark information, please see below:
Engaging in illegal activities such as drug solicitation can have serious legal ramifications. Current laws, like the US Senate’s Combating Cartels on Social Media Act of 2023 and the UK’s Online Safety Bill (2023), are already in effect and enforced. Non-compliance with these laws can result in legal fines.
To ensure compliance and mitigate the risks, you can use ActiveOS codeless workflows to enforce against illegal drug activities at scale. For example, you can build a workflow that automatically removes any item with a violative risk score over 70, and promptly send it to relevant authorities. Items with lower risk score can be routed to a high-priority queue for moderator review.
Usernames are a representation of a user’s identity or brand. But while most select non-offensive names, a troubling minority deliberately choose offensive, abusive, or toxic usernames. They may employ evasive terminology to convey hate speech, illegal content, or profane language.
For this matter, we developed AI models specifically designed to detect violative usernames. It’s trained to overcome the unusual structure of usernames, which frequently combine L33Tspeak, misspellings, letters, symbols, numbers, and unique phrases that may seem benign out of context. With this model, clients can weed-out users with violative usernames, which often indicate involvement in illegal activities on the platform. This will help maintain a safe environment by stopping violative users at first touch.
The violative username models cover the following violations:
Similar to all our ActiveScore models, we constantly improve accuracy over time through feedback loops, ensuring that they remain effective and up-to-date. Here you can see our general benchmarks for more information:
On average, a small percentage of users, just 2-3%, are responsible for creating 40-50% of toxic content. With ActiveOS user-level views, you can easily see any flagged activity and take bulk actions for greater impact with fewer clicks.
Our keywords tool now offers more flexibility and can now detect variations in language usage. This includes a new option to search keywords in all languages, so you won’t miss any relevant content in undefined or mismatched languages.
We’ve added an option to search keywords by “all languages”, and not just by a specific language, so you can catch more variations that can be challenging to search for. For example, evasive tactics involving keywords that promote terrorism or drugs may not always be shown in their specific language. By enabling “all languages,” you can catch those hard-to-find keywords and take appropriate actions.
This feature is particularly useful in the following scenarios:
Below, you can see a sample of the detection changes made for exact, fuzzy, and partial matches, switching between a specified language and all languages:
By extracting and analyzing transcripts from audio or video files, you can now catch potential violations that might be missed in audio or video-only analyses. This approach offers greater accuracy and speed in detecting violations Moreover, this method recognizes the context and subtleties of language use, further enhancing accuracy.
This works the same as our current APIs, whereby you send the content, we transcribe the text for analysis, and provide a combined score. The risk score is determined by the maximum score received. For example, if the analyzed text yields a risk score of 80, but additional video frames only show a risk of 50, we will still report the combined score of 80.
Stay tuned, as we are continuing to work on many more exciting features and enhancements for ActiveOS.
If you’re interested in learning more or seeing these features in action, we invite you to our ongoing demo series – Demo Tuesdays. It’s a great opportunity to see the product in action, meet with our team, and ask any questions you may have! Alternatively, you can also schedule a 1-1 demo session with us.
Thanks,
The ActiveFence Team
The decision to build or buy content moderation tools is a crucial one for many budding Trust and Safety teams. Learn about our five-point approach to this complex decision.
Recognized by Frost & Sullivan, ActiveFence addresses AI-generated content while enhancing Trust and Safety in the face of generative AI.
Building your own Trust and Safety tool? This blog breaks down for Trust and Safety teams the difference between building, buying, or using a hybrid approach.