Improve your detection and simplify moderation - in one AI-powered platform.
Stay ahead of novel risks and bad actors with proactive, on-demand insights.
Proactively stop safety gaps to produce safe, reliable, and compliant models.
Deploy generative AI in a safe and scalable way with active safety guardrails.
Online abuse has countless forms. Understand the types of risks Trust & Safety teams must keep users safe from on-platform.
Protect your most vulnerable users with a comprehensive set of child safety tools and services.
Our out-of-the-box solutions support platform transparency and compliance.
Keep up with T&S laws, from the Online Safety Bill to the Online Safety Act.
Over 70 elections will take place in 2024: don't let your platform be abused to harm election integrity.
Protect your brand integrity before the damage is done.
From privacy risks, to credential theft and malware, the cyber threats to users are continuously evolving.
Stay ahead of industry news in our exclusive T&S community.
Learn more about ActiveOS with a free demo
Over the past month, our product team has introduced substantial updates to our models, enhancing their ability to ensure user safety and build trust. Among these updates are advanced features like granular PII (Personally Identifiable Information) detection, which is designed to provide more precise protection for users.
Let’s dive in:
The exposure of Personally Identifiable Information (PII) is a serious threat to user trust and safety. It’s often linked to illegal activities such as sextortion and drug solicitation, all of which pose significant risks to users. As privacy concerns grow and regulatory requirements like CCPA (California Consumer Privacy Act) and GDPR (General Data Protection Regulation) become more stringent, the stakes for failing to address PII violations are higher than ever, with potential legal liabilities looming large.
To protect user data and ensure compliance, our newly upgraded PII detection model offers adaptable and automatic scanning of personal information across textual communications. This includes the comprehensive ability to detect PII as a whole or to zero in on specific indicators like email addresses, phone numbers, or credit card details. This granularity provides more tailored coverage with fewer false positives, ensuring that only the most critical information is flagged for action.
On any platform, personal information is often linked with other illegal activities, like CSAM, drug solicitation, sextortion, and the exposure of minors’ sensitive data. With ActiveOS’ policy management tool, you can defend your platform against a broader range of illegal activities with precision and efficiency.
For example, by combining ActiveScore’s PII model with our underage detection model, you can ensure that no minors are sharing sensitive personal data— this integrated approach not only safeguards young users but also strengthens your platform’s overall security posture, and it maintains compliance with regulations like the CCPA.
You can also implement codeless workflows to automate responses, streamlining your moderation process. These workflows can automatically redact or remove any shared personal information without requiring manual intervention, freeing up your moderation team to focus on more complex issues. By automating actions such as banning users or removing content, your platform can maintain compliance and safety standards efficiently and at scale.
This allows you to save time, by automatically redacting or removing any personal information shared, without the need of moderator review.
Platform leakage, where users shift transactions off-platform, can lead to major risks, including revenue loss, increased chances of fraudulent or illegal activities, and a diminished ability to enforce platform policies. Preventing this behavior is crucial to ensuring user safety, engagement, and retention.
ActiveScore’s PII model is a powerful tool in combating platform leakage. It can detect patterns of off-platform transactions, such as repeated instances of users sharing personally identifiable information like phone numbers or email addresses. By identifying these patterns, you can take proactive measures to enforce platform policies directly with user-level views and bulk actioning.
ActiveFence models are designed to deliver robust protection against both known and emerging threats, supporting over 17 policies across multiple languages and media formats. Yet, new threats constantly arise, requiring continuous updates and refinements to keep our AI models accurate and effective.
At ActiveFence, our commitment to responsibility ensures our models are continually optimized— correcting biases, enhancing performance, and addressing ethical integrity and security. Through regular audits and feedback loops, we consistently update and fine-tune our models to stay ahead in detecting and mitigating potential threats.
This month, key model updates include reducing false positives in our drug solicitation, underage (18 and below), and nudity detection models. We’ve also expanded support for ‘prompt style’ texts for our CSAM (Child Sexual Abuse Material) model. These updates will improve the accuracy of our models as well as enhance their ability to detect and prevent harmful content more effectively.
Stay tuned, as we are always working on more exciting features and enhancements for ActiveOS and ActiveScore. If you’d like to see these new features in action, or learn more about how they can benefit your platform, feel free to schedule a one-on-one demo session with us.
Thanks,
The ActiveFence Team
Discover the latest advancements in ActiveOS and ActiveScore designed to elevate moderation efficiency and ensure community safety.