Improve your detection and simplify moderation - in one AI-powered platform.
Stay ahead of novel risks and bad actors with proactive, on-demand insights.
Proactively stop safety gaps to produce safe, reliable, and compliant models.
Deploy generative AI in a safe and scalable way with active safety guardrails.
Online abuse has countless forms. Understand the types of risks Trust & Safety teams must keep users safe from on-platform.
Protect your most vulnerable users with a comprehensive set of child safety tools and services.
Our out-of-the-box solutions support platform transparency and compliance.
Keep up with T&S laws, from the Online Safety Bill to the Online Safety Act.
Over 70 elections will take place in 2024: don't let your platform be abused to harm election integrity.
Protect your brand integrity before the damage is done.
From privacy risks, to credential theft and malware, the cyber threats to users are continuously evolving.
Stay ahead of industry news in our exclusive T&S community.
Discover How to Build a Successful Trust and Safety Strategy with ActiveFence
Trust and Safety has become integral to online businesses and communities, especially as we continue to rely on the internet. In today’s connected world, where millions of people interact daily on online platforms, creating secure and trustworthy spaces is more important than ever.
But the concept of Trust and Safety isn’t just about protecting companies—it’s to make sure that every person interacting, doing business, and living their life online feels safe, respected, and valued, making Trust and Safety online a fundamental human right.
At its core, Trust and Safety involves a set of practices, policies, and technologies designed to create an environment where users can safely engage with products, services, and other people—and trust them. This trust is key to building lasting relationships between users and platforms. A strong Trust and Safety framework and digital ethics help keep would-be harms and risks at bay—harms like data breaches, exploitation, fraud, harassment, and misinformation.
These threats have serious effects on both companies and customers. For users, a lack of Trust and Safety measures can lead to emotional distress, financial loss, and a breakdown of trust in online platforms. For businesses, neglecting Trust and Safety can result in legal issues, damaged brand reputation, the loss of users and revenue, and financial fines and regulatory measures.Â
Want to better understand the key terms and basics of Trust & Safety? Dive into our detailed Trust & Safety Glossary and get up to speed with the essentials.
The heart of Trust and Safety. Shielding users from harmful behavior and content involves reactive measures like detecting, actioning, and reporting abuse, as well as proactive strategies such as user verification, dedicated rules for new users, and proactive threat detection to prevent threats before they reach users. Platforms often use machine learning to spot abuse patterns and provide tools for users to report, block, and manage interactions. Education is also important so users know how to recognize and avoid potential threats. This empowers them to protect themselves wherever they buy, chat, or play.
Accountability means that both users and platforms are responsible for their actions, with clear and enforceable consequences for violations. This requires setting community guidelines or terms of service that users must follow and platforms must enforce. There should also be ways to appeal decisions or report violations. For platforms, accountability means responding to users and regulators when issues arise, whether through internal investigations, public apologies, or making systemic changes to prevent future problems. Ultimately, accountability strengthens trust by demonstrating that the company takes its responsibilities seriously and is committed to maintaining a safe and fair environment.
Safety by Design is a foundational principle in Trust and Safety that means building safety features into the architecture of digital products from the start t. Instead of being an afterthought, Safety by Design implementation ensures that user protection, privacy, and compliance are prioritized throughout development and deployment. This approach is crucial in adhering to safety compliance standards—by proactively embedding safety measures, platforms can stay compliant with global regulations as they evolve.
Each platform faces unique risks based on its industry, platform type, and target audience. For example, social networks, gaming platforms, GenAI chatbots, dating apps, and e-commerce sites all encounter distinct safety concerns. The target audience further influences these risks—platforms designed for children face different challenges than those aimed at adults. Risk mapping is essential for identifying these threats and should be customized to each platform’s specific needs. This is a foundational step in developing an effective safety policy that addresses particular vulnerabilities.
Once risks are identified, the next step is creating clear, enforceable policies to manage threats. Effective Trust and Safety policies define acceptable behavior, outline consequences for violations, and adapt to new threats and abuse tactics. Staying ahead of emerging risks is crucial. Policies should target specific abuse areas, like child safety, to address unique challenges effectively. They should also reflect company values, be clear to users, and enforceable by moderators. Collaboration across legal, security, and user experience teams ensures policies remain comprehensive, legally sound, and user-friendly.Â
An all-important function that includes filtering, reviewing, and managing user-generated content. Content moderation requires balancing automated tools, which can quickly scan and flag inappropriate content, with human moderators, who handle more nuanced decisions, like cases involving hate speech or misinformation. Platforms must also continually update their moderation policies to reflect company values, social norms, novel abuse tactics, and evolving legal frameworks so that the content hosted on their sites promotes a safe and respectful environment.
Another critical aspect of Trust and Safety is effectively managing incidents when harmful content slips through a platform’s defenses. Trust and Safety teams must have a well-defined response plan to react swiftly and smoothly. This includes identifying and assessing the severity of the content, deciding the appropriate enforcement action (such as removal, user warnings, or bans), and, when necessary, escalating to legal authorities or specialized teams. A rapid, coordinated response helps mitigate harm and reinforces platform safety.
A pillar of Trust and Safety closely linked to Cybersecurity. It protects user data from breaches, unauthorized access, and misuse. This includes implementing Cyber Threats Management systems with robust encryption, secure storage solutions, anti-malware tools, fraud prevention, and stringent access controls to keep sensitive information safe Data security also involves regular audits, vulnerability checks, and compliance with laws like the General Data Protection Regulation (GDPR) or the California Consumer Privacy Act (CCPA). Educating users on best practices for protecting data, like using strong passwords and two-factor authentication, can also minimize risks.
Transparency means clearly communicating a platform’s policies, practices, and decisions. This includes explaining how data is collected, used, and protected, and how content moderation and user behavior are managed. Transparency reports, which showcase efforts to improve user safety, have been around for a while. Initially voluntary, they often focused on specific issues like child abuse and terrorism. However, recent regulations like the EU’s Digital Services Act, the UK’s Online Safety Bill and others now require regular reporting of moderation activity and enforcement actions, with fines and penalties for non-compliance. By being transparent, platforms build trust with their users and reinforce accountability.
In the last decade, new regulations and legal frameworks have emerged to ensure platforms operate responsibly while safeguarding users’ rights. These rules vary by region and industry, reflecting differing needs and concerns as the regulatory climate intensifies worldwide. As platforms face increased scrutiny, following these laws is essential for maintaining trust and safety online. Here are some of the most prominent global regulations:
The Digital Services Act (DSA), introduced by the European Union (EU), was launched to establish a unified set of rules for digital service providers operating in the EUʼs 27 member states. It builds on existing e-commerce rules and adds new responsibilities for platforms with users in the EU, especially for Very Large Online Platforms (VLOPs). Key requirements of the DSA include detecting and swiftly removing illegal content, enhancing user control over content visibility, and providing transparency reports. It also requires VLOPs to conduct mandatory risk assessments to combat disinformation and other harmful content. Starting in 2024, non-compliance with the DSA could lead to fines of up to 6% of a company’s annual global revenue.
The United Kingdom’s Online Safety Bill aims to protect users from harmful content by enforcing strict online safety rules. It targets 13 categories of illegal content, including child sexual abuse material (CSAM), drugs, and terrorism-related content. The bill also addresses issues like disinformation and cyberbullying, with significant fines for non-compliance.
COPPA is a U.S. law from 1998 that protects the online privacy of children under 13. It requires websites and services targeting minors, or collecting their data, to receive parental consent before collecting, using, or disclosing personal information. COPPA also requires companies to provide clear privacy policies, limits data collection to only what’s necessary, and allows parents to review and delete their child’s data. The Federal Trade Commission (FTC) enforces COPPA , and violations can lead to major penalties. COPPA is especially important for platforms and apps in the education and entertainment industries.
In July 2024, the U.S. Senate passed the Kids Online Safety and Privacy Act (KOSPA) with overwhelming support, voting 91-3 in favor. This landmark legislation aims to protect children online by requiring platforms to limit addictive features, ban targeted ads to minors, restrict data collection, and enforce stricter default privacy and safety settings. KOSPA also addresses harmful content related to suicide and self-harm, eating disorders, and substance abuse. Although it still needs approval from the House, KOSPA represents a significant step forward in child online safety.
These regulations and others create a complex legal framework that companies must navigate to maintain Trust and Safety on their platforms. Compliance not only helps avoid legal issues but also builds credibility and trust with users by showing a commitment to protecting their rights and personal data.
In the Trust and Safety ecosystem, people who spread harmful material and abuse online platforms are called bad or threat actors. This illicit group includes pedophiles, extremists, state agents, and other cybercriminals, each with different goals, tactics, and abusive behaviors.
In the Trust and Safety ecosystem, people who spread harmful material and abuse online platforms are called bad or threat actors. This illicit group includes a variety of predators like pedophiles, extremists, and cybercriminals, each with different goals, tactics, and abusive behaviors. Many of them use malicious bot accounts as an automated, cheap method to spread misinformation, CSAM, and terrorist propaganda.
These actors often operate across different platforms for various purposes: mainstream platforms to lure unsuspecting victims, more secluded spaces like the dark web to interact with each other and share advice, and a mix of environments to evade detection. This varied and overlapping activity creates major challenges for platforms trying to stop harmful activities and, when necessary, involve legal authorities.
Over the short course of its existence, the Trust and Safety industry has made significant strides in ensuring digital safety and protecting users online. Initially, its focus was on basic content moderation, removing illegal or unsafe content from social media platforms. Over time, it has expanded to encompass various online platforms where users create content. Today, the industry not only meets legal requirements but also reflects the values and standards that platforms want to uphold. As user behaviors, regulations, and expectations have changed, the industry has adapted, becoming more critical as digital interactions grow more complex. Several pivotal moments have shaped its evolution, leading to where it stands today. For a deeper look at these milestones, check out our history infographic.
While all industries benefit from Trust and Safety, certain sectors rely more heavily on robust safety measures due to the nature of their services and the sensitive information they handle.
Social media platforms like TikTok, Facebook, Instagram and others depend heavily on Trust and Safety due to their vast user bases and the fast pace of user interactions. These user-generated content (UCG) platforms must constantly monitor and moderate content to prevent the spread of illegal or harmful material, such as hate speech, misinformation, and CSAM. They also face the challenge of balancing free expression with user protection, using both algorithms and human moderators to remove abusive content while respecting free speech. Additionally, they must navigate the complexities of global regulations, as definitions of harmful content differ across countries.
Online marketplaces like Amazon and eBay need safety policies in place to reduce buyer-side and seller-side risk, and to prevent fraud. This means using safety measures like secure payment systems, verifying the authenticity of products, and giving users tools to report scams or counterfeit goods. For customers, knowing their financial information is safe and that they can trust sellers is vital for building loyalty and encouraging repeat business. E-commerce platforms also must protect users from privacy breaches and data misuse, which can lead to major legal and financial consequences.
The video game industry requires real-time action to combat in-game toxicity and keep players safe. Moderators use tools to analyze comments, chats, usernames, and accompanying metadata to detect and block inappropriate conversations, making the platform a safe space for everyone. Gaming platforms also deal with toxic behavior by moderating players in real-time and taking action against repeat offenders. They also must detect and stop account takeovers (ATOs) to combat cheating, prevent revenue loss and protect the integrity of the gaming experience.
Dating apps and platforms need to create a safe environment to build user trust for those seeking romantic relationships. They must prevent a wide range of online abuses, including romance scams, fraud, human trafficking, sexual exploitation, harassment, and non-consensual intimate imagery (NCII). To do this, platforms often use advanced AI moderation to help make accurate and efficient decisions. This helps dating platforms maintain user trust while minimizing risks and costs linked to harmful activity.
A Trust and Safety team is made up of diverse professionals who work together to create and maintain a secure, respectful, and compliant online environment. Here’s a breakdown of the key roles:
Policy experts are essential as they develop and update the rules that govern user behavior on platforms. They make sure these policies align with the platform’s core values, and stay ahead of new threats and changing regulations. They also collaborate with other team members to align policies with legal requirements and industry standards.
Content moderators are on the front lines, reviewing and managing user-generated content to see if it meets community guidelines. They handle high volumes of UGC that require quick, accurate decisions and work to filter out harmful material like extremism, CSAM, hate speech, and misinformation. Involving frequent exposure to distressing content, this stressful line of work often causes emotional trauma, which leads to a high turnover rate. To build resilience, moderators must have support systems in place. Another way to minimize moderator harm is reducing workloads and exposure by using automated tools that moderate clearly harmful content without moderator review, also helping improve productivity and efficiency. Moderators’ insights are essential for improving moderation tools and shaping policies, ensuring platforms remain safe and welcoming for all users.
Security analysts protect the platform’s digital infrastructure by monitoring for cyber threats like hacking and phishing. They implement and maintain security measures to safeguard the platform and its users. Their work involves identifying vulnerabilities, analyzing security incidents, and responding to threats in real-time. Security analysts ensure that strong security practices are integrated throughout the platform’s operations, from login processes to data storage.
In the realm of Trust and Safety, User Experience (UX) professionals are particularly prominent in the gaming industry, where dedicated T&S teams are less common. UX teams ensure a positive user experience by integrating Trust and Safety measures without disrupting the natural flow of gameplay. Their goal is to create a safe environment where players can express themselves freely. By implementing features like intuitive reporting tools and privacy settings, they strike a balance between user safety and an engaging gaming experience.
Legal advisors and in-house legal teams guide the Trust and Safety team through complex regulations, ensuring that platforms comply with local and international laws on data protection, content moderation, and user privacy. They also handle legal disputes and challenges, protecting the platform from liabilities while ensuring that all practices are ethical and legal.
Data scientists play a critical role by analyzing large datasets to detect patterns of abuse and improve Trust and Safety measures. They spot trends, like spikes in harmful content or changes in user behavior, that could signal new threats or enforcement gaps. They also develop and refine algorithms for content moderation, fraud detection, and security, helping the team make informed, data-driven decisions.
Product managers and engineers design and build the systems that platforms use for Trust and Safety. Engineers develop machine learning models to automate and scale enforcement against policy violations, creating tools for user reporting, internal reviews, and content moderation. Product managers partner with engineering, policy, and other teams to drive the strategy, vision, and execution for reducing harmful content. Both groups ensure these systems are scalable, efficient, and aligned with safety guidelines, continuously improving them to meet changing needs.
Professional communities are vital for advancing Trust and Safety initiatives by fostering collaboration, knowledge sharing, and continuous improvement among experts. One such community is the Trust & Safety Professional Association (TSPA), which provides resources, networking, and education on best practices. TSPA helps Trust and Safety teams stay informed on emerging challenges and develop effective strategies.Â
Industry events like TrustCon, the TSPA Summit, and the Marketplace Risk Global Summit, provide opportunities for professionals to meet, share insights, and address new issues in Trust and Safety. These events encourage collaboration and innovation across the field. Following top Trust & Safety influencers also helps professionals stay informed on the latest trends.Â
ActiveFence contributes to this community by establishing the Trust and Safety Collective, a platform where Trust and Safety professionals can exchange knowledge and experiences. This initiative fosters connections among those committed to creating safer online environments. Apply to become a member here!
For more on the role of Trust & Safety teams in creating a safer online ecosystem, check out our Trust & Safety Industry eBook.
In Trust and Safety, strategies can be broadly categorized as proactive or reactive. Reactive strategies address harmful behavior after it occurs, relying on moderation processes where the platform’s community or others flag content that violates policies. Because these methods depend on users reporting issues, harmful content will be seen by others before action is taken. Key measures here include user reporting or flagging systems, which should be easy to use and offer feedback on report statuses, helping build trust and encouraging user participation.
Proactive strategies aim to prevent harmful behavior before it happens. This includes using advanced tools like automated detection and proactive threat intelligence systems to identify and reduce risks early, ideally before they cause real-world harm. Staying ahead of emerging threats, understanding new abuse trends, and preemptively countering potential risks before they escalate is essential. For instance, comprehensive systems that combine technology and human intelligence, such as ActiveFence’s Threat Intelligence Solutions, enable platforms to detect and analyze threats from various sources, including the clear, deep, and dark web. These systems support agile threat intelligence operations across multiple languages, helping platforms manage inauthentic activity and moderate harmful content more effectively.
A well-rounded Trust and safety strategy typically combines proactive and reactive approaches to provide comprehensive protection for users and the platform.
Like many areas of life, technology plays a crucial role in Trust and Safety efforts. Automation and AI are essential for digital platforms, enabling real-time monitoring and swift responses to threats, making content moderation more scalable and efficient.
AI-driven moderation systems can automatically detect and flag harmful content, directing only ambiguous cases to human moderators. This reduces moderator workloads and limits exposure to distressing material, helping prevent burnout and lowering turnover rates. These technologies analyze large datasets to spot patterns of abuse and emerging threats, ensuring consistent enforcement of community guidelines and reducing the risk of harmful content slipping through. Â
AI detection tools like ActiveScore use machine learning to automatically detect harmful content, assessing and ranking it based on risk. This helps platforms identify and prioritize urgent issues more efficiently. AI moderation tools like ActiveOS go a step further by using predefined rules to take automatic action on harmful content without the need for moderator intervention. Regular updates to these systems are crucial for adapting to new threats and ensuring ongoing effectiveness in maintaining a safe online environment.
While automated tools offer efficiency and scalability, human moderators remain vital for handling complex situations that require judgment and context. The optimal approach combines both—using automated detection tools for routine moderation and human expertise for more nuanced cases.Â
Trust and Safety tools have become indispensable for platforms looking to effectively moderate content and protect users. Until recently, there were few dedicated Trust and Safety solutions on the market, leaving teams to rely on a patchwork of tools not designed for their unique needs. This often involved using a mix of messaging tools, spreadsheets, backend management systems, and modified case management software—an approach that quickly becomes complicated and inefficient.
Dedicated tools now provide a comprehensive solution, offering seamless integration with multiple data sources and APIs to deliver a holistic view of potential risks. Features like queue management help Trust and Safety teams prioritize content based on risk level and type of abuse, ensuring efficient use of resources. Additionally, automated actions allow platforms to take immediate action based on predefined rules, minimizing the burden on human moderators.
Given the complexities of building such tools in-house, many platforms turn to specialized Trust & Safety vendors.
A critical decision for companies is whether to develop Trust and Safety capabilities in-house or outsource them, by buying off-the-shelf solutions. This Build vs. Buy dilemma involves weighing factors like cost, control, expertise, and scalability, depending on a company’s expertise and resources.Â
Building an in-house solution from scratch may offer complete control and customization but it requires significant investment in technology, expertise, and ongoing maintenance, which may not be feasible for many organizations. For businesses without experience in this area, creating such systems can be too costly and resource-intensive.
Alternatively, buying off-the-shelf tools provides quick access to specialized capabilities, often at a lower cost. These tools are regularly updated and stay compliant with regulations, allowing companies to focus on their core operations. This is particularly advantageous for smaller platforms that lack the resources to build their own Trust and Safety tools. However, it may limit control and customization.Â
Some organizations opt for a hybrid approach, blending custom-built features with outsourced solutions to balance customization and cost-efficiency. This allows them to leverage the strengths of both approaches, integrating tailored features where needed while relying on ready-made tools for standard tasks. This flexible model is scalable and adapts to evolving threats and operational needs.Â
For more on making this decision, check out our Buyer’s Guide to Trust & Safety Tools and a checklist for choosing the right vendor for your needs.
Since the 2022 launch of ChatGPT, Generative AI (GenAI) has seen a meteoric rise, with platforms like Bard and Co-Pilot forever changing the internet and how we communicate. These advanced technologies enable the easy creation of text, video, and audio content. However, as with any new technology, bad actors are early adopters, quickly learning to misuse and abuse these tools for malicious purposes. This creates new challenges for Trust and Safety teams trying to safeguard online spaces.
The rapid adoption of Generative AI tools has introduced unprecedented risks for the Trust and Safety community. Offenders like child predators, terrorists, and racists now have easier access to create highly realistic, harmful content. Unlike established UGC platforms, new GenAI platforms face a complex threat landscape. These platforms offer advanced tools that can be exploited by bad actors while lackimg the experience and infrastructure to effectively mitigate safety risks.
Generative AI’s multimodal capabilities—including text, image, and audio generation—allow bad actors to create realistic yet fake content, like deepfakes and synthetic media. These can disseminate misinformation, impersonate to facilitate scams, and manipulate public opinion.
Some of the top GenAI dangers include audio impersonation scams and the spread of election disinformation. Child predators also exploit these tools to produce and distribute realistic CSAM, receive advice on sextortion and online grooming, create novel CSAM videos, and create non-consensual intimate images (NCII).Â
Additionally, large language models (LLMs) and AI applications, like chatbots further complicate matters, as they can be weaponized for phishing, harassment, and misinformation. There are also legal challenges related to IP and copyright infringement, as AI-generated content can mimic copyrighted material, raising liability concerns. Companies deploying enterprise GenAI applications also face risks like internal data leaks and regulatory compliance issues.
As AI grows more prominent, the focus on AI safety intensifies, making it essential to integrate Trust and Safety expertise into AI development and deployment. Terms like GenAI safety, AI risk management, Responsible AI, and system integrity are not just trending buzzwords but top priorities for companies using LLMs. This shift requires Trust and Safety teams to develop new skills and strategies to manage this evolving landscape.
Despite the advancements in AI, the principle of Safety by Design remains fundamental, just as it is for other digital platforms. By integrating safety measures from the beginning—during training, deployment, and every new version—companies can reduce the risks associated with AI tools. This proactive approach embeds safety at every stage of development, helping to minimize potential harm.
To build a practical AI Safety framework, companies should adopt several key mitigation strategies:
Looking ahead, Trust and Safety in the age of GenAI will require continuous adaptation to new technologies and threats. Platforms need to evolve their strategies to keep pace with rapid advancements in AI.
Trust and Safety are vital for the long-term success and growth of online platforms, far beyond merely meeting regulatory requirements. With users sharing personal information and interacting regularly, robust Trust and Safety measures are essential to build user confidence. This confidence fosters greater engagement, loyalty, and a willingness to share personal data—especially important for platforms handling sensitive information.Â
A safe, respectful online environment not only improves user retention but also attracts advertisers, which is vital for the profitability of most digital platforms. Platforms with strong Trust and Safety practices also appeal to investors and partners, have reduced legal risks, and can better protect their brand reputation. This creates a positive cycle that enhances user satisfaction and gives the platform a competitive edge.
The importance of Trust and Safety cannot be overstated. As discussed in the State of Trust and Safety 2024 report, maintaining user trust is crucial for platform longevity and engagement. By making Trust and Safety a core focus, organizations can create a reliable, supportive online environment that encourages community growth and secures their future in the rapidly evolving digital world.
Looking forward, the digital landscape will continue to change, necessitating continuous adaptation to new threats and technologies. Trust and Safety strategies must evolve alongside these changes to ensure that online communities remain safe and trustworthy for all users.
ActiveFence is the leading provider of Trust and Safety solutions, safeguarding online platforms and their users from the widest range of harms and abuses. Trusted by Trust and Safety teams of all sizes, we help keep over three billion users safe from threats like child abuse, exploitation, disinformation, hate speech, terror, fraud, and more.
Our comprehensive solutions combine deep intelligence research, AI-driven harmful content detection, generative AI safety and a robust moderation platform, empowering global platforms to operate safely and responsibly in over 100 languages.
Backed by leading Silicon Valley investors, ActiveFence has raised $100M to date and employs over 300 people worldwide. By providing one complete solution for Trust & Safety and AI safety, we enable safe, productive online interactions and help platforms thrive.
Learn More about ActiveFence’s Trust and Safety Solutions
ActiveFence has been awarded the Frost & Sullivan 2021 European Technology Innovation Leadership Award in the online trust and safety industry.
Over the last year, we met with over 100 trust & safety teams, here's what we learned about their challenges.
ActiveFence has just announced that it has completed its acquisition of Spectrum Labs - bringing the market a more mature and robust offering than ever before. Learn what this move means for our clients and the broader Trust & Safety ecosystem.