Building Trust & Safety Leveraging Both Humans and AI

Trust and Safety (T&S) strategy is now essential for any business that supports any form of user-generated content (UGC). In previous years, there has been an assumption that T&S applied only to social media platforms, where most of the visible content is entirely generated by users. However, this has changed as many companies encourage users and customers to upload reviews, comments, and photos or video.

The objective of a T&S strategy is to focus on several specific important initiatives:

Protecting Users

This is your number one priority. A trust and safety policy aims to create a safe and positive user experience by mitigating risks, such as fraud, scams, abuse, and other forms of harm. This can also involve users uploading content that would be offensive to others and may also involve measures to prevent and address bullying, harassment, hate speech, or inappropriate content.

Data Security

T&S may not be seen directly as focused on data security, but it is an important area to focus on because of the security implications inherent in asking users to directly contribute content. Your trust and safety policy should be aimed at securing personal data and privacy, complying with data protection laws, and establishing practices for data collection, storage, and sharing.

T&S policies ensure the platform operates within the boundaries of local and international laws and regulations. This includes compliance with age restrictions, digital rights, and intellectual property regulations, trade laws, privacy laws (like GDPR in Europe), etc.

Platform Integrity

Policies are also designed to uphold the reputation and operational integrity of the platform. They set out guidelines for user behavior to maintain a respectful, inclusive, and productive community. For example, this can be especially important in games where trolls can wreak havoc for other players to the extent that they choose to spend their time in an alternative game—in other words, lost customers.

Each service will use different tools and strategies to implement a robust T&S strategy, but it is important to note that almost all services now require much more than just user-reporting systems. It’s no longer enough to allow communities to police each other.

A comprehensive T&S strategy will now include artificial intelligence (AI) and machine learning for content moderation, robust user-reporting systems, identity verification processes, user education—and potentially more depending on the specific context, user base, and type of interactions the service facilitates. All this contributes to the significant amount of human moderation that needs to be done, and clearly lessens the load for frontline moderators.

The Need for Trust & Safety Online

Most of us have been exposed to offensive content online at some point. Our usual reaction is to hit the block or delete button, but it is becoming increasingly important to protect users. Parents do not want their children exposed to anything that could be viewed as harmful, and the potential for damage to corporate reputation has increased as UGC has moved from social media to almost all online platforms.

The challenge is volume. Consider the most popular video service online today, YouTube. Users upload around 720,000 hours of new content daily. Think about that. Every hour that passes in real time sees 30,000 new hours of content uploaded on one single platform.

To physically watch and approve every video uploaded to YouTube manually would require a team of content moderators the size of a large stadium rock concert all constantly watching and rating videos. Just a single hour sees more new video content uploaded to YouTube than a single person could watch in their entire life.

Now add all the other popular social networks and then all the popular platforms that encourage UGC. Think how many millions of products are available on Amazon and almost all of them feature reviews and product photos from customers. Every online retailer asks for reviews. Every travel site asks for photos of hotels or reviews of airlines. Everyone is asking customers to create and upload content.

The challenge is enormous and therefore tools such as generative AI are not deployed in this environment simply to reduce the cost of keeping users safe. It is a critically important component of a T&S strategy. But it’s not just a simple choice between humans or AI. AI is, of course, a more efficient way to automate elements of keeping users safe – for example, a simple search for banned words or phrases, or automated checks for skin tones in images, and is likely to be more accurate in identifying possible issues.

In summary, AI is required to take on this firehose of constant content updates and uploads. Without AI, this task would be almost impossible and the freedom for users to safely post their content online would be severely curtailed.

How AI Helps to Facilitate Trust & Safety

AI can be used in multiple ways to build a safer environment for your users. In particular, it can be used to identify dangerous or offensive content that is uploaded—so it can be isolated before publication.

Machine learning algorithms can analyze and filter content in real-time to flag inappropriate content, such as nudity, violence, hate speech, misinformation, or spam. This can also help detect deepfakes, with many AI engines working to quickly identify content that has been created using AI models.

Natural Language Processing (NLP) can be used to understand the context of texts or messages and identify harmful language or bullying. This automated oversight can dramatically reduce the amount of content that requires human checks.

AI can also recognize patterns and anomalies in large datasets to identify fraudulent behavior or malicious activities. This is especially important in areas such as financial transactions, where detecting irregularities quickly can prevent large-scale fraud. This is especially important in community situations where customers can directly engage with other customers via a corporate platform. If one customer defrauds another on your platform then do you want to argue with the victim in court over who was responsible?

By using predictive analytics, AI can assess the risk associated with a particular user or action, and alert the appropriate teams or take automatic preventative measures. Game players can benefit from these predictions as they allow players who are being mildly abusive—such as using offensive language—to be placed under closer watch for further trust and safety policy violations.

Once rules are broken, generative AI can also assist in enforcing policies, such as flagging content for review, issuing warnings to users, or in severe cases, suspending or banning accounts. Automated responses like this are a particular feature of generative AI that was more scripted in the past, but can now be created for each situation.

For content moderation this process will be automatic—the offensive content will never even be published. For environments such as games, this allows an automated response to offensive player behavior, rather than waiting for another player to report someone and then waiting again for a human moderator to investigate. The player can be locked out of the game immediately.

It is worth stating that AI still cannot protect users in all situations. It needs to be deployed as a first-line solution that can remove offensive content or users, but the AI requires human support for those instances where it is unsure about a situation.

A good example is the complexity of language. The use of slang, colloquialisms, irony, sarcasm, and new vocabulary are all likely to be misunderstood—this can even be difficult for a human moderator to decipher. Natural language processing is improving dramatically, but if slang and new vocabulary can confuse humans—as all parents know—then it can certainly confuse and outwit an AI system.

AI can be trained to enforce very specific controls, such as the length of a bare leg visible above a knee that is permissible without restriction. However, it can be confused by innocent images taken on a beach or when breastfeeding, although the acceptability of these depends on cultural context. There is scope for confusion.

AI and the feedback created by generative AI can be a strong frontline, but it requires a robust human support team. It is just one component of a strong trust and safety strategy.

Building a Strong Trust & Safety Solution

As described throughout this article, the use of AI is not focused on cutting the cost of protecting users—it is now an essential tool to allow brands to protect their users from offensive material. This blend of human moderators and AI on the frontline is essential for a strong solution that can cope with the volume of content that needs to be checked.

There is another very important factor when designing Trust & Safety for your organization—the cost. Some executives may feel that the cost of a comprehensive content moderation strategy is prohibitive, but the real question should be: what is the cost of not protecting your users and customers?

Content moderation and, a broader focus on T&S, are important. This is not a process that should be designed as an afterthought or for the lowest possible cost. Providing robust measures to protect your customers is now like a city authority providing effective first responder services for emergency situations—it’s an important service that needs to be valued. For example, many companies are developing software with “trust and safety” built into its core to leverage not only AI, but also robust policies and processes.

Employees working in this area need to be trained and protected, during and after their employment. They are highly trained professionals and are usually only called on when the AI needs help.

AI can help by sitting on the frontline of this torrent of content—scanning and protecting customers 24/7—with generative AI creating immediate feedback or responses. But you still need a human team, robust policies, effective escalation, and more. This means that humans need the tools, training, and processes that allow them to keep your customers safe.

A robust Trust & Safety strategy will involve humans and it will involve AI. They augment each other and protect your customers and users from fraud and offensive content that you really do not want to be displayed on your platform.

Learn more about how we can help you level up your trust and safety policy.

Contact Concentrix

Let’s Connect