fbpx

Expert’s Corner with Subject Matter Expert Carolina Christofoletti

Carolina Christofoletti is a Child Sexual Abuse Material (CSAM) subject matter expert, with vast experience both in CSAM Intelligence and CSAM Trust & Safety. She currently leads the CSAM Blockchain Intelligence efforts at TRM Labs as part of the Threat Intelligence Team.

Her career includes an extensive CSAM academic background starting in 2017 at University of São Paulo — where she received an Alumni Excellence Award and a Law Degree. She holds an L.L.M on International Crime, Transitional Justice and Peace Procedures from Universidad Católica de Colombia (Colombia), a masters degree in Criminal Compliance from the Universidad Castilla La Mancha (Spain) and a masters degree in Cybercrime from the University of Antonio Nebrija (Spain) and is an L.L.M candidate for Digital Forensics at IPOG (Brazil).

Carolina Christofoletti is currently assigned as a CSAM researcher at the IGCAC Working Group, where she researches CSAM networks in depth and in partnership with national and international law enforcement agencies as well as at the Institute for Advanced Studies at the University of São Paulo (USP), where she researches CSAM Trust & Safety. She also serves as a Consultant on CSAM Trust & Safety matters.

1. In a recent Linkedin post, you said “Image databases will not solve the CSAM problem.” Could you explain more about what you mean by that?

CSAM Image Databases are bound by a logical contradiction.

Let’s suppose that:

  • ID1 is a CSAM database (‘known’ is defined as existing in ID1).
  • Image A is a CSAM file that is part of ID1 (‘known’ CSAM images)
  • Image B and Image C are CSAM files that are not part of ID1 (‘unknown’ CSAM images)
  • Video A is a CSAM Video (CSAM media we know nothing about)

(1) In order for a Image A to be added to ID1, it must first be found — and found in a manner that meets certain requirements, which is to say that Image A must be found (either directly or indirectly) by someone who has access to ID1, and who is willing to add it to the database. This someone is either a Law Enforcement Agency (Image A being thus part of a seizure) or a CSAM Hotline (Image A being often found by someone else and reported to the organization).

(2) It rapidly becomes quite clear that ID1 can only be useful if Image A is ever found and if the number of Image As does not exceed the scalability of ID1. And here lies the bottleneck:

The mere existence of ID1 has the harmful, unintended effect of obfuscating its main dependency, which is exactly a proactive approach for CSAM detection.

Instead of focusing on Image A (CSAM files flagged by ID1), we should be focusing on Image B and Image C (CSAM files that easily bypass it).

(3) Without constant updates, ID1 rotates in circles — as a CSAM image database serves (without further additions) only to identify known CSAM files (Image A), having no power over the unknown CSAM files (Image B and C).

Image A can only empower the discovery of Image B and C if:

  1. Image B and C appear before Image A and in the same digital environment, being still discoverable at the time Image A is detected. (linear condition), and
  2. If Image A is used as a review flagger, rather than as a ‘tool’ for simply locating and taking down known CSAM files.

(4) Because CSAM criminals do not report their sources and because the research around proactive detection is still poorly explored, Image B and C will hardly ever enter ID1.

Moreover, Image B and C might not be reported to or be found by a CSAM Hotline or a Law Enforcement Agency with access to ID1.

Even if Image B and C are actively circulating on the Internet, the fact that they circulate only among a closed groups of offenders who never report them, the fact that Image B and Image C are so severe that someone can fear their own prosecution by reporting it and the lack of a ‘proactive’ approach of CSAM hotlines to really ‘look for those files’ might, often, eliminate the chances of having Image B and C ever inserted into ID1.

(5) Even if Image A, Image B and Image C have the potential of being added to ID1, the same possibility isn’t available for Video A — since ID1s are image databases, which do not include CSAM videos.

A possible solution for this case would be extracting Image A (a CSAM scene) from Video A and enable the detection per similarity. Although the technology for this process already exists, it unfortunately remains poorly explored up to this point.

(6) Even though online platforms often expect Image A to be directly uploaded into their platforms (in order to trigger ID1), this hardly ever happens.

A quick observation into CSAM networks will show that the way Image A commonly enters online platforms is through an external link. The finding of Image A depends, as such, on the opening AND scanning of those links.

And here, the problem begins, as CSAM criminals tend to look for file hosting platforms with poor or low integration with ID1, and CSAM Trust & Safety teams are not equipped with CSAM detection tools based on ID1 or that enable automatic detection in this case.

At this point, we realize how dependent on ‘opening the links’ and ‘human reviewers’ CSAM Trust & Safety still is. A possible solution for this would be working on ‘risk scoring’ metrics for unknown links.

2. The EU recently beefed up their regulations to ensure online child safety. How important do you think government regulations are or not to online child safety?

The main risk of any regulation is getting the “what to regulate” problem wrong. This is also true for the upcoming EU CSAM regulation.

Once an industry standard is set by regulation, and which online platforms must comply with, CSAM Trust & Safety managers become extremely resistant to greenlighting a different approach to tackle CSAM threats by the teams they coordinate. And with this, the antilogic rotates in circles — revealing the real origin of Trust & Safety NDAs.

Since online platforms risk, with punitive CSAM regulations, having to pay highly expensive fines if new CSAM content is actually found on their sites (as ‘unknown CSAM files’ detection always happen after the CSAM files were uploaded to the platform), efficient CSAM approaches are sacrificed for the sake of compliance with the given regulation.

And this highlights how the proposed EU CSAM regulation can fail miserably, and how it merges itself with the ID1 paradox.

After all, as long as ID1 remains a CSAM Trust & Safety Industry Standard, proactive CSAM detection will have no place in CSAM Trust & Safety teams and CSAM numbers will keep being artificially generated (“Hey, that’s what’s in the database”).

Even though CSAM networks are highly specialized (for example, Image A may never appear together with Image B, even though this is on the surface “imaginable’’), Image A is hardly ever analyzed in its CSAM network context; moreover, the context is also not captured by ID1.

Since Image A, Image B and Image C are simply and independently added to ID1, a second state of blindness regarding CSAM databases is created — forcing CSAM Trust & Safety teams to require even higher levels of training, or CSAM Threat Actors will easily bypass human eyes too.

In particular, when it comes to something as complex as CSAM & Online Platforms interaction, we must be extremely careful with any ‘harmonized’ regulation. CSAM networks are not organized in the same way inside Instagram as they are in Twitter. TikTok and Snapchat present CSAM threats that are very different from each other, and so on and so forth. And this creates an issue for any regulation aimed at establishing “common Trust & Safety practices” anywhere.

Additionally, CSAM networks have shown a great ability to change faster than laws do. This means that, faced with a mandatory practice, Trust & Safety Teams risk being forced to comply with regulations that, soon, will not make sense anymore for detecting the latest violating content.

For me personally, I think that ID1 solutions (the very core of the new EU proposal) belong in the Trust & Safety tool chest that, even though helpful, don’t deserve to be on the Unicorn’s Throne (“magical solution”) anymore.

Even though platforms such as Facebook, Twitter, TikTok and others constantly scan their platforms against ID1, the fact that CSAM networks exist within them under the radar shows us how ID1 has become outdated for its purpose as a “magical solution”.

By saying that the CSAM problem can be solved by scanning all other non-Big Tech platforms and matching them against a database of known CSAM files, the EU indirectly argues that Big Tech is not the main target of this regulation — but rather a small “Platform X” whatever that may be.

The specific CSAM threat posed by “Platform X” which MUST BE under EU jurisdiction is yet to be demonstrated while, parallel to that, a quick analysis of CSAM networks points to Asia, Oceania and Arabia as the preferred ‘jurisdictions’ for CSAM hosting — despite this, CSAM hotlines refer primarily to the EU and the United States.

This is only an example of the contradictory data we need to resolve, highlighting once more the urgent need to really research the data before proposing any solutions to solve the problem.

From a compliance perspective, the new EU proposal gives us the weird impression of a compliance program written without any risk assessment — a logical contradiction through which we derive conclusions without having ever established premises. See, for example, that the creation of the EU Center (premise generator) and the CSAM scanners are meant to be simultaneous, while the first should have been used mainly to validate the adequacy of the second.

Until now, the risk assessment documentation, the document that says ‘what is really going on’ with the targets that the EU intends to regulate is missing (e.g. what is the real problem with Instagram). The fact that we keep supporting CSAM regulations with numbers we know nothing about should cause an overall discomfort.

3. If Web3 is the next big thing, what do you think of the metaverse with regard to online child safety? Is there a greater risk around child grooming?

The metaverse will only surface the fact that child safety is an issue much bigger than just ‘known CSAM files’.

The hyperrealistic metaverse avenues give us the false impression that we are facing a new, or even an increased child safety threat. Instead, what we are facing is a “new wine in an old bottle.” Because the wine label is now more colorful, more visual, we tend to pay more attention to it. But that’s all it is — -a new label.

The bad news is, the prototype for the metaverse child safety threat already exists on online platforms. It’s only the lack of research and proper detection tools that keep this threat in “silent mode’’.

A quick look into online platforms’ channels will rapidly show us that CSAM Threat Actors are already explicitly organized, managing their CSAM networks together, near child profiles on some of these platforms. This is exactly what child safety advocates should be expecting in the metaverse.

What causes some degree of discomfort in the metaverse is not the graphical world it brings with it, but the fact that “CSAM networks moving in tandem” have been and are (with the ID1 as the only industry standard) well below CSAM Trust & Safety teams’ radars.

The fact that metaverse worlds will allow us to actually see CSAM perpetrators walking their avatars around children’s accounts once, twice, twenty times a day doesn’t make CSAM Threat Actors more threatening — it actually makes children safer.

What was an internal log visible only to Trust & Safety teams (e.g. the threat actors movement) now becomes a phenomenon easily seen by whoever is around. The question now depends on the efficiency of ‘reporting buttons’ and ‘review metrics’.

Contrary to common opinion, I personally believe that it is precisely the metaverse — because of the level of transparency afforded by its visual nature — that will take the reins of child safety for the next decade. Metaverse visuals might be the thing that instigates the “minimal duty of care” standards, and I would be very happy if that finally happened.

4. Given your experience as a CSAM researcher, what advice would you give to online platforms?

Child Sexual Abuse Materials (CSAM) networks must be understood as what they are: A social science phenomenon.

From a CSAM research perspective, this highlights the hidden potential of qualitative methods (“hows” instead of “how many”) such as network analysis for Trust & Safety governance.

Blindly reviewing numbers hasn’t been of any help up to this point. The reason is a partnership with CSAM Researchers (external and internal ones), in order to understand what the proper research questions are, is always advisable.

If I could provide a single CSAM research insight to online platforms, I would say that the way CSAM Threat Actors interact with each other tends to provide more useful insights and a better detection system than their isolated pieces of CSAM content.

My advice would be, in this sense, to better research the birth, development, and death of CSAM networks that operate inside online platforms’ own channels — moving thus towards a proactive CSAM Trust & Safety detection approach that is based on predictive metrics derived, precisely, from these social science scenarios.

5. Organizations like IWF and Thorn have been created to protect children from online sexual abuse. However, for a cause that could be a global initiative, these organizations can sometimes take a proprietary posture towards their tools and data. With elimination of online child abuse being such an important problem, do you think we can get to more collaborative and distributed solutions?

Proprietary postures towards CSAM tools and data can be better understood, in the case of NGOs, by a proprietary interest towards financial sponsors (hereby named “Data Sensitivity”) — rather than a wise, thoughtful decision to protect children.

Proprietary postures towards CSAM tools and data can be better understood, in the online platforms case, by the legal interest of compliance (hereby named “Data Privacy”) — rather than a wise, thoughtful decision to protect children.

Similar to the point about the proposed EU regulation, “proprietary” tools subjected to no external review also risk getting the problem wrong.

For example, even though known identifiers used by CSAM offenders and CSAM victims would have been extremely helpful for content moderators to proactively identify CSAM, Threat Actors can bypass online platform’s traditional “image-based CSAM detectors”, but “CSAM context” is hardly ever the focus of those proprietary tools.

And here we can easily see how the issue spins, once again, in circles. Because CSAM databases are such a sensitive topic, those tools become not only proprietary, but “Law Enforcement only”. But where and how does Law Enforcement work start? We absolutely need this context, and we absolutely need to integrate content moderators in this discussion. You might agree with me — as mentioned above — that with a ID1, law enforcement does not go much further — and here is how the second bottleneck is created:

Content Moderators cannot be properly trained on how to recognize a CSAM threat — as the tools are ‘highly sensitive’ — and Law Enforcement cannot proactively navigate online platforms for leads as they do not know how to find the networks. A lose-lose scenario, in which we clearly see the harm caused by three parallel worlds (NGOs, Law Enforcement Agencies and Trust & Safety Groups) that should have never started working against each other.

Having Trust & Safety teams interacting with CSAM NGOs only as ‘anonymous readers of public (thus filtered) research reports’ has the unfortunate effect of turning what could have been a highly impactful CSAM “Trust & Safety Trends & Issues” Research Report into one that hardly ever meets the actual Trust & Safety Research needs.

Even though I agree that CSAM is a sensitive topic, I disagree that Trust & Safety teams must be trained, for example, with the help of blind hash databases whose context they have no idea of or with the help of redacted, filtered, blurred CSAM Research Reports.

A more transparent approach to CSAM Trust & Safety training — benefiting from the qualitative findings of those NGOs — is, as such, advisable.

In the era of artificial intelligence, we’ll learn how to manage and further develop synthetic datasets. A cooperative approach to training and a distributed, synthetic approach to CSAM data is, as such, my added “how” comment to a “yes” answer.

More posts like this

We want content moderation to enhance your users’ experience and so they can find their special one more easily.

The Future of AI-Powered Content Moderation: Careers and Opportunities

As companies are grappling with the challenge of ensuring user safety and creating a welcoming environment: AI-powered content moderation has emerged as a powerful solution, revolutionizing the way organizations approach this task. In this article, we will explore the careers and opportunities that AI-powered content moderation presents, and how individuals and businesses can adapt to…
6 minutes

The Impact of Trust and Safety in Marketplaces

Nowadays, its no surprise that an unregulated marketplace with sketchy profiles, violent interactions, scams, and illegal products is doomed to fail. In the current world of online commerce, trust and safety are essential, and if users don't feel comfortable, they won’t buy. As a marketplace owner, ensuring that your platform is a safe and reliable…
9 minutes

How AI is Revolutionizing Content Moderation in Social Media Platforms

Social media platforms have become an integral part of our lives, connecting us with friends, family, and the world at large. Still, with the exponential growth of user-generated content, ensuring a safe and positive user experience has become a daunting task. This is where Artificial Intelligence (AI) comes into play, revolutionizing the way social media…
3 minutes

Customizing AI Content Moderation for Different Industries and Platforms

With the exponential growth of user-generated content across various industries and platforms, the need for effective and tailored content moderation solutions has never been more apparent. Artificial Intelligence (AI) plays a major role in automating content moderation processes, but customization is key to address the unique challenges faced by different industries and platforms. Understanding Industry-Specific…
3 minutes

Emerging Threats in AI Content Moderation : Deep Learning and Contextual Analysis 

With the rise of user-generated content across various platforms, artificial intelligence (AI) has played a crucial role in automating the moderation process. However, as AI algorithms become more sophisticated, emerging threats in content moderation are also on the horizon. This article explores two significant challenges: the use of deep learning and contextual analysis in AI…
4 minutes

The Impact of AI Content Moderation on User Experience and Engagement

User experience and user engagement are two critical metrics that businesses closely monitor to understand how their products, services, or systems are being received by customers. Now that user-generated content (UGC) is on the rise, content moderation plays a main role in ensuring a safe and positive user experience. Artificial intelligence (AI) has emerged as…
4 minutes

Future Technologies : The Next Generation of AI in Content Moderation 

With the exponential growth of user-generated content on various platforms, the task of ensuring a safe and compliant online environment has become increasingly complex. As we look toward the future, emerging technologies, particularly in the field of artificial intelligence (AI), are poised to revolutionize content moderation and usher in a new era of efficiency and…
3 minutes

Global Perspective : How AI Content Moderation Differs Across Cultures and Religion

The internet serves as a vast platform for the exchange of ideas, information, and opinions. However, this free exchange also brings challenges, including the need for content moderation to ensure that online spaces remain safe and respectful. As artificial intelligence (AI) increasingly plays a role in content moderation, it becomes essential to recognize the cultural…
5 minutes

Ethical Consideration in AI Content Moderation : Avoiding Censorship and Biais

Artificial Intelligence has revolutionized various aspects of our lives, including content moderation on online platforms. As the volume of digital content continues to grow exponentially, AI algorithms play a crucial role in filtering and managing this content. However, with great power comes great responsibility, and the ethical considerations surrounding AI content moderation are becoming increasingly…
3 minutes

The Psychology Behind AI Content Moderation: Understanding User Behavior

Social media platforms are experiencing exponential growth, with billions of users actively engaging in content creation and sharing. As the volume of user-generated content continues to rise, the challenge of content moderation becomes increasingly complex. To address this challenge, artificial intelligence (AI) has emerged as a powerful tool for automating the moderation process. However, user…
5 minutes

How to Launch a Successful Career in Trust and Safety‍

Before diving into the specifics of launching a career in Trust and Safety, it's important to have a clear understanding of what this field entails. Trust and Safety professionals are responsible for maintaining a safe and secure environment for users on digital platforms. This includes identifying and addressing harmful content, developing policies to prevent abuse,…
5 minutes

‍The Future of Dating: Embracing Video to Connect and Thrive

In a rapidly evolving digital landscape, dating apps are continually seeking innovative ways to enhance the user experience and foster meaningful connections. One such trend that has gained significant traction is the integration of video chat features. Video has emerged as a powerful tool to add authenticity, connectivity, and fun to the dating process. In…
4 minutes

How Predators Are Abusing Generative AI

The recent rise of generative AI has revolutionized various industries, including Trust and Safety. However, this technological advancement generates new problems. Predators have found ways to abuse generative AI, using it to carry out horrible acts such as child sex abuse material (CSAM), disinformation, fraud, and extremism. In this article, we will explore how predators…
4 minutes

What is Content Moderation: a Guide

Content moderation is one of the major aspect of managing online platforms and communities. It englobes the review, filtering, and approval or removal of user-generated content to maintain a safe and engaging environment. In this article, we'll provide you with a comprehensive glossary to understand the key concepts, as well as its definition, challenges and…
15 minutes

What is Doxxing: A Comprehensive Guide to Protecting Your Online Privacy

Today, protecting our online privacy has become increasingly important. One of the most concerning threats we face is doxxing. Derived from the phrase "dropping documents," doxxing refers to the act of collecting and exposing an individual's private information, with the intention of shaming, embarrassing, or even endangering them. This malicious practice has gained traction in…
7 minutes

Prevent unwanted content from reaching your platform

Speak to one of our experts and learn about using AI to protect your platform
Talk to an expert