fbpx

Misinformation Expert’s Corner : Preslav Nakov, AI and Fake News

Preslav Nakov has established himself as one of the leading experts on the use of AI against propaganda and disinformation. He has been very influential in the field of natural language processing and text mining, publishing hundreds of peer reviewed research papers. He spoke to us about his work dealing with the ongoing problem of online misinformation.

FAQ on Misinformation with Preslav Nakov


1. What do you think about the on-going infodemic? With your extensive work on fake news and misinformation, do you think there will be a point where we can see a decrease in such content?


Indeed, the global COVID-19 pandemic has also brought us the first global social media infodemic. At the beginning of the pandemic, the World Health Organization already realized the importance of the problem and put fighting the infodemic at rank two in its list of top-5 priorities. The infodemic represents an interesting blending of political and medical misinformation and disinformation. Now, a year and a half later, both the pandemic and the infodemic persist.

Yet, I am an optimist. What has fueled the infodemic initially was that so little was known about COVID-19, and there was a lot of void to be filled. Later on, with the emergence of the vaccines, the infodemic got a new boost by the re-emerging anti-vaxxer movement, which has grown to be much more powerful than before. However, as the severity of the pandemic has now started to decrease, for example, we see full stadiums at EURO 2020 with no masks and little social distancing, to a large extent thanks to the vaccines, I expect that the infodemic will soon follow a similar downward trajectory. Yet, it will not die out completely, just decrease.


2. What drove you to pursue research in the fake news and misinformation domain?


As part of a collaboration between the Qatar Computing Research Institute (QCRI) and MIT, I was working on question answering in community forums, where the goal was to detect which answers in the forum were good, in the sense of trying to answer the question directly, as opposed to giving indirectly related information, discussing other topics, or talking to other users. We developed a strong system, which we deployed for use in production in a local forum, Qatar Living, where it is operational to date, but we soon realized that not all good answers were factually true.

This got me interested in the factuality of user-generated content. Soon, along came the 2016 US Presidential election, and fake news and factuality became a global concern. Thus, I started the Tanbih mega-project, which we are developing at QCRI in collaboration with MIT, and other partners. The aim of the project is to help fight the fake news, propaganda, and media bias by making users aware of what they are reading, thus promoting media literacy and critical thinking. At Checkstep, we’re currently building AI-first tools to tackle hate speech, spam and misinformation.



3. What do you think about the upcoming regulations — EU’s DSA and OSB-UK?

These upcoming EU and U.K. regulations (and related proposals that are being discussed in the USA and other countries), have the potential to become transformative in the way GDPR was. Platforms would suddenly become responsible for their content, and would have a legal obligation to enforce their own terms of service as well as to comply with legislation on certain kinds of malicious content. They would also have an obligation to be able to explain their moderation decisions to their users as well as to external regulatory authorities. I see this as a hugely positive development, the way that GDPR was.

Legislators should be careful though to keep a good balance between trying to limit the spread of malicious content and protecting free speech. Moreover, we all should be cautious and remember that fake news and hate speech are complex problems and that legislation is only part of the overall solution. We would still need human moderators, research and development of tools that can help automate the process of content moderation at scale, fact-checking initiatives, high-quality journalism, teaching media literacy, and cooperation with platforms where user-generated content spreads.

4. How should platforms better prepare themselves?


Big Tech companies are already taking this seriously and have been developing in-house solutions for years. However, complying with the new legislation would be a challenge for small and mid-size companies (though it is also true that it affects them less), as well as for large ones for which user-generated content is important, but is not their core business.

For example, a small fitness club that also has a forum on their website could not afford to hire and train its own content moderators. Such companies face two main options: (a) shut down their fora to avoid any issues, or (b) try to outsource content moderation, partially or completely. When it comes to content moderation at scale, there is a clear need for automation, which can take care of a large number of easy cases, but the final decision in hard cases should be taken by humans, not machines.


5. Any recent talks/research you’d like to talk about? Can also mention future talks?

Fighting the infodemic and misinformation is typically thought of in terms of factuality, but it is a much broader problem. In February 2020, MIT Technology Review had an article that pointed out certain characteristics of the infodemic that go beyond factuality, such as fueling panic and racism.
Indeed, if the 2016 U.S. Presidential election gave us the term “fake news”, the 2020 one got the USA and the world concerned about a range of other types of malicious content online. The infodemic has demonstrated that this is part of the same problem, with dangers ranging from promoting fake cures, rumors, and conspiracy theories to spreading racism, xenophobia, and panic.

Addressing these issues requires solving a number of challenging problems such as identifying messages making claims, determining their check-worthiness and factuality, and their potential to do harm as well as the nature of that harm, to mention just a few. Thus, as part of Tanbih, we have been working on a system that can analyze user-generated content in Arabic, Bulgarian, English, and Dutch, which covers all these aspects and combines the perspectives and the interests of journalists, fact-checkers, social media platforms, policy makers, and society as a whole. A preliminary version of this work appeared in ICWSM-2021 last week.

We have been also looking into supporting fact-checkers and journalists by developing tools for predicting which claims are check-worthy and which ones have been previously fact-checked. We have an upcoming survey paper at IJCAI-2021, which surveys the AI technology that can help the fact-checkers. This has been the mission of the CLEF CheckThat! lab, which we have been organizing for four years now; also look at our recent ECIR-2021 paper about the lab.

Another research line I was involved in aims to detect the use of propaganda techniques in memes. Memes are very important as a large fraction of propaganda in social media is multimodal, mixing textual with visual content. Moreover, by focusing on the specific techniques (e.g., name calling, loaded language, flag-waving, whataboutism, black & white fallacy, etc.), we can train people to recognize how they are being manipulated. Recognizing twenty-two such techniques in memes has been the subject of a recent SemEval-2021 shared task; there is also an upcoming paper at ACL-2021.

In terms of content moderation, we recently wrote a survey that studied the dichotomy between what types of abusive language online platforms seek to curb vs. what research efforts there are to automatically detect abusive language.

6. Any personal anecdotes where you fell prey to fake news and misinformation ?

I have fallen prey to fake news many times, and I keep being fooled from time to time. Many friends and relatives send me articles asking me: is this fake news? In most cases, it is easy to tell, for example, maybe the article is just two to three sentences long and doesn’t give much support to the claim in the title, or maybe the website is a known fake news or satirical one, or a simple reverse image search reveals that the photo in the article is from a different event, or maybe the claim was previously fact-checked and known to be true/false, etc. Yet, in many cases, this is very hard, and my answer is: I am sorry, but I do not have a crystal ball. In fact, several studies in different countries have shown the same thing: most people cannot distinguish fake from real news; in the EU, this is true for 75% of young people.

Yet, with proper training, people can improve quite a bit. Indeed, two years ago Finland declared that they have won the war on fake news and misinformation thanks to their massive media literacy program targeting all levels of the society, but primarily the schools. It took them five years, which shows that real results are possible and achievable in a realistic time period. We should be careful when setting our expectations though: the goal should not be to eradicate all fake news online; it should rather be to limit its impact, thus making it irrelevant.

This has already happened to spam, which is still around, but is not the same kind of a problem that it used to be some 15–20 years ago; now Finland has shown that we can achieve the same with fake news as well. Thus, while the short-term solution should focus on content moderation and on limiting the spread of malicious content, the mid-term and long-term solution would do better to look at explainability and training users: this is fake news because …, this is hateful/offensive language because …, etc.

An edited version of this story originally appeared in The Checkstep Round-up newsletter https://checkstep.substack.com/p/calls-for-more-transparency-and-safety

More posts like this

We want content moderation to enhance your users’ experience and so they can find their special one more easily.

Image Moderation Guide: Discover the Power of AI

In today's digital world, visual content plays a significant role in online platforms, ranging from social media to e-commerce websites. With the exponential growth of user-generated images, ensuring a safe and inclusive user experience has become a paramount concern for platform owners. However, image moderation poses unique challenges due to the sheer volume, diverse content,…
4 minutes

The Psychology Behind AI Content Moderation: Understanding User Behavior

Social media platforms are experiencing exponential growth, with billions of users actively engaging in content creation and sharing. As the volume of user-generated content continues to rise, the challenge of content moderation becomes increasingly complex. To address this challenge, artificial intelligence (AI) has emerged as a powerful tool for automating the moderation process. However, user…
5 minutes

From Trolls to Fair Play: The Transformative Impact of AI Moderation in Gaming

The Online Battlefield The online gaming community, once a haven for enthusiasts to connect and share their passion, has faced the growing challenge of toxic behaviour and harassment. Teenagers and young adults are still the main demographic of players, and as multiplayer games became more popular, so did instances of trolling, hate speech, and other…
4 minutes

Outsourcing Content Moderation

Outsourcing content moderation has become an essential aspect of managing online platforms in the digital age. With the exponential growth of user-generated content, businesses are faced with the challenge of maintaining a safe and inclusive environment for their users while protecting their brand reputation. To address this, many companies are turning to outsourcing content moderation…
4 minutes

Ethical Consideration in AI Content Moderation : Avoiding Censorship and Biais

Artificial Intelligence has revolutionized various aspects of our lives, including content moderation on online platforms. As the volume of digital content continues to grow exponentially, AI algorithms play a crucial role in filtering and managing this content. However, with great power comes great responsibility, and the ethical considerations surrounding AI content moderation are becoming increasingly…
3 minutes

The Evolution of Online Communication: Cultivating Safe and Respectful Interactions

What was once an outrageous dream is now a mundane reality. Going from in-person communication to being able to hold a conversation from thousands of kilometres away has been nothing short of revolutionary. From the invention of email to the meteoric rise of social media and video conferencing, the ways we connect, share, and interact…
5 minutes

Designing for Trust in 2023: How to Create User-Friendly Designs that Keep Users Safe

The Significance of designing for trust in the Digital World In today's digital landscape, building trust with users is essential for operating a business online. Trust is the foundation of successful user interactions and transactions, it is key to encouraging users to share personal information, make purchases, and interact with website content. Without trust, users…
5 minutes

Global Perspective : How AI Content Moderation Differs Across Cultures and Religion

The internet serves as a vast platform for the exchange of ideas, information, and opinions. However, this free exchange also brings challenges, including the need for content moderation to ensure that online spaces remain safe and respectful. As artificial intelligence (AI) increasingly plays a role in content moderation, it becomes essential to recognize the cultural…
5 minutes

TikTok DSA Statement of Reasons (SOR) Statistics

What can we learn from TikTok Statements of Reasons? Body shaming, hypersexualisation, the spread of fake news and misinformation, and the glorification of violence are a high risk on any kind of Social Network. TikTok is one of the fastest growing between 2020 and 2023 and has million of content uploaded everyday on its platform.…
10 minutes

Fake Dating Images: Your Ultimate Moderation Guide

Introduction: Combatting fake dating images to protect your platform With growing number of user concerns highlighting fake dating images to mislead users, dating platforms are facing a growing challenge. These pictures are not only a threat to dating platform's integrity but it also erodes user trusts and exposes companies to reputational and compliance risks. In…
5 minutes

Future Technologies : The Next Generation of AI in Content Moderation 

With the exponential growth of user-generated content on various platforms, the task of ensuring a safe and compliant online environment has become increasingly complex. As we look toward the future, emerging technologies, particularly in the field of artificial intelligence (AI), are poised to revolutionize content moderation and usher in a new era of efficiency and…
3 minutes

How to deal with Fake Dating Profiles on your Platform

Have you seen an increase in fake profiles on your platform? Are you concerned about it becoming a wild west? In this article, we’ll dive into how to protect users from encountering bad actors and create a safer environment for your customers. An Introduction to the Issue Dating apps have transformed the way people interact…
5 minutes

Trust and Safety Teams: Ensuring User Protection

As the internet becomes an integral part of our daily lives, companies must prioritize the safety and security of their users. This responsibility falls on trust and safety teams, whose primary goal is to protect users from fraud, abuse, and other harmful behavior.  Trust and Safety Teams Objectives  The Role of Trust and Safety Teams…
6 minutes

The Impact of AI Content Moderation on User Experience and Engagement

User experience and user engagement are two critical metrics that businesses closely monitor to understand how their products, services, or systems are being received by customers. Now that user-generated content (UGC) is on the rise, content moderation plays a main role in ensuring a safe and positive user experience. Artificial intelligence (AI) has emerged as…
4 minutes

Building Trust and Safety Online: The Power of AI Content Moderation in Community Forums

Community forums are providing spaces for individuals to connect, share ideas, and build relationships. However, maintaining a safe and welcoming environment in these forums is crucial for fostering trust and ensuring the well-being of community members. To address this challenge, many forums are turning to the power of artificial intelligence (AI) content moderation. In this…
3 minutes

Prevent unwanted content from reaching your platform

Speak to one of our experts and learn about using AI to protect your platform
Talk to an expert