Skip to content Skip to footer
Subproject #2

Critical Digital Literacy

The promotion of digital literacy combined with critical thinking (SubProject#2) is arguably the most efficient way to tackle “fake news” and disinformation online. It has been proven to work in Finland, where five years after introducing such programs, the country has declared victory in the fight against “fake news”. Inspired by Finland’s example, this sub-project aims at promoting critical digital literacy in Qatar. We plan to achieve this through a general media literacy platform that would teach citizens and residents of Qatar how to recognize “fake news” and propaganda techniques. The platform will have lessons and exploration capabilities. It will feature tools to analyze news, social media posts, or any custom text in Arabic and English, and it would make explicit the propaganda/persuasion techniques of the discussed issues.

The tool will look for persuasion techniques such as appeal to emotions (e.g., fear, prejudices, smears, etc.) as well as logical fallacies (e.g., black & white fallacies, bandwagon, etc.). By interacting with the platform, users will become aware of the ways they can be manipulated by “fake news”, and thus they would be less likely to act based on it and also less likely to share it further, which is critical for limiting the potential impact of organized disinformation campaigns online. We will further study the role of critical digital literacy on people’s resilience to online manipulation and influence, whether legitimate, e.g., in e-commerce, or malicious, e.g., in social engineering and phishing. This literacy will also cover understanding the influence of the algorithms and the designs used in digital media, i.e., we will go beyond the literacy of how to recognize threats and how to respond to them to understand the underlying mechanics of influence and deception online.

It can be a valid argument, typically made by social media companies, that people shall be primarily responsible for managing their traits, weaknesses, worries, stress, and jealousy, whether in physical or online worlds. Generally, self-regulation is expected from users of social media. However, we argue that social media design can become too immersive and, at times, addictive. Hence, we argue that social media shall reduce triggers leading to a loss of control over their usage. Digital addiction is associated with reduced productivity and distracting sleep. Fear of missing out (FoMO) is one manifestation of how users become overly preoccupied with online spaces. We have argued that a thoughtful design process shall equip users with tools to manage it, e.g. creative versions of auto-reply, coloring schemes, and filters. Such a design can benefit those who are highly susceptible to peer pressure and possess low impulse control.

Objectives

  1. To build a high-quality corpus annotated with propaganda and its techniques.

  2. To develop a system for detecting the use of propaganda and its techniques in text in Arabic and English with a focus on Qatar and social media.

  3. To develop an online platform for teaching critical digital literacy and then use the platform to study the role of critical digital literacy on people’s resilience to online manipulation and influence.

Meet Critical Digital Literacy Team members...

FIROJ ALAM

Lead Principal Investigator:
Critical Digital Literacy

WAJDI ZAGHOUANI

Principal Investigator

GEORGE MIKROS

Principal Investigator

GIOVANNI DA SAN MARTINO

Principal Investigator

MARAM HASANAIN

Postdoctoral Researcher

FATEMA AHMAD

Reaserach Assistant

ELISA SARTORI

Research Assistant
University of Padova

MUAADH NOMAN

PhD Researcher

Publications

21 entries « 1 of 2 »
1.

Alam, Firoj; Hasnat, Abul; Ahmad, Fatema; Hasan, Md. Arid; Hasanain, Maram

ÄrMeme: Propagandistic Content in Arabic Memes Proceedings Article

In: Al-Onaizan, Yaser; Bansal, Mohit; Chen, Yun-Nung (Ed.): Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pp. 21071–21090, Association for Computational Linguistics, Miami, Florida, USA, 2024.

Abstract | Links | BibTeX

2.

Hasanain, Maram; Ahmad, Fatema; Alam, Firoj

Large Language Models for Propaganda Span Annotation Proceedings Article

In: Al-Onaizan, Yaser; Bansal, Mohit; Chen, Yun-Nung (Ed.): Findings of the Association for Computational Linguistics: EMNLP 2024, pp. 14522–14532, Association for Computational Linguistics, Miami, Florida, USA, 2024.

Abstract | Links | BibTeX

3.

Hasanain, Maram; Hasan, Md. Arid; Ahmad, Fatema; Suwaileh, Reem; Biswas, Md. Rafiul; Zaghouani, Wajdi; Alam, Firoj

ÄrAIEval Shared Task: Propagandistic Techniques Detection in Unimodal and Multimodal Arabic Content Proceedings Article

In: Habash, Nizar; Bouamor, Houda; Eskander, Ramy; Tomeh, Nadi; Farha, Ibrahim Abu; Abdelali, Ahmed; Touileb, Samia; Hamed, Injy; Onaizan, Yaser; Alhafni, Bashar; Antoun, Wissam; Khalifa, Salam; Haddad, Hatem; Zitouni, Imed; AlKhamissi, Badr; Almatham, Rawan; Mrini, Khalil (Ed.): Proceedings of The Second Arabic Natural Language Processing Conference, pp. 456–466, Association for Computational Linguistics, Bangkok, Thailand, 2024.

Abstract | Links | BibTeX

4.

Dimitrov, Dimitar; Alam, Firoj; Hasanain, Maram; Hasnat, Abul; Silvestri, Fabrizio; Nakov, Preslav; Martino, Giovanni Da San

SemEval-2024 Task 4: Multilingual Detection of Persuasion Techniques in Memes Proceedings Article

In: Ojha, Atul Kr.; Doğruöz, A. Seza; Madabushi, Harish Tayyar; Martino, Giovanni Da San; Rosenthal, Sara; Rosá, Aiala (Ed.): Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), pp. 2009–2026, Association for Computational Linguistics, Mexico City, Mexico, 2024.

Links | BibTeX

5.

Dimitrov, Dimitar; Alam, Firoj; Hasanain, Maram; Hasnat, Abul; Silvestri, Fabrizio; Nakov, Preslav; Martino, Giovanni Da San

SemEval-2024 Task 4: Multilingual Detection of Persuasion Techniques in Memes Proceedings Article

In: Ojha, Atul Kr.; Doğruöz, A. Seza; Madabushi, Harish Tayyar; Martino, Giovanni Da San; Rosenthal, Sara; Rosá, Aiala (Ed.): Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), pp. 2009–2026, Association for Computational Linguistics, Mexico City, Mexico, 2024.

Links | BibTeX

6.

Gurgun, Selin; Cemiloglu, Deniz; Close, Emily Arden; Phalp, Keith; Nakov, Preslav; Ali, Raian

Why do we not stand up to misinformation? Factors influencing the likelihood of challenging misinformation on social media and the role of demographics Journal Article

In: Technology in Society, vol. 76, pp. 102444, 2024.

Abstract | Links | BibTeX

7.

Gurgun, Selin; Noman, Muaadh; Arden-Close, Emily; Phalp, Keith; Ali, Raian

How Would I Be Perceived If I Challenge Individuals Sharing Misinformation? Exploring Misperceptions in the UK and Arab Samples and the Potential for the Social Norms Approach Proceedings Article

In: International Conference on Persuasive Technology, pp. 133–150, Springer 2024.

Abstract | Links | BibTeX

8.

Hasanain, Maram; Ahmed, Fatema; Alam, Firoj

Can GPT-4 Identify Propaganda? Annotation and Detection of Propaganda Spans in News Articles Journal Article

In: arXiv preprint arXiv:2402.17478, 2024.

Abstract | Links | BibTeX

9.

Hasanain, Maram; Suwaileh, Reem; Weering, Sanne; Li, Chengkai; Caselli, Tommaso; Zaghouani, Wajdi; Barrón-Cedeño, Alberto; Nakov, Preslav; Alam, Firoj

Overview of the CLEF-2024 CheckThat! Lab Task 1 on Check-Worthiness Estimation of Multigenre Content Proceedings Article

In: Working Notes of CLEF 2024 – Conference and Labs of the Evaluation Forum, Grenoble, France, 2024.

Abstract | Links | BibTeX

10.

Struß, Julia Maria; Ruggeri, Federico; Barrón-Cedeño, Alberto; Alam, Firoj; Dimitrov, Dimitar; Galassi, Andrea; Pachov, Georgi; Koychev, Ivan; Nakov, Preslav; Siegel, Melanie; Wiegand, Michael; Hasanain, Maram; Suwaileh, Reem; Zaghouani, Wajdi

Overview of the CLEF-2024 CheckThat! Lab Task 2 on Subjectivity in News Articles Proceedings Article

In: Working Notes of CLEF 2024 – Conference and Labs of the Evaluation Forum, Grenoble, France, 2024.

Abstract | Links | BibTeX

11.

Alam, Firoj; Biswas, Md. Rafiul; Shah, Uzair; Zaghouani, Wajdi; Mikros, Georgios

Propaganda to Hate: A Multimodal Analysis of Arabic Memes with Multi-Agent LLMs Journal Article

In: 2024.

Links | BibTeX

12.

Barrón-Cedeño, Alberto; Alam, Firoj; Struß, Julia Maria; Nakov, Preslav; Chakraborty, Tanmoy; Elsayed, Tamer; Przybyła, Piotr; Caselli, Tommaso; Martino, Giovanni Da San; Haouari, Fatima; Li, Chengkai; Piskorski, Jakub; Ruggeri, Federico; Song, Xingyi; Suwaileh, Reem

Overview of the CLEF-2024 CheckThat! Lab: Check-Worthiness, Subjectivity, Persuasion, Roles, Authorities and Adversarial Robustness Proceedings Article

In: Goeuriot, Lorraine; Mulhem, Philippe; Quénot, Georges; Schwab, Didier; Soulier, Laure; Nunzio, Giorgio Maria Di; Galuščáková, Petra; de Herrera, Alba García Seco; Faggioli, Guglielmo; Ferro, Nicola (Ed.): Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fifteenth International Conference of the CLEF Association (CLEF 2024), 2024.

Abstract | Links | BibTeX

13.

Piskorski, Jakub; Stefanovitch, Nicolas; Alam, Firoj; Campos, Ricardo; Dimitrov, Dimitar; Jorge, Alípio; Pollak, Senja; Ribin, Nikolay; Fijavž, Zoran; Hasanain, Maram; Guimarães, Nuno; Pacheco, Ana Filipa; Sartori, Elisa; Silvano, Purificação; Zwitter, Ana Vitez; Koychev, Ivan; Yu, Nana; Nakov, Preslav; Martino, Giovanni Da San

Overview of the CLEF-2024 CheckThat! Lab Task 3 on Persuasion Techniques Proceedings Article

In: Working Notes of CLEF 2024 – Conference and Labs of the Evaluation Forum, Grenoble, France, 2024.

Abstract | Links | BibTeX

14.

Gurgun, Selin; Cemiloglu, Deniz; Arden-Close, Emily; Phalp, Keith; Ali, Raian; Nakov, Preslav

Challenging Misinformation on Social Media: Users’ Perceptions and Misperceptions and Their Impact on the Likelihood to Challenge Journal Article

In: Available at SSRN 4600006, 2023.

Abstract | Links | BibTeX

15.

Hasanain, Maram; El-Shangiti, Ahmed; Nandi, Rabindra Nath; Nakov, Preslav; Alam, Firoj

QCRI at SemEval-2023 Task 3: News Genre, Framing and Persuasion Techniques Detection Using Multilingual Models Proceedings Article

In: Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), pp. 1237–1244, Association for Computational Linguistics, Toronto, Canada, 2023.

Abstract | Links | BibTeX

16.

Galassi, Andrea; Ruggeri, Federico; Barrón-Cedeño, Alberto; Alam, Firoj; Caselli, Tommaso; Kutlu, Mucahid; Struss, Julia Maria; Antici, Francesco; Hasanain, Maram; Köhler, Juliane; Korre, Katerina; Leistra, Folkert; Muti, Arianna; Siegel, Melanie; Deniz, Turkmen. Mehmet; Wiegand, Michael; Zaghouani, Wajdi

Overview of the CLEF-2023 CheckThat! Lab Task 2 on Subjectivity in News Articles Proceedings Article

In: Aliannejadi, Mohammad; Faggioli, Guglielmo; Ferro, Nicola; Vlachos,; Michalis, (Ed.): Working Notes of CLEF 2023–Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, 2023.

Links | BibTeX

17.

Martino, Giovanni Da San; Alam, Firoj; Hasanain, Maram; Nandi, Rabindra Nath; Azizov, Dilshod; Nakov, Preslav

Overview of the CLEF-2023 CheckThat! Lab Task 3 on Political Bias of News Articles and News Media Proceedings Article

In: Aliannejadi, Mohammad; Faggioli, Guglielmo; Ferro, Nicola; Vlachos,; Michalis, (Ed.): Working Notes of CLEF 2023–Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, 2023.

Links | BibTeX

18.

Nakov, Preslav; Alam, Firoj; Martino, Giovanni Da San; Hasanain, Maram; Nandi, Rabindra Nath; Azizov, Dilshod; Panayotov, Panayot

Overview of the CLEF-2023 CheckThat! Lab Task 4 on Factuality of Reporting of News Media Proceedings Article

In: Aliannejadi, Mohammad; Faggioli, Guglielmo; Ferro, Nicola; Vlachos,; Michalis, (Ed.): Working Notes of CLEF 2023–Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, 2023.

Links | BibTeX

19.

Abdelali, Ahmed; Mubarak, Hamdy; Chowdhury, Shammur Absar; Hasanain, Maram; Mousi, Basel; Boughorbel, Sabri; Kheir, Yassine El; Izham, Daniel; Dalvi, Fahim; Hawasly, Majd; others,

Benchmarking arabic ai with large language models Journal Article

In: arXiv preprint arXiv:2305.14982, 2023.

Abstract | Links | BibTeX

20.

Dalvi, Fahim; Hasanain, Maram; Boughorbel, Sabri; Mousi, Basel; Abdaljalil, Samir; Nazar, Nizi; Abdelali, Ahmed; Chowdhury, Shammur Absar; Mubarak, Hamdy; Ali, Ahmed; others,

LLMeBench: A Flexible Framework for Accelerating LLMs Benchmarking Journal Article

In: arXiv preprint arXiv:2308.04945, 2023.

Links | BibTeX

21 entries « 1 of 2 »

Conferences

ArMeme: Propagandistic Content in Arabic Memes

Read the full paper here

 Find the presentation here

Download the dataset from here

Abstract: With the rise of digital communication memes have become a significant medium for cultural and political expression that is often used to mislead audience. Identification of such misleading and persuasive multimodal content become more important among various stakeholders, including social media platforms, policymakers, and the broader society as they often cause harm to the individuals, organizations and/or society. While there has been effort to develop AI based automatic system for resource rich languages (e.g., English), it is relatively little to none for medium to low resource languages. In this study, we focused on developing an Arabic memes dataset with manual annotations of propagandistic content. We annotated ∼6K Arabic memes collected from various social media platforms, which is a first resource for Arabic multimodal research. We provide a comprehensive analysis aiming to develop computational tools for their detection. We made the dataset publicly available for the community.

Large Language Models for Propaganda Span Annotation

Read the full paper here

 Find the poster here

Download the dataset from here

Abstract: The use of propagandistic techniques in online content has increased in recent years aiming to manipulate online audiences. Fine-grained propaganda detection and extraction of textual spans where propaganda techniques are used, are essential for more informed content consumption. Automatic systems targeting the task over lower resourced languages are limited, usually obstructed by lack of large scale training datasets. Our study investigates whether Large Language Models (LLMs), such as GPT-4, can effectively extract propagandistic spans. We further study the potential of employing the model to collect more cost-effective annotations. Finally, we examine the effectiveness of labels provided by GPT-4 in training smaller language models for the task. The experiments are performed over a large-scale in-house manually annotated dataset. The results suggest that providing more annotation context to GPT-4 within prompts improves its performance compared to human annotators. Moreover, when serving as an expert annotator (consolidator), the model provides labels that have higher agreement with expert annotators, and lead to specialized models that achieve state-of-the-art over an unseen Arabic testing set. Finally, our work is the first to show the potential of utilizing LLMs to develop annotated datasets for propagandistic spans detection task prompting it with annotations from human annotators with limited expertise. All scripts and annotations will be shared with the community.

Can GPT-4 Identify Propaganda? Annotation and Detection of Propaganda Spans in News Articles

Read the full paper here

 Find the poster here

Download the dataset from here

Abstract: The use of propaganda has spiked on mainstream and social media, aiming to manipulate or mislead users. While efforts to automatically detect propaganda techniques in textual, visual, or multimodal content have increased, most of them primarily focus on English content. The majority of the recent initiatives targeting medium to low-resource languages produced relatively small annotated datasets, with a skewed distribution, posing challenges for the development of sophisticated propaganda detection models. To address this challenge, we carefully develop the largest propaganda dataset to date, ArPro, comprised of 8K paragraphs from newspaper articles, labeled at the text span level following a taxonomy of 23 propagandistic techniques. Furthermore, our work offers the first attempt to understand the performance of large language models (LLMs), using GPT-4, for fine-grained propaganda detection from text. Results showed that GPT-4’s performance degrades as the task moves from simply classifying a paragraph as propagandistic or not, to the fine-grained task of detecting propaganda techniques and their manifestation in text. Compared to models fine-tuned on the dataset for propaganda detection at different classification granularities, GPT-4 is still far behind. Finally, we evaluate GPT-4 on a dataset consisting of six other languages for span detection, and results suggest that the model struggles with the task across languages. We made the dataset publicly available for the community.

Workshops

Critique What You Read!

Find the presentation here

On the 8th of Sep, 2024, Critical Digital Literacy team and team MARSAD (sp#1), and in collaboration with QNL, held a public workshop to empower people to critique what they read. The workshop focused on the ways we can improve our consumption of news and online content, and empowered them with ways and tools to verify news and identify possible use of propagandistic techniques. 

ArMeme

Download the dataset here

Read the paper here

ArMeme is the first multimodal Arabic memes dataset that includes both text and images, collected from various social media platforms. It serves as the first resource dedicated to Arabic multimodal research. While the dataset has been annotated to identify propaganda in memes, it is versatile and can be utilized for a wide range of other research purposes, including sentiment analysis, hate speech detection, cultural studies, meme generation, and cross-lingual transfer learning. The dataset opens new avenues for exploring the intersection of language, culture, and visual communication.

ArMPro

Download the dataset here

Read the paper here

This dataset represents the largest one to date for fine-grained propaganda detection. It includes 8,000 paragraphs extracted from over 2,800 Arabic news articles, covering a large variety of news domains.

LLM_Propaganda Annotation

Download the dataset here

Read the paper here

Our study investigates whether large language models (LLMs), such as GPT-4, can effectively extract propagandistic spans. We further study the potential of employing the model to collect more cost-effective annotations. Finally, we examine the effectiveness of labels provided by GPT-4 in training smaller language models for the task. In this repo we release full human annotations, consolidated gold labels, and annotations provided by GPT-4 in different annotator roles.

The Future of Digital Citizenship in Qatar : a Socio-Technical Framework

Partners

Copyright © Digital Citizenship in Qatar. All Right Reserved

Research Sponsored by Qatar National Research Fund, NPRP14C-0916-210015