The Fourth Ukrainian NLP Workshop (UNLP 2025) organizes a Shared Task on Detecting Social Media Manipulation. This Shared Task aims to challenge and assess AI capabilities to detect and classify manipulation, laying the groundwork for progress in cybersecurity and the identification of disinformation within the context of Ukraine.
Join the discussions in Discord via https://discord.gg/DYNnWaZD4a.
Registration Form
Please register by March 23, 2025.
In this shared task, your goal is to build a model capable of identifying manipulation techniques in Ukrainian social media content (specifically, Telegram). In this context, “manipulation” refers to the presence of specific rhetorical or stylistic techniques aimed to influence the audience without providing clear factual support.
There are two tracks in this shared task:
-
Subtask 1 (Technique classification): given the text of a post, identify which manipulation techniques are used, if any. This is a multilabel classification problem; a single post can contain multiple techniques.
-
Subtask 2 (Span identification): given the text of a post, identify the specific spans of manipulative text, regardless of the manipulation technique. This is a binary token-level classification task, focusing on pinpointing exactly where the manipulative content occurs.
The dataset was provided by the Texty.org.ua team. It consists of Ukrainian Telegram posts annotated for the presence of ten manipulation techniques. The annotation was performed by experienced journalists, analysts, and media professionals. Detailed explanations and examples of manipulation techniques are available in data/techniques-en.md.
The two main dirs for the two tracks are:
To ensure fair and reproducible results:
- You may not use any Telegram data.
- You are allowed and encouraged to use other external data, but you must verify that the data license permits research use.
- You should use only open-source models in your solution. Proprietary models are allowed only for data generation.
- All code must be openly published for reproducibility (on GitHub, HF, etc.).
-
Subtask 1 (Technique Classification):
- Metric: Macro-F1 (for multilabel classification)
-
Subtask 2 (Span Identification):
- Metric: Token-level F1 (for binary span detection)
TBD
Participants in the shared task are invited to submit a paper to the UNLP 2025 workshop. Please see the UNLP website for details. Accepted papers will appear in the ACL anthology and will be presented at a session of UNLP 2025 specially dedicated to the Shared Task.
Submitting a paper is not mandatory for participating in the Shared Task.
- December 20, 2024 — Shared task announcement
- January 15, 2024 — Release of train data
- January 27, 2024 — Second call for participation
- March 23, 2025 — Registration deadline
- March 31 (11:58 PM GMT +02:00), 2025 — Final submission of system responses
- April 4, 2025 — Results of the Shared Task announced and release of test data
- April 14, 2025 — Shared Task paper due
- May 12, 2025 — Notification of acceptance
- June 2, 2025 — Camera-ready Shared Task papers due
- July 31 or August 1, 2025 — Workshop date
Email: [email protected], [email protected], [email protected], [email protected]