Abstract
We ask whether state-of-the-art large language models can provide a viable alternative to human annotators for detecting and explaining behavioural influence online. Working with a large corpus of online interactions retrieved from the social media platform Mastodon, we cross-examine a dataset containing 11,000 LLM influence labels and explanations across nine state-of-the-art large language models from 312 scenarios. We use a range of resolution categories and four stages of shot prompting to further measure the importance of context to language model performance. We also consider the impact of model architecture, and how social media content and features from the explanation impact model labelling accuracy. Our experiment shows that whilst most large language models struggle to identify the correct framing of influence from an interaction, at lower label resolutions, models like Flan and GPT-4 Turbo perform with an accuracy of 70%-80%, demonstrating encouraging potential for future social influence identification and explanation, and contributing to our understanding of the general social reasoning capabilities of large language models.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2024 AAAI Fall Symposia |
Publisher | AAAI Press |
Pages | 40-47 |
Number of pages | 8 |
ISBN (Electronic) | 9781577358947 |
DOIs | |
Publication status | Published - 8 Nov 2024 |
Event | AI Trustworthiness and Risk Assessment for Challenged Contexts: AAAI 2024 Fall Symposium - Westin Arlington Gateway, Arlington, United States Duration: 7 Nov 2024 → 9 Nov 2024 https://sites.google.com/view/aaai-atracc |
Publication series
Name | Proceedings of the AAAI Symposium Series |
---|---|
Publisher | AAAI |
Number | 1 |
Volume | 4 |
ISSN (Electronic) | 2994-4317 |
Conference
Conference | AI Trustworthiness and Risk Assessment for Challenged Contexts |
---|---|
Abbreviated title | ATRACC |
Country/Territory | United States |
City | Arlington |
Period | 7/11/24 → 9/11/24 |
Internet address |
Research Groups and Themes
- Cyber Security