TY - GEN
T1 - Addressing contingency in algorithmic (mis)information classification
T2 - Toward a responsible machine learning agenda
AU - Domínguez Hernández, Andrés
AU - Owen, Richard
AU - Nielsen, Dan Saattrup
AU - McConville, Ryan
PY - 2022/10/5
Y1 - 2022/10/5
N2 - Machine learning (ML) enabled classification models are becoming increasingly popular for tackling the sheer volume and speed of online misinformation and other content that could be identified as harmful. In building these models, data scientists need to take a stance on the legitimacy, authoritativeness and objectivity of the sources of ``truth" used for model training and testing. This has political, ethical and epistemic implications which are rarely addressed in technical papers. Despite (and due to) their reported high accuracy and performance, ML-driven moderation systems have the potential to shape online public debate and create downstream negative impacts such as undue censorship and the reinforcing of false beliefs. Using collaborative ethnography and theoretical insights from social studies of science and expertise, we offer a critical analysis of the process of building ML models for (mis)information classification: we identify a series of algorithmic contingencies--key moments during model development that could lead to different future outcomes, uncertainty and harmful effects as these tools are deployed by social media platforms. We conclude by offering a tentative path toward reflexive and responsible development of ML tools for moderating misinformation and other harmful content online.
AB - Machine learning (ML) enabled classification models are becoming increasingly popular for tackling the sheer volume and speed of online misinformation and other content that could be identified as harmful. In building these models, data scientists need to take a stance on the legitimacy, authoritativeness and objectivity of the sources of ``truth" used for model training and testing. This has political, ethical and epistemic implications which are rarely addressed in technical papers. Despite (and due to) their reported high accuracy and performance, ML-driven moderation systems have the potential to shape online public debate and create downstream negative impacts such as undue censorship and the reinforcing of false beliefs. Using collaborative ethnography and theoretical insights from social studies of science and expertise, we offer a critical analysis of the process of building ML models for (mis)information classification: we identify a series of algorithmic contingencies--key moments during model development that could lead to different future outcomes, uncertainty and harmful effects as these tools are deployed by social media platforms. We conclude by offering a tentative path toward reflexive and responsible development of ML tools for moderating misinformation and other harmful content online.
KW - misinformation
KW - reflexivity
KW - content moderaion
KW - fact checking
KW - Machine Learning
KW - responsible innovation
U2 - 10.48550/arXiv.2210.09014
DO - 10.48550/arXiv.2210.09014
M3 - Conference Contribution (Conference Proceeding)
BT - ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT)
ER -