Publications

Stats

View publication

Title Are Text Classifiers Xenophobic? A Country-Oriented Bias Detection Method with Least Confounding Variables
Authors Valentín Barriere, Sebastian Cifuentes
Publication date 2024
Abstract Classical bias detection methods used in Machine Learning
are
themselves biased because of the different confounding variables implied in
the assessment of the initial biases. First they are using templates that
are syntactically simple and distant from the target data on which the model
will deployed. Second, current methods are assessing biases in pre-trained
language models or in dataset, but not directly on the fine-tuned classifier
that can actually produce harms. We propose a simple method to detect the
biases of a specific fine-tuned classifier on any type of unlabeled data.
The idea is to study the classifier behavior by creating counterfactual
examples directly on the target data distribution and quantify the amount of
changes. In this work, we focus on named entity perturbations by applying a
Named Entity Recognition on target-domain data and modifying them
accordingly to most common names or location of a target group (gender and
country), and this for several morphosynctactically different languages
spoken in relation with the countries of the target groups. We used our
method on two models available open-source that are likely to be deployed by
industry, and on two tasks and domains. We first assess the bias of a
multilingual sentiment analysis model trained over multiple-languages tweets
and available open-source, and then a multilingual stance recognition model
trained over several languages and assessed over English language. Finally
we propose to link the perplexity of each example with the bias of the
model, by looking at the change in label distribution with respect to the
language of the target group. Our work offers a fine-grained analysis of the
interactions between names and languages, revealing significant biases in
multilingual models.
Downloaded 7 times
Pages 1511-1518
Conference name International Conference on Computational Linguistics
Publisher Association for Computational Linguistic
PDF View PDF
Reference URL View reference page