Publications

Stats

View publication

Title Deep Natural Language Feature Learning for Interpretable Prediction
Authors Felipe Urrutia, Cristian Buc, Valentín Barriere
Publication date 2023
Abstract We propose a general method to break down a main complex
task
into a set of intermediary easier sub-tasks, which are formulated in natural
language as binary questions related to the final target task. Our method
allows for representing each example by a vector consisting of the answers
to these questions. We call this representation Natural Language Learned
Features (NLLF). NLLF is generated by a small transformer language model
(e.g., BERT) that has been trained in a Natural Language Inference (NLI)
fashion, using weak labels automatically obtained from a Large Language
Model (LLM). We show that the LLM normally struggles for the main task using
in-context learning, but can handle these easiest subtasks and produce
useful weak labels to train a BERT. The NLI-like training of the BERT allows
for tackling zero-shot inference with any binary question, and not
necessarily the ones seen during the training. We show that this NLLF vector
not only helps to reach better performances by enhancing any classifier, but
that it can be used as input of an easy-to-interpret machine learning model
like a decision tree. This decision tree is interpretable but also reaches
high performances, surpassing those of a pre-trained transformer in some
cases. We have successfully applied this method to two completely different
tasks: detecting incoherence in students' answers to open-ended
mathematics exam questions, and screening abstracts for a systematic
literature review of scientific papers on climate change and
agroecology.
Downloaded 2 times
Pages article 229
Conference name Empirical Methods in Natural Language Processing
Publisher Association for Computational Linguistic
PDF View PDF
Reference URL View reference page