Detailed Information on Publication Record
2024
Self-Training Language Models in Arithmetic Reasoning
KADLČÍK, Marek, Michal ŠTEFÁNIK, Ondřej SOTOLÁŘ and Vlastimil MARTINEKBasic information
Original name
Self-Training Language Models in Arithmetic Reasoning
Authors
Edition
ICLR 2024 Workshop on Large Language Model (LLM) Agents, 2024
Other information
Language
English
Type of outcome
Prezentace na konferencích
Field of Study
10201 Computer sciences, information science, bioinformatics
Country of publisher
Czech Republic
Confidentiality degree
není předmětem státního či obchodního tajemství
References:
Organization unit
Faculty of Informatics
Keywords in English
language models, arithmetical reasoning, self-training, preference optimisation
Změněno: 12/6/2024 14:45, Mgr. Michal Štefánik
Abstract
V originále
Recent works show the impressive effectiveness of an agent framework in solving problems with language models. In this work, we apply two key features from the framework, interaction with tools and goal-oriented training, to improve models' arithmetical reasoning. First, we curate and transform existing datasets to create Calc-X, a standardized collection with over 300,000 problems with step-by-step solutions. We use Calc-X to train models we call Calcformers that interact with a calculator during inference. Calcformers achieve twice the accuracy of standard baselines. Finally, we optimize Calcformers via self-training using preference optimization and supervised loss by checking the model's predicted results. We find that self-training can achieve substantial improvements on out-of-domain problems and that traditional supervised loss is a strong baseline for preference optimization. Our results show that preference optimization converges faster and isn't prone to forgetting pre-trained abilities.
Links
MUNI/A/1590/2023, interní kód MU |
|