Adaptive Regularization Techniques for Mitigating Overfitting in Large-Scale Language Model Tuning

Derek McAuley; Tanja Mayer

Authors

Derek McAuley School of Computer Science, University of Nottingham, UK
Tanja Mayer Department of Computer Science, University of Luxembourg, Luxembourg

Abstract

As the scale and complexity of language models continue to increase, overfitting becomes a significant challenge in fine-tuning these models for specific tasks. This paper explores adaptive regularization techniques as a means to mitigate overfitting in large-scale language model tuning. We examine various approaches, including dropout, weight decay, and advanced methods like adaptive weight noise and differential privacy. By analyzing the impact of these techniques on model performance, we provide insights into their effectiveness in preserving generalization while maintaining task-specific accuracy.

Adaptive Regularization Techniques for Mitigating Overfitting in Large-Scale Language Model Tuning

Authors

Abstract

Downloads

Published

Issue

Section

Make a Submission

Information

Indexing