Adaptive Regularization Techniques for Mitigating Overfitting in Large-Scale Language Model Tuning

Authors

  • Derek McAuley School of Computer Science, University of Nottingham, UK
  • Tanja Mayer Department of Computer Science, University of Luxembourg, Luxembourg

Abstract

As the scale and complexity of language models continue to increase, overfitting becomes a significant challenge in fine-tuning these models for specific tasks. This paper explores adaptive regularization techniques as a means to mitigate overfitting in large-scale language model tuning. We examine various approaches, including dropout, weight decay, and advanced methods like adaptive weight noise and differential privacy. By analyzing the impact of these techniques on model performance, we provide insights into their effectiveness in preserving generalization while maintaining task-specific accuracy.

Downloads

Published

2024-06-12

Issue

Section

Articles