TY - JOUR
T1 - AdaFlood
T2 - Adaptive Flood Regularization
AU - Bae, Wonho
AU - Ren, Yi
AU - Ahmed, Mohamad Osama
AU - Tung, Frederick
AU - Sutherland, Danica J
AU - Leivas Oliveira, Gabriel
PY - 2024/8/30
Y1 - 2024/8/30
N2 - Although neural networks are conventionally optimized towards zero training loss, it has been recently learned that targeting a non-zero training loss threshold, referred to as a flood level, often enables better test time generalization. Current approaches, however, apply the same constant flood level to all training samples, which inherently assumes all the samples have the same difficulty. We present AdaFlood, a novel flood regularization method that adapts the flood level of each training sample according to the difficulty of the sample. Intuitively, since training samples are not equal in difficulty, the target training loss should be conditioned on the instance. Experiments on datasets covering four diverse input modalities—text, images, asynchronous event sequences, and tabular—demonstrate the versatility of AdaFlood across data domains and noise levels.
AB - Although neural networks are conventionally optimized towards zero training loss, it has been recently learned that targeting a non-zero training loss threshold, referred to as a flood level, often enables better test time generalization. Current approaches, however, apply the same constant flood level to all training samples, which inherently assumes all the samples have the same difficulty. We present AdaFlood, a novel flood regularization method that adapts the flood level of each training sample according to the difficulty of the sample. Intuitively, since training samples are not equal in difficulty, the target training loss should be conditioned on the instance. Experiments on datasets covering four diverse input modalities—text, images, asynchronous event sequences, and tabular—demonstrate the versatility of AdaFlood across data domains and noise levels.
M3 - Article (Academic Journal)
SN - 2835-8856
JO - Transactions on Machine Learning Research (TMLR)
JF - Transactions on Machine Learning Research (TMLR)
ER -