jslade


Total Posts: 1246 
Joined: Feb 2007 


https://dsp.rice.edu/2021/09/07/afarewelltothebiasvariancetradeoffanoverviewofthetheoryofoverparameterizedmachinelearning/
Huge if true 
"Learning, n. The kind of ignorance distinguishing the studious." 


nikol


Total Posts: 1403 
Joined: Jun 2005 


Thanks.
You embrace relativistic principles and say farewell to the Newtonian mechanics, while I am still living in the village believing in the flat world on top of three elephants with whale underneath. 
... What is a man
If his chief good and market of his time
Be but to sleep and feed? (c) 

doomanx


Total Posts: 115 
Joined: Jul 2018 


I saw a talk on this a while ago by Hastie, presented this paper https://arxiv.org/pdf/1903.08560.pdf (which shows the same double descent idea in least squares). It's not surprising that a large amount of this analysis is based on rmt, which deals with the case p features >> n data points.
The intuition is this: if you have a linear system Ax = b and you're solving for the minimum l2 norm solution, as p grows A has more columns so we can generally decrease the components of x (distributing the value over the coefficients of the extra columns). What this does is regularise the coefficient vector more, hence the better generalisation. What's interesting is that you have to go through the classic under  plateau  over fit cycle to reach this regime. 
did you use VWAP or triplereinforced GAN execution?



sloppy


Total Posts: 9 
Joined: May 2011 


Have a look at section 7.1. Overparameterized (interpolating) solutions are dominated by optimally regularized Lasso or ridge regression.
"These results suggest that optimally tuned regularization will always dominate interpolation. From this perspective, the recent flurry of results show that interpolation is relatively harmless, rather than being relatively beneficial." 

