Logo

STAT Colloquia - Michael Mahoney - Shared screen with speaker view
B.J. Fregly
32:04
For curiosity, what happens if you start lambda (the penalty term weight) out small and gradually increase it? Is the result similar to curriculum regularization?
Zhenwei Dai
44:47
What about increase the hidden dimension when cut delta t?
d b
59:51
So injecting random noise helps to generalize better?