Paper Notes: Nested Learning

Chris Paxton

Nov 20, 2025

A research paper from Google on how to enable AI to learn over its lifetime without a distinct "training" phase

Read →

4 Comments

Nov 21

Why isn’t continual fine tuning of transformers on individual user corpuses more common?

Reply (1)

Chris Paxton

Nov 21

Two reasons: 1) it loses a lot of the scale advantages you get from ml systems right now, due to being much more expensive than inference with all the tricks people have figured out, and 2) it has the "infinite data" issue mentioned in the article, where you start to need an insane amount of storage per user or you start to run into catastrophic forgetting

Neural Foundry

Nov 21

The analogy to how the human brain handles memory at different timescales is really stricking here. Most attempts at lifelong learning feel forced, but this nested optimization approach seems much more naturla. I wonder if the trade off in computational cost will be worth it though, since transformers are so efficient at infrence right now.

Reply (1)

Chris Paxton

Nov 21

Oh yeah i agree, its really cool and compelling. I could even imagine tweaking the learning rates lower as your system "ages" which almost certainly happens with humans... but its very new, and probably will take a long time to compete with transformers if ever

It Can Think!

Paper Notes: Nested Learning