New research papers about self-improving AI systems come at the same time as agentic AI is growing more capable, and predictions from AI leaders grow more dire. What does it all mean?
Self improvement is actually just having ai make us better at making ai in the near term. That takes self improvement as another major step away from current reasoning models. It’s abstract and also something I think many AGI faithful overestimate.
Yeah it seems very far from recursive self-improvement to me as well. I could see something like Sakana rewriting its code helping-- at the very least it provides a mechanism. But it seems like it would be prone to all the same issues we see with reinforcement learning and the results -- as with most clever ideas -- just dont impress yet.
In the end it seems like truly self improving ai will take a long time to arrive still
How does googles alphaevolve fit in here? Seems maybe similar to sakanas system? Not sure if there are any important differences, though i guess we might not know enough about alphaevolve
I dont know much about alphaevolve in particular. But most of the alpha* have been solving noticeably different problems; they're neurosymbolic and have a good amount of domain knowledge baked in; they're solving very concretely defined problems. This doesn't fit the nebulous definition of an ai research agent in my opinion; while it can drive up metrics, those metrics are notably disconnected from actual utility to a larger degree (and honestly none of the work i reviewed was doing anything massive yet)
I think in the end we just need to see if it scales
Alphaevolve is different in its generality - it can optimize any algorithm that has some measure of “success”, which is the same as the methods you mentioned since an eval is always necessary. Google has already used it to improve gemini (to speed up inference).
I think google is worth watching here closely, because they have both a frontier model and an agent scaffold which can optimize it. As you say, sakanas system might benefit a lot from a better base LLM. This is basically what alphaevolve is (it is based upon gemini).
I asked ChatGPT to weigh in on this question the day before you posted this article.
https://open.substack.com/pub/stevenscesa/p/what-does-chatgpt-v41-think-about?r=28v6pr&utm_medium=ios
Self improvement is actually just having ai make us better at making ai in the near term. That takes self improvement as another major step away from current reasoning models. It’s abstract and also something I think many AGI faithful overestimate.
Yeah it seems very far from recursive self-improvement to me as well. I could see something like Sakana rewriting its code helping-- at the very least it provides a mechanism. But it seems like it would be prone to all the same issues we see with reinforcement learning and the results -- as with most clever ideas -- just dont impress yet.
In the end it seems like truly self improving ai will take a long time to arrive still
How does googles alphaevolve fit in here? Seems maybe similar to sakanas system? Not sure if there are any important differences, though i guess we might not know enough about alphaevolve
I dont know much about alphaevolve in particular. But most of the alpha* have been solving noticeably different problems; they're neurosymbolic and have a good amount of domain knowledge baked in; they're solving very concretely defined problems. This doesn't fit the nebulous definition of an ai research agent in my opinion; while it can drive up metrics, those metrics are notably disconnected from actual utility to a larger degree (and honestly none of the work i reviewed was doing anything massive yet)
I think in the end we just need to see if it scales
Alphaevolve is different in its generality - it can optimize any algorithm that has some measure of “success”, which is the same as the methods you mentioned since an eval is always necessary. Google has already used it to improve gemini (to speed up inference).
I think google is worth watching here closely, because they have both a frontier model and an agent scaffold which can optimize it. As you say, sakanas system might benefit a lot from a better base LLM. This is basically what alphaevolve is (it is based upon gemini).
The blog post is here: https://deepmind.google/discover/blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/
Thanks, i really should use the llm to edit more
They’re quite good. Mostly better than grammarly.