Can you explain how 'recursive self-improvement' functions without 'endless benchmark chasing'? I mean, RSI is literally that.
What do you think they're improving on? How would a model self-improve without some metric/data of some kind to check? When you have metrics+data, that is a benchmark. And yes, simulations and or soft-verification like LLM judges are still a kind of benchmarking. Maybe its not a static benchmark they can easily hack.
Folks -- RSI does not mean the self-improvement is them going to therapy and seeking inner peace to overcome trauma.
This is a failure to engage the clear arguments and claims made. If you don't want to debate something, why bother commenting at me? Was I supposed to just cede ground and accept your framing wholesale? I'm putting forth very clear and open-to-rebuttal assertions which is what you should do.
Edit: Nevermind, realized he's just an uneducated troll just looking for his kicks. Comment flagged.
What do you think they're improving on? How would a model self-improve without some metric/data of some kind to check? When you have metrics+data, that is a benchmark. And yes, simulations and or soft-verification like LLM judges are still a kind of benchmarking. Maybe its not a static benchmark they can easily hack.
Folks -- RSI does not mean the self-improvement is them going to therapy and seeking inner peace to overcome trauma.