Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Where is this spelled out formally and proven logically?


LLM backtracking is an active area of research, see e.g.

https://arxiv.org/html/2502.04404v1

https://arxiv.org/abs/2306.05426

And I was wrong that nobody has implemented it, as these papers prove people have… it is just the results haven’t been sufficiently impressive to support the transition from the research lab to industrial use - or at least, not yet


> Empirical evaluations demonstrate that our proposal significantly enhances the reasoning capabilities of LLMs, achieving a performance gain of over 40% compared to the optimal-path supervised fine-tuning method.


I would expect to see something like this soonish as around now we are seeing the end of training scaling and the beginning of inference scaling


This is a neat observation, training has been optimized to hell and inference is just beginning.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: