This is not what I said. Sure you can use such tools or even just classical boilerplate scripts (like we used for a decade now) to get started with react fast. But building out a system that fails well when the underlying llm starts behaving erratically or not at all is a completely different league of engineering as executing a boilerplate script.
Sorry for misinterpreting you. So the underlying LLM starts misbehaving, and the difficulty you see, is that the system as a whole should fail gracefully. What would that look like, in your eyes? A proctor LLM/whatever that observes the output and decides that it has gone awry and decides to take over?