How does it do in performance/efficiency? (Lua is pretty low overhead, but I guess for something embedded like this maybe latency in responding to an input, rather than a "normal" benchmark, would make more sense?)
I haven't run any benchmarks yet, and TBH performance isn't a high priority, but event dispatch latency is currently fairly bad, between 100 and 150 us (when overclocking the Pico at 250 MHz). It feels like it should be possible to bring it down to <30 us. I suspect the bad latency is due to my very naive implementation of thread scheduling and timers.