Just tried to use qwen3.6-35b-a3b-bf16 + omlx running a pi session to use my HN cli to do a sentiment analysis on this story and opus4.7 story. I’m getting ~40tk/s on a M3 Ultra Mac Studio and the tool use consistency has been held up well. Even when passing 100k tokens, the session was still going strong. Here is the full sentiment analysis report it produced:
This is by far my smoothest agentic session using a local model of any size. The output quality and speed has really struct the right balance. Very impressive release
https://gist.github.com/duh17/2db5351da026cec4bd4f46e169e75e...
Here is the full session:
https://pi.dev/session/#c3d003becb1bfcc7ffbca04e89e1adf8
This is by far my smoothest agentic session using a local model of any size. The output quality and speed has really struct the right balance. Very impressive release