Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Finite State Machines with Python Coroutines (arpitbhayani.me)
80 points by arpitbbhayani on April 19, 2020 | hide | past | favorite | 15 comments


Perhaps I'm missing it, but the only reason I can find in TFA for using coroutines here is that it's the most intuitive way of writing this FSM class.

Is it really more intuitive than something like:

    class FSM:
        def __init__(self):
            self.current_state = self._start
            self.stopped = False
        
        def send(self, char):
            self.current_state(char)
        
        def does_match(self):
            if self.stopped:
                return False
            return self.current_state == self._q3

        ...

        def _q2(self, char):
            if char == 'b':
                return

            if char == 'c'
                self.current_state = self._q3

            self.stopped = True

        ...
(Using names 'send' etc. from the article just for easier comparison.)

This is basically the same except not using co-routines, and I don't see what advantage using them brings?

Not using them has a clear advantage (IMO) that we don't have to initialise each state (`_create_q2()`) - which just seems strange to read, and a recipe for forgetting ('I just added a state for this, why are we still seeing this error...') since it essentially means a state is 'declared' in init, and defined in a method. But this is Python, and a compiler won't complain. Neither will your linter know^ that you should have called that method in `__init__`.

^(OK, you _might_ have a check that it's used anywhere ever, but there's a lot of good reasons for not having that - library code, abstractions, linter not clever enough to work out you have called things in a non-straightforward way, etc.)


An even better example is with the coroutine FSM by the same author here https://github.com/arpitbbhayani/fsm/blob/master/divisibilit... . It is cleaned up a lot by using functions instead of coroutines. Here each 'state' function takes a digit as input and returns the next state:

    class Divisibility3FSM:
        def __init__(self):
            self.current_state = self.init
            
        def send(self, digit):
            self.current_state = self.current_state(digit)
            
        def is_divisible(self):
            return self.current_state == self.q0

        def init(self, digit):
            if digit in [0, 3, 6, 9]:
                return self.q0
            elif digit in [1, 4, 7]:
                return self.q1
            elif digit in [2, 5, 8]:
                return self.q2

        def q0(self, digit):
            if digit in [0, 3, 6, 9]:
                return self.q0
            elif digit in [1, 4, 7]:
                return self.q1
            elif digit in [2, 5, 8]:
                return self.q2

        def q1(self, digit):
            if digit in [0, 3, 6, 9]:
                return self.q1
            elif digit in [1, 4, 7]:
                return self.q2
            elif digit in [2, 5, 8]:
                return self.q0

        def q2(self, digit):
            if digit in [0, 3, 6, 9]:
                return self.q2
            elif digit in [1, 4, 7]:
                return self.q0
            elif digit in [2, 5, 8]:
                return self.q1


I don't get it either. Normally you'd use coroutines to have an implicit state machine, without writing out explicit states.


Yes the regex-parsing FSM class in the article looks cleaner with a single coroutine:

    class FSM:
        def __init__(self):
            self.match = False
            self.current_state = self.create_state_machine()
            next(self.current_state)

        def does_match(self):
            return self.match  

        def send(self, char):
            try:
                self.current_state.send(char)
            except StopIteration:
                self.match = False
                pass

        def create_state_machine(self):
            char = yield
            if char == 'a':
                while True:
                    char = yield
                    if char == 'b':
                        pass
                    elif char == 'c':
                        self.match = True
                        yield
                        break
                    else:
                        break


That makes more sense. And we could still break out states into functions without being quite so explicit as in the article (and without the __init__ problem I described):

    def create_state_machine(self):
        if yield == 'a':
            yield from self.q2()

    def q2(self):
        while (char := yield) == 'b':
            pass

        if char == 'c':
            self.match = True
Or something like that anyway, untested of course.

(NB that actually fixes a bug in yours in that it would continue processing 'b's after the 'c' - of course that's still a match for the regex, but the FSM should have stopped.)


Thanks for pointing that out, I think I fixed it. Not well tested ;)

I'm not sure if the way that you're using yield within the if and while statements works correctly. I like the idea of replacing state transitions with method calls but you'll run into recursion depth problems with a state machine that alternates between two states.


> I'm not sure if the way that you're using yield within the if and while statements works correctly.

You're right...

Perhaps I'm not so surprised that `if yield == val:` is a SyntaxError, but that `yield` can't be used as an rvalue to a walrus operator (that the `while` example essentially boils down to) surprises me:

    >>> char = yield
      File "<stdin>", line 1
    SyntaxError: 'yield' outside function
    >>>
    >>> # i.e. OK, that's the error we want if it works
    >>>
    >>> got_b = (char := yield) == 'b'
      File "<stdin>", line 1
        got_b = (char := yield) == 'b'
                         ^
    SyntaxError: invalid syntax


You can use Brzozowski's derivatives of regular expressions[1] to lazily get a state machine from a RE. I made a simple FSM using a trampoline function[2] to imitate a kind of table-drive GOTO design.[3]

[1] "Derivatives of Regular Expressions", Janusz A. Brzozowski https://dl.acm.org/doi/10.1145/321239.321249

[2] https://en.wikipedia.org/wiki/Trampoline_(computing)#High-le...

[3] http://joypy.osdn.io/notebooks/Derivatives_of_Regular_Expres...


@arpitbbhayani - this is very interesting. you are one step away from building "reliable functions" - the basis of Azure Durable Functions and Uber Cadence.

https://docs.microsoft.com/en-us/azure/azure-functions/durab...

A durable function is a state machine that can be replayed.


In newer version of Python, you can send values to Generators with generator.send(), which makes python Generators effectively coroutines.


Generator send() was actually added way back in Python 2.5! [1]

[1] https://docs.python.org/2.5/whatsnew/pep-342.html


For even more fun, use async def + await, and you save yourself the while loop since that what the asyncio.run() does for you.


asyncio will get confused as to why youre yielding random things and it won't work


No, you just wrap it in a Future.


So now you have even more confusing code. Good work!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: