Shared publicly  - 
 
Nice lexing approach. I wonder if it's possible to do in vanilla Python, or I'd need something like Stackless for it
3
1
Elazar Leibovich's profile photoPiet Delport's profile photoEli Bendersky's profile photoShane Jensen's profile photo
13 comments
 
Everything possible with channels (which does not require the scheduler), is possible with python's yield. And yes, it's an excellent talk. Look at "concurrent is not parallel" if you liked this, also have a look at my humble slides: http://www.lambda.org.il/meetings/past/mpgsms22-gw interesting part from slide 40.
 
+Elazar Leibovich the problem is that his state functions both put stuff in channels (~ yield) and return the next state function. Maybe one can cook something up with PEP 380
 
I'll give it a look this evening. I can only tell, that some examples in my slides, are definitely not possible with single thread + coroutines ;-)
 
I don't think you need generators to do this in Python, nor something like Stackless.

The first version of the talk's lexer, using goroutines, should correspond to a generator-based coroutine sending parsed items over a Queue, in Python.

The second version of the lexer, modified to work without goroutines, would involve the same modifications in Python: removing the use of generators, adding a length to the Queue (turning it into a ring buffer), and using the modified nextItem() state machine runner / trampoline.
 
You can do without "yield from": it's a convenient syntax, but doesn't provide fundamentally new functionality.

(For Python < 3.3, there are a bunch of libraries that offer helpers to write code that looks essentially the same as with "field from"; a large part of PEP 380's motivation is to obviate them, by making it built-in syntax.)
 
+Piet Delport thinking about this a bit more, I'm not sure you right. A size-limited Queue will actually block, forcing you to do the pulling (i.e. a parser) in another thread.

Therefore the "channel" passing stuff to the caller should be via Python's generator interface - caller pulls a new token by calling .next() (or .send()), which advances the lexing coroutine somewhere.

Do you have code where you tried to implement this?
 
As far as i understand, a Go channel will block in the same way as a Queue. The second version of the algorithm works by emptying and returning the pending items from the channel/queue before each state function invocation: the only assurance you need is that the channel/queue is at least as large as the maximum number of items that any given state function will enqueue at one time. (That is the magic number 2 that Rob Pike refers to, for this lexer.)

As Rob Pike explains in the Q&A, the second version does not require that the channel/queue be size-limited, and in fact does not require it to be a channel/queue: you could replace it with a plain list, at this point.

I haven't tried implementing this, though. Caveat emptor. :)
 
+Piet Delport OK, enough babbling, time to code ;-)

Here is a first attempt - partial, but addresses the core issues: https://gist.github.com/3293777

It needs Python 3.3 (for PEP 380 "yield from"). All in all it looks quite similar to Pike's Go code. PEP 380 allows to cleanly mix the two paths (returning next state function on one hand, and yielding tokens on the other hand).

I'll write a blog post once I have a fuller implementation.
 
+Eli Bendersky: Cool. :)

I forked and added some variations to your gist (using generators, threads, and no concurrency), to show what i mean.

In the original version, the use of yielding (instead of queues/channels) to convey tokens is a red herring, i think: it bears no relation to the Go code. The way to think about it is that for generator-based tasks / coroutines, `yield` is only a way for generators to signal their scheduler, not a way to convey data in general.
 
+Piet Delport Nicely done. In the meantime I've completed more functionality based on my approach and I like the way it came out. I agree it's not really parallel to Pike's Go code, but it is Pythonic which is more important. I'll mention your alternatives when I write about it.
Add a comment...