2 plus ones
Shared publicly•View activity
- I've also noticed the issue you raise in this article. It's not unlike the need for a user to determine how many workers and input partitions to use for a given mapreduce. It might be possible to automatically choose these parameters based on heuristics or historical measurement (or even dynamically at runtime), but it's not obvious how to do this in general.
Alternate versions of your example to consider:
- start a static number of goroutines, then feed them indexes of x to process via a single shared channel, then close the channel. The goroutines range over the channel, processing elements of x, then exit on close, signaling the waitgroup.
- alter your original version or the one I just described to have each goroutine process a block of several indexes of x at a time, instead of just one. This is another magic parameter that would be nice to have set automatically.
Building a parallel data processing system (like mapreduce) atop Go might work, because then the structure of the computation is constrained such that the system or library can automatically set these parameters.Feb 10, 2013
- Good catch. Fixed it. To be precise: the second version is intended to spawn as many goroutines as elements in x. But the third one is not. And there was a mini bug in it. It used to spawn len(x) goroutines but then have most of them block until there is CPU. So I fixed the third one.Feb 13, 2013
- I feel like your example and Rob's talk still lead with concurrency too much. Basically, they separate concurrency and parallelism but then make a mistake of using concurrency as the only tool to enable parallelism. This is an outcome of Go's parallelism support which is limited to goroutines.
There are other ways to expose parallelism that are arguably better as they don't involve the danger of concurrency, instead simply communicate the parallelism to the compiler as you intend. Data parallelism for example. Haskell is probably the language that best enables these alternate ways to communicate parallelism without using concurrency.
I.e., for a similar point to Rob's but expanding on the above see:
Or a (long -- 1 hour) video: http://yow.eventer.com/events/1004/talks/1055May 30, 2013
Add a comment...