How to design software and APIs for parallelism on modern hardware?http://www.meetup.com/de-DE/ThoughtWorksKoeln/events/228082419/
By coincidence Richard A. Parker is in Cologne and is available for meetup / talk. He is a mathematician and freelance computer programmer in Cambridge, England. He invented many of the algorithms for computing the modular character tables of finite simple groups. He discovered the relation between Niemeier lattices and deep holes of the Leech lattice, and constructed Parker's Moufang loop of order 213 (which was used by John Horton Conway in his construction of the monster group).
I see only three techniques capable of speeding up well written but old software by a factor of ten or more, namely
* Using multiple cores
* Making good use of cache memory
* Using the full power of a single processor.
So far, commercial software seems to have ignored recent changes in hardware. Often, performance does not really matter, but when it does matter, it requires changes to the design and coding of the computer software.
I have been looking into the changes needed for several years now, and it is gradually emerging that one aspect sticks out as being more important than all the others.
If functions, subroutines, methods etc. do a large number of whatever they do, rather than just one, it becomes possible to write clever implementations that run much
faster. Several examples of this will be discussed.
I suggest that software designed and written today should take this into account, making it easier to speed it up if and when it becomes necessary to do so.