Aargh. Tempted by what I thought looked like an opportunity for a huge optimisation in the Hoopl library, I spent a day or so coding up the solution and making it work, only to discover that it was worth a paulty 10% or so. I still need to find a factor of 3 from somewhere to get the Hoopl-based backend in GHC into the ballpark.
3 plus ones
Shared publicly•View activity
View 5 previous comments
- you stand accused of over-abstraction. How do you plead? (I'll accept "insanity, your honour")Jan 13, 2012
- +Simon Marlow, I plead that I fell in with a bad crowd. Something about "wholemeal programming" and "dictionaries are first-class values."
Seriously, a lot of compromise went into the Hoopl library, and I'd love to see where there might be opportunities to improve the performance without making too many compromises around reuse. Our term starts next week and I have to get ready for POPL, but maybe we can get together post Jan 30? Are you coming to POPL? If not, perhaps you can prime my unindicted co-conspirator with all the details?
take note.Jan 13, 2012
- Is the major offender the cps or the abstraction over node types? One of those has implications for potential clients; the other is "just" a matter of implementation.Jan 13, 2012
- I'm a bit stumped actually. The current situation is that the Hoopl-based code generator in GHC takes more than 10x as long as the old code generator, which translates into a factor of 2 or so overhead in total compilation time. And that's the current state after I've been beating on it for the last couple of days. I expect to take a hit, but that's too much. Profiling tells me it's all in hoopl's dataflow engine. I'm eager to get working on the backend itself, but this performance problem is really a blocker.
So here's what I've done so far. I rewrote the fixpoint algorithm, the idea being to re-analyse fewer blocks. A quick measurement on a GHC compile shows my version anaylsing about 20% fewer blocks. I'm not sure my version is overall better, it may have bigger overheads elsewhere, but it is shorter than the original. I'll send you this when I've tidied it up.
I made the representation change to Blocks that I sent you a while back, aiming to make entryLabel and successors O(1). It doesn't make any measurable difference to performance in GHC, but I like it.
I made a few minor tweaks to Dataflow.hs, but ran out of ideas. Then out of desperation I copied Dataflow.hs into GHC and specialised it to the FuelUniqSM monad. This helped, but not enough.
Do you have any ideas for things I might try? At this stage I'm a bit stumped.
Sadly I won't be at POPL. Are you going to WG2.8?Jan 13, 2012
- http://hackage.haskell.org/trac/ghc/wiki/Commentary/Compiler/HooplPerformance. If we keep track of what we've done, perhaps we can publish something.Our term starts in a week and I'm a bit blocked at the moment. I did put up some memories at
Do you have any kind of infrastructure one can use to probe these performance problems? I have zero experience.
Also, it's possible that +Sorin Lerner, +Craig Chambers, or +David Grove would have something to say about the performance of their algorithm in the Vortex/Whirlwind compilers.... (none of them seem to be on G+).Jan 14, 2012
- Thanks for putting together the wiki page! I shall keep trying, and keep track of what changes I make. I haven't built any infrastructure for measuring performance yet, but I think it would be worthwhile (also some more tests would be good).Jan 14, 2012