Thursday, 22 July 2010

Haskell's hash tables revisited: part 2

Our previous blog post contained several benchmark results comparing the new GHC 6.12.3 with F#. We have since discovered some other points of interest regarding this benchmark.

Firstly, the results for Haskell rely on the use of a garbage collector that prevents parallelism. If the more modern multicore-friendly GC is used (by compiling with -threaded and running with +RTS -N8) then the time taken increases from 4.5s to 10.6s. This is over 2× slower than before and now over 13× slower than F#. Naturally, the F# was already using the multicore-capable .NET garbage collector so this was an unfair bias in favor of Haskell.

Secondly, the Haskell code exploits an algorithmic optimization on the assumption that the keys are unique. This is often not the case in practice and, again, the F# code did not exploit such an assumption so this was another unfair bias in favor of Haskell. A fairer comparison may be obtained by changing from the insert function to the update function in the Haskell code. Again, this incurs a substantial performance penalty and increases the time taken to 20.6.

With these alterations, Haskell becomes 26× slower than F# on this benchmark even with the latest GHC. Although these latest improvements are a step in the right direction, it seems that Haskell still has a long way to go before attaining competitive performance with mutable data structures.

3 comments:

Paolo Giarrusso said...

I agree that Haskell is slower here, and I agree somewhat on the second point (it can be useful to be able to exploit key uniqueness when present - does F# allow it?).

But I don't see the point of enabling a parallel garbage collector for a single-threaded program if it slows down execution.

You seem to imply that F# garbage collector is slower because it supports multiple cores, but I doubt it. Parallel GCs exist to be faster, and a production quality implementation such as .NET is supposed to be tuned to enable parallelism only when its benefit are bigger than the overhead, with default settings - otherwise it would be a .NET performance misbehavior/bug.

For Haskell, the defaults are better than the parameters you force, and the runtime is not supposed to prevent you from changing settings even if the default is worse. Additionally, the help about option -qg suggest that parallel GC is automatically enabled for parallel programs. See here:
http://haskell.org/ghc/docs/6.12.2/html/users_guide/runtime-control.html

(I can't find docs for 6.12.3 so easily, and yes, that's not good).

Flying Frog Consultancy Ltd. said...

@Paolo: "But I don't see the point of enabling a parallel garbage collector for a single-threaded program if it slows down execution". People using hash tables in parallel Haskell programs will be interested in this performance characteristic.

"Parallel GCs exist to be faster". There seems to be some confusion here. I was referring to GCs that allow mutators to run in parallel and not to GCs that run GC threads in parallel in order to accelerate GC cycles.

jay paul said...

Your post really cool and interesting. Thanks very much.

HP - Pavilion 17.3" Laptop - 4GB Memory - 500GB Hard Drive - Silver

HP - ENVY 15.6" Laptop - 8GB Memory - 750GB Hard Drive - Natural Silver (15-j011dx)