In general, combining multiple packages that each attempt to perform concurrent operations is error prone, especially if they use locking. The STM papers make an interesting argument for why using transactions rather than locks as the typical concurrency primitive leads to concurrent programs that can be more easily composed.
That's why it's really nice to see that Arc has threads and an atomic-execution mechanism. GHC-style Software Transactional Memory would be a natural fit, and it seems to scale very well.
On that note, you might take a glance at the lightweight concurrency paper; it proposes a new concurrency runtime for GHC, basically moving most of the implementation into Haskell, with just a thin layer of primitives in the RTS. Interestingly, those primitives don't include locks, but instead a very minimal transactional memory.