A short exploration of GHC’s instance resolution hiding mistakes from the type checker.

I was recently working on some Haskell code (for research, with Jack Hughes) and happened to be using a monoid (via the Monoid type class) and I was rushing. I accidentally wrote x `mempty` y instead of x `mappend` y. The code with mempty type checked and compiled, but I quickly noticed some tests giving unexpected results. After a pause, I checked the recent diff and noticed this mistake, but I had to think for a moment about why this mistake was not leading to a type error. I thought this was an interesting little example of how type class instance resolution can sometimes trip you up in Haskell, and how to uncover what is going on. This also points to a need for GHC to explain its instance resolution, something others have thought about; I will briefly mention some links at the end.

The nub of the problem

In isolation, my mistake was essentially this:

whoops :: Monoid d => d -> d -> d
whoops x y = mempty x y  -- Bug here. Should be "mappend x y"

So, given two parameters of type d for which there is a monoid structure on d, we use both parameters as arguments to mempty.

Recall the Monoid type class is essentially:

class Monoid d where
  mempty :: d
  mappend :: d -> d -> d

(note that ​​Monoid also has a derived operation mconcat and is now decomposed into Semigroup and Monoid but I elide that detail here, see https://hackage.haskell.org/package/base-4.14.0.0/docs/Prelude.html#t:Monoid).

We might naively think that whoops would therefore not type check since we do not know that d is a function type. However, whoops is well-typed and evaluating
whoops [1,2,3] [4,5,6] returns []. If the code had been as I intended, (using mappend here instead of mempty) then we would expect [1,2,3,4,5,6] according to the usual monoid on lists.

The reason this is not a type error is because of GHC’s instance resolution and the following provided instance of `Monoid`:

instance Monoid b => Monoid (a -> b) where
  mempty = \_ -> mempty
  mappend f g = \x -> f x `mappend` g x

That is, functions are monoids if their domain is a monoid with mempty as the constant function returning the mempty element of b and mappend as the pointwise lifting of a monoid to a function space.

In this case, we can dig into what is happening by compiling1 with --ddump-ds-preopt and looking at GHC’s desugared output before optimisation, where all the type class instances have been resolved. I’ve cleaned up the output a little (mostly renaming):

whoops :: forall d. Monoid d => d -> d -> d
whoops = \ (@ d) ($dMonoid :: Monoid d) ->
  let
    $dMonoid_f :: Monoid (d -> d)
    $dMonoid_f = GHC.Base.$fMonoid-> @ d @ d $dMonoid }

    $dMonoid_ff :: Monoid (d -> d -> d)
    $dMonoid_ff = GHC.Base.$fMonoid-> @ (d -> d) @ d $dMonoid_f }
  in
    \ (x :: d) (y :: d) -> mempty @ (d -> d -> d) $dMonoid_ff x y

The second line shows whoops has a type parameter (written @ d) and the incoming dictionary $dMonoid representing the Monoid d type class instance (type classes are implemented as data types called dictionaries, and I use the names $dMonoidX for these here).

Via the explicit type application (of the form @ t for a type term t) we can see in the last line that mempty is being resolved at the type d -> d -> d with the monoid instance Monoid (d -> d -> d) given here by the $dMonoid_ff construction just above. This is in turn derived from the Monoid (d -> d) given by the dictionary construction $dMonoid_f just above that. Thus we have gone twice through the lifting of a monoid to a function space, and so our use of mempty here is:

mempty @ (d -> d -> d) $dMonoid_ff = \_ -> (\_ -> mempty @ $dMonoid)

thus

mempty @ (d -> d -> d) $dMonoid_ff x y = mempty @ d $dMonoid

That’s why the program type checks and we see the mempty element of the original intended monoid on d when applying whoops to some arguments and evaluating.

Epilogue

I luckily spotted my mistake quite quickly, but this kind of bug can be a confounding experience for beginners. There has been some discussion about extending GHCi with a feature allowing users to ask GHC to explain its instance resolution. Michael Sloan has a nice write up discussing the idea and there is a GHC ticket proposing by Icelandjack something similar which seems like it would work well in this context where you want to ask what the instance resolution was for a particular expression. There are many much more confusing situations possible that get hidden by the implicit nature of instance resolution, so I think this would be a very useful feature, for beginner and expert Haskell programmers alike. And it certainly would have explained this particular error quickly to me without me having to scribble on paper and then check --ddump-ds-preopt to confirm my suspicion.

Additional: I should also point out that this kind of situation could be avoided if there were ways to scope, import, and even name type class instances. The monoid instance Monoid b => Monoid (a -> b) is very useful, but having it in scope by default as part of base was really the main issue here.


1 To compile, put this in a file with a main stub, e.g.

main :: IO ()
main = return ()

Considering the order of results when computing Cartesian product [short]

Sometimes in programming you need to do a pairwise comparison of some elements coming from two collections, for example, checking possible collisions between particles (which may be embedded inside a quadtree representation for efficiency). A handy operation is then the Cartesian product of the two sets of elements, to get the set of all pairs, which can then be traversed.

Whenever I need a Cartesian product of two lists in Haskell, I whip out the list monad to generate the Cartesian product:

cartesianProduct :: [a] -> [b] -> [(a, b)]
cartesianProduct as bs = as >>= (\a -> bs >>= (\b -> [(a, b)]))

Thus for every element a in as we get every element b in bs and form the singleton list of the pair (a, b), all of which get concatenated to produce the result. For example:

*Main> cartesianProduct [1..4] [1..4]
[(1,1),(1,2),(1,3),(1,4),(2,1),(2,2),(2,3),(2,4),(3,1),(3,2),(3,3),(3,4),(4,1),(4,2),(4,3),(4,4)]

This traverses the Cartesian space in row order. That is, if we imagine the square grid of possibilities here (where the element at the ith row and jth column corresponds to element (i, j)), then cartesianProduct generates the pairs in the following order:

cp-rowOrder

In mathematics, the Cartesian product is defined on sets, so the order of these pairs is irrelevant. Indeed, if we want to examine all pairs, then it may not matter in what order. However, if we learn something as we look at the pairs (i.e., accumulating some state), then the order can be important.

In some recent research, I was building an automated algebra tool to find axioms for some algebraic structure, based on an interpretation. This involved generating all pairs of syntax trees t and t' of terms over the algebraic structure, up to a particular depth, and evaluating whether t = t'. I also had some automated proof search machinery to make sure that this equality wasn’t already derivable from the previously generated axioms. I could have done this derivability check as a filtering afterwards, but I was exploring a very large space, and did not expect to even be able to generate all the possibilities. I just wanted to let the thing run for a few hours and see how far it got. Therefore, I needed Cartesian product to get all pairs, but the order in which I generated the pairs became important for the effectiveness of my approach.  The above ordering (row major order) was not so useful as I was unlikely to find interesting axioms quickly by traversing the span of a (very long) row; I needed to unpack the space in a more balanced way.

My first attempt at a more balanced Cartesian product was the following:

cartesianProductBalanced :: [a] -> [b] -> ([(a, b)])
cartesianProductBalanced as bs =
     concatMap (zip as) (tails bs)
  ++ concatMap (flip zip bs) (tail (tails as))

tails gives the list of successively applying tail, e.g., tails [1..4] = [[1,2,3,4],[2,3,4],[3,4],[4],[]]

This definition for cartesianProductBalanced essentially traverses the diagonal of the space and then the lower triangle of the matrix, progressing away from the diagonal to the bottom-left, then traversing the upper triangle of the matrix (the last line of code), progressing away from the diagonal to the top-right corner. The ordering for a 4×4 space is then:

*Main> cartesianProductBalanced [1..4] [1..4]
[(1,1),(2,2),(3,3),(4,4),(1,2),(2,3),(3,4),(1,3),(2,4),(1,4),(2,1),(3,2),(4,3),(3,1),(4,2),(4,1)]

cp-low-then-upper
This was more balanced, giving me a better view of the space, but does not scale well: traversing the elements of this Cartesian product linearly means first traversing all the way down the diagonal, which could be very long! So, we’ve ended up with a similar problem to the row-major traversal.

Instead, I finally settled on a “tiling” Cartesian product:

cartesianProductTiling :: [a] -> [b] -> [(a, b)]
cartesianProductTiling [] _ = []
cartesianProductTiling _ [] = []
cartesianProductTiling [a] [b] = [(a, b)]
cartesianProductTiling as bs =
       cartesianProductTiling as1 bs1
    ++ cartesianProductTiling as2 bs1
    ++ cartesianProductTiling as1 bs2
    ++ cartesianProductTiling as2 bs2
  where
    (as1, as2) = splitAt ((length as) `div` 2) as
    (bs1, bs2) = splitAt ((length bs) `div` 2) bs

This splits the space into four quadrants and recursively applies itself to the upper-left, then lower-left, then upper-right, then lower-right, e.g.,

*Main> cartesianProductTiling [1..4] [1..4]
[(1,1),(2,1),(1,2),(2,2),(3,1),(4,1),(3,2),(4,2),(1,3),(2,3),(1,4),(2,4),(3,3),(4,3),(3,4),(4,4)]

cp-tiling

Thus, we explore the upper-left portion first, without having to traverse down a full row, column, or diagonal. This turned out to be much better for my purposes of searching the space of possible axioms for an algebraic structure.

Note that this visits the elements in Z-order (also known as the Lebesgue space-filling curve) [Actually, it is a reflected Z-order curve, but that doesn’t matter here].

The only downside to this tiling approach is that it does not work for infinite spaces (as it calculates the length of as and bs), so we cannot exploit Haskell’s laziness here to help us in the infinite case. I’ll leave it as an exercise for the reader to define a tiling version that works for infinite lists.

Of course, there are many other possibilities depending on your application domain!


Later edit:  In the case of the example I mentioned, I actually do not need the full Cartesian product since the order of pairs was not relevant to me (equality is symmetric) and neither do I need the pairs of identical elements (equality is reflexive). So for my program, computing just the lower triangle of Cartesian product square is much more efficient, i.e,:

cartesianProductLowerTriangle :: [a] -> [b] -> [(a, b)]
cartesianProductLowerTriangle as bs = concatMap (zip as) (tail (tails bs))

However, this does not have the nice property of tiling product where we visit the elements in a more balanced fashion. I’ll leave that one for another day (I got what I needed from my program in the end anyway).