The standard definition of a monoid goes something like this: a monoid is any set (if you’re a mathematician) or type (if you’re a Haskell programmer), together with an associative binary operation ∗ (if you’re a mathematician) or
<> (if you’re a Haskeller) that has an identity. That’s nice and abstract. Unfortunately, those without much math background can find it difficult to follow or think of examples.
So here’s a different definition that I’ve used when talking to programmers in particular. I’m sure this isn’t new, but I don’t know of a reference I can point to for it. If you know of a good beginner-accessible reference for this, I’d like to know. It goes something like this: a monoid is any way to summarize a list so that you can combine just the summaries of two lists to get a summary of their concatenation.
This second definition is interesting to programmers because it suggests all sorts of strategies for decomposition, parallelism, and distributed computing. Anything that I want to compute from a list is much easier to do if I can take this approach of (1) splitting up the list into parts, (2) computing on each of the parts, then (3) merging the results together. I can do this to build distributed systems where each node computes on only the part of the data it has. I can do it to build parallel code where I effectively use multiple CPUs or even GPUs or SIMD instructions to perform operations on different subsets of data. And it’s useful just for making it easier to express and reason about a program even if there’s nothing fancy going on. (For example,
foldMap in Haskell’s
Foldable class uses monoids in this way.)
Let’s look at some examples:
- Counts: I can summarize a finite list by counting its elements. If I have the counts of two lists, I can compute the count of their concatenation by adding the two counts together.
- Sums: I can summarize a finite list by its sum. If I have the sum of two lists, I can compute the sum of their concatenation, again by adding.
- Minima and Maxima: I can summarize a finite list by its minimum and maximum elements. (Lists may be empty, but we can define the minimum and maximum of an empty list to be ∞ and -∞, respectively). And again, if I have the minima and maxima of two lists, I can compute the minimum and maximum of their concatenation, by taking the least of the two minima and the greater of the two maxima.
- GCDs: If I’m doing something with number theory, I might summarize a finite list with its greatest common divisor. If I have the greatest common divisors of two lists, I can compute the greatest common divisor of their concatenation, by computing the gcd of the gcds.
- Heads: I can summarize a list by its head (i.e., first element), if it exists! If the list is empty, I will just note that it has no head. Given the heads of two lists, I can tell you the first element of their concatenation: it’s the head of the left-hand list if it exists, and the head of the right-hand list otherwise.
- The Whole List: This one may seem a bit silly, but I’ll reference it later. You can trivially “summarize” a list by just keeping the whole list. Your summarizer here is the identity function! That obviously gives you enough information to know the concatenation, as well, so it’s a perfectly good monoid.
Counter-examples and how to fix them
At this point, our examples of monoids include counts, sums, minima, maxima, gcds, and heads. With so many examples, one might start to suspect that any function on lists is a monoid. But of course, that’s not the case. We can look at some instructive counterexamples.
- Means: If you’re thinking of ways to summarize a list of numbers, one that probably comes to mind is the mean. The mean is not quite a monoid, because knowing the means of two lists isn’t enough information to compute the mean of their concatenation. Just averaging the averages doesn’t work, unless the lists happen to be the same length!
While means themselves aren’t monoids, I can quite easily deal with means using a monoid structure: Instead of keeping the mean itself, I can keep an ordered pair of sum and count, which are monoids, and actually do the division only when I need the final result.
Imagine you are designing parallel or distributed algorithm for computing the mean of some data set. By finding an actual monoid — pairs of counts and sums — I can use it as a sort of halfway point. The monoid structure on counts and sums means you can use a divide-and-conquer approach to take advantage of data parallelism when computing the sums and counts. Split up the list, sum and count each piece, and then combine them. Or maybe the data already lives on different compute nodes, so you can have each node compute the count and sum locally before combining their results. Once you have the overall monoid value (sum and count), it still remains to compute the mean. But luckily, that last step of computation is pretty easy: just one division!
- Medians: Going down the list of averages, you might also want to know the median of a list. Now things get more dicey, but also more intriguing! Once again, the median itself is not a monoid: knowing the medians of two lists still doesn’t tell you the median of the concatenation.
This leads to the same question as before. Is there some other monoid I can use as my halfway point? There’s one answer that always works here, but isn’t useful: remember that the whole list is a monoid. But that answer corresponds to just concatenating the whole list together and only running the summarizer function on the final concatenated list. That avoids interesting uses of monoids entirely.
For the median, can we do better? Honestly, it looks difficult. In general, any element of a list could wind up being the median of some concatenated list containing it. So it seems that any monoid we choose here will need to contain information about each element of the list. That’s already looking a lot like the trivial “the whole list” monoid above.
There’s one improvement we can make, though. We don’t care about the order of elements in the list! We can then choose the sorted list as our monoid.
- First, is that even a monoid? Well, (a) given a list I can always sort it. And (b) If I have the sorted versions of two lists, I can merge them to get the sorted concatenation of those lists. So yes, it’s a monoid.
- Not only is it a monoid, but the monoid structure is helpful. It’s actually easier to merge sorted lists than it is to sort the concatenation of the originals — this is the whole basis of merge sort, after all.
- Finally, this monoid helps with computing a median. Although we don’t save on the size of data, it’s trivial to find the median of a list after sorting it.
So we do find an approach to computing medians that can take advantage of monoid structure after all. Ultimately, it amounts to this: sort the list using merge sort, then pick the middle element after sorting. Indeed, merge sort is often used because it has many of the properties we’re interested in: it’s easy to parallelize, distribute over multiple nodes, etc. That’s exactly because of the monoid structure it exploits.
It’s worth mentioning, though, that sorting and then picking the middle element is not the classically optimal way to find a median. The median-of-medians algorithm can famously do the computation in O(n) time. Perhaps there is some suitably clever monoid that takes advantage of these ideas, but I don’t see how that would happen. In the end, I suppose not all classical algorithms on lists can be expressed as a monoid in this way.
Why is this equivalent to abstract monoids?
I haven’t yet justified calling these list-summarizers monoids. Are they really the same thing as monoids in their traditional mathematical definition? They are, indeed. The short justification is that finite lists are free monoids.
Here’s the more complete justification. Let’s call the typical algebra meaning for monoids an abstract monoid, and this definition as list-summarizers a concrete monoid. We want to prove that abstract monoids and concrete monoids are equivalent, so the proof has two directions.
Part I: Every concrete monoid gives rise to an abstract monoid.
Proof: The concrete monoid consists of a 4-tuple (S, ∗, E, f) so that:
- f maps finite lists with elements in E onto S.
- If ⧺ denotes concatenation of finite lists, then f(a ⧺ b) = f(a) ∗ f(b).
The claim is that (S, ∗) is an abstract monoid. This means we have to prove:
- The ∗ operation is associative. That is, for any x, y, and z in S, (x ∗ y) ∗ z = x ∗ (y ∗ z). To show this, we use the fact that f is onto to find lists a, b, and c so that x = f(a), y = f(b), and z = f(c). Then we have this chain of equalities: (x ∗ y) ∗ z = (f(a) ∗ f(b)) ∗ f(c) = f(a ⧺ b) ∗ f(c) = f((a ⧺ b) ⧺ c) = f(a ⧺ (b ⧺ c)) = f(a) ∗ f(b ⧺ c) = f(a) ∗ (f(b) ∗ f(c)) = x ∗ (y ∗ z).
- The ∗ operation has an identity. For any x in S, we again use the fact that f is onto to find x with x = f(a). Then if e is the empty list, we have x = f(a) = f(a ⧺ e) = f(a) ∗ f(e) = x ∗ f(e), and similarly on the other side. So f(e) is the identity.
Part II: Every abstract monoid gives rise to a concrete monoid.
Proof: Let (S, ∗) be an abstract monoid, so ∗ is associative and has an identity. The claim is that we can find an E and f so that (S, ∗, E, f) is a concrete monoid by the definition above. To put this a more conventional way, we want to prove that every monoid arises as the homomorphic image of a free monoid.
The trick is going to be to let E = S, and f be the canonical homomorphism from the free monoid generated by S back to S itself. That canonical homomorphism is defined by just mapping the empty list to the identity, and any other finite list like (x, y, …, z) to the result of the chained monoid operation x ∗ y ∗ … ∗ z. We then need to check:
- That this f is onto. It is, because you can get any x in S by applying f to the one-element list containing x.
- That f is a monoid homomorphism, so f(a ⧺ b) = f(a) ∗ f(b). We need only expand the definition of f to get f(a) and f(b) as chains of the monoid operation, and then appeal to the associative property to arrange the parentheses as needed.
Note that a concrete monoid is still just a little more information than an abstract monoid. An abstract monoid might arise as homomorphic images of more than one free monoid, whereas our concrete monoids come with a baked-in choice of which free monoid they are the image of, and by which homomorphism. In this sense, we’ve got something roughly analogous to the concrete category, which comes equipped with a faithful functor to another category (often Set). But I think this justifies thinking of these things as capturing the essence of monoids.
This line of thought leads more places, as well. If monoids are composable summarizers of finite lists, then what about some other structures? In general, any algebraic structure should be describable as homomorphic images of the free structure. But what makes this interesting is that free monoids have a particularly simple meaning. So this is the same as asking which other structures have particularly simple free constructions.
- Semigroups are summarizers of nonempty finite lists.
- Commutative monoids are summarizers of bags (aka multisets), where elements can occur more than once, but the order is not significant.
- Magmas (that is, sets with a binary operation but no axioms at all) are summarizers of binary trees.
There are probably more, as well!
Finally, this Reddit comment ties it all together within the Haskell programming language, by astonishingly defining both the notion of a free object and defining the List data type itself in terms of free monoids, in only a few lines of code:
type Free :: (Type -> Constraint) -> (Type -> Type)
type Free cls a = (forall x. cls x => (a -> x) -> x)type List :: Type -> Type
type List a = Free Monoid atoList :: Foldable f => f a -> List a
toList = flip foldMap