Nascent GHC Proposal: Source Rewrite Rules and Optional Constraints
Lately, I’ve been thinking more and more about something that would be nice to have in GHC.
Motivation #1: HMock Predicates
Consider the Predicate
type from HMock:
data Predicate a = Predicate
{ showPredicate :: String,
showNegation :: String,
accept :: a -> Bool,
explain :: a -> String
}
The purpose of this type is to be a function a -> Bool
that can also explain itself. That is, you can get String
s to explain what it’s testing, and why a particular argument matches or not. There are three important things to know about Predicate
:
- It’s important that
Predicate
works on arbitrary types, including those withoutShow
instances. Indeed, using predicates is the only way to match function calls in HMock whose arguments don’t haveEq
andShow
instances. - The exact strings produced by
showPredicate
,showNegation
, andexplain
are entirely unspecified, by design. It’s nice for users if they are as descriptive as possible, but they only get used in error messages and diagnostics, so no kind of real correctness depends upon their exact value. Predicate
s are often polymorphic. For instance, there iseq :: (Show a, Eq a) => a -> Predicate a
, which compares checks for equality with a specific value. There are even more complex predicates, too, likeunorderedElemsAre :: MonoFoldable t => [Predicate (Element t)] -> Predicate t
, which uses a Ford-Fulkerson max-flow algorithm to match elements and child predicates, then reports on the best available matching, and whether there were either elements or child predicates that couldn’t be matched.
Here’s the fundamental problem: what do I do with Show
instances? Even in the description above, there are some unsatisfying choices:
- The
eq
predicate requires aShow
constraint. This is used to implementshowPredicate
,showNegation
, andexplain
. However, if it weren’t available, I could still provide the core functionality ofeq
at the cost of somewhat poorer descriptive text. Leaving out theShow
constraint would have doomed all users to receive poor descriptions. In the end, I left it in, makingeq
less useful. I could also define a separatenonShowableEq
predicate, which is identical toeq
in all ways except that it lacksShow
. This is ugly, and doing this everywhere would double the exposed symbols in this module (from 50 to 100 exports!) - On the other hand,
unorderedElemsAre
doesn’t have anEq
constraint onElement t
. If it did, I could provide better explanatory text by listing which specific element values in the collection were unexpected. Yet this seems too limiting on such a useful combinator, which is often useful for higher-order types. Therefore, I left the constraint out. This is a real loss of functionality. Again, I could defineshowableUnorderedElemsAre
, but at the cost of exploding the API if I did this everywhere.
What I really want is for GHC to check the types looking for a Show
constraint, then use one or the other implementation depending on whether a Show
constraint exists. This might seem suspect. Indeed, it is! It’s only reasonable because the results of the two implementations are equivalent up to the unspecified explanatory text.
Motivation #2: Improving efficiency
This isn’t the only time something like this comes up. A similar situation often arises with polymorphic algorithms. With only an Eq
constraint, the best you can do for Haskell’s nub
function (which removes duplicates from a list) is O(n²). Add an Ord
constraint, and you can do the same thing in O(n log n). It requires a bit of care to do it in a way that preserves the original order, but it’s still possible. Add a Hashable
constraint, and you can do it in O(n) expected time.
In practice, Haskell offers nub
, and common libraries offer orbNub
to do the same thing more efficiently. That’s again an extra API for the same function, just with better performance at the cost of more assumptions about its input. It’s ugly that this is necessary.
Existing (and partial) solutions
It’s not surprising that people have thought about this in the past. There are several bits of prior work worth mentioning.
- In C++, there’s an approach known as SFINAE: “Specialization Failure Is Not An Error”. What this means is that you can just write two implementations of the same template, and if the more specific one doesn’t compile, you don’t get an error! Instead, the compiler ignores it and substitutes the more general implementation. Since C++ monomorphizes all templates at compile time, it all comes down to trying, and dropping the specialization if possible. This means C++ programs can make these trade-offs without exposing different names and APIs.
- Mike Izbicki has written a Haskell library called ifcxt for Haskell, which partially solves the problem. The good news is that you can use the library to define optional constraints, and different implementations based on whether the constraint is satisfied or not. The bad news is that it requires Template Haskell to write a bunch of boilerplate. Unfortunately, new boilerplate is needed for every new type, so this Template Haskell cannot be internal to the implementation and needs to be invoked from the client code.
- GHC already has a mechanism called rewrite rules, which allow certain expressions to be matched and rewritten by the compiler into a more efficient form. However, this happens during the Core stage of GHC, which is after type class resolution has occurred. This means that rewrite rules are not able to add additional type classes.
It seems that what’s needed here is a GHC plugin or extension.
Proposed solution #1: Source rewrite rules
This seems very similar to a rewrite rule, but just needs to fire at a different stage of the compilation. Therefore, if I were proposing this, I would propose a new kind of rewrite rule. Something like this:
{-# SOURCE_RULES
"nub/Ord" forall x. nub x = ordNub x
#-}
When encountering a rule like this during type checking, GHC would attempt to perform the given substitution. If the resulting program type checks, fine. Otherwise, it would fall back to not applying the rule rather than failing the compile. This is very like the SFINAE rule in C++.
Proposed solution #2: Optional constraints
There’s a slight problem with the solution above: it’s not very compositional when we have separate compilation. If I write this:
foo :: a -> a
foo = idfooNum :: Num a => a -> a
fooNum a = a + 1{-# SOURCE_RULES
"fooNum" forall x. foo x = fooNum x
#-}bar x = foo x
What is the type of bar
? If bar has type a -> a
, then the fooNum
rule never fires. On the other hand, if bar has type Num a => a -> a
, then it is insufficiently general.
I propose that it should infer a -> a
, but with an explicit type signature, one could write bar :: ?Num a => a -> a
. This would inform GHC that it should generate two versions of bar
, one specialized implementation with a Num
constraint, and one general implementation without. It should then automatically generate a new source rule forall x. bar x = barNum x
, that tries to rewrite the more general implementation into the specialized one when possible. (If there are no source rules that depend on the constraint, GHC could realize that both implementations are the same, and decline to generate a specialization at all, possibly with a warning about a redundant optional constraint.)
Conclusion
What do you think? Is this a direction you’d like to see GHC move? Is ifCxt
enough?