In previous instalments of this series, I’ve described F#’s support for first-class functions in terms of relatively simple, isolated examples. In this final part, I want to draw on a real-world example to show how F# has made it easy for me to use higher-order functions to address a genuine business requirement.
The project I’ve been working on recently includes a little expression evaluator. Mini-languages of course are meat and potatoes stuff for F#, thanks to the fslex and fsyacc parser tools, but the evaluator has a bunch of behaviours which make life rather more complicated.
For a start, values in the expression language are variants, which have to be coerced to the type of the operation at hand. For example, “2″ + 3 should evaluate to 5, because + means numeric addition; whereas “2″ & 3 should evaluate to “23″, because & means string concatenation.
For another thing, variants can contain arrays, and the rule is that if you apply an operation or function to an array, it should be performed element-wise across the array, returning a new array. For example, 1 + [2, 3] returns [3, 4], and AVG([1, 2], [3, 4]) returns [2, 3].
Finally, variants can contain errors, and operations and functions on errors should propagate the error. For example, 1 + [1, 2/0, "Bob"] should return [2, Error.DivisionByZero, Error.ExpectedNumeric].
(The rules in the real evaluator are actually rather more complicated than this, but life’s too short.)
Now obviously I could write the coercion, array application and error propagation logic into every operation and function, but that sounds like a really bad plan. Instead, what I’d like to do is write each operation and function as if it worked on scalars of the correct type, and have another function that ‘lifts’ that naive operation or function into the variant world.
Let’s take the numeric addition operation as an example. F#, you’ll be gobsmacked to know, has built-in support for addition via the surprisingly named + operator. Unfortunately, + only operates on good old-fashioned numbers. Let’s see if we can lift it to work on variants according to the rules above.
The first thing we need to do is write a method for coercing variants to numbers. Coercion can fail, so this also means we need to define a “maybe” type (equivalent to Haskell’s Either type):
type Result<'a> = | ValidResult of 'a | ErrorResult of Error // Error is a discriminated union -- definition omitted let coerceToNumeric variant = // implementation omitted but returns a Result<float>
Now we can envisage writing the numeric addition operator for scalar, non-error variants something like this:
let addv variant1 variant2 = match coerceToNumeric variant1, coerceToNumeric variant2 with | ValidResult n1, ValidResult n2 -> NumericValue (n1 + n2) | ErrorResult err, _ -> ErrorValue err | _, ErrorResult err -> ErrorValue err
(Here we’re using pattern matching to switch between the various different possible outcomes of the type coercion — see here if you’re not familiar with pattern matching. NumericValue and ErrorValue are the type constructors for the variant discriminated union.)
Once I’m confident this works, it’s easy for me to write the variant subtraction function too:
let subv variant1 variant2 = match coerceToNumeric variant1, coerceToNumeric variant2 with | ValidResult n1, ValidResult n2 -> NumericValue (n1 - n2) | ErrorResult err, _ -> ErrorValue err | _, ErrorResult err -> ErrorValue err
At this point, of course, all your “Ctrl+V reuse” alarm bells will be going off. But before we spring into action, let’s also write the string concatenation function:
let catv variant1 variant2 = match coerceToString variant1, coerceToString variant2 with | ValidResult s1, ValidResult s2 -> StringValue (s1 + s2) | ErrorResult err, _ -> ErrorValue err | _, ErrorResult err -> ErrorValue err
Hmm, this is still eerily similar, but we’re coercing the variants into strings instead of numbers, and wrapping the result as a StringValue instead of a NumericValue. So let’s try to pull out what’s common and what’s varying and create a function that can transform any function into its variant equivalent.
let coerced f coerce uncoerce val1 val2 = match coerce val1, coerce val2 with | (ValidResult x, ValidResult y) -> uncoerce (f x y) | (ErrorResult err, _) -> ErrorValue err | (_, ErrorResult err) -> ErrorValue err let addv = coerced (+) coerceToNumeric NumericValue let subv = coerced (-) coerceToNumeric NumericValue let catv = coerced (+) coerceToString StringValue
(Handily, F# allows me to treat an operator as a function by putting parentheses around it.)
We’re not quite there yet, though, because we’re relying on the function we’re lifting always producing a valid result. But the evaluator spec requires functions to be able to report errors: for example, division by zero should return an error variant rather than a numeric NaN the way the F# division operator does. We can easily write a scalar function that provides the behaviour we want by having it return our “maybe” type:
// divide: float -> float -> Result<float> let divide val1 val2 = match val1, val2 with | _, 0.0 -> ErrorResult DivisionByZero | _ -> ValidResult (val1 / val2)
But now we can’t write this:
let divv = coerced divide coerceToNumeric NumericValue
because NumericValue expects a float, not a Result<float>. We could fix this by writing a custom uncoerce function for divide, but it’s a fair guess that divide won’t be the last function we come across that has to return errors! Instead, it’s better to update the coerced function to make it error-aware:
let coerced f coerce uncoerce val1 val2 = match coerce val1, coerce val2 with | (ValidResult x, ValidResult y) -> match (f x y) with | ValidResult r -> uncoerce r | ErrorResult err -> ErrorValue err | (ErrorResult err, _) -> ErrorValue err | (_, ErrorResult err) -> ErrorValue err
Now divv works, but addv and the other existing functions complain, because coerced expects f to return a Result<’a>, and (+) doesn’t. To make coerced happy, we need to give it a version of + that returns a Result, even though that will always be a ValidResult rather than an ErrorResult. But that’s easy with another higher-level function:
let liftToMaybe f x y = ValidResult (f x y) let addv = coerced (liftToMaybe (+)) coerceToNumeric NumericValue let subv = coerced (liftToMaybe (-)) coerceToNumeric NumericValue let catv = coerced (liftToMaybe (+)) coerceToString StringValue let divv = coerced divide coerceToNumeric NumericValue
We can tighten this up a little more, but you probably get the idea. It’s time to move onto error propagation and array operations.
Again, let’s start by looking at how we might transform addv into something that handles errors and arrays:
let addeav variant1 variant2 = match variant1, variant2 with | (ErrorValue x, _) -> ErrorValue x | (_, ErrorValue y) -> ErrorValue y | (MultiValue xs, MultiValue ys) -> MultiValue (List.map2 addv x y) | (MultiValue xs, _) -> MultiValue (List.map (fun x -> addv x variant2) xs) | (_, MultiValue ys) -> MultiValue (List.map (fun y -> addv variant1 y) ys) | _ -> addv variant1 variant2
(MultiValue is the case of the variant discriminated union that contains an array of values, represented as a List<Variant>. Representing the contents as a list rather than an array makes it easier for us to use F# list operations and pattern matching to work with the contents. List.map2 is the F# list function for performing a two-argument function element-wise across two lists.)
Looking at the addeav code, it’s completely generic. It deals only with functions between variants, so it doesn’t care about the different types of variants or the fact that the function might fail: that was already handled by when we used coerced to transform the primitive function into a scalar variant one. So here’s our lifting function:
let lifted f val1 val2 = match (val1, val2) with | (ErrorValue x, _) -> ErrorValue x | (_, ErrorValue y) -> ErrorValue y | (MultiValue xs, MultiValue ys) -> MultiValue (List.map2 f xs ys) | (MultiValue xs, _) -> MultiValue (List.map (fun x -> f x val2) xs) | (_, MultiValue ys) -> MultiValue (List.map (fun y -> f val1 y) ys) | _ -> f val1 val2 let addv = lifted (coerced (liftToMaybe (+)) coerceToNumeric NumericValue) let subv = lifted (coerced (liftToMaybe (-)) coerceToNumeric NumericValue) let divv = lifted (coerced divide coerceToNumeric NumericValue) let catv = lifted (coerced (liftToMaybe (+)) coerceToString StringValue)
addv, subv, divv and catv will now all propagate errors in a consistent way, and can be applied to arrays. So when we add multiplication support, we barely have to lift a finger:
let mulv = lifted (coerced (liftToMaybe (*)) coerceToNumeric NumericValue)
Looking at these raises an interesting design question: should we combine coerced and lifted? After all, they always seem to be used together. However, it turns out that it’s useful to be able to lift a scalar function of variants to work on arrays too, even if that scalar function didn’t originally come from coercing a primitive operation. An example is bitwise operations. We can’t use coerced for these, because they have to work on both boolean and numeric variants, and they use different operators and different conversions depending on which they received. So I pretty much have to write a new function for bitwise operations on scalar variants:
let bitand_scalar val1 val2 = let b1, b2 = coerceToBoolean val1, coerceToBoolean val2 let n1, n2 = coerceToNumeric val1, coerceToNumeric val2 match b1, b2, n1, n2 with // deep breath | ValidResult b1, ValidResult b2, _, _ -> BooleanValue (b1 && b2) | _, _, ValidResult n1, ValidResult n2 -> NumericValue (float ((int n1) &&& (int n2))) | ValidResult true, _, _, ValidResult n2 -> NumericValue n2 | ValidResult false, _, _, ValidResult n2 -> NumericValue 0.0 | _, ValidResult true, ValidResult n1, _ -> NumericValue n1 | _, ValidResult false, ValidResult n1, _ -> NumericValue 0.0 | _, _, ErrorResult err, _ -> ErrorValue err | _, _, _, ErrorResult err -> ErrorValue err
But once I’ve got the scalar version written, I can get the array version for free:
let bitandv = lifted bitand_scalar
So what have we seen? We’ve been able to write two short functions, coerced and lifted, which respectively turn a function on primitive types like floats or strings into a function on scalar variants, and turn a function on scalar variants into a function which works on both scalar and array variants. With those two functions in place, I can write my core evaluation logic in terms of primitive values, or in terms of single-value variants if I really have to. This keeps my core evaluation logic as simple as possible: the logic of the divide function, for example, isn’t swamped by coercion, error propagation and array application concerns. Having written the simple evaluators, I can then create full-fat evaluators with all the behaviour required by the spec just by feeding them through the coerced and/or lifted functions.
As always, there’s nothing here you can’t do in C#, if you’re sufficiently determined or if multiply nested angle brackets fill you with a great joy. What I hope you can see from the above is that F# enables you to build higher-order functions like coerced and lifted, and to work with the results of those functions, just that bit more easily. I used F# for this project because of its parser tools, but over the course of the project I’ve found the simplicity of defining and working with higher-level functions has encouraged and enabled me to tackle repetition and boilerplate that would have been a lot tougher to tackle in C#. Give it a try — you might be surprised!
Previous entries in this series: