In a previous post, I showed how to represent mathematical expression trees using discriminated unions and gave an example of using them to compute derivitives. In this post, I endeavor to add pretty-printing capabilites to my expression trees.
One of the most difficult aspects of formatting mathematical expressions is dealing with precedence and associativity. However, it's simple enough to write a basic formatting function that simply parenthesizes every operation:
let rec print expr = match expr with | Add (x, y) -> sprintf "(%s + %s)" (print x) (print y) | Sub (x, y) -> sprintf "(%s - %s)" (print x) (print y) | Mult (x, y) -> sprintf "(%s * %s)" (print x) (print y) | Div (x, y) -> sprintf "(%s / %s)" (print x) (print y) | Power (x, y) -> sprintf "(%s ** %s)" (print x) (print y) | Neg x -> sprintf "-(%s)" (print x) | Var x -> x | Con x -> string x
Here is an example of using that function to format an expression:
print <| (-x + 2) * (x ** y) ** z / -2;; val it : string = "(((-(x) + 2) * ((x ** y) ** z)) / -2)"
This output, though technically correct, is not really what we want. We would like to include only the minimum number of parentheses, as in the original input expression.
Before we attempt to rectify this, it would be a good idea to try and remove all the duplication from our function. For example, it would be nice if we could handle all binary expressions in a single case, factoring out the parts that vary (such as the operator symbol). This will become more important as the function increases in complexity.
To this end, let's add the following union types to represent each type of unary and binary expression:
[<RequireQualifiedAccess>] type BinaryOp = Add | Sub | Mult | Div | Power [<RequireQualifiedAccess>] type UnaryOp = Neg
These differ from our main
Expr type in that they only represent the type of expression and nothing else. The
RequireQualifiedAccess attribute is used to avoid naming conflicts with the cases from the
We can then augment our our union types with informational properties, such as for retrieving the operator symbol:
type BinaryOp with member this.Symbol = match this with | Add -> "+" | Sub -> "-" | Mult -> "*" | Div -> "/" | Power -> "**" type UnaryOp with member this.Symbol = match this with | Neg -> "-"
Note that because these are members on the union types themselves, we are able to omit the type prefixes that would normally be required on the union cases due to the
Now let's define a multi-case active pattern to classify expressions into one of four categories:
Constant. This will enable us to match binary and unary expressions as a group, as opposed to needing to match each type of expression individually.
let (|Binary|Unary|Variable|Constant|) expr = match expr with | Add (x, y) -> Binary (BinaryOp.Add, x, y) | Sub (x, y) -> Binary (BinaryOp.Sub, x, y) | Mult (x, y) -> Binary (BinaryOp.Mult, x, y) | Div (x, y) -> Binary (BinaryOp.Div, x, y) | Power (x, y) -> Binary (BinaryOp.Power, x, y) | Neg x -> Unary (UnaryOp.Neg, x) | Var x -> Variable x | Con x -> Constant x
Putting it all together, we can now simplify our
let rec print expr = match expr with | Binary (op, x, y) -> sprintf "(%s) %s (%s)" (print x) op.Symbol (print y) | Unary (op, x) -> sprintf "%s(%s)" op.Symbol (print x) | Variable x -> x | Constant x -> string x
This is much more concise than our original version, and it keeps the function focused on the single responsibility of formatting the expression rather than worrying about which operator goes with what expression type.
Now we're ready to modify our function to drop some of the parentheses. However, we'll want to retain the parentheses around the following types of expressions:
- A binary expression nested in either side of another binary expression with higher precedence.
- A binary expression nested in the left-hand side of a right-associative binary expression with equal precedence.
- A binary expression nested in the right-hand side of a left-associative binary expression with equal precedence.
- The operand of a unary expression, except for negation of a variable or non-negative constant.
With all this in mind, let's first add some additional properties to the
BinaryOp type, as well as an
type Associativity = Left | Right type BinaryOp with member this.Precedence = match this with | Add | Sub -> 1 | Mult | Div -> 2 | Power -> 3 member this.Associativity = match this with | Add | Mult -> None | Sub | Div -> Some Left | Power -> Some Right
We can now rewrite our
let rec print expr = let parensPrint innerExpr = sprintf "(%s)" (print innerExpr) match expr with | Binary (op, left, right) -> let printInner defAssoc innerExpr = let opAssoc = defaultArg op.Associativity defAssoc match innerExpr with | Binary (innerOp, _, _) when innerOp.Precedence < op.Precedence -> parensPrint innerExpr | Binary (innerOp, _, _) when innerOp.Precedence = op.Precedence && opAssoc <> defAssoc -> parensPrint innerExpr | _ -> print innerExpr sprintf "%s %s %s" (printInner Left left) op.Symbol (printInner Right right) | Unary (op, operand) -> match expr with | Neg (Var _) -> print operand | Neg (Con x) when x >= 0 -> print operand | _ -> parensPrint operand |> sprintf "%s%s" op.Symbol | Variable x -> x | Constant x -> string x
Now when we invoke the function with our test expression we get the expected result:
print <| (-x + 2) * (x ** y) ** z / -2;; val it : string = "(-x + 2) * (x ** y) ** z / -2"
That's all there is to formatting expressions. As you can see, discriminated unions and pattern matching allow for a very readable and elegant solution.