Learning Elm, part 4

Property Based Testing And Better Modelling

As I said in the conclusion of part 1 of this series, the function I wrote in that post felt very reliable, in a way that's difficult to achieve with any javascript code.

As a reminder, here is the card type used in the code:

type Value = Jack | Queen | King | Ace | Num Int
type Suit = Club | Diamond | Spade | Heart
type Card = OrdinaryCard Value Suit | Joker

Looking at these types, a question arises: how can I guarantee that I never end up with an invalid card? By invalid card, I mean something like a thirteen of Clubs, or a minus five of Hearts.

Unit Testing

The "unsafe" part of the type is the type Value. It's created by the function parseNumValue, which has the type:

parseNumValue : String -> Maybe Value

By testing the parseNumValue function, we'll be able to raise the reliability of the code as a whole.

Unit testing pure functions is very simple: we define some example cases of the function, and then define the expected return values. Let's use the library elm-test for our tests.

Installing Elm Test is easy, as described here:

  1. Run npm install -g elm-test if you haven't already.
  2. cd into the project's root directory that has your elm-package.json.
  3. Run elm-test init. It will create a tests directory inside this one, with some files in it.
  4. Copy all the dependencies from elm-package.json into tests/elm-package.json. These dependencies need to stay in sync, so make sure whenever you change your dependencies in your current elm-package.json, you make the same change to tests/elm-package.json.
  5. Run elm-test.
  6. Edit tests/Tests.elm to introduce new tests.

After writing the unit tests, this is how my Tests.elm file looks:

module Tests exposing (..)

import Test exposing (..)
import Expect
import Cards exposing (..)

all : Test
all =
    describe "parseNumValue"
        [ test "cannot be less than 2"
            <| \() -> Expect.equal (parseNumValue "1") Nothing
        , test "minimum of 2"
            <| \() -> Expect.equal (parseNumValue "2") (Just (Num 2))
        , test "maximum of 10"
            <| \() -> Expect.equal (parseNumValue "10") (Just (Num 10))
        , test "cannot be more than 10"
            <| \() -> Expect.equal (parseNumValue "11") Nothing

The syntax is direct: you describe a test suite, and then define the tests inside a list. I tested the "corner cases" of the function, to make sure that any value less than 2 or greater than 10 will not be parsed to a Card. I also like that tests in general also work as documentation to show how the function is supposed to behave.

So, unit tests raise reliability, but could they do better? What would happen if we call the parseNumValue function with the string "100"? Or the string "-22"? Is it possible to write more general tests, that answer the more powerful question can I guarantee that only integers between 2 and 10 get converted to a Value, and no others?

Property Based Testing

Property Based Testing is very interesting because it allows you to test a whole set of values. For instance, let's pretend that we have at our disposal the set of all integers. If we transform them into strings, we have the perfect inputs for testing parseNumValue.

Elm Test has an easy way of doing property based tests. Instead of using the test function, you will use fuzz function, specify a "fuzzer", and write your test using the generated value as a parameter:


import Fuzz exposing (..)


, fuzz int "parseNumValue"
    <| \number ->
            parsed =
                    |> toString
                    |> parseNumValue
            case parsed of
                Just (Num v) ->
                        "Number should be >= 2 and <= 10 when Just Num v"
                        (v >= 2 && v <= 10)

                _ ->
                        "Number should not be >= 2 and <= 10 when Nothing"
                        (number >= 2 && number <= 10)

This test is direct: it generates an int and passes it as a parameter to the testing function - that's why we're using \number -> .... Then we convert the number to a string and parse it with our parseNumValue function.

We are testing for the following: if the result of the parse is a Just (Num v), then the number was something between two and ten. And, if the result is Nothing, the number was either smaller than 2 or greater than 10. That is what we are asserting in the pattern matching section of the test.

How does it work? It's simple: fuzz int generates a bunch of random integers, and runs a test for each integer generated. That way, it's almost the same as writing a lot of test functions for a lot of integer values.

Observation: during these tests, I found a little problem: the key values of 1, 2, 10 and 11 were not tested every time. That means I could have a false positive! The fuzz test would say everything is ok, but my function could have an error and I would end up with a Just (Num 11). The solution to this could be raising the number of random integers tested, but I could not find a way to do it. If you have an idea of how to deal with this situation, please comment below it in the comments section!

In the end, I maintained the four unit tests that I knew were important cases, and added the fuzz test. Here's how the final test file looks.

I believe the solution feels much more reliable with the addition of the property based tests. But one thing still bothers me: the fact that, if I do not use the parsers to build a card, I can still have an invalid card like Just (Num 11).

Can We Do Better?

Our cards have a small finite domain. Instead of having a Num Int case for the Value type, we could be explicit about every value possible:

type Value
    = Jack
    | Queen
    | King
    | Ace
    | Two
    | Three
    | Four
    | Five
    | Six
    | Seven
    | Eight
    | Nine
    | Ten

type Suit
    = Club
    | Diamond
    | Spade
    | Heart

type Card
    = OrdinaryCard Value Suit
    | Joker

This modelling is very simple and direct, but it's also powerful. It is literally impossible to represent an invalid card. This is what Yaron Minsky, Mark Seemann, Scott Wlaschin and Richard Feldman mean when they say "make illegal states unrepresentable". All these talks are amazing, and illustrate very well the benefits of having types that simply do not allow invalid models to be represented, and also cover techniques to achieve that.

And now we have a much more robust set of functions to parse and "pretty print" cards. (The whole final code with the new types is here).

Is It Always A Possibility?

I tend to think that we were kind of "lucky" here, in the sense that a normal deck of cards has about nine numbered cards only. It's easier to enumerate every case in that sense, but I do not know if it would be practical to do that if the numbered cards were in the 2 - 100 range, for example.

I think that the simple way of dealing with it is to always first try to have all your modeling constraints through types. For example, imagine that the only way to have a user name is through logging in. That means that instead of:

type alias User =
    { isLogged : Bool
    , name : String

You should have:

type User
    = NotLogged
    | Logged String

With the second modeling, you never have the risk of having a NotLogged user with a name. That means you don't need a test to assure that a constructed user is invalid. This is what I mean by powerful! :)

And what do I do if I can't model my domain that way? For example, what do I do if I have a deck of cards with numbered cards ranging from 2 to 1000? In this case, I think that testing your constructors with property based tests is the way to go.

Observation: even when your typings make illegal states unrepresentable, unit and property based tests are still useful when testing state transitions. In our User example, it's useful to test if "logging out function results in a NotLogged User". So, even though good type modeling lowers the need for tests, tests are still useful for making your code reliable.

One last remark: we could represent our 1000 card deck using only types if we could have a "bounded integer" type, such as, "this is an integer larger than X and smaller than Y". This would be a type that is dependent on values, and it's not possible to do in Elm. Actually, it seems it's not possible to do in any mainstream language. :(

This stack overflow question explains dependent typing very directly, and here's a list of languages with dependent typing so we can research more about it. Idris looks particularly nice!

November 1, 2016.