Monday, August 04, 2008

Lazy Evaluation(there be dragons and basement cats)

Lazy Evaluation and "undefined"


I am on the road to being a haskell programmer, and it still is a long way to go. Yesterday I had some nice guys from #haskell explain to me lazy evaluation.

Take a look at this code:

Prelude> let x = undefined in "hello world"
"hello world"
Prelude>

Because of Haskells lazyness, x will not be evaluated because it is not used, hence undefined will not be evaluated and no exception will occur.

The evaluation of "undefined" will result in a runtime exception:

Prelude> undefined
*** Exception: Prelude.undefined
Prelude>


Strictness


Strictness means that the result of a function is undefined, if one of the arguments, the function is applied to, is undefined.
Classical programming languages are strict. The following example in Java will demonstrate this. When the programm is run, it will throw a RuntimeException, although the variable "evilX" is never actually used, strictness requires that all
arguments of a function are evaluated before that function is applied to(called with) these values:



public class Strictness {
public static void main(String[] args) {
int evilX = 1/0;
report_result("sven", evilX);
}

private static void report_result(String name, int x) {
System.out.println("None cares about what "+name + " actually calculated...");
}

}

When executed, the output of the program is:

Exception in thread "main" java.lang.ArithmeticException: / by zero
at com.blogspot.sheyll.strictness.Strictness.main(Strictness.java:10)


A non-strict Haskell equivalent:

module StrictnessFP where

report_results name x = putStrLn $ "None cares about what " ++ name ++ " actually calculated..."

main = let evilX = 1/0 in
report_results "Sven" evilX


When executed the following output appears:

[1 of 1] Compiling StrictnessFP ( C:/Users/sheyll/workspace/StrictnessFP/src/StrictnessFP.hs, interpreted )
Ok, modules loaded: StrictnessFP.
*StrictnessFP> main
None cares about what Sven actually calculated...




Enforcing strictness


Sometimes it makes sense to enforce strictness, in order to optimize the runtime performance. Lazy evaluation means that every value is encapsulated in a thunk, a
function without paramaters, which is passed around instead of the actual value, and which is evaluated if needed. This might cause some overhead, compared to passing around the value directly.
Another point to note is that evaluation order is undetermined, values are evaluated as needed, and therefore only as consequence to the interaction with the real world aka the IO Monad.
In Haskell there is a standard operator to enforce strictness:

seq:: a -> t -> t

With the semantics:

seq a t = if a /= undefined then t else undefined

THIS CODE WILL NOT RUN because the result of comparing something with undefined is always undefined, still this illustrates the idea, that seq will check(evaluate and throw away the result) its first parameter and only return the second parameter if the first did not evaluate to undefined.
Why is this construct usefull?
In case where we have the following structure, and want to optimize the performance and/or space requirements:

complex_calculation x = seq x (... x ....)

"seq" therefore ensures the order of evaluation, that's why it is called "seq".

The following code will output the string "None cares about what Sven actually calculated...":

report_results name x = putStrLn ("None cares about what " ++ name ++ " actually calculated...")

main = report_results "Sven" undefined

Whereas the next version, that incorporates "seq", will throw an Exception at runtime:

report_results name x = seq x (putStrLn ("None cares about what " ++ name ++ " actually calculated..."))

main = report_results "Sven" undefined




Evaluation means reduction to weak head normal form


Simon Peyton Jones, one of the greatest Haskell-Hackers out there, coined the term
"Weak Head Normal Form"(WHNF).
In contrast, the "Head Normal Form"(HNF) requires, that during execution an expression must be reduced until it cannot be further reduced, i.e. all applyable lambda expressions must be applied.
Whereas the WHNF is reached as soon as a lambda expression is achieved, even if it could be reduced further till finally the HNF is reached.
Requirering reduction only to WHNF allows the program to stop the process of reduction, which actually is the process of executing the program, when a value is passed to a function, also if the value ist yet to be calculated by the application of another function.
Requirering reduction to WHNF instead of HNF also allows the reuse of computation results and also helps a lot when implementing "lazy evaluation" in the compiler, because an algorithm called graph reduction can be used.
Please search the net for weak head normal form.
This concludes the theoretical part of this post. The next chapter will deal with
a very practical issue:

There be Dragons: Stackoverflow!



Everyone loves these two higher order functions foldr, foldl.
Here is a possible definition of foldl that's close enough to reality:

my_foldl f a (b:bs) = my_foldl f (f a b) bs
my_foldl _ a [] = a

If that definiton is used in the function below, we get a stack overflow exception:

stack_overflow = my_foldl (\x y -> x + y) 0 [1..1000000]

Because the x+y operations are contained in thunks, that pile up until the stack is full, because they are not evaluated! If they were, all that needs to be on the stack is a single integer, the result of each addition.

For exactly that purpose a special version of foldl exists called " foldl' ", it's basically ....

A Strict "foldl"


The Data.List module of the Glasgow Haskell compiler contains a function called
foldl' that has (roughly) the following definition:


my_foldl' f a l = mfl a l
where mfl a [] = a
mfl a (x:xs) = let a' = f a x in a' `seq` mfl a' xs


The variable a' contains the thunk for the computation f a x, and seq forces a' to be evaluated which means that the thunk called a' will be evaluated - which forces the application of f to a and x.

In this example f is a lambda expression containing the addition of x and y, in fact
it is similar to the expression that caused the stackoverflow:

no_stack_overflow = my_foldl' (\x y -> x + y) 0 [1..1000000]

- only this time there won't be a stackoverflow:

*StrictnessFP> no_stack_overflow
500000500000


So you think you're save now? Be aware, cause

basement cat is in yar toople!!11oneeleven



What do you expect to be the outcome of the evaluation of basement_cat, as defined
here:

basement_cat = my_foldl' (\(x, y) i -> (i, y + i)) (0,0) [1..10000000]


The surprising result is a stackoverflow, although we are using the strict version of foldl.
The stuff piles up on the stack again, because the tuple construction (i, y+i) adds another layer of laziness ontop of the computation. The tuple itself is evaluated because we use the strict foldl, but the contents of the tuple are laziely chilling in thunks, that will not be evaluated until explicitely forced!

The output of the basement_cat:

*StrictnessFP> basement_cat
(10000000,*** Exception: stack overflow


Note how the first element of the tuple can still be evaluated, but not the second.

So how can one get the desired result? The solution is to use the strictness operator "seq":

ceiling_cat = my_foldl' (\(x, y) i -> let y' = y + i in y' `seq` (i, y')) (0,0) [1..10000000]

When executed the interpreter outputs:

*StrictnessFP> ceiling_cat
(10000000,50000005000000)


Write unit tests. Always. Even in Haskell.


So the conclusion for me is that lazy evaluation offers greate benefits:
* optimization potential through graph reduction
* declarative programming style i.e. thtough infinite lists
* recursive functions and datastructures

On the other hand one has to watch out for stack overflows and strictness issues when using i.e. tuples and lists, and all data constructors, as well as newtype contructors which contain tuples. It is essential to pay special attention to these
issues, because they might be not very obvious, at least for a beginner.

Further more the use of the strictness operator should be clear; it should be noted that it is not easiely possible to implement seq in haskell itself. I think it must somehow be built-in.
As the name already suggests seq fixes the order of evaluation:

a `seq` b

Means first eval a then b.

I am convinced, that especially the beginner needs to write unit tests; even in Haskell there are some non-trivial aspects that might cause unexpected behaviour.

No comments: