Sunday, November 30, 2008

my personal profiling guide

This is suited for functional programming.

Blind optimizations: just say NO.

1. Write some unit tests that ensure your code stays correct.
2. DO NOT ASSUME ANY KNOWLEDGE ON WHAT SHOULD BE OPTIMIZED- but try to "see"
possible bottlenecks.
3. Write a the smallest possible program that uses the code to optimize, such that
the program exhibits the functionality that seems to be the bottleneck(s).
Design this program to run for a few seconds without user interaction.
4. Use a decent profiler to profile a complete run of this program
5. Optimize the bad code in this order
- estimate the complexity of the code around hotspots
- try diffrent data structures and algorithms
- try to involve background knowledge
- try to combine several functions into one
- records should fit into the cache of the target platform
- add more strictness(dangerous)
- move common subexpressions or constant subexpressions to a higher scope(generalisation of precalculation)
- use more efficient primitives(do I really need an arbitrary precission floating point)

Following the steps from above I was able to optimize two new programms each by a factor of about 50(!).

Monday, September 15, 2008

ArrowLoop - finally seeing the light at the end of a long tunnel

Today I achieved something I long lusted to acheive, I made the step from fix to loop.
Basically by going from the applicative order Y combinator:


(define (y f)
((lambda (ff)
(lambda (xx)
(f (ff ff) xx)))
(lambda (ff)
(lambda (xx)
(f (ff ff) xx)))))

(define (fak f n)
(if (= 0 n)
1
(* (f (- n 1)) n)))

((y fak) 5)



to writing the same in lazy haskell:



y f = f $ y f

fak r 0 = 1
fak r n = n * r (n - 1)

-- ghci> y fak 5 ----> 120


someone told me that
 fix = y 

and that fix and loop are closely related.
I found an aspiring blogpost and thought it would be fun to
do the fibonacci recursion instead of the simple factorial.

That resulted in:


data SF a b = SF {runSF :: [a] -> [b]}

instance Arrow SF where
arr f = SF (map f)
(SF f) >>> (SF g) = SF (f >>> g)
first (SF f) = SF $ unzip >>> (first f) >>> zip'

instance ArrowLoop SF where
loop (SF f) = SF (\ins ->
let (outs, rs) = (zip' >>> f >>> unzip) (ins, lazy rs)
in outs)


lazy ~(z:zs) = z : lazy zs
zip' = uncurry zip
delay n = SF (n:)

{-

The Fibonacci device
1 2 3 4

o----x ,->(delay b)-. ,-->(id)------->o
| |=>(uncurry (+))-|
o---------'---(id)-----' `-->(delay a)-->o

(the first input is merely a clock source...)

with the loop around:



-}
fib n = take n (runSF (loop (fib' 0 1)) [1,1..])
fib' :: (Num a) => a -> a -> SF (a, a) (a, a)
fib' a b = arr snd >>> -- 1
(delay b &&& arr id) >>> -- 2
arr (uncurry (+)) >>> -- 3
(arr id &&& delay a) -- 4

{-


The experimental Fibonacci device
1 2 3 4
o-------------------------------------------. ,------(id)----->o
,->(delay b)-. |===> (uncurry (*))-|
| |=>(uncurry (+))-----' `--->(delay a)-->o
o---------'---(id)-----'


-}
fibex' :: (Num a) => a -> a -> SF (a, a) (a, a)
fibex' a b = (second
((delay b &&& arr id) >>> -- 1
arr (uncurry (+)))) -- 2
>>>
(arr (uncurry (*))) >>> -- 3
((arr id) &&& (delay a)) -- 4


fibex n = take n (runSF (loop (fib' 0 1)) [1..])



That was/is really fun. I like haskell.

Friday, August 08, 2008

Got Religious

[this is the only religious post you will ever encounter on this blog]

After an excessive amount of thinking, reading and
weighting pros and cons I came to the final conclusion that
actually and deep down in me, I beleive that
1. God exists
2. Christianity makes sense

Many, many arguments exits against 1 and also 2.

But as soon as you accept 1. arguing against 2 becomes
really hard. All objections brought into field against 2. failed to
convince me finally; ok some of them convinced me at first,
and it took years to find out that they were errornous.
Many objections seem clever at first, but really contain
discrepancies or stem from pure ignorance.
Most of those uttering them, do not really reflect
these objections very well. A nice example is
the following "objection" against 1. and partly 2.:
"If god was almighty, could he create a
stone so heavy, that he can't lift it?"
It takes a bit of thinking, but is actually easy to debilitate. I will
leave that as an exercise to the reader, I really don't want to
take the fun out of this; I am sure, that also atheists will
see the inherent logical problems in this objection :)

I think that most religions can be unified to a common theme,
I therefore believe, that some religions contain an abstract
common base, that is below a thick tangible layer of
"stories" - I prefer christianity, as it is the only one
I know just good enough, to see how deeply consistent and
consistent everything is. Although some peices seem odd or
paradox at first, they make sense if you dig deeper and
think about them...

Regarding 1. How could a reasonable scientist exclude
the existence of god, without leaving many
questions open, and without entangling themselves in inconsistencies?
Well even if my ignorance/stupidity is not a
convincing argument that 1. is true, and you believe that
something in this world *whatever that is* exists, you end up
beleiving in some immaterial properties - one way or the other.
I beleive that these properties exist (and might even be corner stones of a
creation done be an intentional, allmighty being, which some call god) and are
basic expressions of the Holy Ghost - as are the brain brain activities,
that ask themselves now, wether this make sense or not.

So ... please do not assume that I will bother anyone with my personal beleivesystem
anymore, I just wanted share these thoughts, that's all I'll ever write about that
topic.

Thursday, August 07, 2008

Erlang code snippet....

Code like that:

blah(X) ->
X1 = f(X),
X2 = g(X1),
X3 = h(X2),
i(X3).

is very error prone, and worse very hard to extend. Just adding another
function application between f(X) and g(X1) requires you to alter all lines
upto the end of the function:

blah(X) ->
X1 = f(X),
X2 = j(X1)
X3 = g(X2),
X4 = h(X3),
i(X4).

This causes 2*numberoflinestoendoffunction potential error sources;

And this is also not at all a very functional style.
Why not rewriting it like this:

blah(X) ->
(chain([
f,
g,
h,
i
]))(X).

where one chains together the functions. Adding a function invocation in between is as easy as pie.



chain(FunList) ->
lists:foldl(
fun combine/2,
fun id/1,
FunList).

id(X) -> X.

combine(F,G) ->
RF = to_local_function(F),
RG = to_local_function(G),
fun(X) ->
RF(RG(X))
end.

to_local_function(F) ->
case F of
{FM, FF} -> F;
_ when is_atom(F) -> {?MODULE, F}
end.

Another solution might be using the state monad, wich was
built for dealing with this kind of functional composition.

Monday, August 04, 2008

Lazy Evaluation(there be dragons and basement cats)

Lazy Evaluation and "undefined"


I am on the road to being a haskell programmer, and it still is a long way to go. Yesterday I had some nice guys from #haskell explain to me lazy evaluation.

Take a look at this code:

Prelude> let x = undefined in "hello world"
"hello world"
Prelude>

Because of Haskells lazyness, x will not be evaluated because it is not used, hence undefined will not be evaluated and no exception will occur.

The evaluation of "undefined" will result in a runtime exception:

Prelude> undefined
*** Exception: Prelude.undefined
Prelude>


Strictness


Strictness means that the result of a function is undefined, if one of the arguments, the function is applied to, is undefined.
Classical programming languages are strict. The following example in Java will demonstrate this. When the programm is run, it will throw a RuntimeException, although the variable "evilX" is never actually used, strictness requires that all
arguments of a function are evaluated before that function is applied to(called with) these values:



public class Strictness {
public static void main(String[] args) {
int evilX = 1/0;
report_result("sven", evilX);
}

private static void report_result(String name, int x) {
System.out.println("None cares about what "+name + " actually calculated...");
}

}

When executed, the output of the program is:

Exception in thread "main" java.lang.ArithmeticException: / by zero
at com.blogspot.sheyll.strictness.Strictness.main(Strictness.java:10)


A non-strict Haskell equivalent:

module StrictnessFP where

report_results name x = putStrLn $ "None cares about what " ++ name ++ " actually calculated..."

main = let evilX = 1/0 in
report_results "Sven" evilX


When executed the following output appears:

[1 of 1] Compiling StrictnessFP ( C:/Users/sheyll/workspace/StrictnessFP/src/StrictnessFP.hs, interpreted )
Ok, modules loaded: StrictnessFP.
*StrictnessFP> main
None cares about what Sven actually calculated...




Enforcing strictness


Sometimes it makes sense to enforce strictness, in order to optimize the runtime performance. Lazy evaluation means that every value is encapsulated in a thunk, a
function without paramaters, which is passed around instead of the actual value, and which is evaluated if needed. This might cause some overhead, compared to passing around the value directly.
Another point to note is that evaluation order is undetermined, values are evaluated as needed, and therefore only as consequence to the interaction with the real world aka the IO Monad.
In Haskell there is a standard operator to enforce strictness:

seq:: a -> t -> t

With the semantics:

seq a t = if a /= undefined then t else undefined

THIS CODE WILL NOT RUN because the result of comparing something with undefined is always undefined, still this illustrates the idea, that seq will check(evaluate and throw away the result) its first parameter and only return the second parameter if the first did not evaluate to undefined.
Why is this construct usefull?
In case where we have the following structure, and want to optimize the performance and/or space requirements:

complex_calculation x = seq x (... x ....)

"seq" therefore ensures the order of evaluation, that's why it is called "seq".

The following code will output the string "None cares about what Sven actually calculated...":

report_results name x = putStrLn ("None cares about what " ++ name ++ " actually calculated...")

main = report_results "Sven" undefined

Whereas the next version, that incorporates "seq", will throw an Exception at runtime:

report_results name x = seq x (putStrLn ("None cares about what " ++ name ++ " actually calculated..."))

main = report_results "Sven" undefined




Evaluation means reduction to weak head normal form


Simon Peyton Jones, one of the greatest Haskell-Hackers out there, coined the term
"Weak Head Normal Form"(WHNF).
In contrast, the "Head Normal Form"(HNF) requires, that during execution an expression must be reduced until it cannot be further reduced, i.e. all applyable lambda expressions must be applied.
Whereas the WHNF is reached as soon as a lambda expression is achieved, even if it could be reduced further till finally the HNF is reached.
Requirering reduction only to WHNF allows the program to stop the process of reduction, which actually is the process of executing the program, when a value is passed to a function, also if the value ist yet to be calculated by the application of another function.
Requirering reduction to WHNF instead of HNF also allows the reuse of computation results and also helps a lot when implementing "lazy evaluation" in the compiler, because an algorithm called graph reduction can be used.
Please search the net for weak head normal form.
This concludes the theoretical part of this post. The next chapter will deal with
a very practical issue:

There be Dragons: Stackoverflow!



Everyone loves these two higher order functions foldr, foldl.
Here is a possible definition of foldl that's close enough to reality:

my_foldl f a (b:bs) = my_foldl f (f a b) bs
my_foldl _ a [] = a

If that definiton is used in the function below, we get a stack overflow exception:

stack_overflow = my_foldl (\x y -> x + y) 0 [1..1000000]

Because the x+y operations are contained in thunks, that pile up until the stack is full, because they are not evaluated! If they were, all that needs to be on the stack is a single integer, the result of each addition.

For exactly that purpose a special version of foldl exists called " foldl' ", it's basically ....

A Strict "foldl"


The Data.List module of the Glasgow Haskell compiler contains a function called
foldl' that has (roughly) the following definition:


my_foldl' f a l = mfl a l
where mfl a [] = a
mfl a (x:xs) = let a' = f a x in a' `seq` mfl a' xs


The variable a' contains the thunk for the computation f a x, and seq forces a' to be evaluated which means that the thunk called a' will be evaluated - which forces the application of f to a and x.

In this example f is a lambda expression containing the addition of x and y, in fact
it is similar to the expression that caused the stackoverflow:

no_stack_overflow = my_foldl' (\x y -> x + y) 0 [1..1000000]

- only this time there won't be a stackoverflow:

*StrictnessFP> no_stack_overflow
500000500000


So you think you're save now? Be aware, cause

basement cat is in yar toople!!11oneeleven



What do you expect to be the outcome of the evaluation of basement_cat, as defined
here:

basement_cat = my_foldl' (\(x, y) i -> (i, y + i)) (0,0) [1..10000000]


The surprising result is a stackoverflow, although we are using the strict version of foldl.
The stuff piles up on the stack again, because the tuple construction (i, y+i) adds another layer of laziness ontop of the computation. The tuple itself is evaluated because we use the strict foldl, but the contents of the tuple are laziely chilling in thunks, that will not be evaluated until explicitely forced!

The output of the basement_cat:

*StrictnessFP> basement_cat
(10000000,*** Exception: stack overflow


Note how the first element of the tuple can still be evaluated, but not the second.

So how can one get the desired result? The solution is to use the strictness operator "seq":

ceiling_cat = my_foldl' (\(x, y) i -> let y' = y + i in y' `seq` (i, y')) (0,0) [1..10000000]

When executed the interpreter outputs:

*StrictnessFP> ceiling_cat
(10000000,50000005000000)


Write unit tests. Always. Even in Haskell.


So the conclusion for me is that lazy evaluation offers greate benefits:
* optimization potential through graph reduction
* declarative programming style i.e. thtough infinite lists
* recursive functions and datastructures

On the other hand one has to watch out for stack overflows and strictness issues when using i.e. tuples and lists, and all data constructors, as well as newtype contructors which contain tuples. It is essential to pay special attention to these
issues, because they might be not very obvious, at least for a beginner.

Further more the use of the strictness operator should be clear; it should be noted that it is not easiely possible to implement seq in haskell itself. I think it must somehow be built-in.
As the name already suggests seq fixes the order of evaluation:

a `seq` b

Means first eval a then b.

I am convinced, that especially the beginner needs to write unit tests; even in Haskell there are some non-trivial aspects that might cause unexpected behaviour.

Friday, August 01, 2008

BlogPostAnswer.hs

{-

Intro


-}
module BlogPostAnswer where

import System.Environment

{-

This is haskell source code, which relates
to a nice blog post of someone comparing
java with python.

Original post



I picked some of the python-java comparisons
from that post, which I would like to
extend in order to contrast java and python
with haskell.

You can actually use the complete post as a
haskell source code file. All prose is
embedded in "{-" and "-}", wich are the block
comment tokens in haskell syntax.

I am a beginner haskell-hacker, and I dont have
a clue about python, so enlighten me, if you spot
a mistake.

Like python, haskell allows you to use indenting
as part of the syntax, unlike python however, you
also have the choice of using "{", "}" and ";"
instead of, or combined with, indentation.

Also, like in python, and unlike java you don't need
to put type declarations into you source code.

The step from java to haskell is
not the step from a statically typed language
to a dynamically typed language. Neither is java
really statically typed, nor is haskell
dynamic.
The opposite is true, haskell has a far stricter static
typesystem than java has, but it uses a variation
of the Hindley-Milner typesystem wich includes
type inference - This effectivly frees one from
writing type specs in most cases.

Haskell has a rich typesystem with all kinds of
good stuff. For someone who knows
C++ template programming, learning haskell might be
easier than learning java 5 generics.

I will try to give code examples in haskell, that
roughly correspond to the java and python code examples
from the blog post.

Plase aquire ghc and load this blogpost as haskell file.
Every piece of code shown here, is complete and can be
executed in the interactice haskell shell called ghci.

Assuming ghci is started in the director where textfile
with this source is located, the following commands
should be typed at the ghci prompt in order to compile and
load the code:

> :l (name of file)

(this loads the code)

Now you can call every function in this source code from the prompt
by writing the name and the arguments of that function seperated by
spaces.
i.e.

> test_matches_2

to run the "main" function, you must compile the file outside the interactive shell and
start the programm from the commandline like this:

bash $ ghc --make (nameofsourcefile)
bash $ ./(nameofsourcefile) "hallo welt"

Again, let me remind the reader, that I am not
a haskell expert, and that there might be far
more elegant ways to do it.

ok, let's get started:

Observation 1: Haskell is a unique mixture of readability and brevity



Java:

public class Test
{
public static void main(String[] args)
{
for (String s : args[0].split("\\s+"))
System.out.println(s);

System.exit(0);
}
}

Haskell:

-}

main = getArgs >>= mapM putStrLn . words . head

{-

This examples pipes a list of command line parameters into the
functional combination of three functions that do the following in the
order they are listed:

  1. takes the first element and ignores the rest: "head"

  2. seperates this element into a list of words of wich the first element consists

  3. write each word seperated by a newline to the stdout


As you may have noticed:

  1. ">>=" pipes data

  2. "." combines two functions to a new function

  3. "mapM f" call the function "mapM" with the parameter "f"


Haskell function application syntax is very similar to the
typical unix shell function invocation syntax.
The name of the function and the arguments are put in a line
next to each other seperated by spaces instead of brackets.


Observation 2: No need for set/get methods in Haskell



One purpose of the 2nd code example from
the above mentioned blogpost is to show off
a nice python feature needed in a language
with mutable state and oo features like instance
variables.

Haskell has neither mutable variables nor objects with variables.
Haskell is a pure funcitonal programming language,
meaning that like erlang, haskell has no notion of
variables that may be changed after being bound.
The typical OO-Class system, having classes with
i.e. setters for properties is not possible in
haskell without rolling your own little object
system (wich isn't hard at all, but also not
necessary).

Haskell provides other means for grouping data.
One way is to use algebraic data types:

Observation 3: Haskell has useful constructs Java lacks


algebraic datatypes


Algeraic data types are those constructs that start with the reserved keyword "data".
They create symbols associated with a place to create new structural type info associated
with that symbol; with the hint that "|" means "or" and that all words starting with capitals
are symbols wich we associate witch structure and meaning, and that everything starting with a
lowercase character is a (type-) parameter, have a look at the definiton of the datatype definitons in the follwoing;


-}

data Point2D = Point2D Double Double deriving (Show)

{-

this defines a data type wich
contains to floats: a 2-D point...
before we go on, let's define another data type,
but this time one that is
parameterised with another type.
Comparable with C++ templates:

-}

data Triangle p = Triangle p p p deriving (Show)

{-

now we create a simple alias
for 2D triangles (typedef in C++):

-}

type Triangle2D = Triangle Point2D

{-

Confused now? good, we can move on...

Our Point2Ds can be created and read.
Their contents (the two floats) are
readonly. This of course doesn't bother
us at all, it turns out in the long run
to be a really nice feature
concerning readability of code.

Assume the following function that generates
a "circle" made of n points with a radius of r:

-}

generate_circle r n = [Point2D (calc sin i) (calc cos i) | i <- [0 .. (n - 1)]]
where calc op i = r * op ((i/n) * 2 * pi)

{-

which looks like this in Java:

class Point2d {
private final double x;
private final double y;

public Point2d(double x, double y) {
this.x = x;
this.y = y;
}
/* ... some getters... */
}

public final class GenerateCircle {

final Point2d[] points;
final double r;
final int n;

public GenerateCircle(double r, int n) {
points=new Point2d[n];
this.r = r;
this.n = n;
for ( int i = 0; i < n; ++i) {
points[i] = new Point2d(calc_sin(i), calc_cos(i));
}
}
private float calc_sin(int i) {
return r * Math.sin((double)i/n * 2.0 * Math.PI);
}
private float calc_cos(int i) {
return r * Math.cos((double)i/n * 2.0 * Math.PI);
}
/* ... some getters ... */
}


pattern matching and guards



Pattern matching is syntactic sugar that frees one
from writing complicated if and switch constructs.
Combined with algebraic data types, a feature that java lacks for sure -
it is, as we have seen above, a way to simulate generic methods.

Another useful construct is the function guard;
let's look at a practical example:
the programmer of the display function
may use a simple constraints(a guard) to protect
the precious display function:
(the guard is the boolean expression
between the "|" and the "=")
Now if in that example x or y is less than 0 a runtime exception
is thrown, with detailed information where and why the error was caused.(Try it already!)

-}

display_list primitives = [display p | p <- primitives]
where display (Point2D x y)
| x >= 0 && y >= 0 = show (Point2D x y)

{-

All we do now is add a super type for primitives
and pimp our "display_list function a little to make
use of pattern matching in conjunction with
algebraic data types(see below):

-}

data Primitive = Point Point2D | Tri Triangle2D

display_primitives :: [Primitive] -> [String]
display_primitives primitives = [display p | p <- primitives]
where
display (Point (Point2D x y))
| x >= 0 && y >= 0 = show (Point2D x y)
display (Tri (Triangle p1 p2 p3))
= show (Triangle p1 p2 p3)

{-

higher order functions, lambdas, currying and sections



When it comes to functions, lambdas and closures haskell really shines.
In haskell everything is done with functions. Let's have
a look at a small example.
Note that this example also makes heavy use of algebraic datatypes and pattern
matching.

-}

data MyTree t = Leaf t | Node [MyTree t]

{-


In Java this would look like this:

public interface MyTree<T> {
public class Leaf<S> implements MyTree<S> {
public S data;
}
public class Node<S> implements MyTree<S> {
public MyTree<S> childs;
}
}


This results in a tree with data in the leafs. Let's define a matches function,
that returns a list of "ts" that match a user defined predicate(or matcher function):

-}

matches matcher (Leaf t) = if matcher t then [t] else []
matches matcher (Node childs) = [e | child <- childs, e <- matches matcher child]

{-

well that's it! The first clause of the function "matches" is applied only to Leafs
and the second clause uses list comprehension to call "matches" recursivly
for every child node.

In java is considerably harder to write (and to read).

Now let's leave this planet, and let's head off into a galaxy far far away...
Lets write a matcher that checks that the first letter is a capital "A", but let's
do this in two steps, lets first create a "startsWith" function that you should be
familiar with from other languages, and after the a function using the first one,
that checks for the capital "A".

-}
startsWith str = (( == str) . (take (length str)))

{-

this can actually be translated to english:
"combine two functions: (this is the "." operator)


  1. A function that compares something for equality with str, this is "(== str)".
    I used a fancy haskell feature here... (==str) is actually function call
    to the function "==" which needs to arguments and returns True or False.
    We gave the function only one argument "str". Of course the "==" function
    cannot give us a Boolean value for that! But instead it gives us a new function,
    a closure, that takes only one argument and returns a Bool. Isn't that cool?!
    I mean once you get used to that, it's really readable and you will never want
    to go back!


  2. A function that returns a sublist of a list containing only the the first
    "(length str)" elements: "(take (length str))"
    BTW, a string in haskell is a list of characters, and "length" returns the number
    of elements in a list.


Observe that startsWith takes only one argument not two! You may wonder how startsWith
can work, because it needs the string to search for and a string to search in.

Recall the weired (== str) construct. What happend here happens again: we created a new
anonymous function by calling a function named "==" with not enough arguments, btw.
this is called currying(- invented by the logician "Haskell Brooks Curry")
We also called "take" wich actually needs two arguments with only one argument
(the string length). The result is therefore a function the takes a string and returns
the desired sub-string with the same length that "str" has.
Now the combine operator (".") combines the two functions that each take one argument
and creates a new one such that the new function takes the input string for (take ...)
and returns the result of "(==str)" called with the result of "(take ...)";

Let's forget about (==str) and (take...) for a while and let's look at the definition of
".":

(f . g) x = f(g(x))

this might look familiar...
Now it should become clearer what the result of calling the startsWith function with
ONE parameter is: a new function that takes a string and returns a Bool indicating
wether or not the string given to that anonymous function starts with the string!

Once a programmer has learned very few core concepts about the function handling,
haskell source code is very readable, although it is often less terse than java code.

Also note that our startsWith function is perfectly generic, it can be applied to
ANY type of lists not just strings wich happen to be a synonym for lists of "Char"!

Now I do not want to write down equivilant code for that in java, I think it is
possible. Also note that I do not want to judge the qualities of languages,
I just want to study them.

Ok, back to topic, let's create a test function that roughly corresponds to the one in
the original blogpost.

Remeber that we wanted to define a second function in terms of the first function,
that checks if a string starts with a capital "A"?
Here we use currying again, and simlpy and clearly write:

-}
startsWithA = startsWith "A"

keywordTree = Node [ Node [ Leaf "Hallo"], Leaf "lieber", Leaf "Allerwertester", Node [Node [], Node [Leaf "Wie"], Leaf "gehts"], Leaf "dir"]

test_matches_1 = matches startsWithA keywordTree

{-

oh, let's not forget to introduce a lambda!
the greek symbol for a small lambda actually looks alot like the "\" (backslash) charakter,
so in haskell a lambda is defined like this:

\x -> x + 2

in python:

lambda x: x + 2

another way in haskell:

(2 +)

A lambda expression with two parameters can be written in haskell like this:

\x y -> x + y

or simply:

(+)

another way to write (==str) using a lambda:

\x -> x == str


Ok, back to the tree example,
let's use a lambda expression as a parameter for "matches"
to filter all words shorter than 5 characters.
The then the complete expression looks like this:
  
-}

test_matches_2 = matches (\x -> length x <= 5) keywordTree

{-

of course we can assign the lamda expression to a name, and use that instead.

-}

shorter_than_5_chars = \x -> length x <=5
test_matches_3 = matches shorter_than_5_chars keywordTree

{-

This can of course be written with haskells usual function definition syntax:

-}

shorter_than_4_chars x = length x <= 4
test_matches_4 = matches shorter_than_4_chars keywordTree

{-

And even more compact using functional compisition and sections:

-}

shorter_than_6_chars = (<=6) . length
test_matches_5 = matches shorter_than_6_chars keywordTree

{-

shorter_that_6_chars is written in point-free style, that means
that in contrast to the previous definitons we didn't explicitly name
the argument wupped around in the function, instead we just combined
the the functions.
This is very smilar to the pipe syntax in the bash:

$ cat /proc/cpuinfo | grep bogo | awk '{print $3}' | sort -n

Here too no variable is mentioned, that actually contains the current line
that is passed around.
A functions stdin is its parameter and a functions stdout is its return value.
That's also why it is so important in haskell to have currying automatically
everywhere, because it allows me to use a function with more than one
parameters as a function with one parameter wich returns a new function,
wich takes one parameter and returns a new functions wich takes one parameter ... etc
until final we get the actual result of the computation.

Factories?


Writing factories in haskell might by now seem not very interesting; still
let's write some RPC class definitons in haskell:

-}

data RPC = XMLRPC String | JSONRPC String

data RPCResult = Void | Str String | Strs [String] | Numbers [Int] -- define some results here...
deriving (Show)

callRemote :: RPC -> String -> [String] -> RPCResult
callRemote (XMLRPC conn) function args =
Str ("nyi, xmlrpc to "++ conn ++" func: " ++ function ++ " args: " ++ (show args))
callRemote (JSONRPC conn) function args =
Str ("nyi, jsonrpc to "++ conn ++" func: " ++ function ++ " args: " ++ (show args))


{-

Here is a little testfunction so you know how to use this;
it will call the jsonrpc service "helloworld" at "http://..." with the arguments 1,2

-}

test_json_rpc = callRemote (JSONRPC "http://test.org/service") "helloworld" ["1", "2"]

--
now to a the factory:


rpc_factory :: String -> String -> RPC
rpc_factory impl arg = case impl of
"JSON" -> JSONRPC arg
"XML" -> XMLRPC arg

test_rpc_factory name = callRemote conn "trigger" []
where conn = rpc_factory name "http://alarm.org/sirene"
{-

Call the test_rpc_factory function with either "XML" or "JSON" from the
ghci shell.

A factory also makes sense when one wants to extend the programm at runtime.
This can be done in haskell. Code can be dynamically loaded and recompiled
in running haskell programms (see "hs-plugin"), with type checking and dependency
resolution.

It is possible to create a complete OO-Class infrastructure in haskell,
similar to what exists in python, java and CLOS.

Lesser need for DI and friends



A nice application of factories is dependency injection.
This extends the idea of factories, it allows programmers to wire all class
instance dependencies and configuration values in one function in a declarative way.
In powerfull functional languages this might be pointless in situation where this approach makes
perfect sense in programs written in python or java, as the means of functional abstraction
and combination, especially when used in combination with a powerfull type system, and pattern
matching, render dependency injection frameworks simply useless - if the whole point
of the dependency injection was merely the declarative wireing of class dependencies and
configuration and not the ability to change all that at runtime.
On the other hand, in systems like the erlang runtime complicated mechanisms exist to actually
exchange code and the the dependencies of the code at runtime,
while maintaining the serveral versions of the same code when it is still required,
with automatically restarting services in the correct order when necessary, etc.

Well I hope I have shown, that haskell can actually be fun to work with, and that
diffrent approaches in the design of programming languages are always fun to look at.

That's all folks


The original blog post author compared python with java to show the things he likes in python and
he misses in java.
This comparison has shown how similar phyton and java actually are when compared with haskell ;)
My relation to haskell and java is similar to the blogpost authors relation to python and java,
I think that the author should definitely have a look at haskell, for the
relation to haskell and python might just develop to be of a similar nature,
and the learning experience might also be pretty enlightening.

-}

Tuesday, May 27, 2008

NumberCheck in haskell


import System.Environment
import Data.Char

main = do
arg:_ <- getArgs
putStrLn (arg ++ " is " ++ (result arg))
where result arg = if check arg then "correct" else "incorrect"
check arg = dotOrDigit `all` arg && oneOrZeroDots arg
dotOrDigit = (`elem` ['0'..'9'] ++ ['.'])
oneOrZeroDots = (<=1) . length . (filter ('.'==))

Wednesday, May 21, 2008

logs display, little experiment

this code ...


module LogDiffMarker where

type ReferenceWords = [String]
type Lines = [Line]
type Line = String

mark_diffrent_words :: (ReferenceWords, Lines) -> Line -> (ReferenceWords, Lines)
mark_diffrent_words (ref,i) line = (next_ref, i ++ [unwords line_with_em])
where (line_with_em, next_ref) = mark_diffrences (words line) ref ([],[])
mark_diffrences l1 [] (line_acc, ref_acc) = (line_acc ++ ["<b>"] ++ l1 ++ ["</b>"], ref_acc ++ l1)
mark_diffrences [] r (line_acc, ref_acc) = (line_acc, ref_acc)
mark_diffrences (h1:t1) (h2:t2) (line_acc, ref_acc)
| h1 == h2 = mark_diffrences t1 t2 (line_acc ++ [h1] ,ref_acc ++ [h2])
| otherwise = mark_diffrences t1 t2 (line_acc ++ ["<b>" ++ h1 ++ "</b>"],ref_acc ++ [h1])

mark_diffrent_lines :: Lines -> Lines
mark_diffrent_lines = (["<pre>"]++) . (++["</pre>"]) . snd . (foldl mark_diffrent_words ([], []))

main = interact (unlines . mark_diffrent_lines . lines)


...is a proof of concept of a highlighter algorithm for a logdisplayer that - if proofed useful - will be rewritten in javascript/html

from:

May 10 05:52:19 ubuntu syslogd 1.5.0#1ubuntu1: restart.
May 10 06:22:02 ubuntu -- MARK --
May 10 06:42:02 ubuntu -- MARK --
May 10 06:51:49 ubuntu kernel: [ 2145.672201] nautilus[6101]: segfault at e1000020 eip b773f2d6 esp bfe4d9d0 error 5
May 10 06:51:51 ubuntu kernel: [ 2147.588620] gdm[5658]: segfault at 10c03f90 eip b7801635 esp bf8100e0 error 4
May 10 06:51:53 ubuntu kernel: [ 2149.171081] ip6_tables: (C) 2000-2006 Netfilter Core Team
May 10 06:51:54 ubuntu exiting on signal 15
May 12 19:07:04 ubuntu syslogd 1.5.0#1ubuntu1: restart.
May 12 19:07:04 ubuntu kernel: Inspecting /boot/System.map-2.6.24-16-generic
May 12 19:07:04 ubuntu kernel: Loaded 27704 symbols from /boot/System.map-2.6.24-16-generic.
May 12 19:07:04 ubuntu kernel: Symbols match kernel version 2.6.24.
May 12 19:07:05 ubuntu kernel: Loaded 32155 symbols from 83 modules.
May 12 19:07:05 ubuntu kernel: [ 0.000000] Linux version 2.6.24-16-generic (buildd@palmer) (gcc version 4.2.3 (Ubuntu 4.2.3
May 12 19:07:05 ubuntu kernel: [ 0.000000] BIOS-provided physical RAM map:
May 12 19:07:05 ubuntu kernel: [ 0.000000] BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
May 12 19:07:05 ubuntu kernel: [ 0.000000] BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
May 12 19:07:05 ubuntu kernel: [ 0.000000] BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
May 12 19:07:05 ubuntu kernel: [ 0.000000] BIOS-e820: 0000000000100000 - 000000007fff0000 (usable)
May 12 19:07:05 ubuntu kernel: [ 0.000000] BIOS-e820: 000000007fff0000 - 000000007fff3000 (ACPI NVS)
May 12 19:07:05 ubuntu kernel: [ 0.000000] BIOS-e820: 000000007fff3000 - 0000000080000000 (ACPI data)
May 12 19:07:05 ubuntu kernel: [ 0.000000] BIOS-e820: 00000000d0000000 - 00000000e0000000 (reserved)
May 12 19:07:05 ubuntu kernel: [ 0.000000] BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
May 12 19:07:05 ubuntu kernel: [ 0.000000] 1151MB HIGHMEM available.
May 12 19:07:05 ubuntu kernel: [ 0.000000] 896MB LOWMEM available.
May 12 19:07:05 ubuntu kernel: [ 0.000000] found SMP MP-table at 000f52c0
May 12 19:07:05 ubuntu kernel: [ 0.000000] Zone PFN ranges:
May 12 19:07:05 ubuntu kernel: [ 0.000000] DMA 0 -> 4096
May 12 19:07:05 ubuntu kernel: [ 0.000000] Normal 4096 -> 229376
May 12 19:07:05 ubuntu kernel: [ 0.000000] HighMem 229376 -> 524272
May 12 19:07:05 ubuntu kernel: [ 0.000000] Movable zone start PFN for each node
May 12 19:07:05 ubuntu kernel: [ 0.000000] early_node_map[1] active PFN ranges
May 12 19:07:05 ubuntu kernel: [ 0.000000] 0: 0 -> 524272
May 12 19:07:05 ubuntu kernel: [ 0.000000] DMI 2.3 present.
May 12 19:07:05 ubuntu kernel: [ 0.000000] ACPI: RSDP signature @ 0xC00F6C50 checksum 0
May 12 19:07:05 ubuntu kernel: [ 0.000000] ACPI: RSDP 000F6C50, 0014 (r0 Nvidia)
May 12 19:07:05 ubuntu kernel: [ 0.000000] ACPI: RSDT 7FFF3000, 0030 (r1 Nvidia AWRDACPI 42302E31 AWRD 1010101)
May 12 19:07:05 ubuntu kernel: [ 0.000000] ACPI: FACP 7FFF3040, 0074 (r1 Nvidia AWRDACPI 42302E31 AWRD 1010101)
May 12 19:07:05 ubuntu kernel: [ 0.000000] ACPI: DSDT 7FFF30C0, 4BF2 (r1 NVIDIA AWRDACPI 1000 MSFT 100000C)
May 12 19:07:05 ubuntu kernel: [ 0.000000] ACPI: FACS 7FFF0000, 0040
May 12 19:07:05 ubuntu kernel: [ 0.000000] ACPI: MCFG 7FFF7D40, 003C (r1 Nvidia AWRDACPI 42302E31 AWRD 1010101)
May 12 19:07:05 ubuntu kernel: [ 0.000000] ACPI: APIC 7FFF7CC0, 007C (r1 Nvidia AWRDACPI 42302E31 AWRD 1010101)
May 12 19:07:05 ubuntu kernel: [ 0.000000] Nvidia board detected. Ignoring ACPI timer override.
May 12 19:07:05 ubuntu kernel: [ 0.000000] If you got timer trouble try acpi_use_timer_override
May 12 19:07:05 ubuntu kernel: [ 0.000000] ACPI: PM-Timer IO Port: 0x1008


to


May 10 05:52:19 ubuntu syslogd 1.5.0#1ubuntu1: restart.
May 10 06:22:02 ubuntu -- MARK --
May 10 06:42:02 ubuntu -- MARK --
May 10 06:51:49 ubuntu kernel: [ 2145.672201] nautilus[6101]: segfault at e1000020 eip b773f2d6 esp bfe4d9d0 error 5
May 10 06:51:51 ubuntu kernel: [ 2147.588620] gdm[5658]: segfault at 10c03f90 eip b7801635 esp bf8100e0 error 4
May 10 06:51:53 ubuntu kernel: [ 2149.171081] ip6_tables: (C) 2000-2006 Netfilter Core Team
May 10 06:51:54 ubuntu exiting on signal 15
May 12 19:07:04 ubuntu syslogd 1.5.0#1ubuntu1: restart.
May 12 19:07:04 ubuntu kernel: Inspecting /boot/System.map-2.6.24-16-generic
May 12 19:07:04 ubuntu kernel: Loaded 27704 symbols from /boot/System.map-2.6.24-16-generic.
May 12 19:07:04 ubuntu kernel: Symbols match kernel version 2.6.24.
May 12 19:07:05 ubuntu kernel: Loaded 32155 symbols from 83 modules.
May 12 19:07:05 ubuntu kernel: [ 0.000000] Linux version 2.6.24-16-generic (buildd@palmer) (gcc version 4.2.3 (Ubuntu 4.2.3-2ubuntu7)) #1 SMP Thu Apr 10 13:23:42 UTC 2008 (Ubuntu 2.6.24-16.30-generic)
May 12 19:07:05 ubuntu kernel: [ 0.000000] BIOS-provided physical RAM map:
May 12 19:07:05 ubuntu kernel: [ 0.000000] BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
May 12 19:07:05 ubuntu kernel: [ 0.000000] BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
May 12 19:07:05 ubuntu kernel: [ 0.000000] BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
May 12 19:07:05 ubuntu kernel: [ 0.000000] BIOS-e820: 0000000000100000 - 000000007fff0000 (usable)
May 12 19:07:05 ubuntu kernel: [ 0.000000] BIOS-e820: 000000007fff0000 - 000000007fff3000 (ACPI NVS)
May 12 19:07:05 ubuntu kernel: [ 0.000000] BIOS-e820: 000000007fff3000 - 0000000080000000 (ACPI data)
May 12 19:07:05 ubuntu kernel: [ 0.000000] BIOS-e820: 00000000d0000000 - 00000000e0000000 (reserved)
May 12 19:07:05 ubuntu kernel: [ 0.000000] BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
May 12 19:07:05 ubuntu kernel: [ 0.000000] 1151MB HIGHMEM available.
May 12 19:07:05 ubuntu kernel: [ 0.000000] 896MB LOWMEM available.
May 12 19:07:05 ubuntu kernel: [ 0.000000] found SMP MP-table at 000f52c0
May 12 19:07:05 ubuntu kernel: [ 0.000000] Zone PFN ranges:
May 12 19:07:05 ubuntu kernel: [ 0.000000] DMA 0 -> 4096
May 12 19:07:05 ubuntu kernel: [ 0.000000] Normal 4096 -> 229376
May 12 19:07:05 ubuntu kernel: [ 0.000000] HighMem 229376 -> 524272
May 12 19:07:05 ubuntu kernel: [ 0.000000] Movable zone start PFN for each node
May 12 19:07:05 ubuntu kernel: [ 0.000000] early_node_map[1] active PFN ranges
May 12 19:07:05 ubuntu kernel: [ 0.000000] 0: 0 -> 524272
May 12 19:07:05 ubuntu kernel: [ 0.000000] DMI 2.3 present.
May 12 19:07:05 ubuntu kernel: [ 0.000000] ACPI: RSDP signature @ 0xC00F6C50 checksum 0
May 12 19:07:05 ubuntu kernel: [ 0.000000] ACPI: RSDP 000F6C50, 0014 (r0 Nvidia)
May 12 19:07:05 ubuntu kernel: [ 0.000000] ACPI: RSDT 7FFF3000, 0030 (r1 Nvidia AWRDACPI 42302E31 AWRD 1010101)
May 12 19:07:05 ubuntu kernel: [ 0.000000] ACPI: FACP 7FFF3040, 0074 (r1 Nvidia AWRDACPI 42302E31 AWRD 1010101)
May 12 19:07:05 ubuntu kernel: [ 0.000000] ACPI: DSDT 7FFF30C0, 4BF2 (r1 NVIDIA AWRDACPI 1000 MSFT 100000C)
May 12 19:07:05 ubuntu kernel: [ 0.000000] ACPI: FACS 7FFF0000, 0040
May 12 19:07:05 ubuntu kernel: [ 0.000000] ACPI: MCFG 7FFF7D40, 003C (r1 Nvidia AWRDACPI 42302E31 AWRD 1010101)
May 12 19:07:05 ubuntu kernel: [ 0.000000] ACPI: APIC 7FFF7CC0, 007C (r1 Nvidia AWRDACPI 42302E31 AWRD 1010101)
May 12 19:07:05 ubuntu kernel: [ 0.000000] Nvidia board detected. Ignoring ACPI timer override.
May 12 19:07:05 ubuntu kernel: [ 0.000000] If you got timer trouble try acpi_use_timer_override
May 12 19:07:05 ubuntu kernel: [ 0.000000] ACPI: PM-Timer IO Port: 0x1008
May 12 19:07:05 ubuntu kernel: [ 0.000000] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
May 12 19:07:05 ubuntu kernel: [ 0.000000] Processor #0 15:3 APIC version 16
May 12 19:07:05 ubuntu kernel: [ 0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
May 12 19:07:05 ubuntu kernel: [ 0.000000] Processor #1 15:3 APIC version 16
May 12 19:07:05 ubuntu kernel: [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x00] dfl dfl lint[0x1])
May 12 19:07:05 ubuntu kernel: [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl lint[0x1])
May 12 19:07:05 ubuntu kernel: [ 0.000000] ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
May 12 19:07:05 ubuntu kernel: [ 0.000000] IOAPIC[0]: apic_id 2, version 17, address 0xfec00000, GSI 0-23
May 12 19:07:05 ubuntu kernel: [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)

Wednesday, April 30, 2008

functional music, honorable mention of new composer C.W. Boese

just after the previous post a talented composer (Boese) used the small framework for functional music in haskell for one of his compisitions. His work balances harmony and disharmony in an unbalanced way and thereby balances the unbalancing and balancing forces driving the excellent piece of modern algorithmic music. with his silent permission the source as well the interpretation is attached to this post.

Here's the interpretation using an artifical Clarinet: play.

Here's the source:

boeseboese = Rest (2/4) :+: (hochlauf (-2) 3 boese) :=: (hochlauf (2) 3 boese) :+: boese
boese = rpt 3 freak :=: rpt 3 ghoul :=: rpt 3 funk
grund = (Note (Cis,5) 1)
freak = rptm (Trans (round (abs (duration ghoul)))) 3 funk
ghoul = akkord grund 0
funk = Trans terz grund :=: rptm (Trans quinte) 4 ghoul

just append that to the previous program and adapt the main function.


Maybe it's also worth mentioning, that this was the first piece of haskell code written by him.
Haskell syntax helps to create frameworks with very little effort, that reassemble DSLs. In this case parts of the DSL for functional music from the School Of Expression was used to aid composers in creating algorithmic music without the syntactic burdens of a low-level language like (Ihh.. pfui... bahh) "java", that lack many, even basic features, like being NullPointer free, pattern matching, currying, higher order functions, type inference, laziness, ...etc.

Tuesday, April 29, 2008

Learning Haskell, functional music

As you might have realized, I started to learn Haskell. One of the most fun things to do in any programming language is creating some kind of audible side effects with a program.
Already back in the days when I started programming, I always played around with audio when toying around with a new language.

I have found a wonderful set of lecture slides about haskell and multimedia programming, called school of expression. Inspired by the slides about functional music I implemented a little song. Ahh ... and yes it is intended to sound slightly strange.

I used the synthesis toolkit to transform the music to real noise, simply by piping skini message to std-out.

I used this command line to achieve the results audible in the table:

sven@hhi1214a:~/Mukke$ ghc -o test1 test1.hs && ./test1 | stk-demo Plucked -n 16 -or -ip

Sound samples:




Pluckedplay
Clarinetplay
Whistle(attention very crazy!)play


As always the source...

stueck = anfang :+: mitte :+: ende

anfang = groovy :+: (Trans (-5) trickie)
mitte = seq_overlap (1/3) mitte1 mitte2
mitte1 = (3/4) `schneller` (Trans (-2) groovy)
mitte2 = (3/4) `schneller` groovy
ende = Rest (1/4) :+: (moll_akkord (Note (A,4) 1))

groovy = dumschingbadumschingsching 3 (2/3) -- fast schon swing artiger rythmischer versatz
trickie = dumschingbadumschingsching 2 (3/4)

blim_sching af = ((Note (A,3) (1/4)) :+: (af (Note (A,4) (1/4))))
dumdideldum af = quinten_hochlauf 4 (rpt 2 $ blim_sching af) -- 4 klingt leicht schief
dumschingbadumschingsching r versatz = interleave r versatz (dumdideldum moll_akkord) (Trans 3 (dumdideldum dur_akkord))

main :: IO ()
main = do
mapM_ putStrLn $ music_to_skini stueck 33

-- functional music composition
hochlauf tr n m = rep (Trans quinte) (delay (duration m)) n m
quinten_hochlauf = hochlauf quinte
interleave n dist m1 m2 = (rpt n m1) :=: (Rest dist :+: (rpt n m2))
schneller spdup m = (Tempo ((duration m)*(spdup)) m)

akkord grundton moll_dur = grundton :=: (Trans (3 + moll_dur) grundton) :=: (Trans 7 grundton)
moll_akkord grundton = akkord grundton 0
dur_akkord grundton = akkord grundton 1

terz = 3
gr_terz = 4
quarte = 5
quinte = 7

seq_overlap overlap m1 m2 = m1 :=: (delay ((duration m2) - overlap) m2)

rpt 0 m = Rest 0
rpt n m = m :+: rpt (n-1) m

rptm f 0 m = Rest 0
rptm f n m = m :+: rptm f (n - 1) (f m)

rep f g 0 m = Rest 0
rep f g n m = m :=: g (rep f g (n-1) (f m))

-- functional music basics, based on the school of expression lecture slides
data Music = Note Pitch Dur
| Rest Dur
| Music :+: Music
| Music :=: Music
| Tempo Dur Music
| Trans Int Music
| Instr Int Music
| Control ControlMessage Music
deriving (Show, Eq)

type Dur = Rational

type Pitch = (Tone, Octave)
data Tone = C | Cis | D | Dis | E | F | Fis | G | Gis | A | B | H
deriving (Show, Eq)
type Octave = Int

type ControlMessage = (Int, Int)

-- absolute pitches
type AbsPitch = Int

trans :: Int -> Pitch -> Pitch
trans i p = pitch (absPitch p + i)

pitch :: AbsPitch -> Pitch
pitch p = ([C, Cis, D, Dis, E, F, Fis, G, Gis, A, B, H] !! mod p 12, p `quot` 12)

absPitch :: Pitch -> AbsPitch
absPitch (n, o) = (o * 12) + (case n of C -> 0; Cis -> 1; D -> 2; Dis -> 3; E -> 4; F -> 5; Fis -> 6; G -> 7; Gis -> 8; A -> 9; B -> 10; H -> 11)

-- delay
delay :: Dur -> Music -> Music
delay d = ((Rest d) :+:)

-- absolute durations
abs_dur :: BeatPM -> Dur -> AbsDur
abs_dur bpm d = (fromRational d) / (bpm / (60.0))

type BeatPM = Float
type AbsDur = Float

-- calculate the duration of a piece of music
duration :: Music -> Dur
duration m = case m of
Rest d -> d
Note _ d -> d
m1 :+: m2 -> duration m1 + duration m2
m1 :=: m2 -> max (duration m1) (duration m2)
Tempo d _ -> d
Trans _ m1 -> duration m1
Instr _ m1 -> duration m1
Control _ m1 -> duration m1

-- convert to a simple event stream of musical events with absolute values
-- that soley consist of NoteOn and NoteOff events with midi notes and timestamps
data MusicEvent = MusicEv MusicEventType AbsPitch AbsDur
deriving (Show, Eq)

data MusicEventType = NoteOn | NoteOff
deriving (Show, Eq)

to_event_stream :: BeatPM -> Music -> AbsDur -> (AbsDur, [MusicEvent])
to_event_stream bpm um start_time =
case m of
Note p d ->
let dur = start_time + (abs_dur bpm d)
in (dur, [(MusicEv NoteOn (absPitch p) start_time), (MusicEv NoteOff (absPitch p) dur)])

Rest d ->
(start_time + (abs_dur bpm d), [])

Tempo d m1 ->
to_event_stream (bpm * (fromRational (duration m1)) * 1.0/(fromRational d)) m1 start_time

m1 :+: m2 ->
(total_dur, evt_str1 ++ evt_str2)
where (st1, evt_str1) = to_event_stream bpm m1 start_time
(total_dur, evt_str2) = to_event_stream bpm m2 st1

m1 :=: m2 ->
(max dur1 dur2, merge evt_str1 evt_str2)
where (dur1, evt_str1) = to_event_stream bpm m1 start_time
(dur2, evt_str2) = to_event_stream bpm m2 start_time

_ -> (start_time, [])

where
m = transpose_all um 0
transpose_all :: Music -> Int -> Music
transpose_all m tr = case m of
Trans tr2 m1 -> transpose_all m1 (tr + tr2)
Note p d -> Note (trans tr p) d
m1 :+: m2 -> (transpose_all m1 tr) :+: (transpose_all m2 tr)
m1 :=: m2 -> (transpose_all m1 tr) :=: (transpose_all m2 tr)
other -> other
merge :: [MusicEvent] -> [MusicEvent] -> [MusicEvent]
merge l1 l2 = mergea l1 l2 []
mergea :: [MusicEvent] -> [MusicEvent] -> [MusicEvent] -> [MusicEvent]
mergea [] l2 acc = acc ++ l2
mergea l1 [] acc = acc ++ l1
mergea l1@(e1@(MusicEv _ _ dur1):r1) l2@(e2@(MusicEv _ _ dur2):r2) acc
| dur1 <= dur2 = mergea r1 l2 (acc ++ [e1]) | dur1 > dur2 = mergea l1 r2 (acc ++ [e2])

-- convert to skinni
{-

Friday, April 11, 2008

Haskell Thread Ring Benchmark

Not as fast as the erlang version coming soon:




import Control.Monad
import Control.Concurrent
import System.Environment
import System.CPUTime

fork_ring_elem prev_mvar _ = do
next_mvar <- newEmptyMVar
forkIO (ring_elem prev_mvar next_mvar)
return next_mvar

ring_elem :: MVar Int -> MVar Int -> IO ()
ring_elem prev_mv next_mv = run
where run = do
token <- takeMVar prev_mv
putMVar next_mv (token - 1)
when ( token > 0 ) run

first_ring_elem :: MVar Int -> MVar Int -> IO ()
first_ring_elem prev_mv next_mv = run
where run = do
token <- takeMVar prev_mv
putMVar next_mv (token - 1)
putStrLn "."
when ( token > 0 ) run

main = do
(procsArg:roundsArg:_) <- getArgs
first_mvar <- newMVar ((read procsArg) * (read roundsArg))
t1 <- getCPUTime
last_mvar <- foldM fork_ring_elem first_mvar [2..(read procsArg)]
t2 <- getCPUTime
putStrLn ("forked processes in " ++ (show (t2 - t1)) ++ " ps")
first_ring_elem last_mvar first_mvar
t3 <- getCPUTime
putStrLn "\n***finished***"
putStrLn ("total time: " ++ (show (t3 - t1)) ++ " ps")

Saturday, March 08, 2008

coping emacs #1

After having written a lot of erlang code with emacs(uhhm ... no, erlide, the erlang eclipse plugin, is not yet a viable alternative...), I decided to start (back-)porting eclipse features to emacs. Yeahhhh! I am now among those 1337 |-|4><025 that have their own ".emacs" in ~.

Well one feature, that I can't live without, is moving the current line with Alt - up/down, this is really usefull together with anotherthe other feature: duplicating the current line(Ctrl-Alt-up/down).

Below is the elisp code that provides these features and binds them to the keys these are bound to in eclipse. Note: In contrast to the eclipse version of this feature, this implementation does not(yet) work with regions, but with lines only.

Ok, put this in your ~/.emacs

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Written by Sven Heyll in 2008
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
(defun sh-wind-col (thunk)
(let ((col (- (point) (line-beginning-position))))
(apply thunk '())
(beginning-of-line)
(forward-char col)))

(defun sh-with-current-line (func)
(let ((cline (buffer-substring (line-beginning-position) (line-end-position))))
(apply func `(,cline))))

(defun sh-dup-line-down ()
(interactive)
(sh-wind-col
(lambda ()
(sh-with-current-line
(lambda (cline)
(end-of-line)
(insert ?\n)
(insert cline))))))

(defun sh-dup-line-up ()
(interactive)
(sh-wind-col
(lambda ()
(sh-with-current-line
(lambda (cline)
(beginning-of-line)
(insert cline)
(insert ?\n)
(forward-line -1))))))

(global-set-key [C-M-down] 'sh-dup-line-down)
(global-set-key [C-M-up] 'sh-dup-line-up)

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
(defun sh-move-line (dir)
(defun extract-a-line (dir)
(let ((line-start (line-beginning-position (+ 1 dir)))
(line-end (line-end-position (+ 1 dir))))
(setq res (delete-and-extract-region line-start line-end))
(delete-region line-start (+ 1 line-start))
res))
(if (not
(or
(and (< dir 0) (= (line-beginning-position) (point-min)))
(and (> dir 0) (= (line-end-position) (- (point-max) 1)))))
(let
((col (- (point) (line-beginning-position)))
(line (extract-a-line dir)))
(if (< dir 0)
(progn
(end-of-line)
(insert ?\n)
(insert line))
(progn
(beginning-of-line)
(insert line)
(insert ?\n)))
(beginning-of-line (truncate (+ 0.5 (* 0.5 dir))))
(forward-char col))))

(defun sh-line-down ()
(interactive)
(sh-move-line 1))

(defun sh-line-up ()
(interactive)
(sh-move-line -1))

(global-set-key [M-down] 'sh-line-down)
(global-set-key [M-up] 'sh-line-up)

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

Saturday, February 16, 2008

Visualisation of DNA

After some attention has been caought by the wonderfull Rainbow DNA project, I have decided to join the club! Here is a very simplistic, far more useless way of visualising DNA: turtle graphics.
I really cannot put up a website with comlete renderings of the DNA using turtle graphics, but I uploaded two sample images to my flikr account. I wanted to find out if turtle graphics could reveal diffrent sets of patterns as those perceivable with a color plot of basepairs.

Attached is source code so you can see what the program does. You can also reuse the part that proceses the contents of "gbk" files containing genome data.

The code is just a hack done at night while I was waiting in a starbucks for a friend to pick me up, so if you think this code is a mess - you earned your degree, I was just curios after I found out about the rainbow dna project.

Well the idea of the program is very simple:
1. Initialise the turtle to be in center of the screen
2. read the next basepair, for each base encountered look up the turtle rotation
3. rotate the turtle
4. draw 5 pixels
5. goto step 2 until finished with a gene

So here are the results, not surprisingly very unspectacular. If you want a good representation for the contents of the human genome, well ... look into a mirror. All other reps. just look ridiculous in comparison.

I think it's funny, that one could argue that the dna seems to be like a multi-quine: not only that it conatins the code that creates organisms to reproduce itself through regular biological reproduction, it also encodes a brain with the ability to create turtle renderings of itself...

In the following image rendering the following rules were applied. Whenever an "a" (as of a, c, t, g) is encountered turn the turle by -180 degree, on "c" -60 degrees, "t" 60 deg and "g" 180 deg.

turtle dna rendering heyll-mode

In the next rendering the following rules apply. Whenever an "a" turns the turle by 23 degree, on "c" 42 degrees, "t" 128 deg and "g" 15 deg.

turtle dna rendering boese-mode

Ok now here is the code.
You might want to get the gbk files. Have a look at my delicious account, I have stored a link to an ftp server where a gbk file for every chromosome of the human genome can be found.

For my experiments I used parts of the X chromosome.


(require (lib "turtles.ss" "graphics"))


(load "boyer-moore.scm")
(load "lazy-streams.scm")
(load "list-utils.scm")

;; read a line
(define LS #\newline)

(define (read-line port)
(let loop ((line '())
(c (read-char port)))
(if (eof-object? c)
(reverse (cons c line))
(if (eqv? c LS)
(reverse line)
(loop (cons c line) (read-char port)) ) )))

(define (lazy-line-stream port)
(define current-line (read-line port))
(define result-stream
(cons-lazy-stream current-line (lazy-line-stream port)))
(if (eof-object? (car current-line))
the-empty-lazy-stream
result-stream))

(define (filter-bases char-list)
(filter
(lambda (c) (or (eqv? c #\a) (eqv? c #\c) (eqv? c #\g) (eqv? c #\t)))
char-list))

(define (process-gbk port info-block-func base-pair-func post-draw)
(define (loop-base-pairs line-stream)
(if (empty-lazy-stream? line-stream)
(display "stream exhausted(while basepair parsing).")
(if (and (eqv? (car (lazy-head line-stream)) #\/) (eqv? (cadr (lazy-head line-stream)) #\/))
(begin
(post-draw)
(loop-header (lazy-tail line-stream) '()))
(begin
(base-pair-func (filter-bases (lazy-head line-stream)))
(loop-base-pairs (lazy-tail line-stream))))))
(define (loop-header line-stream info-block-A)
(if (empty-lazy-stream? line-stream)
(begin (newline)
(display "stream exhausted(while parsing header).")
(newline))
;; ok there's more stuff to read so. Find the ORIGIN string indicating the start of a DNA string
(if (equal? #f (>>boyer-moore (string->list "ORIGIN") (lazy-head line-stream)))
(begin
(loop-header (lazy-tail line-stream) `(,@info-block-A ,(lazy-head line-stream))))
;; ok found the ORIGIN string
(begin
(newline)
(display "found beginning of base pair sequence.")
(newline)
(info-block-func info-block-A)
(loop-base-pairs (lazy-tail line-stream))))))
;; well the file is always assumed to start with a header
(loop-header (lazy-line-stream port) '()))


;; simple function that just displays the base pairs
(define (simple-base-pair-displayer L)
(display L)
(newline))

;; now some turtle functions
;; simple turtle moving and turning

;
;(define base-table ;; pun intended
; '((#\a 23)
; (#\c 42)
; (#\t 128)
; (#\g 15)))
;(define angle-factor 1)
;(define step-len 5)

;(define base-table ;; pun intended
; '((#\a -3)
; (#\c -1)
; (#\t 1)
; (#\g 3)))
;(define angle-factor 60)
;(define step-len 5)

(define base-table ;; pun intended
'((#\a -2)
(#\c -1)
(#\t 1)
(#\g 2)))
(define angle-factor 60)
(define step-len 4)

;(define base-table ;; pun intended
; '((#\a 0)
; (#\c 1)
; (#\t 2)
; (#\g 3)))
;(define angle-factor 90)
;(define step-len 4)


(turtles #t)

;;simple turtle func that will draw a line
(define (basepair-drawer L)
(define (draw-loop L)
(if (not (equal? L '()))
(begin
(turn (* angle-factor (cadr (assoc (car L) base-table))))
(draw step-len))))
(draw-loop L))

(define (info-block-func info-block)
(display "GOT AN INFO BLOCK, STARTING NEW RENDERING")
;(display info-block)
(newline)
(clear))

(define (post-draw)
(display "FINISHED DRAWING")
(newline)
(sleep 5))


;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; main prog
(define (start)
(call-with-input-file "ref_chrX.gbk"
(lambda (port)
(process-gbk port info-block-func basepair-drawer post-draw))))
(start)

Boyer Moore Search Algorithm

I needed to find a String in a text file, so I wrote(rather hacked) a scheme imlementation of the boyer moore string search algoritm.

This is just a hack. But it is commented. What do you think?
(I decided to use this blog also as my cut-paste-source from now on.)


;; searches for string using boyer moore algorithm
(define (>>boyer-moore needle haystack)
(define needle-len (string-length needle))
(define hs-len (string-length haystack))
(define r-needle-list (reverse (string->list needle)))
;; two tables are build
;; compute the bad character shift table
;; it contains the number of chars to skip, if a character is encountered that is not the last of the search string.
;; (this table is only used after the search cursor was replaced)
(define bad-char-shift-table
(let loop
((shift 0)
(nlist r-needle-list)
(table '()))
(if (eq? nlist '())
table
(if (assv (car nlist) table)
(loop (+ 1 shift) (cdr nlist) table)
(loop (+ 1 shift) (cdr nlist) `((,(car nlist) ,shift) ,@table))))))
;; the good char table contains the number of chars to skip forwar, if a substring starting from the
;; end of a needle was matched befor a mismatch occurs
;; it contains the next possible position from the current search position where a
;; string match might end...
(define good-suffix-shift-table
;; for every reverse substring define a shift value
(let char-pattern-loop
((pattern '())
(pattern-len 0)
(nlist r-needle-list)
(table '()))
(if (eq? nlist '())
table
(char-pattern-loop
`( ,@pattern ,(car nlist))
(+ 1 pattern-len)
(cdr nlist)
`(,@table
(,pattern ,(let loop
((shift 0)
(unmatched (car nlist)))
(if (equal? (ncar pattern-len (ncdr shift r-needle-list)) (ncar (- needle-len shift) pattern))
(if (eqv? shift needle-len)
shift
(if (>= (+ shift pattern-len) needle-len)
shift
(if (eqv? (car (ncdr shift nlist)) unmatched)
;; ok nicht gefunden weiter schieben/suchen
(loop (+ 1 shift) unmatched)
shift)))
(loop (+ 1 shift) unmatched)))))))))
;; searching at a position
(define (search-needle-at index)
(letrec ((sub-hs (reverse (string->list (substring haystack index (+ needle-len index))))) ;; den kaefer erstma aufn ruecken drehen...
(first-char (car sub-hs)))
(if (eqv? first-char (car r-needle-list)) ;; first time is special
;; if the fist char matches, proceed with subpattern search
(let ((common (common-sublist sub-hs r-needle-list)))
(if (= (car common) needle-len)
0 ;; found
(cadr (assoc (cdr common) good-suffix-shift-table))))
;;if the first char did not match, look up shift in bad-char-shift table
(let ((shift (assv first-char bad-char-shift-table)))
(if (eq? shift #f)
needle-len ;; return the needle length if nothing better could be found in the bad-char jump table
(cadr shift)))))) ;; ...otherwise return the value obtained from the table

;; search mainloop
(let main-loop ((current-index 0))
(if (> (+ needle-len current-index) hs-len)
#f
(let ((minimum-chars-to-skip (search-needle-at current-index)))
(if (= 0 minimum-chars-to-skip)
current-index ;; juhu found string!
(main-loop (+ current-index minimum-chars-to-skip)))))))


(>>boyer-moore "ANPANMAN" "NNNNNAXPANPANMANANMAN")


Some utility definitions are missing from the above code, these are:


;; returns the rest of the list after removing n elements
(define (ncdr n list)
(if (eqv? n 0)
list
(if (eq? list '())
list
(ncdr (- n 1) (cdr list)))))

;; returns the fist n items of the list
(define (ncar n list)
(let loop ((result '())
(rest list)
(c n))
(if (eqv? c 0)
result
(if (eq? rest '())
result
(loop `(,(car rest) ,@result) (cdr rest) (- c 1) )))))

;; return the common begining sublist of two lists
(define (common-sublist listA listB)
(let loop
((listC '())
(restA listA)
(restB listB)
(count 0))
(if (or (eq? restA '()) (eq? restB '()))
(cons count listC)
(if (eqv? (car restA) (car restB))
(loop `(,@listC ,(car restA)) (cdr restA) (cdr restB) (+ 1 count))
(cons count listC)))))



The above code might be complete bullsh*t, I dont know I just hacked it down while reading the wikipedia article of the algorithm. I didn't bother to lookup a reference implementation...
Also it was like 4:00 am when I hacked it...(apologies accepted?)