Saturday, February 16, 2008

Visualisation of DNA

After some attention has been caought by the wonderfull Rainbow DNA project, I have decided to join the club! Here is a very simplistic, far more useless way of visualising DNA: turtle graphics.
I really cannot put up a website with comlete renderings of the DNA using turtle graphics, but I uploaded two sample images to my flikr account. I wanted to find out if turtle graphics could reveal diffrent sets of patterns as those perceivable with a color plot of basepairs.

Attached is source code so you can see what the program does. You can also reuse the part that proceses the contents of "gbk" files containing genome data.

The code is just a hack done at night while I was waiting in a starbucks for a friend to pick me up, so if you think this code is a mess - you earned your degree, I was just curios after I found out about the rainbow dna project.

Well the idea of the program is very simple:
1. Initialise the turtle to be in center of the screen
2. read the next basepair, for each base encountered look up the turtle rotation
3. rotate the turtle
4. draw 5 pixels
5. goto step 2 until finished with a gene

So here are the results, not surprisingly very unspectacular. If you want a good representation for the contents of the human genome, well ... look into a mirror. All other reps. just look ridiculous in comparison.

I think it's funny, that one could argue that the dna seems to be like a multi-quine: not only that it conatins the code that creates organisms to reproduce itself through regular biological reproduction, it also encodes a brain with the ability to create turtle renderings of itself...

In the following image rendering the following rules were applied. Whenever an "a" (as of a, c, t, g) is encountered turn the turle by -180 degree, on "c" -60 degrees, "t" 60 deg and "g" 180 deg.

turtle dna rendering heyll-mode

In the next rendering the following rules apply. Whenever an "a" turns the turle by 23 degree, on "c" 42 degrees, "t" 128 deg and "g" 15 deg.

turtle dna rendering boese-mode

Ok now here is the code.
You might want to get the gbk files. Have a look at my delicious account, I have stored a link to an ftp server where a gbk file for every chromosome of the human genome can be found.

For my experiments I used parts of the X chromosome.


(require (lib "turtles.ss" "graphics"))


(load "boyer-moore.scm")
(load "lazy-streams.scm")
(load "list-utils.scm")

;; read a line
(define LS #\newline)

(define (read-line port)
(let loop ((line '())
(c (read-char port)))
(if (eof-object? c)
(reverse (cons c line))
(if (eqv? c LS)
(reverse line)
(loop (cons c line) (read-char port)) ) )))

(define (lazy-line-stream port)
(define current-line (read-line port))
(define result-stream
(cons-lazy-stream current-line (lazy-line-stream port)))
(if (eof-object? (car current-line))
the-empty-lazy-stream
result-stream))

(define (filter-bases char-list)
(filter
(lambda (c) (or (eqv? c #\a) (eqv? c #\c) (eqv? c #\g) (eqv? c #\t)))
char-list))

(define (process-gbk port info-block-func base-pair-func post-draw)
(define (loop-base-pairs line-stream)
(if (empty-lazy-stream? line-stream)
(display "stream exhausted(while basepair parsing).")
(if (and (eqv? (car (lazy-head line-stream)) #\/) (eqv? (cadr (lazy-head line-stream)) #\/))
(begin
(post-draw)
(loop-header (lazy-tail line-stream) '()))
(begin
(base-pair-func (filter-bases (lazy-head line-stream)))
(loop-base-pairs (lazy-tail line-stream))))))
(define (loop-header line-stream info-block-A)
(if (empty-lazy-stream? line-stream)
(begin (newline)
(display "stream exhausted(while parsing header).")
(newline))
;; ok there's more stuff to read so. Find the ORIGIN string indicating the start of a DNA string
(if (equal? #f (>>boyer-moore (string->list "ORIGIN") (lazy-head line-stream)))
(begin
(loop-header (lazy-tail line-stream) `(,@info-block-A ,(lazy-head line-stream))))
;; ok found the ORIGIN string
(begin
(newline)
(display "found beginning of base pair sequence.")
(newline)
(info-block-func info-block-A)
(loop-base-pairs (lazy-tail line-stream))))))
;; well the file is always assumed to start with a header
(loop-header (lazy-line-stream port) '()))


;; simple function that just displays the base pairs
(define (simple-base-pair-displayer L)
(display L)
(newline))

;; now some turtle functions
;; simple turtle moving and turning

;
;(define base-table ;; pun intended
; '((#\a 23)
; (#\c 42)
; (#\t 128)
; (#\g 15)))
;(define angle-factor 1)
;(define step-len 5)

;(define base-table ;; pun intended
; '((#\a -3)
; (#\c -1)
; (#\t 1)
; (#\g 3)))
;(define angle-factor 60)
;(define step-len 5)

(define base-table ;; pun intended
'((#\a -2)
(#\c -1)
(#\t 1)
(#\g 2)))
(define angle-factor 60)
(define step-len 4)

;(define base-table ;; pun intended
; '((#\a 0)
; (#\c 1)
; (#\t 2)
; (#\g 3)))
;(define angle-factor 90)
;(define step-len 4)


(turtles #t)

;;simple turtle func that will draw a line
(define (basepair-drawer L)
(define (draw-loop L)
(if (not (equal? L '()))
(begin
(turn (* angle-factor (cadr (assoc (car L) base-table))))
(draw step-len))))
(draw-loop L))

(define (info-block-func info-block)
(display "GOT AN INFO BLOCK, STARTING NEW RENDERING")
;(display info-block)
(newline)
(clear))

(define (post-draw)
(display "FINISHED DRAWING")
(newline)
(sleep 5))


;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; main prog
(define (start)
(call-with-input-file "ref_chrX.gbk"
(lambda (port)
(process-gbk port info-block-func basepair-drawer post-draw))))
(start)

No comments: