DNA for functional programmers


module DNA where

DNA consists of two long polymers of simple units called nucleotides. In living organisms DNA does not usually exist as a single molecule, but instead as a pair of molecules that are held tightly together. These two long strands entwine like vines, in the shape of a double helix

type DNA = [(Nucleotide, Nucleotide)]
type Strand = [Nucleotide]

There are four types of molecules called nucleobases (informally, bases). The four bases found in DNA are adenine (abbreviated A), cytosine (C), guanine (G) and thymine (T).

data Base = A | T | C | G deriving (Show, Eq)

A nucleobase linked to a sugar is called a nucleoside and a base linked to a sugar and one or more phosphate groups is called a nucleotide.

I’ll ignore distinction between nucleotides and bases in this file.

type Nucleotide = Base

In a DNA double helix, each type of nucleobase on one strand normally interacts with just one type of nucleobase on the other strand. This is called complementary base pairing. Here, purines form hydrogen bonds to pyrimidines, with A bonding only to T, and C bonding only to G.

pair :: Base -> Base
pair A = T
pair T = A
pair C = G
pair G = C

DNA replication

The double helix is unwound and each strand acts as a template for the next strand.

unwind :: DNA -> (Strand, Strand)
unwind = unzip

Bases are matched to synthesize the new partner strands.

synthesize :: Strand -> DNA
synthesize strand =
  let partnerStrand = map pair strand
  in  zip strand partnerStrand

More detail. The double helix is unwound by a helicase and topoisomerase. Next, one DNA polymerase produces the leading strand copy. Another DNA polymerase binds to the lagging strand. This enzyme makes discontinuous segments (called Okazaki fragments) before DNA ligase joins them together.

Read: helicase and topoisomerase do unzip. DNA polymerase is (parallel) map. DNA ligase zips them again.

replicate :: DNA -> (DNA, DNA)
replicate dna = let (s1,s2) = unwind dna
                in  (synthesize s1, synthesize s2)


DNA replication is unzip, map, and zip.