All that junk (DNA) inside your trunk

By Barney Wharam

It was billed to be as inspiring as the space program. The sequencing of the human genome, figuring out our DNA code letter for letter, is one of our greatest scientific achievements.

The human brain is thought to be the most complex thing in the known universe. Following that logic, scientists thought that once we sequenced our DNA, we would find that humans have an incredibly large number of genes compared to other living things. Many educated guesses were well above 100,000 genes. However, what they mainly found was trash.

DNA representationA gene is a unit of information in our DNA that codes for a protein. Proteins are the physical building blocks of living things. The pigments in our eyes, the enzymes that break down our food and the keratin which strengthens our skin, bones and hair are all proteins. Your genome can be thought of a bit like a collection of instructions or recipes, where each gene is the recipe for an individual type of protein, and your proteins make up you.

You can therefore imagine the surprise when it was found that we have only 20-25,000 genes, much less than we thought! After sequencing more than three billion base pairs (letters in the DNA code), only around two per cent of it actually codes for genes, with a further five per cent involved in regulating when those genes are used.

DNA that does not have a known function is “junk DNA”. Most of this junk DNA consists of repetitive sequences. They often propagate themselves in a “selfish” way; they are basically parasites of the genome that copy themselves, rarely benefitting their host.

There is a theory that repetitive DNA sequences can play important roles in the body’s processes. A recent project called ENCODE estimated that 80% of all human DNA plays at least some role in biological processes, implying that most of our DNA isn’t junk after all.

However, when you compare the DNA of different species have massively varying amount of “junk DNA”. The highly poisonous but apparently delicious pufferfish has the tiniest of all the known vertebrate genomes, whereas the lungfish has a genome that is 350 times bigger at a whopping 133 billion base pairs. Despite what the ENCODE project concluded, the fact that complex organisms such as puffer fish and lungfish can live with or without massively varying amounts of junk DNA means that most of it is indeed probably junk.

The cost of sequencing a human genome has dropped from the initial $3 billion project to below $1000 per genome. It is very likely that you will one day have your own genome sequenced. It is also pretty certain that most of it will be junk.

Image: ‘DNA representation’ by Andy Leppard under creative commons license