LATEST BLOG
Tuesday, November 18, 2014
Tuesday, November 11, 2014
Wednesday, November 05, 2014
Welcome to Matt Ridley's Blog
Matt Ridley is the author of provocative books on evolution, genetics and society. His books have sold over a million copies, been translated into thirty languages, and have won several awards.

Please note that this blog no longer accepts comments (there was too much spam coming in!). If you're reading this blog and want to respond then please use the contact form on the site.

You can also follow me on twitter.

How new words and new genes are coined

In the evolution of a language, the same principles apply to DNA as to English

My latest Mind and Matter column in the Wall Street Journal, with added links:

Don't look for the soul in the language of DNA

Back in the genomic bronze age-the 1990s-scientists used to think that there would prove to be lots of unique human genes found in no other animal. They assumed that different species would have many different genes. One of the big shocks of sequencing genomes was not just the humiliating news that human beings have the same number of genes as a mouse, but that we have the same genes, give or take a handful.

This humiliation deepened recently when David Knowles and Aoife McLysaght at Trinity College, Dublin, tracked down, at last, some uniquely human genes: just three of them. They estimate that there are, altogether, probably no more than 18 of this wholly unique kind-out of 22,568 genes in total. Over the span of our history, human beings seem to have acquired a brand-new gene only every third of a million years.

The three that Drs. Knowles and McLysaght identified lie in stretches of DNA that are gobbledegook in chimpanzees, gorillas, gibbons and macaques, so the chances are they have sprung to life as protein-coding sequences in human beings uniquely. (A gene is the digital recipe for making a protein molecule.)

The functions of these three genes are not yet known (since they don't exist in mice, experiments are tricky), though one seems to be slightly more active in people with a form of leukemia. They are small and simple genes, however-unlikely candidates to hold the recipe for the human soul.

This might seem to leave a small hook upon which philosophers could hang the uniqueness of the human race. But we have long known that our uniqueness lies in the order and combination of our genes, not in the ingredients themselves. DNA is not only like a language; it is a language, a linear sequence of recombinable digital characters of infinite variety.

There are close parallels between DNA and a language like English. Just as evolution uses the same 22,000 genes in a different order to make a rhinoceros or a rabbit, so Shakespeare used many of the same 18,000 words in each of his plays. The 10 most common words in "Othello" and "King Lear" are the same (I, and, the, to, you, of, my, that, a, in), yet the plays are very different. The English language, like the human genome, contains very few brand new words that were invented recently from scratch.

Most new English words arise by different means: by borrowing from foreign languages (Schadenfreude, pajama); by recombining existing words (blogosphere, download); or by the addition of second meanings to existing words (green, mouse).

All of these habits are common in the genome, too. Lateral gene transfer brings genes from one species into another, especially among bacteria (less often among mammals like us). This is just like borrowing a foreign word. Genes recombine by fusing in whole or in part, by a process known as exon shuffling (exons are the separate stretches of code that are used to make one protein in split genes). This is just like recombining existing words. And genes often duplicate themselves and then diverge into different functions, just as old words acquire new meanings.

About 800 million years ago, a gene for a simple pigment protein enabling worm-like creatures to see was duplicated, and the daughter genes diverged to give the different proteins used in the rods and cones of the eye. About 500 million years ago, in a lamprey-like fish, the gene used in cones duplicated and diverged again to give us blue versus yellow color vision. About 30 million years ago, in a tree-climbing primate, the yellow gene duplicated and diverged again to give us green-red color vision.

Genes also die out, just as words do. When they are still recognizable but no longer in use, they are called pseudogenes. The word "theatrophone" is a forgotten linguistic pseudogene, and the word "minidisc" is becoming one. The words "trebuchet" and "cenotaph" are examples of extinct words that sprang back to life-something that pseudogenes sometimes do as well.