COSMIC ANCESTRY | Quick Guide | Site Search | Next | by Brig Klyce | All Rights Reserved

Three New Human Genes
and De Novo Genes in General and What'sNEW

Entirely novel human-specific protein-coding genes originating from ancestrally noncoding sequences have been reported by two geneticists at the University of Dublin (1). Analyzing available data, they identified genes that are expressed in the human species but not in chimps. They then looked for simiar sequences in other primates, finding three. The chimp and macaque (unexpressed) sequences are nearly identical to the human one, but are interrupted by frameshifting insertions and stop codons.

Although the three human genes are known to be expressed from several lines of evidence, their functions are not definitively characterized. However one, chronic lymphocytic leukemia upregulated gene 1 (CLLU1), appears to have a role in that human disease. Its sequence among humans, compared to the matching one in chimps and macaques, is illustrated below.

gene sequences
"Multiple sequence alignment of the gene sequence of the human gene CLLU1 and similar nucleotide sequences from the syntenic location in chimp and macaque. The start codon is located immediately following the first alignment gap, which was inserted for clarity. Stop codons are indicated by red boxes. The sequenced peptide identified from this locus is indicated in orange. The critical mutation that allows the production of a protein is the deletion of an A nucleotide, which is present in both chimp and macaque (indicated by an arrow). This causes a frameshift in human that results in a much longer ORF capable of producing a 121-amino acids-long protein. Both the chimp and macaque sequences have a stop codon after only 42 potential codons." © Genome Research 2009

CLLU1 is also disabled by a matching point insertion in the gorilla and gibbon, but not orangutan, genomes. The geneticists reason, If the ancestral primate sequence was coding, then we would need to infer that an identical 1-bp insertion occurred in four lineages independently, whereas if we infer the presence of the disabler in the ancestral sequence, then we must infer two independent 1-bp deletions. The inference that the ancestral sequence was noncoding is a more parsimonious explanation of the data, even without considering that the parallel insertion of a specific base into an identical location is probably less likely than the parallel deletion of one base. ...We hypothesize that these genes have originated de novo in the human lineage, since the divergence with chimp from ancestrally noncoding sequence.

Consider the human nucleotide sequence designated CLLU1, 121 codons in length. A codon, three nucleotides, may encode any of 20 amino acids, or a stop. (But this sequence is a gene, an open reading frame with no stops.)

Assume that the protein encoded by this nucleotide sequence needs ~25%, or 30, of its codons exactly right. In other words, only 1 out of 21 codons can occupy each of those 30 positions. The chance that 30 random codons will match this sequence in one trial can be estimated as

(1/21)^30 = ~10^-40

Assume that the remaining 91 codons in this sequence may vary widely, encoding any of 10 of life's 20 amino acids, but no stops. In other words, 10 out of 21 codons can occupy each of those 91 codon positions. The chance that 91 random codons will satisfy these criteria in one trial is approximately

(10/21)^91 = ~10^-30

Combining these assumptions, the chance that a given sequence of 121 random codons will constitute a working version of this gene is on the order of

10^(-40-30) = 10^-70

(This method copies Chandra Wickramasinghe's in The Legacy of Fred Hoyle, reviewed 2005.)
The claim that these genes, "originated de novo in the human lineage," is baffling. Sequences virtually identical to them already exist in species considered ancestral to humans, and even in mice. The genes "were activated" is a more accurate description.

And the geneticists' use of math is interesting: At one location, two deletions are more likely than four identical insertions, so the ancestral sequence must have been noncoding? Why not ask how likely it is that a sequence of 121 codons — apparently unaffected by natural selection — happens to encode a useful protein? By our analysis (see box at right), it is forbiddingly unlikely, even if this relatively small protein could vary widely. It's the monkeys writing Shakespeare again.

The darwinian explanation of this phenomenon is unclear to us. But the discovery neatly aligns with cosmic ancestry, which predicts:

  • ...Genes precede the phenotypic expression of themselves (2).
  • If a new genetic program arrives by the strong panspermia process, intervening (ancestral) species should possess either nearly identical versions of it ...or nothing similar... (3).
  • ...At least some of the silent DNA is for future use (4).
  • Point mutations and other simple mechanisms can switch existing programs off and on (5).
  • ...This process would ...depend on sophisticated software management that can recognize an installed program (6).
  • New genetic programs will be continually offered for testing (7).

What'sNEW since Jul 2009

"Orphans and new gene origination, a structural and evolutionary perspective," by Sara Light, Walter Basile and Arne Elofsson, doi:10.1016/j.sbi.2014.05.006,
v 26 Current Opinion in Structural Biology, online 13 Jun 2014. ...at some time in history the first protein coding sequence within a protein family must have been created from non-coding genetic material [if evolition is strictly darwinian.]
The Evolution of Venom by Co-option of Single-Copy Genes by Ellen O. Martinson, Mrinalini, et al., doi:10.1016/j.cub.2017.05.032, Current Biology, 10 Jul 2017. We propose that co-option of single-copy genes may be a common but relatively understudied mechanism of evolution for new gene functions, particularly under conditions of rapid evolutionary change.
22 May 2017: It has become clear that protein-coding genes can originate de novo from non-coding sequences.
Aoife McLysaght and Daniele Guerzoni, "New genes from non-coding sequence: the role of de novo protein-coding genes in eukaryotic evolutionary innovation" [pdf], doi:10.1098/rstb.2014.0332, Philosophical Transactions B, 2015. ...It has now become clear that de novo origin of protein-coding genes from non-coding DNA is a consistent feature of eukaryotic genomes....
21 Aug 2016: Can antagonistic evolution compose de novo genes?
4 Sep 2015: ...Thousands of transcripts ...which are likely to have originated de novo.... NEW 4 Jan 2016.
8 Oct 2014: 24 hominoid-specific de novo protein-coding genes were identified.
Daniele Guerzoni and Aoife McLysaght, "De Novo Origins of Human Genes" [html], doi:10.1371/journal.pgen.1002381, 7(11): e1002381, PLoS Genet, 10 Nov 2011.
Diethard Tautz & Tomislav Domazet-Loŝo, "The evolutionary origin of orphan genes" [html], doi:10.1038/nrg3053, p 692-702 v 12, Nature Reviews Genetics, Oct 2011. "...de novo evolution ...appears to provide raw material continuously for the evolution of new gene functions...."
24 Jan 2014: The earliest steps in de novo gene origination remain mysterious.
Bétermier M, Bertrand P, Lopez BS, "Is Non-Homologous End-Joining Really an Inherently Error-Prone Process?" [html], doi:10.1371/journal.pgen.1004086, 10(1): e1004086, PLoS Genet, 16 Jan 2014. "...Recent data have pointed to the intrinsic precision of NHEJ."
Dong-Dong Wu and Ya-Ping Zhang, "Evolution and Function of De Novo Originated Genes" [abstract], Molecular Phylogenetics and Evolution, online 27 Feb 2013. Wu and Zhang think sequences evolve as proteins, but in most cases, the nucleotides are already ordered before the first translation.
25 Jan 2013: Many of our genes have no obvious relatives or evolutionary history. So where did they come from?
23 Oct 2012: Evolution by subfunctionalization
A-R Carvunis et al., "Proto-genes and de novo gene birth," doi:10.1038/nature11184, Nature, online 14 Jun 2012.
A-R Carvunis et al.
19 Nov 2011: Where do new genes come from?
25 Oct 2010: Genes are either very old, or they appear suddenly, without predecessors.
13 Sep 2010: Origins, evolution, and phenotypic impact of new genes
Joe Hannon comments, 14 Dec 2009.
Tobias J.A.J. Heinen, Fabian Staubach et al., "Emergence of a New Gene from an Intergenic Region" [abstract], doi:10.1016/j.cub.2009.07.049, p1527-1531 v 19, Current Biology, 29 Sep 2009.
21 Sep 2009: Nearly half of the human genome is derived from transposable elements (TEs).
17 Sep 2009: The gain and loss of exons has contributed to the evolution of new features.
23 Jul 2009: Primate-specific genes were inserted de novo, not generated by gradual divergence from non-primate genes.

References

1. David G. Knowles and Aoife McLysaght, "Recent de novo origin of human protein-coding genes" [
abstract], doi:10.1101/gr.095026.109, Genome Research, online 2 Sep 2009.
     Discovery of novel genes..., by EurekAlert!, 1 Sep 2009.
     Genes That Make Us Human, by Elizabeth Pennisi, ScienceNOW, 1 Sep 2009.
     Three human genes evolved from junk, by Michael Le Page, NewScientist, 3 Sep 2009.
     Which Genes Make Us Human? by Alan Boyle, MSNBC, 3 Sep 2009.
2. Metazoan Genes Older Than Metazoa?, 25 Oct 1996.
3. New genetic programs in Darwinism and strong panspermia, 7 Apr 2002.
4. Why Sexual Reproduction?, first posted May 1996.
5. Testing Darwinism versus Cosmic Ancestry, 24 Nov 2002.
6. Duplication Makes A New Primate Gene, 21 Feb 2005.
7. How is it Possible?, first posted May 1996.
COSMIC ANCESTRY | Quick Guide | Site Search | Next | by Brig Klyce | All Rights Reserved