Friday, June 15, 2007

Introns transcribed to RNA inside cell nuclei

The last issue of New Scientist contains an article about the discovery that only roughly one half of DNA expresses itself as aminoacid sequences. The article is published in Nature (thanks for Doug for the link). The Encyclopedia of DNA Elements (ENCODE) project has quantified RNA transcription patterns and found that while the "standard" RNA copy of a gene gets translated into a protein as expected, for each copy of a gene cells also make RNA copies of many other sections of DNA. In particular, intron portions ("junk DNA", the portion of which increases as one climbs up in evolutionary hierarchy) are transcribed to RNA in large amounts. What is also interesting that the RNA fragments correspond to pieces from several genes which raises the question whether there is some fundamental unit smaller than gene.

In particular, intron portions ("junk DNA", the portion of which increases as one climbs up in evolutionary hierarchy) are transcribed to RNA in large amounts. What is also interesting that the RNA fragments correspond to pieces from several genes which raises the question whether there is some fundamental unit smaller than gene.

None of the extra RNA fragments gets translated into proteins, so the race is on to discover just what their function is. TGD proposal is that it gets braided and performs a lot of topological quantum computation (see this). Topologically quantum computing RNA fits nicely with replicating number theoretic braids associated with light-like orbits of partonic 2-surfaces and with their spatial "printed text" representations as linked and knotted partonic 2-surfaces giving braids as a special case (see this). An interesting question is how printing and reading could take place. Is it something comparable to what occurs when we read consciously? Is the biological portion of our conscious life identifiable with this reading process accompanied by copying by cell replication and as secondary printing using aminoacid sequences?

This picture conforms with TGD view about pre-biotic evolution. Plasmoids [1], which are known to share many basic characteristics assigned with life, came first: high temperatures are not a problem in TGD Universe since given frequency corresponds to energy above thermal energy for large enough value of hbar. Plasmoids were followed by RNA, and DNA and aminoacid sequences emerged only after the fusion of 1- and 2-letter codes fusing to the recent 3-letter code. The cross like structure of tRNA molecules carries clear signatures supporting this vision. RNA would be still responsible for roughly half of intracellular life and perhaps for the core of "intelligent life".

I have also proposed that this expression uses memetic code which would correspond to Mersenne M127=2127-1 with 2 126 codons whereas ordinary genetic code would correspond to M7=27-1 with 26 codons. Memetic codons in DNA representations would consist of sequences of 21 ordinary codons. Also representations in terms of field patterns with duration of .1 seconds (secondary p-adic time scale associated with M 127 defining a fundamental biorhythm) can be considered.

A hypothesis worth of killing would be that the DNA coding for RNA has memetic codons scattered around genome as basic units. It is interesting to see whether the structure of DNA could give any hints that memetic codon appears as a basic unit.

  1. In a "relaxed" double-helical segment of DNA, the two strands twist around the helical axis once every 10.4 base pairs of sequence. 21 genetic codons correspond 63 base pairs whereas 6 full twists would correspond to 62.4 base pairs.

  2. Nucleosomes are fundamental repeating units in eukaryotic chromatin possessing what is known as 10 nm beads-on-string structure. They repeat roughly every 200 base pairs: integer number of genetic codons would suggest 201 base pairs. 3 memetic codons makes 189 base pairs. Could this mean that only a fraction p≈ 12/201, which happens to be of same order of magnitude as the portion of introns in human genome, consists of ordinary codons? Inside nucleosomes the distance between neighboring contacts between histone and DNA is about 10 nm, the p-adic length scale L(151) associated with the Gaussian Mersenne (1+i)151-1 characterizing also cell membrane thickness and the size of nucleosomes. This length corresponds to 10 codons so that there would be two contacts per single memetic codon in a reasonable approximation. In the example of Wikipedia nucleosome corresponds to about 146=126+20 base pairs: 147 base pairs would make 2 memetic codons and 7 genetic codons.

    The remaining 54 base pairs between histone units + 3 ordinary codons from histone unit would make single memetic codon. That only single memetic codon is between histone units and part of the memetic codon overlaps with histone containing unit conforms with the finding that chromatin accessibility and histone modification patterns are highly predictive of both the presence and activity of transcription start sites. This would leave 4 genetic codons and 201 base pairs could decompose as memetic codon+2 genetic codons+memetic codon+2 genetic codons. The simplest possibility is however that memetic codons are between histone units and histone units consist of genetic codons. Note that memetic codons could be transcribed without the straightening of histone unit occurring during the transcription leading to protein coding. Note that prokaryote genome lacks the histone units so that the transition from prokaryotes to eukaryotes would mean the emergence of memetic code.

[1] E. Lozneanu and M. Sanduloviciu (2003), Minimal-cell system created in laboratory by self-organization, Chaos, Solitons and Fractals, Volume 18, Issue 2, October, p. 335. See also Plasma blobs hint at new form of life, New Scientist vol. 179 issue 2413 - 20 September 2003, page 16.

For background see the chapter Topological Quantum Computation in TGD Universe of "TGD as a Generalized Number Theory" and the chapter Pre-biotic Evolution in Many-Sheeted Space-Time of "Genes and Memes".

2 comments:

Doug said...

Hi Matti,
1 - Nature, v447, n7149, p799 is an article on the Encode Project for the human genome.
An Editor's Summmary is at
http://www.nature.com/nature/journal/v447/n7146/edsumm/e070614-01.html

2 - MicroRNA at Wiki has additional information.

Matti Pitkänen said...

Thank you for the links.

Matti