Tuesday, November 29, 2005

Yes! Genetic code can be understood number theoretically!

For the last three weeks I have been working (or rather fighting) with a number theoretical model of the genetic code stimulated by an accidental observation: the number of primes smaller than 64, the number of DNA codons, is 18 and together with 0 and 1 this makes 20, the number of aminoacids!

This led to an intense period of work involving a lot of modular arithmetics and painful MATLAB computations. Things are not made easier by the fact that I have to develop program modules in home PC and run them in University computer. It is tragic that the Physics Department of Helsinki University is so poor that it cannot provide me with the number theoretically advanced Mathematica or even MATLAB working in my personal PC. To say nothing about some financial help. Hence I am forced to work as an unemployed with a minimal unemployment money. The far sighted and wise decision makers of Helsinki University must cry for pain and shame when they cannot do nothing to help me. The people working in the universities of rich countries such as India probably cannot realize how difficult the situation of scientists in the underdeveloped countries like Finland is.

I have already told about the basic ideas behind the number theoretical model of the genetic code. The idea is to maximize the negentropy defined as a number theoretic variant of Shannon entropy by replacing the arguments of logarithms with their p-adic norms. The point is that these entropies can have also negative values as a function of the prime defining the p-adic norm. Negentropy maximization makes it possible to assign a unique prime p(n) to a given integer n representing DNA triplet.

The task is to determine the map mapping DNA codons, which are naturally labelled by 3 4-digits in base 4, to the set of integers n in the range 0-63 and to deduce the map by assigning to the partitions (n,r) of n to r summands Boltzmann weights f(r)= exp(-H(r)/T) and by maximizing the negentropy. One can consider bosonic, fermionic, and supersymmetric thermodynamics. All possible partitions correspond to bosonic case, the partitions containing given integer at most once correspond to the fermionic case, and supersymmetric case corresponds to the product of bosonic and fermionic partition functions.

The numerical experimentation led to the conclusion that simplest Hamiltonians do not work. Quantum criticality and fractality of TGD Universe however inspire the idea that the criticality is an inherent property of Hamiltonian rather than only thermodynamical state. Hence Hamiltonian can depend only weakly on the character of the partition so that all partitions contribute with almost equal weights to the partition function. The natural assumption is that the Hamiltonian depends only on the number r of summands in the partition. The super-symmetric variant of this kind of Hamiltonians yield the most realistic candidates for the genetic code and one might hope that a number theoretically small perturbation not changing the divisors p < 61 of partition function but affecting the probabilities could give correct degeneracies.

Unfortunately, numerical experimentation suggests that this might not be the case and that simple analytic form of Hamiltonian is too much to hope for. A simple argument however shows that exp(-H/T)=f(r) could be in quantum critical case be deduced from the genetic code by fixing the 62 values of f(r) so that the desired 62 correspondences n → p(n) result. The idea about almost universality of the genetic code would be replaced with the idea that quantum criticality allows to engineer a genetic code maximizing the total negentropy associated with DNA triplet-aminoacid pairs. In principle this would allow to predict a unique genetic code as the absolute negentropy maximum but this is outside of my computational resources since the crucial assumption 1< f(n)< n still leaves 63! possibilities to consider.

A natural guess is that the map codon → n of codons to integers is given as a small deformation of the map induced by the map of DNA codons to integers induced by the identification of nucleotides with 4-digits 0,1,2, 3 (this identification depends on whether first, second, or third nucleotide is in question). This map predicts an approximate p(n)=p(n+1) symmetry directly visible over finite ranges in the columns of code table and has also a convincing number theoretical justification in terms of a procedure allowing to construct f(n) by a trial and error procedure. This map is also consistent with exact A-G symmetry and almost exact T-C symmetry with respect to the last nucleotide of the codon.

One can deduce both codon-integer and aminoacid-prime correspondences. At least two Boltzmann weight distributions f(n) are consistent with the genetic code and Negentropy Maximization Principle constrained by the degeneracies of the genetic code. Only bosonic thermodynamics works contrary to the expectations raised by the earlier analytic models.

What is so non-trivial is that the natural map assigning to a given codon an integer gives almost correctly the map of codons to integers n in turn allowing to understand genetic code as a correspondence maximizing the individual negentropies of codon-aminoacid pairs as well as their sum. This motivates the attempts to find the physical interpretation of the number theoretical thermodynamics. The interpretation in terms of a broken conformal invariance is highly suggestive since bosonic partitions can be assigned with the states of a fixed conformal weight n constructed by using ordered sequences of conformal generators Ln or even better, U(1) Kac Moody generators Jn so that basically a breaking of Kac Moody symmetry would be in question. What is this system: for instance, could it be associated with the lightlike boundaries of magnetic flux quanta which are key actors in TGD based model of topological quantum computation?

For the details of the number theoretic model of genetic code see the new chapter Could Genetic Code Be Understood Number Theoretically? of Genes,Memes, Qualia, and ...".

Matti Pitkänen

No comments: