The $\mathbb{Z_5}$-vector space $\mathfrak{B}$3 over the field $(\mathbb{Z_5}, +, .)$

1. Background

This is a formal introduction to the genetic code $\mathbb{Z_5}$-vector space $\mathfrak{B}^3$ over the field $(\mathbb{Z_5}, +, .)$. This mathematical model is defined based on the physicochemical properties of DNA bases (see previous post). This introduction can be complemented with a Wolfram Computable Document Format (CDF) named IntroductionToZ5GeneticCodeVectorSpace.cdf available in GitHub. This is graphic user interface with an interactive didactic introduction to the mathematical biology background that is explained here. To interact with a CDF users will require for Wolfram CDF Player or Mathematica. The Wolfram CDF Player is freely available (easy installation on Windows OS and on Linux OS).

2. Biological mathematical model

If the Watson-Crick base pairings are symbolically expressed by means of the sum “+” operation, in such a way that hold: G + C = C + G = D, U + A  = A + U = D, then this requirement leads us to define an additive group ($\mathfrak{B}^3$, +) on the set of five DNA bases ($\mathfrak{B}^3$, +). Explicitly, it was required that the bases with the same number of hydrogen bonds in the DNA molecule and different chemical types were algebraically inverse in the additive group defined in the set of DNA bases $\mathfrak{B}$. In fact eight sum tables (like that one shown below), which will satisfice the last constraints, can be defined in eight ordered sets: {D, A, C, G, U}, {D, U, C, G, A}, {D, A, G, C, U}, {D, U, G, C, A},{G, A, U, C},{G, U, A, C},{C, A, U, G} and {C, U, A, G} [1,2]. The sets originated by these base orders are called the strong-weak ordered sets of bases [1,2] since, for each one of them, the algebraic-complementary bases are DNA complementary bases as well, pairing with three hydrogen bonds (strong, G:::C) and two hydrogen bonds (weak, A::U). We shall denote this set SW.

A set of extended base triplet is defined as $\mathfrak{B}^3$ = {XYZ | X, Y, Z $\in\mathfrak{B}$}, where to keep the biological usual notation for codons, the triplet of letters $XYZ\in\mathfrak{B}^3$ denotes the vector $(X,Y,Z)\in\mathfrak{B}^3$ and $\mathfrak{B} =$ {A, C, G, U}. An Abelian group on the extended triplets set can be defined as the direct third power of group: 

$(\mathfrak{B}^3,+) = (\mathfrak{B},+)×(\mathfrak{B},+)×(\mathfrak{B},+)$

where X, Y, Z $\in\mathfrak{B}$, and the operation “+” as shown in the table [2]. Next, for all elements $\alpha\in\mathbb{Z}_{(+)}$ (the set of positive integers) and for all codons $XYZ\in(\mathfrak{B}^3,+)$, the element:

$\alpha \bullet XYZ = \overbrace{XYZ+XYX+…+XYZ}^{\hbox{$\alpha$ times}}\in(\mathfrak{B}^3,+)$ is well defined. In particular, $0 \bullet X =$ D for all $X\in(\mathfrak{B}^3,+) $. As a result, $(\mathfrak{B}^3,+)$ is a three-dimensional (3D) $\mathbb{Z_5}$-vector space over the field $(\mathbb{Z_5}, +, .)$ of the integer numbers modulo 5, which is isomorphic to the Galois field GF(5). Notice that the Abelian groups $(\mathbb{Z}_5, +)$ and $(\mathfrak{B},+)$ are isomorphic. For the sake of brevity, the same notation $\mathfrak{B}^3$ will be used to denote the group $(\mathfrak{B}^3,+)$ and the vector space defined on it.

+DACGU
DDACGU
AACGUD
CCGUDA
GGUDAC
UUDACG

This operation is only one of the eight sum operations that can be defined on each one of the ordered sets of bases from SW.

3. The canonical base of the $\mathbb{Z_5}$-vector space $\mathfrak{B}^3$

Next, in the vector space $\mathfrak{B}^3$, vectors (extended codons): e1=ADD, e2= DAD and e3=DDA are linearly independent, i.e., $\sum\limits_{i=1}^3 c_i e_i =$ DDD implies $c_1=0, c_2=0$ and $c_3=0$ for any distinct $c_1, c_2, c_3 \in\mathbb{Z_5}$. Moreover, the representation of every extended triplet $XYZ\in\mathfrak{B}^3$ on the field $\mathbb{Z_5}$ as $XYZ=xe_1+ye_2+ze_3$ is unique and the generating set $e_1, e_2$, and $e_3$ is a canonical base for the $\mathbb{Z_5}$-vector space $\mathfrak{B}^3$. It is said that elements $x, y, z \in\mathbb{Z_5}$ are the coordinates of the extended triplet $XYZ\in\mathfrak{B}^3$ in the canonical base ($e_1, e_2, e_3$) [3]
  1. José M V, Morgado ER, Sánchez R, Govezensky T. The 24 Possible Algebraic Representations of the Standard Genetic Code in Six or in Three Dimensions. Adv Stud Biol, 2012, 4:119–52.
  2. Sanchez R. Symmetric Group of the Genetic-Code Cubes. Effect of the Genetic-Code Architecture on the Evolutionary Process. MATCH Commun Math Comput Chem, 2018, 79:527–60.
  3. Sánchez R, Grau R. An algebraic hypothesis about the primeval genetic code architecture. Math Biosci, 2009, 221:60–76.