Totally there are 4096 Codon mutations. Using a binary-image numerical system to express those mutations can significantly benefit observation, recording, statistic, analyzing and organizing info/data regarding SARS-2 virus and relevant research.
What is Binary-Image Numerical System?
About two years ago, based on the ancient Chinese Bagua (I-Ching) binary system and modern mirror image theory, I developed a new genetic Codon system and its table.
The basics: The classic mirroring study believes that the mirror value of value 2 is (-2), and the mirror value of zero (0) is still zero (0). In the said system, the mirror value of zero is 1, (vice versa, the mirror value of 1 is 0 instead of (-1)). In this way, the mirror value of a numerical value depends on the dimension. For example, the mirror value of the 3D value 2 is 5, that is, the mirror value of 010 is 101; and the mirror value for 3D value zero (000) is 7 (111), the mirror value of the 4D value zero (0000) is 15 (1111), and so on.
Incorporating the Bagua binary mirror numerical approach into the ACGT system of the Codon system, and arranging the gene pairs according to the energy strength of each code from weak to strong, we can have:
Fig. 1: Gene Code: A Binary-Image Numerical Approach Corresponding to I-Ching.
Among them, A: binary value 00 (dec 0); its mirror image is gene T with binary value 11 (dec 3).
Similarly, C: binary value 01 (dec 1); its mirror image is gene G with binary value 10 (dec 2).
According to NIH GenBank statistics by Feb-20 2021, the highest frequent mutations re SARS2 are [C>T] and [A>G]. They have a common allele-gap value [2] (binary 10), namely:
Mutation [C>T]: 3(11)-1(01) = 2(10),
Mutation [A>G]: 2(10)-0(00) = 2(10).
This positive value could be a numerical indicator of the energy change caused by the virus mutation and that of an increase in transmission power.
The arrangement and function of the Bagua binary mirror Codon system is consistent with the ancient Chinese idea of "Yin-Yang with Five Elements". The simple demo is as follows:
As A-T and C-G are pairs to each other, then, if A-G belongs to Yin, thus C-T belongs to Yang. Under this coding, C>T means production of a positive increase and T>C means that of a negative increase in energy, although their numerical value may be the same. The same way but opposite direction for A>G or G>A pair to go. This expression may well reflect the increased power and decreased energy consumption when the SARS-2 virus interacts with human ACE2 and runs/functions in the body.
A Binary-Image Based Codon Table.
According to the above framework, a new gene Codon table (Table 1) and the assignment of each Codon can be represented by:
AAA can be expressed as: 00 00 00, decimal is 0,
AAC can be expressed as: 00 00 01, decimal is 1,
AAG can be expressed as: 00 00 10, decimal is 2,
AAT can be expressed as: 00 00 11, decimal is 3,
ATT can be expressed as: 00 11 11, decimal is 15;
CAA can be expressed as: 01 00 00, decimal is 16,
CTT can be expressed as: 01 11 11, decimal is 31;
GAA can be expressed as: 10 00 00, decimal is 32,
GTT can be expressed as: 10 11 11, decimal is 47;
TAA can be expressed as: 11 00 00, decimal is 48,
TTT can be expressed as: 11 11 11, decimal is 63.
Each gene mutation can be expressed as:
Allele-Gap (Ref, Alt) = Alt(value) - Ref(value).
Among them, Ref is the original gene code before mutation, and Alt is the gene code after the mutation; their difference is termed as [allelic gap] re the genetic mutation. And the allelic gap is calculated by the above formula.
Example: Among the S-Gene mutations re UK B.1.1.7. in UK, there is one A23063T/N501Y, of which the Codon expression is AAC>TAC. See above, the value of AAC is 1 (000001), the value of TAC is 49 (110001), and the allelic gap is 49-1=48, and the mutation N501Y can be recorded as: 48 (1, 49).
In the same way, all mutations re B.1.1.7. can be expressed and analyzed by using the numerical value in the Bagua binary mirror coding system, such as:
Briefly, on the S-Gene mutations: The arithmetic difference between A570D and P681H is same (-4), however, their Ref-Alt codes are different, the former is (37, 33) and the latter is (21, 27). In the said binary coding system, the former has a stronger negative energy change.
Taking all such S-Gene mutations allelic-gap values together, the score is 60, meaning a positive energy function gain. This may represent an enhanced spreading power of the SARS-2 virus. (Their binary digits can be used for computer chip designs.).
The mutation of ORF1ab may need special attention. It has two mutations with the same allelic gap but opposite coding locations, i.e., T761I has [8 (5, 13)] and I2230T has [-8 (13, 5)]. In classic mirror study, they mean balancing or canceling each other. However, in the said binary mirror system, they mean different energy changes for the virus affinity power. For instance, the former may increase affinity power and the later may reduce the energy consumption when the virus runs with ACE2.
It goes in the same way, to observe and to analyze all mutations for B.1.1.7 and obtain a total score of positive 100. That shows a more obvious or more powerful improvement re gain of functions, which performs as a more dangerous SARS-2 virus.
A Binary-Image Based Codon Coordinate.
According to the above, a binary-mirror-based coordinate can be constructed, taking mutation Ref and Alt as the X and Y axes respectively, and the decimal arithmetic value is the Z axis. In this way, the picture of the aforementioned mutation can be clear at a glance. For example:
Fig.2: Mutations Represented by Binary-Image Based Coordinate.
In the above graph, mutations A570D and P681H have the same absolute allelic value 4 at different locations by Ref/Alt values, meaning different energy powers. Mutations of T761I and I2230T have the same absolute allelic value 8 in opposite directions, meaning different but coordinated energies, not that they are canceling each other.
In fact, as there are 64 Codons, they produce 64*64=4096 allelic mutations, which can be totally covered by the said binary-mirror based coordinate. In other words, an allelic Codon mutation can be expressed in a numerical format. For example:
5(2,7) = AAG>ACT, -18(57,39)=TGC>GCT, 63(0,63)=AAA>TTT;
5(22,27)=CCG>CGT, -18(22,4)=CCG>ACA, -63(63,0)=TTT>AAA, etc.
Further, each Codon has 3 gene codes, which can also be expressed by coordinate assignment, but the dimension needs to be increased. For example:
5(2,7) = [AAT] in AAG>ACT has two [A] and [ACT] has one A, which can be expressed as:
5(2,7,0,0): The first A of [AAG]; the first 0 refers to the new dimension of Ref (before mutation), and the second 0 represents the first gene code of the Codon.
5(2,7,0,1): The second A of [AAG]; the first 0 refers to the new dimension of Ref (before mutation), and 1 represents the second gene code of the Codon.
5(2,7,0,2): The G of [AAG]; the first 0 refers to the new dimension of Ref (before mutation), and 2 represents the third gene code of the Codon.
The same goes for: 5(2,7,1,0): The A of [ACT]; the new dimension 1 is Alt (after mutation), and the following 0 represents the first gene code of the Codon.
There are 3 single gene codes for each Codon, thus, under the binary-mirror coordinates, there are totally (64*3)^2=36,864 different mutation possibilities regarding single-gene allelic mutations. And they can also be expressed by numerical approach.
For example, regarding mutation GCC>ACG, the value [-31(37,6,0,2)] and [-31(37,6,1,1)] respectively indicate the third gene C in GCC and the second C in ACG. They have the same value [-31] but their energy powers are different.
In sum, using the said Bagua binary-mirror based numerical system to represent allelic mutations can make it easier for observation, recording, analyzing and organizing relevant info/data, especially when computer work is involved.
Of course, the 4096 possible mutations could be expressed by a simpler approach, something like [allele-gap (0…4095)]. However, the cost is that it cannot directly reflect the mirror status and trend re the energy change during and after the mutation process.
More Discussion: More uses of the binary mirroring password table.
The NIH published a new SARS-2 mutation database:
Mutations in SARS-CoV-2 SRA Data.
https://www.ncbi.nlm.nih.gov/labs/virus/vssi/#/scov2_snp
As of Feb 22, 2021, the database has 454,464 cases re SARS-2 mutations, of which there are 4777 events in terms of Codon mutation frequency. Using the said binary-image Codon table, each Codon has a unique frequency ratio for both Ref and Alt of the total events, such as:
Fig. 3:
Based on the above, each Codon base gene accruing frequency and score can be obtained as following:
Fig. 3-2:
In the left figure, number 1, 2 and 3 refers to the first, second the third base gene, which are positioned in the said binary-image form. And the right figure shows the score of the accruing frequency by each base gene code, A-C-G-T.
It can be seen that the mutation frequency scores of the basic genes C and G are significantly lower, indicating that they are in a relatively stable state, or that they work in keeping the virus stable. The mutation frequency scores of A and T are significantly higher, indicating that they are in a more active status, or that they have the function of strengthening the adaptability and stability regarding the virus mutations. Many research papers show that the most frequent mutations of the SARS-2 are C>T and A>G, which can be directly expressed by the said binary mirror Coden table.
It is noteworthy that the PRC-PLA doctor Chen Wei and her team told the same trend in their SARS-2 vaccine patent. For example:
Patent number: CN111218459A
Patent Title: Recombinant Novel Coronavirus Vaccine Taking Human Replication-Defective Adenovirus As Vector.
Filed by 2020-03-18, Chen Wei (etc) of PRC-PLA medical institute.
Link: https://patents.google.com/patent/CN111218459A/en
…In this case, we used a method of replacing part of the high-frequency and low-frequency codons by artificial analysis while uniformly distributing the high-frequency codons and the low-frequency codons in the S protein gene. Also considering that increasing the GC content of mRNA helps to enhance the stability of mRNA, we suitably increased the GC content of the S protein gene and distributed G, C nucleotides as evenly as possible throughout the GP gene. #
Dr. Chen Wei, the principle inventor of the vaccine, said that their vaccine can cover all known SARS-2 mutations and that the vaccine was put in production on Feb-26-2020. Recently, the PRC officials told the same when promoting China-made vaccines.
The prerequisite for vaccine development is the possession of relevant viruses, and the pre-condition for covering known mutations is to have large quantity and different genetic mutation strains. More over, normally, it takes half a year to several years to complete a new vaccine design before putting it into production.
Therefore, the above patent, which told (or must tell) the truth, strongly suggests that, in or before August 2019, the PRC authorities had already had big enough and well-organized genetic database regarding SARS-2 and its mutations. Otherwise, how could the vaccine developers so clearly know about the C and G functions in SARS-2 mutations and then put their knowledge into vaccine development?
However, at that time, i.e., August 2019, there was no confirmed and big pandemic reported. Thus, it is not exaggerating to say that the SARS-2 comes from a lab-product, making the said vaccine C-G design to cover all known mutations and development all possible.
Note: Useful Explain From PLA doctor Chen Wei.
Regarding her Covid-Vaccine design, PRC military doctor Chen Wei provided a unique perspective on the frequency, quantity and balanced distribution of the basic gene CG in S-gene. These information can help to research the SARS-2 origin, which is artificially synthesized in a laboratory such PRC-WIV. The following chart is a comparison of the basic gene frequency of S-gene in different samples:
Fig.3-2-B.: Occurring Frequencies of Base Genes in S-Gene.
Among them,
NC-45512 is the basic sample commonly used to study SARS-2, which came from a patient in PRC Wuhan at the early stage of the Wuhan epidemic.
CW-PATENT is the Covid vaccine, mRNA, developed by PRC military doctor Chen Wei and her team, using gene editing technology.
WIV1 is one of the artificial chimera new coronavirus, which was by Shi Zhengli's team in Wuhan Institute of Virology (WIV), with cooperation from some Americans such Dr. Daszak in NYC and others.
RaTG13 is a nCov sample carried by Yunnan bats. According to Shi Zhengli team of WIV, it has become the source of SARS-2 with 96.2% identity.
Bat Yunnan-2012/2014 are samples of Yunnan bats carrying corona-virus collected in 2012 and 2014, i.e., one year before and after the RaTG13 collect-date.
It can be seen that the frequency and length of the CG basic genes of the bat virus samples collected in 2012/2014 are significantly different from other samples. That is, if RaTG13 sample came from nature, then its base genes should have almost identical frequency with bats of Yunnan-2012/2014. However, it has a very tight similarity with WIV1 and CW-Patent that are man-made nCov; plus, it has a 15bp mutation that is as the same as gene-editing tool sgRNA but the natural bats do not have at all.
Clearly, WIV1 and RaTG13 come from artificial synthetic chimeras. Since RaTG13 virus does not transmit to humans but WIV1 can directly jump to humans, it would be reasonably to say that WIV1 is the first candidate origin re SARS-2. In fact, in March 2016, the NAS publication warned that WIV1, a synthetic chimeras, can directly jump to humans without intermediate host, causing high pandemic and economic loses, even change exiting living style. Now the warning has been becoming a harsh and deadly reality that the whole world is facing. (end of note).
Forecasting the SARS-2 Mutations.
Further, using the said binary-image system can help to forecast the mutation trend. For instance, assuming that the allelic mutations of A>G and C>T happen again on each of the basic genes in the above data, using the said allelic-gap formula to calculate, the following chart is obtained:
Fig. 3-3: The SARS-2 Mutation Trend, Codon and Base Genes:
Obviously, in terms of mutation activity and its amplitude,
1] the first basic gene is in the lowest and the third basic gene in the highest trend;
2] the C>T allelic mutation is much more active than A>G mutation, reflected by both of their scores and average;
3] the C>T mutation alleles are mostly positive, and the A>G, negative.
The above information tells that the SARS-2 mutation is far from over and, by referencing the above NIH data, it may last for another year, for which the C>T alleles have played and will continue to play a major role.
It is also noteworthy that, in PRC, gene-editing is not only very popular but also seriously out of control. For example, its management framework states that gene-editing has no ethical or safety issues and, as long as no foreign gene involved, a gene-edited product can be managed and used as conventional breeding one.
Under such framework, many gene-edited food products are commercialized in the name of conventional (or even natural) breed-products, even gene-edited babies were delivered (news reported there were two, though, in fact, just open-papers indicated that there were at least 58 gene-edited embryos, no report re they were born or not).
Within such framework and gene-editing development in PRC, man-made and gene-edited virus (including SARS-2) would be just a piece of cake. So there, it would be nothing surprise that Dr. Chen Wei and her team could have a big SARS-2 mutation database to develop their vaccines before the pandemic started and went all over China.
Interacting with SARS-2, the receptor ACE2 and its mutations should be also paid with enough research for both academic and practical purposes in helping public health.
Here is the NIH database re ACE2 mutations:
ACE2, Gene ID: 59272, updated on 7-Mar-2021,
https://www.ncbi.nlm.nih.gov/gene/59272/ ,
Go to Variation Viewer for ACE2 variants,
https://www.ncbi.nlm.nih.gov/variation/view/?assm=GCF_000001405.25&q=ACE2[gene]
Using the said binary-image Codon table to organize its mutation raw data regarding accruing frequencies of the Codon base genes, the result is:
A: Ref:10; Alt:06; total score:16;
C: Ref:10; Alt:03; total score:13;
G: Ref:07; Alt:11; total score:18;
T: Ref:02; Alt:09; total score:11.
Repeat the above forecasting method, below is the forward-looking data regarding the ACE2 mutation trend for now and future:
Fig 3-4:
It shows that the C>T and A>G mutations are also active but less than that in the SARS-2 mutations; further, the base genes C-G and A-T are playing balanced or image roles.
The ACE2 mutation database is small, thought, it tells that, along with the SARS-2 pandemic development, human ACE2 is changing toward a status that matters a lot for public health but has not been studied well enough.
According to research papers by members of the PRC Academy of Science, the C>T mutation has the highest frequency among gene-edited food-crops they developed. Coincidentally, some Brazil genetic experts indicated that the C>T allelic mutation appeared the most among off-targets during the gene-editing operation on agricultural crops.
The coincidence of highly frequent C>T mutations in the above different fields undoubtedly raises safety issues re the gene editing and its CASx protein residues. In other words, in order to effectively prevent the spread and recurrence of the epidemic, it would be a must to consider to establish an effective restriction over the gene editing usage in food-crop products.
Reference Reading:
A Total Binary Mirror Image Codon Table. By Zhiyan-Le, 2019, 04-16.
https://sites.google.com/site/zhiyanpage2/2019/new-issues/zy9417-tbm-eng
Table 1: A Binary-Image Based Codon Table.
Table 2:A Binary-Image Based Coordinate re Allelic Gap of Codon Mutations.
Note: Red color means negative indexing value (the allelic-gap = Alt – Ref values).