2018
In accordance with the budgetary allocation for the calendar year, the project included one activity, namely:
"The growth of shoots, extraction and quantification of DNA ".
The necessary purchases of materials and equipment were fulfilled and the main institutions that will be the direct beneficiaries of the project results were consulted for selecting the material. These included the National Institute for Agricultural Research (INCDA Fundulea), the three Research Stations for maize (Turda, Șimnic and Lovrin) and the Bank of Plant Genetic Resources Suceava. Following the discussions, 1,735 maize inbred lines were chosen for which molecular genotyping will be performed. It should be noted that the number of lines has been increased (from 1,200, provided in the project proposal) as a result of the identification of additional sources of funding. Moreover, 600 inbred lines from the Rep. of Moldova were added to the study, given that they are very close to the Romanian national germplasm, as well as 100 lines from the Bank of Plant Genetic Resources of Serbia (Maize Research Institute Zemun Polje), to have a better picture of the genetic diversity of this crop in the Balkans.
An attempt was made to germinate all the 2,435 lines, either within the Institute of Biological Research or at the partners' locations. For 2,236 the germination was successful and the DNA was extracted.
DNA extraction was performed using specific kits, and DNA was quantified using NanoDrop. Its quality and quantity were in optimal parameters, so the samples were sent out for genotyping using the Illumina NGS technology.
A collaboration with the Chinese partner (Chinese Agricultural University, Beijing) was established considering the similar study that they had undertaken using 3,000 maze inbred lines. The results will be corroborated, and seeds will later be exchanged between Romanian and Chinese collaborators, both parties being interested in bringing germplasm from the other country to improve local breeding programs. In this regard, seed exchange agreements are being finalized. The collaboration with the Chinese partner will materialize in a joint project that will be submitted within a financing scheme, part of Horizon 2020, with which China is an associated state.
==============================================================================================
2019
As we have stated in our project proposal the broad scope of our work was to unmask the molecular background of 1,200 maize inbred lines in order to assign them to heterotic groups (i.e., groups sharing the same molecular genetic background), which are anchored to international standard-lines used in maize breeding (B73, Mo17, Fv2, Lo3, C103, D105, OH43, W153R). This impressive Romanian germplasm (kept in the country’s six stock centers) is the result of decades of plant breeding, meticulous observations, and collecting samples from all around the country. The missing piece of information was their molecular background, which we have fulfilled this year.
Through our network of collaborators, we managed to increase the number of lines included in the analyses to 2,236, incorporating material from the Republic of Moldova and former Yugoslavia. Thusly, we can now have a broader view of the genetic structure of maize in the Balkan area. In addition, our oral presentation at the EUCARPIA meeting in Freising, Germany, this year came to fruition as an extended collaboration with the researchers led by Dr. Domagoj Simic, from Poljoprivredni institut Osijek, Croatia, with whom we agreed to share data, his team complementing our research through inclusion of Croatian and Hungarian maize germplasm.
All the inbred lines were genotyped by GBS and analyzed according to the workflow detailed below:
Samples were pooled as multiplexed libraries using Illumina barcodes.
Paired-end sequencing was performed in Illumina HiSeq X instruments (0.1x genomic coverage).
Cleaning of raw data was performed using Fastp.
All clean reads were mapped onto the B73_v4 reference genome using BWA-MEM.
Sorting and indexing was done using Samtools.
Duplicates were removed using MarkDuplicates implemented in Picard Tools.
SNP and indels were called using UnifiedGenotyper from GATK3.
The VCF file outputed by GATK3 were imputed using BEAGLE 5.0.
The BEAGLE output served as input for:
GCTA, to perform the PCA analysis.
Plink, to calculate the distance matrix (1-ibs), which served as input for MEGA, to calculate the NJ tree.
ADMIXTURE, to infer population structure.
The workflow above led us to having > 800.000 informative SNP markers across the inbred lines, filtered from the original > 2.3 M generated.
As we have hypothesized, the Romanian germplasm has some unique features, making it extremely valuable in future breeding programs, which are nowadays faced with a shrinking pool of favorable alleles. Thus, in a PCA analysis of the genetic relations among the inbred lines we have shown that many of them form two clusters that deviate from the existing genetic structure based on the international elite lines mentioned above (Figure 1).
We have 200 inbred lines for which we know their pedigree and are confirmed as having been extracted from old local populations of Romania. Our country still has pristine areas where hybrids have not been grown, and peasants prefer sowing their own material, passed on from one generation to another. These served as gene pools to our Agricultural Research Stations and the Fundulea Agricultural Institute, where the above inbred lines have been created. Not surprisingly, when mapped onto the above figure the 200 inbred lines clustered in the two groups identified as not being populated by international elite lines (Figure 2).
The clear separation of the Romanian inbred lines is confirmed by the Neighbour Joining method (Figure 3). At least three clusters are present that are deeply rooted and that do not have an elite inbred line included but rather local germplasm.
We also inferred the genetic structure of the > 2,000 inbred lines using ADMIXTURE (Figure 4). Whereas many of those share part of their genome with the other inbred lines, including the international elite lines, there are four clusters clearly differentiated that are conserved nonmatter the K being used (from K3 to K8). The third cluster includes that 200 inbred lines extracted from local populations.
In summary: we have shown that using more than 800.000 SNP markers we were able to group the Romanian germplasm into heterotic groups and show that part of the inbred lines are new additions to the international allelic pool available to breeding programs worldwide. These heterotic groups/clusters serve the breeders in designing crosses that would render new hybrids characterized by an increased level of heterosis.
Figure 1. PCA results in 3D view, using the GCTA software and the output generated by BEAGLE 5.0 after imputations. In red are the international elite lines. Two clusters are formed that do not contain any international standards.
Figure 2. Same as above but the 200 inbred lines originating on old local populations are highlighted in orange. They cluster in the two corners of the pyramidal 3D structure, unrelated to the international elite inbred lines.
Figure 3. NJ tree presenting the three big clusters that are characteristic to the Romanian germplasm. In red are the international standards used as reference.
Figure 4. Population structure of the > 2,000 inbred lines inferred using ADMIXTURE, from K3 to K8. The four clusters that share little admixture are highlighted. The largest such cluster includes the 200 inbred lines originating in old Romanian germplasm.
==============================================================================================
2020
The present project focused on the study of the molecular substrate of corn in Romania, in order to help national improvement programs, by highlighting national resources. We analyzed the Romanian germplasm of corn, together with the one from the Republic of Moldova (using additional budgetary resources), thus including in this study ~ 2,300 parental lines (lines used in obtaining hybrids). Studying the molecular baggage of this significant number of samples through state-of-the-art laboratory technologies (Genotyping-By-Sequencing; GBS), coupled with complex bioinformatics analyses, we highlighted the genetic richness of the material in the two countries, developed over time, since the introduction of corn in Romania, at the beginning of the 17th century. This wealth is all the more important as global breeding programs face a reduction in genetic diversity as a result of the intensive use of hybrids, especially after World War II. The main explanation lies in the fact that Romania still has areas where hybrids have not yet penetrated, the peasants currently using local populations transmitted "from father to son". The data generated by us allow the collaborators (National Institute of Agricultural Research and Agricultural Stations in Romania, together with the Porumbeni Institute of Phytotechnics from the Republic of Moldova) to obtain new hybrids, superior to the current ones and with the potential to successfully compete with those generated by large seed companies.
The main motivation that determined us to propose this study was the lack of solid and comprehensive molecular data about this important crop plant in Romania, our country being a white spot on the world map from this point of view.
The results validated the working hypotheses, the Romanian germplasm having unique characteristics, which make it extremely valuable in future improvement programs, which are currently facing a decrease in the available fund of favorable alleles. Thus, in a PCA analysis of the genetic relationships between inbred lines we showed that many of them form two clusters that deviate from the genetic structure given by the elite international lines used as reference - B73, Mo17, Fv2, Lo3, C103, D105 , OH43, W153R.
The year 2020 was dedicated to the latest interpretations of data and the presentation of final results. In the context of the COVID-19 pandemic, the Maize Genetics Conference 2020, where we set out to present the results, was canceled. However, we participated with an oral presentation at the University of Bucharest - "Science for all" Series, at the Romanian Athenaeum, as well as an oral presentation at the webinars organized by the European program ELIXIR (https://elixir-europe.org).
Collaboration with Chinese partners at China Agricultural University, Beijing has been severely affected by the lockdown caused by the pandemic in China, and the generation of the latest laboratory data has been long overdue. However, we managed to generate and process all of them, so that compared to the 2019 report we increased the number of SNP markers from 800,000 to over 1,400,000, which considerably increases the resolution in demarcating heterotic groups.
The genetic information collected by genotyping the 2,236 inbred lines with the 1.4 million SNP markers, allows us to determine the degree of kinship between them, and breeders will use the results in selecting the lines that are suitable for obtaining hybrids superior to existing ones, both quantitatively and qualitatively. New hybrids have already been generated at the Agricultural Research Station Turda, by crossing genetically distant parental lines, i.e. belonging to different heterotic groups. In Figure 5 we present a sample of the diversity of the Romanian corn germplasm, together with a hybrid produced at Turda by crossing a parental line obtained from a local population and an elite line from breeding programs.
Figure 5 A sample of the diversity of the Romanian corn germplasm, together with a hybrid produced at Turda by crossing a parental line extracted from a local population and an elite line from breeding programs.
During 2020 phase of the project, we also addressed the diversity of mitochondria and chloroplast, auxiliary to nuclear. The two cellular components provide important information on the diversity of the cytoplasm in the 2,236 inbred lines, due to the fact that they are inherited only on the maternal line. We made these analyses for the set of inbred lines from each of the Agricultural Research Stations in Romania (Turda, Șimnic, Lovrin), INCDA Fundulea, as well as the Plant Gene Bank from Suceava (as keeper of the germplasm from SCDA Suceava and SCDA Podu Iloaiei).
We present the results broken down by location, nuclear, mitochondrial and chloroplastic markers in the form of pdf annexes to the final report sent to UEFISCDI.
The number of SNP markers used for mitochondria is 80, while 23 were identified for chloroplast, proportional to the size of the genomes of the two organelles.
Included in Figure 6, as an example of the existing genetic variation on the organelles (here, mitochondria), is the 3D representations of the PCA results for the ~450 inbred lines originating at the National Institute for Agriculture (Fundulea).
Figure 6. The first 3 axes from the PCA analysis of the mitochondrial genetic diversity in the inbred lines from Fundulea for. Colored, the international standards used as follows: B73 = fuchsia, C103_EN = aqua, CO255 = black, D105 = gray, DEM = brown, F1852Nr = chartreuse, FV2 = coral, K1080 = darkgreen, L03Bery = plum, M4 = pink, MO17 = indigo, W153RNeall = gold, OH43_RO = salmon.
The genetic diversity of the chloroplast is severly reduced, compared to that of the mitochondria (Figure 7).
Figure 7. Same as for Figure 6 but plotting the genetic diversity of the chloroplast for the ~ 450 inbred lines. Strikingly, a signifficant number of those inbred lines have an identical chloroplast (as probed by the 23 informative SNP markers).
The genotyping of all the inbred lines of Romania is a very important starting point in the future improvement programs implemented nationwide (and beyond). In order to have a comprehensive picture of the existing corn genopond in our country and the potential it has in the economy / agriculture, it is necessary to complement the studies on inbred lines with those on the local populations in which they originate. In this context, regarding the two manuscripts proposed to be sent for publication before the end of the project, we decided to wait for the results to be corroborated with those of another project we coordinate, which aims at genotyping by the same technique 480 local populations from Romania. The processing of data for those has already started, based on the bioinformatics analysis pipeline developed in this project. The two manuscripts resulting from the corroboration of the data will focus on (i) nuclear diversity and (ii) cytoplasmic diversity, respectively, by studying the genomes of the two cellular organelles.
Here, we present a sample of the genetic diversity of the local populations (Figure 8), as unpublished results from our ongoing project, showing a clear differentiation of the regions where hybrid seed has not entered yet, and the Romanian germplasm is still pristine (where peasants pass on their seed "from father to son", as mentioned above.
Figure 8. Graphical representation of the Q matrix generated by ADMIXTURE for the 480 local populations from Romania, showing the clear separation of genepools from the Romanian provinces. In yellow, the genepool that is specific to higher lands/mounaineous regions, like the springs of the Iza River in the North, and Mures River further South, two location with known "from father to son" passing of seed.