Compositional domains in Cyanobacteria

DNA sequences are formed by patches or domains of different nucleotide composition. In simple, homegeneous sequences, domains can be identified by eye; however, most DNA sequences show a complex compositional heterogeneity. We used a computationally efficient segmentation method to analyse such non-stationary sequence structures, based on the Jensen–Shannon entropic divergence (Bernaola-Galván et al., 2012; Oliver et al., 1999).

We divide a DNA sequence into compositionally homogeneous domains by iterating a local optimization procedure at a given statistical significance. Once a sequence is partitioned into domains, a global measure of sequence compositional complexity (SCC), accounting for both the sizes and compositional biases of all the domains in the sequence can be derived (Román-Roldán et al., 1998). SCC is computed as a function of the significance level, which provides a multiscale view of sequence complexity. Four DNA alphabets or mapping rules (Bernaola-Galván et al., 1999) were used: {A,T,C,G}, {S,W}, {R,Y} and {K,M}.

Using the UCSC Genome Browser, we provide below links to genome maps of the compositional domains obtained at the 0.95 significance level in 91 species of Cyanobacteria.

Note that, once at UCSC Genome Browser, you can obtain a complete list of segment coordinates in plain text by clicking on Tools --> Table Browser.

References

Bernaola-Galván P, Oliver JL, Hackenberg M, Coronado a. V., Ivanov PC, Carpena P. 2012. Segmentation of time series with long-range fractal correlations. Eur Phys J B 85:211. doi:10.1140/epjb/e2012-20969-5
Bernaola-Galván P, Oliver JL, Román-Roldán R. 1999. Decomposition of DNA sequence complexity. Phys Rev Lett 83:3336–3339.
Oliver JL, Román-Roldán R, Pérez J, Bernaola-Galván P. 1999. SEGMENT: identifying compositional domains in DNA sequences. Bioinformatics 15:974–9.
Román-Roldán R, Bernaola-Galván P, Oliver JL. 1998. Sequence compositional complexity of DNA through an entropic segmentation method. Phys Rev Lett 80:1344–1347.

Genome maps of compositional segments at UCSC:

Acaryochloris marina MBIC11017Anabaena cylindrica PCC 7122 cAnabaena sp. 90 Chromosome chANA01Anabaena sp. WA102Arthrospira platensis NIES 39Arthrospira platensis YZArthrospira sp PCC 8005Aulosira laxa NIES 50Calothrix sp. 336 3Calothrix sp. NIES 2098Calothrix sp. NIES 2100Calothrix sp. NIES 3974Calothrix sp. NIES 4071Calothrix sp. NIES 4101Calothrix sp. PCC 6303Calothrix sp. PCC 7507Candidatus Atelocyanobacterium thalassa isolate ALOHAChamaesiphon minutus PCC 6605Chondrocystis sp NIES 4102Chroococcidiopsis thermalis PCC 7203Crinalium epipsammum PCC 9333Cyanobacterium aponinum PCC 10605Cyanobacterium stanieri PCC 7202Cyanobium gracile PCC 6307Cyanothece sp. ATCC 51142Cyanothece sp. PCC 7424Cyanothece sp. PCC 7425Cyanothece sp. PCC 7822Cyanothece sp. PCC 8801Cyanothece sp. PCC 8802Cylindrospermum stagnale PCC 7417Dactylococcopsis salina PCC 8305Fischerella sp NIES 3754Geitlerinema sp. PCC 7407Geminocystis sp. NIES 3708Geminocystis sp. NIES 3709Gloeobacter kilaueensis JS1Gloeobacter violaceus PCC 7421Gloeocapsa sp. PCC 7428Halothece sp. PCC 7418Leptolyngbya boryana dg5Leptolyngbya sp. O 77Leptolyngbya sp. PCC 7376Microcoleus sp. PCC 7113Microcystis aeruginosa NIES 2481Microcystis aeruginosa NIES 843Microcystis panniformis FACHB 1757Moorea producens JHBMoorea producens PAL 8 15 08 1Nodularia spumigena CCY9414Nostoc azollae 0708Nostoc piscinale CENA21Nostoc punctiforme PCC 73102Nostoc sp. PCC 7107Nostoc sp. PCC 7120Nostoc sp. PCC 7524Oscillatoria acuminata PCC 6304Oscillatoria nigro viridis PCC 7112Pleurocapsa sp. PCC 7327Prochlorococcus marinus str. AS9601Prochlorococcus marinus str. MIT 9303Prochlorococcus marinus str MIT 9312Prochlorococcus marinus str MIT 9313Prochlorococcus marinus str. MIT 9515Prochlorococcus marinus str. NATL1AProchlorococcus marinus str. NATL2AProchlorococcus marinus subsp marinus str CCMP1375Prochlorococcus marinus subsp. pastoris str. CCMP1986Prochlorococcus sp RS01Pseudanabaena sp. PCC 7367Rivularia sp. PCC 7116Stanieria cyanosphaera PCC 7437Synechococcus elongatus PCC 6301Synechococcus elongatus PCC 7942Synechococcus lividus PCC 6715Synechococcus sp. CC9311Synechococcus sp. CC9605Synechococcus sp. CC9902Synechococcus sp JA 2 3B a 2 13Synechococcus sp JA 3 3AbSynechococcus sp PCC 6312Synechococcus sp. PCC 7002Synechococcus sp. PCC 7502Synechococcus sp. RCC307Synechococcus sp. WH 8020Synechococcus sp. WH 8102Synechococcus sp. WH 8103Synechococcus sp. WH 8109Synechocystis sp. PCC 6803Thermosynechococcus elongatus BP 1Thermosynechococcus sp. NK55aTrichodesmium erythraeum IMS101Trichormus variabilis ATCC 29413