Of the top motifs showed similarity only to known E. coli
Of the top motifs showed similarity only to known E. coli or Corynebacteria motifs. Within these top motifs, we were able to identify four of the nine known Mtb motifs (DosR, IdeR, KstR, and ZurB). As described above, the KstR motif shows a much stronger signal, in terms of both conservation and information content, than any of the other motifs (top of the ranked conservation list, Table 7). Based on the distribution of highly conserved predicted motif instances for KstR across the genome, we predict a more general role for KstR in lipid metabolism. We see KstR motif instances near many other lipid genes not related to cholesterol degradation, in support of the view that KstR is a more general lipid regulator controlling a large regulon [36]. One of the most interesting new motif candidates that shows up in our analysis is a conserved palindromic motif, consisting of a highly conserved TAC… GTA separated by 6 bp of less well conserved sequence (marked with an X in Table 7) that is found in clusters of 2-3 closely spaced sites upstream of several genes related to fatty acid metabolism (Figure 6). There is a cluster of 3 evenly spaced sites upstream of Rv3229c (linoeyl-coA desaturase), a cluster of 2 sites upstream of the adjacent Rv3230c (oxidoreductase), and a cluster of 3 sites upstream of Rv2524c (fatty acid synthase). This is the second highest-scoring new motif identified (Table 7). This motif shows up as one of the top motifs associated with the clusters of genes PubMed ID:https://www.ncbi.nlm.nih.gov/pubmed/28461567 upregulated under saturated fatty acid conditions (specifically palmitate).Few transcription factor binding motifs have been identified in Mtb. Transcription factors for which binding motifs have been identified include KstR [36], DosR [67], IdeRConclusion To better understand Mtb, we performed a comparative analysis of 31 organisms from the Tuberculosis Database. We studied the evolution of protein families and metabolic pathways, looked for proteins with evidenceMcGuire et al. BMC Genomics 2012, 13:120 http://www.biomedcentral.com/1471-2164/13/Page 15 ofFigure 5 New predicted RNAs. a) An example of a new predicted RNA. This is the RNA2 in Table 6. This figure shows a screenshot from the GenomeView Browser [64]. The light blue bars show the coding regions (Rv1230c and Rv1231); the tan bar shows the conserved region predicted by Gumby [65]; and the green bar shows the region predicted to fold by Evofold [66]. The yellow and green plots in the center show the RNA-seq data. Green signifies reads from the negative strand, and yellow shows the total reads (positive and negative strands). The multiple alignment is shown on the bottom (darker grey signifies a higher degree of conservation; red signifies no alignment at that position). You can see that this predicted RNA region is conserved through M. avium. The Relugolix web rulers at the top show the gene structure. Small red squares show where stop codons are present all six reading frames, indicating that this intergenic region is unlikely to be a protein-coding region missed in the annotation. b) Northern blots validating four of the new, predicted small RNAs (RNA1, RNA2, RNA3, and RNA9 in Table 6).McGuire et al. BMC Genomics 2012, 13:120 http://www.biomedcentral.com/1471-2164/13/Table 6 Top 12 predicted RNAs, ranked by their RPKM scoreConserved region in Mtb H37Rv1 id RNARegion in M. Smegmatis3 # reads 1013 567 111 55 81 174 517 83 502 58 375 332 RPKM 69526 30071 9251 2139 2055 6767 9280 6917 24405 1829 18231 9684 Evo-fold?2 Y Y Y Y Y.