|
|
YEASTRACT+ |
Home > Tutorial | Contact/Credits Tutorial Help |
|
Tutorial
PathoYeastract (Pathogenic Yeast Search for Transcriptional Regulators And Consensus Tracking; http://yeastract-plus.org/pathoyeastract/) database presently contains almost 37,000 regulatory associations between the transcription factors and target genes in Candida albicans and C. glabrata, based on 747 bibliographic references. Each regulation has been annotated manually, after examination of the relevant references. The database also contains the description of 107 specific DNA binding sites shared among 100 C. albicans and 34 C. glabrata characterized TFs. Further information about each yeast gene was obtained from Candida Genome Database (CGD), YEASTRACT and Gene Ontology (GO) Consortium. PathoYeastract database provides assistance in two major issues: prediction of gene transcriptional regulation and global expression analysis according to Candida transcription networks described in the literature. This tutorial presents two case-studies, exemplifying the use of different query options and utilities. Various other ways to exploit available options and utilities are possible. - Example 1: Identification of the documented and potential regulatory associations for an ORF/Gene - Example 2: Gene expression analysis based on regulatory associations Throughout PathoYeastract database and this tutorial, the regulatory associations are denominated "Documented" or "Potential":
The accuracy and updating of the information gathered, curated and inserted in this database is crucial to PathoYeastract users. Thus, we will value any contribution from the yeast community to achieve this goal. The results presented for this Tutorial were computed on June 30, 2016. However, due to subsequent updates the current ranking may differ from the presented one. Example 1: Identification of the documented and potential regulatory associations for an ORF/Gene The functional analysis of an ORF or gene can be guided through the identification of its documented and potential transcription factors (TF). This example describes one of the possible ways to explore the regulatory associations for the C. glabrata ORF CAGL0G08624g, encoding a Drug:H+ Antiporter of the Major Facilitator Superfamily which remained uncharacterized until recently , using various queries and utilities provided by YEASTRACT.
1.1 - Search for Documented Transcription Factors (TFs)
According to the CGD description of Pdr1 this regulator is involved in the control of multidrug resistance, with a special importance in the clinical acquisition of azole resistance. Therefore, it may be considered of interest to examine the eventual link of ORF CAGL0G08624g to these biological processes. Indeed, in a recent study CAGL0G08624g was shown to confer azole drug resistance, being involved in azole drug extrusion from within C. glabrata cells (1), and consistently up-regulated in clotrimazole resistant clinical isolates (2). Given its homology to S. cerevisiae QDR2 gene, this ORF was coined C. glabrata QDR2.
1.2 - Search for Potential Transcription Factors (TFs)
The display of potential TFs on the image can be controlled by un-checking their respective box in the color pallet below the image and pressing the Redisplay button. The color pallet displays the color for only those TFs for which binding sites are found in the promoter region of the given gene(s). A close observation of the image, looking for the TF which is the documented regulator of CAGL0G08624g (i.e., Pdr1) reveals that a binding site for Pdr1 is indeed present, although at a relatively long distance from the START codon. However, binding sites for the TFs Yap1, Yap6, Yap7 and Amt1 can be found in the promoter region of the QDR2 gene, suggesting that they may play a role in the regulation of QDR2 expression. The predicted functions of these TFs, as regulators of oxidative stress, osmotic stress response, iron-cluster biogenesis and metallothionein genes, respectively, further hint to a possible role for QDR2 in these processes.
1.3 - Search for Genes
For example, if considering the C. albicans TF Tac1, the master regulator of drug resistance, the obtained result is displayed in the following Table 1.
Table 1 - Documented target genes of the Candida albicans Tac1 transcription factors. Interestingly, besides the most commonly known targets of Tac1, the drug efflux pump encoding genes CDR1 and CDR2, it is possible to observe that other genes whose role is apparently unrelated to drug resistance are also Tac1 targets. For example, Adh1 and Snz1 are related to central carbon metabolism and vitamin B synthesis, respectively. This observation, raises the possibility of either Tac1 playing additional roles in C. albicans biology or Adh1 and Snz1 contributing somehow to drug tolerance. Example 2: Gene expression analysis based on regulatory associations PathoYeastract provides tools for the classification and grouping of large lists of genes of interest, such as those found up- or down-regulated under a specific environmental cue or genetic mutation, as suggested by genome-wide expression data inspection. These analyses are based on known or algorithmically identified potential regulatory associations, deposited in the PathoYeastract database, or on shared Gene Ontology (GO) terms.
2.2 - Rank by Gene Ontology (GO)
Based on the % of genes associated to each GO term, the first hit is "oxidation-reduction process". Considering, however, a GO term enrichment analysis perspective, the GO term "hydrogen peroxide catabolic process" shows up together with "oxidation-reduction process" as the ones with a lower p-value. Both criteria, however, appear to favor the idea that selenium induces oxidative imbalance in C. glabrata cells.
2.3 - Rank by Transcription Factor
TFs predicted to regulate this transcriptional response can be ranked by the % of genes in the list associated to them. Using such a ranking the TF Pdr1 comes on top, regulating 30% of the gene set, while ORF GACL0G08844g regulates 16% of the gene list. The fact that Pdr1 regulates the most genes in response to an azole drug is an expected result, given its know role in this process (5). The second most highly ranked TF is somewhat more surprising, being its closest S. cerevisiae homolog the TF Asg1, characterized as involved in response to stress, particularly cell wall related stress. The appearance of this TF as the regulator of a large fraction of the clotrimazole responsive genes suggests that either this azole drug induces cell wall stress, or that the function of the uncharacterized C. glabrata TF encoded by ORF GACL0G08844g is divergent from that of its homolog in S. cerevisiae. When ranking by statistical significance of regulations, the TF score is given by a p-value denoting the overrepresentation of regulations of the given TF targeting genes in the list of interest relative to the regulations of that TF targeting genes in the whole YEASTRACT database. The p-value further denotes the probability that the TF regulates at least the number of genes found to be regulated in the list of interest if we were to sample a set of genes of the same size as the list of interest from all the genes in the YEASTRACT database. This probability is modeled by a hypergeometric distribution and the p-value is finally subject to a Bonferroni correction for multiple testing. Below is the output of the utility "Rank by TF" based on regulation enrichment for the clotrimazole dataset, using the default filtering options. In Table 3, the first column indicates the name of the TF, the second column the % of genes in the list targeted by the TF, the third column the % resulting from the ratio between the number of genes in the list targeted by the TF and the number of genes targeted by the TF in the whole YEASTRACT database, the fourth column the enrichment p-value, and the fifth and final column the genes from the list of interest targeted by the TF.
In this case the enrichment-based ranking of transcription factors reports basically the same two TF as the highest ranking TFs.
References
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||