N.C. Yeastract
Y.lipolytica
Please fill in here our
user satisfaction survey
Thank you!
YEASTRACT+ INESC-ID IST IBB
Home > Tutorial Contact/Credits     Tutorial     Help

 

Tutorial

N.C. Yeastract (Non-Conventional Yeast Search for Transcriptional Regulators And Consensus Tracking; http://yeastract-plus.org/ncyeastract/) database presently contains regulatory associations between transcription factors and target genes in non-conventional yeast species biotechnologically relevant, based on bibliographic references. Each regulation has been annotated manually, after examination of the relevant references. Further information about each yeast gene was obtained from the respective Genbank genome assembly, and Gene Ontology was either annotated using the Blast2GO software [1] or retrieved from the fungal genomes browser Ensembl Fungi [2]. Additionally, orthology associations between the all the non-conventional yeast species added to the database and S. cerevisiae were inferred with the software OrthoFinder [3] and Proteinortho [4].

N.C. Yeastract database assists in three major issues: prediction of gene transcriptional regulation and global expression analysis in yeasts considered biotechnologically relevant according to the transcription networks described in the literature. This tutorial presents three case-studies, exemplifying the use of different query options and utilities. Various other ways to exploit available options and utilities are possible.

- Example 1: Gene expression analysis based on regulatory associations

- Example 2: Prediction of transcription factor DNA binding sites based on promoter analysis

- Example 3: Identification of potential regulatory associations based on cross species network comparison

Throughout N.C Yeastract database and this tutorial, the regulatory associations are denominated "Documented" or "Potential":

  • a documented association between a Transcription Factor (TF) and a target gene is supported by published data showing at least one of the following experimental evidences: i) Change in the expression of the target gene due to a deletion (or mutation) in the gene encoding transcription factor; these evidences may come from detailed gene by gene analysis or genome-wide expression analysis; ii) Binding of the transcription factor to the promoter region of the target gene, as supported by band-shift, foot-printing or Chromatine ImmunoPrecipitation (ChIP) assays. Therefore, the user is urged to check the literature references provided in the database to fully understand the nature of the evidences underlying the identified regulatory associations.
  • a potential association between a TF and a target gene is based on the occurrence of the TF binding site in the promoter region of the target gene. The binding sites associated to each TF in this database are supported by published experimental evidence for the binding of the TF to the specific nucleotide sequence (data coming from foot-printing or ChIP assays). Again, the user is urged to check the literature references provided in the database.

The accuracy and updating of the information gathered, curated and inserted in this database is crucial to N.C. Yeastract users. Thus, we will value any contribution from the yeast community to achieve this goal.

The results presented for this Tutorial were computed on July 2019. However, due to subsequent updates the current ranking may differ from the presented one.




Example 1: Gene expression analysis based on regulatory associations

N.C. Yeastract provides tools for the classification and grouping of large lists of genes of interest, such as those found up- or down-regulated under a specific environmental cue or genetic mutation, as suggested by genome-wide expression data inspection. These analyses are based on known or algorithmically identified potential regulatory associations, deposited in the N.C. Yeastract database, or on shared Gene Ontology (GO) terms.



1.1 - Rank by Gene Ontology (GO)

The grouping of genes based on the shared GO terms is a feature common to several gene expression analysis software and is also implemented in N.C. Yeastract. To exemplify this utility, we used the list of proteins whose expression was seen to change in Z. bailii cells exposed to acetic acid in a ZbHaa1 dependent manner [5]. The grouping of this gene list, based on the Molecular Function ontology (Biological Process or Cellular Component can also be selected), and considering only the GO terms associated to more than 5% of the genes in the list, is shown in the following table:

GO ID       GO term       Depth level       % in user set       % in Z.bailii       p-value       Genes/ORF      
GO:0019239deaminase activity41.43%20.00%0.001779248635010 send data to
ZbMMF1
GO:0043167ion binding31.43%25.00%0.001077029880752 send data to
ZbMMF1
GO:0003729mRNA binding65.71%23.53%0.000002206013848 send data to
ZbHSP26 ZbHSP26-1 ZbHSP26-2 ZbHSP26-3
GO:0042802identical protein binding48.57%15.79%0.000000576479486 send data to
ZbHSP26 ZbPDR16 ZbHSP26-1 ZbHSP26-2 ZbHSP42 ZbHSP26-3
GO:0051082unfolded protein binding48.57%13.33%0.000001922082332 send data to
ZbHSP26 ZbSSA3 ZbHSP26-1 ZbHSP26-2 ZbHSP42 ZbHSP26-3
GO:0019901protein kinase binding61.43%10.00%0.007661369733558 send data to
ZbPCL10
GO:0016811hydrolase activity, acting on carbon-nitrogen (but not peptide) bonds, in linear amides51.43%100.00%0.000000000000000 send data to
ZbYPC1
GO:0000774adenyl-nucleotide exchange factor activity61.43%33.33%0.000543302076111 send data to
ZbSSE1
GO:0005516calmodulin binding41.43%33.33%0.000543302076111 send data to
ZbSSE1
GO:0005524ATP binding414.29%2.48%0.019151658965812 send data to
ZbSSE1 ZbHRK1 ZbHSP104 ZbPFK27 ZbSSA3 ZbSKS1 ZbHSP78 ZbSAK1 ZbPDR12 ZbPDR12-1
GO:0008134transcription factor binding42.86%11.76%0.001433099847584 send data to
ZbSSE1 ZbHSP104
GO:0042277peptide binding42.86%66.67%0.000002417203689 send data to
ZbSSE1 ZbSSA3
GO:0004672protein kinase activity44.29%4.23%0.015432283413036 send data to
ZbHRK1 ZbSKS1 ZbSAK1
GO:0005515protein binding310.00%1.04%0.713571511182516 send data to
ZbYGP1 ZbBTN2 ZbMTH1 ZbATG8 ZbFKH2 ZbSAP30 ZBIST_5070
GO:0003676nucleic acid binding47.14%2.94%0.027349925087602 send data to
ZbRIE1 ZbNRG2 ZbSTP3 ZbMSN4 ZbCOM2
GO:0005216ion channel activity51.43%14.29%0.003671026315418 send data to
ZbYRO2
GO:0009055electron transfer activity41.43%5.56%0.024285976283325 send data to
ZBIST_1208
GO:0015035protein disulfide oxidoreductase activity61.43%10.00%0.007661369733558 send data to
ZBIST_1208
GO:0016491oxidoreductase activity38.57%5.50%0.000633028456509 send data to
ZbYJR096W ZBIST_2256 ZbMCR1-1 ZBIST_3873 ZbSUR2 ZbYIR035C
GO:0003674molecular_function14.29%1.75%0.202967195739200 send data to
ZbECM13 ZBIST_2052 ZBIST_3490
GO:00038736-phosphofructo-2-kinase activity71.43%25.00%0.001077029880752 send data to
ZbPFK27
GO:0005085guanyl-nucleotide exchange factor activity31.43%11.11%0.006183258145687 send data to
ZbMUK1
GO:0005096GTPase activator activity52.86%10.00%0.002333294529536 send data to
ZbMUK1 ZbRGS2
GO:0030295protein kinase activator activity61.43%25.00%0.001077029880752 send data to
ZbMTH1
GO:0031072heat shock protein binding41.43%7.14%0.014958554043250 send data to
ZbSSA3
GO:0042623ATPase activity, coupled91.43%14.29%0.003671026315418 send data to
ZbSSA3
GO:0044183protein folding chaperone21.43%12.50%0.004851741764326 send data to
ZbSSA3
GO:0051787misfolded protein binding41.43%10.00%0.007661369733558 send data to
ZbSSA3
GO:0008429phosphatidylethanolamine binding51.43%100.00%0.000000000000000 send data to
ZbATG8
GO:0031386protein tag21.43%100.00%0.000000000000000 send data to
ZbATG8
GO:0004620phospholipase activity61.43%25.00%0.001077029880752 send data to
ZbEHT1
GO:0034319alcohol O-butanoyltransferase activity81.43%100.00%0.000000000000000 send data to
ZbEHT1
GO:0034321alcohol O-octanoyltransferase activity81.43%100.00%0.000000000000000 send data to
ZbEHT1
GO:0034338short-chain carboxylesterase activity61.43%100.00%0.000000000000000 send data to
ZbEHT1
GO:0047372acylglycerol lipase activity61.43%25.00%0.001077029880752 send data to
ZbEHT1
GO:0008289lipid binding31.43%9.09%0.009281954535017 send data to
ZbPDR16
GO:0008525phosphatidylcholine transporter activity51.43%25.00%0.001077029880752 send data to
ZbPDR16
GO:0008526phosphatidylinositol transporter activity51.43%25.00%0.001077029880752 send data to
ZbPDR16
GO:0016787hydrolase activity32.86%3.33%0.047646499172559 send data to
ZbYNL217W ZbSDT1
GO:0000166nucleotide binding41.43%3.45%0.058581005005002 send data to
ZBIST_2390
GO:0046872metal ion binding52.86%2.56%0.089339950308084 send data to
ZBIST_2390 ZbGPP1
GO:0052856NADHX epimerase activity51.43%100.00%0.000000000000000 send data to
ZBIST_2390
GO:0052857NADPHX epimerase activity51.43%100.00%0.000000000000000 send data to
ZBIST_2390
GO:0000121glycerol-1-phosphatase activity71.43%100.00%0.000000000000000 send data to
ZbGPP1
GO:0050308sugar-phosphatase activity81.43%100.00%0.000000000000000 send data to
ZbGPP1
GO:0008721D-serine ammonia-lyase activity62.86%100.00%0.000000000000000 send data to
ZBIST_2574 ZBIST_4826
GO:0004861cyclin-dependent protein serine/threonine kinase inhibitor activity71.43%100.00%0.000000000000000 send data to
ZBIST_2593
GO:0001965G-protein alpha-subunit binding41.43%100.00%0.000000000000000 send data to
ZbRGS2
GO:00041153',5'-cyclic-AMP phosphodiesterase activity91.43%100.00%0.000000000000000 send data to
ZbPDE1
GO:0010309acireductone dioxygenase [iron(II)-requiring] activity61.43%100.00%0.000000000000000 send data to
ZbADI1
GO:0008934inositol monophosphate 1-phosphatase activity91.43%50.00%0.000182712161163 send data to
ZbINM1
GO:0005506iron ion binding71.43%6.25%0.019383387800211 send data to
ZbSUR2
GO:0003700DNA-binding transcription factor activity31.43%2.70%0.089717030957501 send data to
ZbFKH2
GO:0043565sequence-specific DNA binding61.43%3.57%0.055013522164445 send data to
ZbFKH2
GO:0030060L-malate dehydrogenase activity61.43%33.33%0.000543302076111 send data to
ZBIST_4902
GO:0042626ATPase activity, coupled to transmembrane movement of substances72.86%10.00%0.002333294529536 send data to
ZbPDR12 ZbPDR12-1
GO:0015103inorganic anion transmembrane transporter activity51.43%100.00%0.000000000000000 send data to
ZbARR3
GO:0015297antiporter activity61.43%16.67%0.002645402729163 send data to
ZbARR3
GO:0005199structural constituent of cell wall31.43%9.09%0.009281954535017 send data to
ZbSED1
Figure 1 - Rank by Gene Ontology of the genes described to be regulated by ZbHaa1 in response to acetic acid stress.

Based on the % of genes associated to each GO term, the first hit is "Inorganic anion transmembrane transporter activity". Considering, however, a GO term enrichment analysis perspective, the GO term "unfolded protein binding" shows up together with "identical protein binding" as the ones with a lower p-value. Both criteria, however, appear to favor the idea that acetic acid leads to protein denaturation and misfolding possibly as a result of intracellular acidification in Z. bailii cells.




Example 2: Prediction of transcription factor DNA binding sites based on promoter analysis TODO

Comparative genomics allows the implementation of various approaches for finding regulatory elements. In N.C. Yeastract, the user can make use of the known transcription factors binding motif sequences of a given yeast species to infer regulatory associations. Using the "Promoter Analysis" tool in N.C. Yeastract/zbailii considering the TF binding sites from S. cerevisiae, and, as example, the gene ZBIST_0509 (YGP1) – described to be regulated by ZbHaa1 in Z. bailii and S. cerevisiae [6], and Haa1 in S. cerevisiae [7] in response to acetic acid stress – we obtain a table displaying a list of transcription factors that have a binding sequence in the considered promoter regions (first 1000 bp upstream of the START codon). The resulting table is organized into two main columns - "Uniquely in species" and "On all orthologs" - (Figure 2). The former contains the transcription factors that only have a putative binding site in the promoter of that gene in a given yeast species (example: Abf1 has only a putative binding site in the promoter of ZBIST_0509 from Z. bailii, but not in YGP1 from S. cerevisiae). The "On all orthologs" column contains the transcription factors that have a putative binding site in the promoter regions of all the orthologs of a given gene. Additionally, the user may also visualize the putative binding sites located at the promoter of the queried genes and respective orthologs by selecting "Promoter" in the column "Find Binding Sites".

Gene/ORF TFs binding to the promoter Find
Binding Sites
Uniquely in species On all
orthologs
Saccharomyces cerevisiae S288c Zygosaccharomyces bailii IST302
ZbYGP1/ZBIST_0509 YGP1/YNL160W: Crz1p Cup2p Gln3p Cad1p Yap3p Cin5p Yap5p Zap1pAbf1p Aca1p Cst6p Sko1p Azf1p Bas1p Reb1p Rds1p Ste12p Adr1p Cbf1p Tda9p YML081w Pip2p Mig1pArg81p Ash1p Gcn4p Gcr1p Gis1p Com2p YER130c Usv1p YPL230w Rph1p Msn2p Msn4p Hap1p Hsf1p Mal63p YFL052w Znf1p Mcm1p Mot3p Pdr1p Pdr3p Pdr8p Rgt1p Rtg1p Rtg3p Xbp1p Yrr1p Gsm1p YJL103C Stb5p Tec1p Skn7p Rox1p Pho2p Mot2p Haa1p Fkh1p Fkh2p Nrg1p Ndt80p Sum1p Yap1pPromoter
Figure 2 - List of TFs binding to the promoter of ZBIST_0509 and respective orthologs.

In the "Search Transcription Factors" page is possible to select the display of the binding sites for specific, all, none or common transcription factors among the queried gene and respective orthologs. Since YGP1 is described to be regulated by ZbHaa1 and Haa1 we can narrow our search for the binding sites of Haa1 in the promoter regions of ZBIST_0509 and its S. cerevisiae ortholog YGP1 (Figure 3). This way we can make a raw prediction of a possible localization of a ZbHaa1 binding site on the promoter of the queried gene based on the Haa1 binding site information.



Figure 3 - Haa1 putative binding sites in the promoter region of ZBIST_0509 and YGP1.




Example 3: Prediction of transcriptional regulation associations based on orthologous cross species transcription regulatory networks

Comparison of interspecies regulatory networks can be useful to identify functional elements without previous knowledge of function as well as identify significant evolutionary changes regarding transcriptional regulatory mechanisms. In N.C. Yeastract, the user can search for common and unique transcription regulatory associations and obtain the respective comparative networks of biotechnologically relevant non-conventional yeast species with the yeast model S. cerevisiae using the query "Network comparison". To exemplify this utility, we used the documented ZbHaa1- and Haa1- regulons in response to acetic acid stress [5], [7] in N.C. Yeastract/zbailii. The results of this query are three regulatory networks (Common, Unique to Zygosaccharomyces bailii IST302 and Unique to Saccharomyces cerevisiae S288c). It is important to note that the classification of a common regulatory association (represented in blue) can be done by choosing a determined environmental condition (in this example stress by weak acids) but does not have into account the supporting evidence (DNA binding and/or Expression) or type of association (Positive, Negative or Unspecified). Considering this, as a complementary feature, the user also has the option to visualize the complete regulatory networks of either S. cerevisiae or Z. bailii considering the environmental condition, supporting evidence and association type and therefore having the possibility to compare with any of the obtained networks. Furthermore, there is also the option to toggle on the display of additional regulatory associations between TFs; this will consider the regulatory associations described for all the transcription factors among the queried target genes. For example, HSP26 (regulated by Haa1) is also described to be regulated by Msn4 in response to weak acid stress, which in turn is regulated by Haa1, and therefore, the regulation will only be displayed if the aforementioned option is selected, otherwise, it will not be classified as a unique regulatory association.

The common network displays the transcriptional regulations in common between Z. bailii ZbHaa1 and S. cerevisiae Haa1 based solely on the selected environmental condition and the regulation and orthology data present in the database. In the case of weak acid stress condition, we can make a comparison of the aforementioned transcription factors regulons to find possible genes of interest. This way, ZbHaa1, and Haa1 have two regulatory associations in common (HSP26 and YGP1) (Figure 4A). In Z. bailii, HSP26 is present in four copies - ZBIST_0079 (ZbHSP26), ZBIST_3334 (ZbHSP26-1), ZBIST_3442 (ZbHSP26-2) and ZBIST_4487 (ZbHSP26-3) - and we can observe that only two of these (ZbHSP26 and ZbHSP26-2) are regulated by ZbHaa1 in conditions of stress by weak acid; these two genes together with ZbYGP1 might be interesting to explore in the context of increasing tolerance to weak acids and to understand the underlying mechanisms that confer Z. bailii with an extremely high tolerance to acetic acid. It is important to note that the output obtained using the query "Network comparison" regarding the common networks will be different in terms of visualization depending on the database we are working in (N.C. Yeastract or Yeastract). For instance, using genes belonging to the ZbHaa1 and Haa1 regulons in response to acetic acid in Yeastract will result in a common network displaying three blue arrows (Figure 4A) whereas the same group of genes used in N. C. Yeastract will result in a common network displaying two blue arrows (Figure 4B). The result is the same, but in the former case, we have as a base S. cerevisiae genes whereas in the latter the base are Z. bailii genes, therefore, orthology relationships will affect the network visualization and users should be aware of that when using both databases.

A B
Figure 4 - Networks of common transcriptional regulations between Z. bailii ZbHaa1 and S. cerevisiae Haa1.

References

  1. A. Conesa, S. Gotz, J. M. Garcia-Gomez, J. Terol, M. Talon, and M. Robles (2005) Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research Bioinformatics, vol. 21, no. 18, pp. 3674–3676.
  2. The Gene Ontology Consortium (2012) The Gene Ontology: enhancements for 2011 Nucleic Acids Res., vol. 40, no. D1, pp. D559–D564.
  3. D. M. Emms and S. Kelly (2015) OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy Genome Biol., vol. 16, no. 1, p. 157.
  4. M. Lechner, S. Findeiß, L. Steiner, M. Marz, P. F. Stadler, and S. J. Prohaska (2011) Proteinortho: Detection of (Co-)orthologs in large-scale analysis BMC Bioinformatics, vol. 12, no. 1, p. 124.
  5. M. Antunes, M. Palma, and I. Sá-Correia (2018) Transcriptional profiling of Zygosaccharomyces bailii early response to acetic acid or copper stress mediated by ZbHaa1 Sci. Rep., vol. 8, no. 1, p. 14122.
  6. N. P. Mira, J. D. Becker, and I. Sá-Correia (2010) Genomic expression program involving the Haa1p-regulon in Saccharomyces cerevisiae response to acetic acid OMICS, vol. 14, no. 5, pp. 587–601.
w3c xhtml validator w3c css validator