Program: Computational and Systems Biology
Current advisor: Barak A. Cohen, PhD
Undergraduate university: University of Notre Dame
Transcription factors (TFs) activate silent genes by binding to and opening heterochromatic instances of their motifs. While we rely on this process for cellular reprogramming, we have an incomplete understanding of which TFs are capable of recognizing inaccessible instance of their motifs and what parameters are important for this activity. My thesis work aimed to address these two questions. The leading model for silent gene activation is the pioneer factor hypothesis (PFH). The PFH states that pioneer factors (PFs) are qualitatively unique TFs that can bind to and open DNA and subsequently recruit non-pioneer factors (nonPFs) to activate expression. We tested the predictions of the PFH by ectopically expressing a canonical PF FOXA1 and nonPF HNF4A in K562 blood cells. While we expected that only FOXA1 would bind inaccessible motifs and that neither TF would activate tissue-specific gene expression, we found that both TFs independently bound, opened, and activated tissue-specific loci. When we examined what may control such “pioneer activity,” we found that motif content, TF concentration, and TF binding strength were all important factors.
Having shown that pioneer activity may not be a qualitative trait restricted to just a few TFs, we sought to develop a quantitative metric. Because pioneer activity is essentially “TF binding at hard-to-bind sites,” we suggest that a measure of pioneer activity should capture the relative difference in a TF’s ability to bind at accessible versus inaccessible DNA. We estimated a parameter related to a TF’s K(d) by using doxycycline induction as a proxy for TF concentration. We call this term the TF’s “dox50.” We propose that the average difference of a TF’s dox50 between accessible and inaccessible binding sites is a measure of its pioneer activity. We call this term the TF’s “delta dox50.” We predict that TFs with lower delta dox50s will more specifically and consistently activate their tissue-specific targets across cell types because they are less sensitive to differences in chromatin accessibility. To demonstrate the feasibility of this metric, we induced FOXA1 and HNF4A across a 1,000-fold range, measured binding, fit binding curves at tens of thousands of loci, and then extracted dox50s. We show that HNF4A has a smaller delta dox50 than FOXA1, which suggests it has stronger pioneer activity. We also show that while both TFs activate overlapping sets of genes between K562 blood cells and BJ5TA fibroblasts, it is HNF4A that activates more significantly enriched sets of tissue-specific genes in both cell types.
Altogether we propose that every TF likely has some degree of pioneer activity that depends on its affinity for any given location, the concentration at which it is expressed, and the motif content at each target site. We hope that future work will characterize more TFs’ pioneer activity, making delta dox50 a useful quantitative metric to describe pioneer activity. We further propose curating a list of each TFs “true gene targets” from the overlap of activated genes across a variety of ectopic gene activation experiments. These two datasets together will move us closer to predictable, effective ectopic gene activation and thus improved cellular reprogramming.
Hansen JL, Loell KJ, Cohen BA. 2022 A test of the pioneer factor hypothesis using ectopic liver gene activation. Elife, 11():e73358.
Hansen J*, Hong C*, Chaudhari H, Maricque B, and Cohen B. (2018) Genome position scales local cis-regulatory activity. Systems Biology: Global Regulation of Gene Expression at Cold Springs Harbor Laboratory, Cold Springs Harbor, NY, Abstract.