Debanjan Saha

Program: Molecular Genetics and Genomics

Current advisor: Christopher Maher, PhD

Undergraduate university: Rutgers University-Cook College

Research summary
Metastatic castration resistant prostate cancer (mCRPC) is a subtype of prostate cancer that develops after metastasis and resistance to hormone therapy, thereby carrying a poor prognosis. Likewise, pancreatic ductal adenocarcinoma (PDAC) is one of the most lethal malignancies due to its propensity to present at advanced, unresectable stages in patients. Recent genomic studies have identified molecular features of mCRPC and PDAC, such as the existence of mutations in key driver genes, the extensive transcriptomic intra-tumoral heterogeneity, the role of regulatory elements in dictating driver gene expression, mechanisms of treatment resistance, and the interactions with the immunosuppressive and stromal tumor-microenvironment (TME) in facilitating disease progression. While much information has been elucidated from these analyses, they have been primarily focused on the role of protein coding genes while neglecting the importance of long non-coding RNAs. Long non-coding RNAs (lncRNAs) are RNA transcripts longer than 200 nucleotides without evidence of coding potential. lncRNAs have been demonstrated to be implicated in various steps of the metastatic cascade during tumor progression with many described mechanisms, such as through transcriptional and epigenetic regulation, as well as sequestration and modification of RNA transcripts or proteins. However, few studies have performed a systematic characterization of the lncRNA landscape in mCRPC or PDAC using single cell transcriptomic data coupled with integration of multi-omic datasets. Hence, the goal of this thesis research was to perform an integrative genomics analysis of single cell RNA sequencing data with bulk transcriptomic, genomic, epigenomic, and clinicopathologic data to better understand the mechanisms of lncRNA dysregulation in cancers. We sought to identify lncRNAs associated with genomic and regulatory features, tumor progression, intra-tumoral heterogeneity, TME cell types, histologic transformation, and treatment resistance.
In the analysis of mCRPCs, we used transcriptomic data from a recently published study of 2170 cells from 14 patients and 15 biopsies of mCRPC metastatic sites with varied treatment histories and tumor pathologies, coupled with a computational pipeline for lncRNA discovery and validation. This yielded 389 cell-enriched lncRNAs in prostate cancer cells and the TME comprising various immune cells. Regulatory elements, such as hypomethylated regions and transcription factor binding sites, were enriched in these lncRNAs. Analysis of mCRPCs and localized prostate cancers revealed lncRNAs associated with tumor progression, such as the established prostate cell-enriched lncRNAs SCHLAP1 and PCAT14, and novel TME-enriched lncRNAs like MIAT and CYTOR. Prostate cell-enriched lncRNAs were correlated with Androgen Receptor (AR) mutational status, AR signaling, and demonstrated alterations during treatment with the AR signaling inhibitor, enzalutamide. In contrast, the expression of a subset of TME-enriched lncRNAs was upregulated in tumors with RB1 deletions and correlated with poor prognosis. Finally, lncRNAs identified between prostate adenocarcinomas and neuroendocrine tumors exhibited distinct expression and methylation profiles.
With respect to the PDAC analysis, we used single cell transcriptomic data from a recent publication of 232,764 cells from 73 multi-region samples from a cohort of 21 PDAC patients to characterize the lncRNA landscape of this disease. We found 111 lncRNAs to be highly enriched in PDAC and TME cells, such as known PDAC-specific lncRNAs CASC8 and CRNDE, as well as novel lncRNAs associated with Treg and fibroblast cell types validated across multiple orthogonal datasets. Analysis of PDAC cells with TP53 mutations revealed several lncRNAs associated with genomic status, such as NORAD, that failed to be detected by bulk sequencing due to their expression across multiple TME cell types. In addition, we identified lncRNAs and pathways associated with resistance to treatment with FOLFIRINOX, a multi-agent chemotherapy, that were validated in treatment resistant PDAC organoids, highlighting the ability of single cell analysis to identify PDAC-specific expression changes after treatment exposure. Lastly, we performed tumor subcluster analysis of PDAC cells to identify lncRNAs commonly deregulated across patients, followed by pathway annotation which yielded 8 PDAC tumor subclusters. 6 of these tumor subclusters were subsequently validated in multiple orthogonal PDAC single cell datasets, highlighting the generalizability of our findings. These subclusters and their corresponding lncRNAs included processes related to angiogenesis, metabolic pathways, and epithelial-to-mesenchymal transition, which displayed associations with patient outcomes, thereby demonstrating the clinical relevance of these genes in PDAC.
In summary, this thesis work represents the first systematic analysis of lncRNAs in mCRPC and PDAC using an integrative genomics approach with single cell and bulk multi-omics datasets. Our findings highlight the utility of single cell sequencing to ascribe lncRNA alterations to specific cell types and nominate potential mechanisms of action. This work also underscores the utility of integrative multi-omic studies for lncRNAs that can be extended to other tumor types. We hope our results will serve as a resource to inform future work on identifying the biological roles of these lncRNAs and their contributions to mCRPC, PDAC and tumor biology in general.

Graduate publications


Back to full list