christa c
- Research Program Mentor
PhD at University of California Los Angeles (UCLA)
Expertise
computational genomics, bioinformatics, epigenetics, neurological diseases, health data science, machine learning for health data
Bio
I am a computational geneticist interested in developing equitable genomic tools that translate to the clinic. I received my B.S. in biological physics and art history from Brandeis University and my PhD in bioinformatics from UCLA. My scientific interests are broadly in genetic medicine and applying advanced statistical techniques to big biological datasets. Currently, I’m working on two main projects. The first is developing statistical and technological methods to find epigenetic biomarkers for ALS. I am also working on using genetics to understand the drivers of genetic disease in underrepresented populations. This work is at the intersection of population genomics, epidemiology, and data science. Beyond my research, I am keenly interested in health justice and equitable application of biomedical discoveries to all people. Outside of science, I like art and design. My favorite museums include the Isabella Stewart Gardner Museum in Boston, the Getty Center in Los Angeles, the Leeum Museum in Seoul, and the Asian Art Museum in San Francisco.Project ideas
Finding novel disease-related variants in founder populations
Overview: Founder populations are populations who experienced a historical bottleneck, which results in less genetic diversity in modern day individuals. This can mean genetic variants that cause disease are at higher frequencies than what might be expected from evolutionary theory. In this project, we can identify a public dataset that might contain founder populations. Then, we can use population genetics algorithms to model the population history and search for potentially pathogenic variants. Knowledge/Skills to be learned: - population genetics algorithms, like identity-by-descent analysis - classifying genetic variants based on pathogenicity - statistical association testing Skills needed: - Intermediate to advanced coding skills in Python or similar programming language - Basic statistics knowledge - Introductory genetics knowledge - experience with the command line a plus Expected Outcomes: - Map of disease-associated genetic segments - Potential novel variant discoveries - Documentation of population-specific risks - Academic paper - Github repository Possible extensions: -Create visualization tools or web interfaces - Optimize computational efficiency of existing tools - Compare with ancient DNA - Analyze natural selection effects
Identifying reproducible biomarkers for early stage lung cancer using microRNAs
Overview: MicroRNAs are popular candidates for biomarkers for cancer, because they can be easily measured in blood. However, there is a reproducibility problem in the field and some promising microRNA biomarkers found in a single dataset do not replicate when applied to other populations. Can we identify our own biomarker candidates with machine learning and then assess how well they reproduce in external datasets? If they don't replicate, what factors are associated with the lack of replication? Knowledge/Skills: - RNA-seq analysis - Feature selection methods - Machine learning classification - Cross-validation techniques - Statistical association testing Skills needed: - Basic coding skills in Python or similar programming language - Basic statistics knowledge - Introductory genetics knowledge - Experience with data visualization Expected Outcomes: - Panel of 5-10 blood-based RNA biomarkers - Classification model with performance metrics - Replication statistics - Academic paper - Github repository Possible extensions: - Developing deep learning or transfer learning models - Generalizing to other datasets - Incorporating genetic risk scores to models - Incorporating population genetics theory to models