Hope S
- Research Program Mentor
PhD candidate at Massachusetts Institute of Technology (MIT)
Expertise
Natural language processing, computational social sciences, data science, machine learning
Bio
Nice to meet you! I’m Hope. I'm California born and raised, and I’m back now in California as a research fellow at Stanford Law School where I use data science and natural language processing methods to understand law texts and policy issues. I'm also a PhD student at MIT. I’m interested in the way that new technologies can help us better interpret and interact with the world around us. In particular, I like to use data science and machine learning as a tool to understand how people interact with each other online, particularly through written language and in networks. Even though I'm from California, I love winter sports! I like ice skating, skiing, and hockey. In places with no ice, I like roller blading-- that's one hobby I took up during the pandemic. I'm also slowly learning to cook-- do you have a favorite recipe you'd want to share with me? I'd love to know something you learned about yourself in the past year!Project ideas
How have presidential speeches changed through time?
It’s a polarized time in the US. Was it always this way? Let’s see how parties have changed the way they speak by looking at Presidential speeches. We might just learn how language itself has changed through time! The sky is the limit for folks interested in linguistics, history, and computational analysis.
Data scientist pipeline: sentiment online
This project can teach you the pipeline for becoming an independent data scientist! Answering interesting questions often involves learning how to find data “in the wild”— ethically. We can learn to use an open API to get raw data, clean it, and analyze it. We will then discuss the ethical implications of collecting and using data for social science projects. We will apply computational methods chosen appropriately for your question of interest, perhaps scoping a project to study how people on the internet (Twitter, Reddit, etc) feel about a particular issue, like climate change or COVID-19, through the language they use. We can discuss computational tools for this task-- traditional statistics? Machine learning classification? What are the pros and cons of each computational tool? The final project can be an analysis of your findings, blog post, or academic-style paper discussion.