Mariel W
- Research Program Mentor
PhD at University of California Berkeley (UC Berkeley)
Expertise
Machine Learning, Artificial Intelligence, Computer Science, and Math
Bio
Having just graduated with a PhD in machine learning and AI, I am interested in applying machine learning to address emergent needs and problems in the world. Machine learning allows us to synthesize the profusion of data around us, but doing so securely, responsibly, and meaningfully is an ongoing challenge. I am interested in finding ways to coexist, in a way that maximizes and preserves human value, with new AI technologies that are rapidly changing domains of human creativity and labor. Additionally, I am a serious classical cellist and pianist, and am expanding my familiarity with other musical genres --- particularly jazz, hip-hop, rap, and soul. Recently, I've taken up road and trail running.Project ideas
Building an LLM from Scratch
Large Language Models (LLM) are currently the dominant force in AI technology. Today's state-of-the-art LLMs (such as OpenAI's GPT4 model) are capable of such complex tasks, and are displaying such rapid improvement, that top researchers in the field do not yet fully comprehend their power. What is certain however is their impact on the future of humanity, from labor markets to creative enterprises to technological innovation. It is critical to understand how LLMs function, so that we can harness and regulate their power and so that such knowledge is not the privileged domain of a few companies. In this project, we will build an LLM from scratch and develop rigorous, hands-on understanding of how they work. I will guide the student through the PyTorch library, the current state-of-the-art Python-based library for LLMs and deep learning applications. (Ideally, the student is already be facile with standard Python. However, depending upon coding experience, we can adjust the technical rigor of the project.) After building the LLM, we can personalize it for use by training it on our own browsing data. The project will have three major components: 1) learning the architecture of basic deep learning models, 2) coding such a model in PyTorch, and 3) training the model on a variety of data modalities (text, images, etc). The student will develop familiarity with the theoretical architecture of LLMs, practical experience coding them from the ground up, and an understanding of how training data influences the output of such models.