Ryan Lung
Class of 2025B, Washington
About
Projects
- "Highly Compressed LLMS as Classifiers Remain Effective" with mentor Efthimios (Feb. 12, 2024)
Ryan's Symposium Presentation
Project Portfolio
Highly Compressed LLMS as Classifiers Remain Effective
Started Nov. 22, 2023
Abstract or project description
There have been massive strides in natural-language-processing in recent years, largely due to the widespread adoption of the self-attention mechanism in large language models (LLMs). However, this improvement in model output has come at the cost of a prohibitively large memory and computing demand for most devices. Specialized external computing hardware, such as GPUs and TPUS, are required to effectively run these models, which can cost from a few hundred dollars to tens of thousands. These issues only compound when applied to a mobile device setting, where the hardware and storage issues become glaringly obvious. However, LLMs on mobile devices have a vast spread of applications, with an essential one being text classification. With the amount of time youth spend on social media, there is a wealth of potential red flags that could suggest declining mental health. A lightweight and accurate model is vital for detecting these users and getting them help. We fine-tune an emotion text-classification model using Google’s BERT model. We find that we can retain much of the performance of BERT even after utilizing a combination of quantization and pruning, accurately classifying text with a fraction of the computational cost of the original model. With these steps toward improving efficiency of LLMS, it becomes easier to serve a large user base quickly while lowering costs.