Polygence Scholar2024

Ryan Lung

Class of 2025B, Washington

About

Projects

"Highly Compressed LLMS as Classifiers Remain Effective" with mentor Efthimios (Feb. 12, 2024)

Ryan's Symposium Presentation

Project Portfolio

Highly Compressed LLMS as Classifiers Remain Effective

Started Nov. 22, 2023

Abstract or project description

There have been massive strides in natural-language-processing in recent years, largely due to the widespread adoption of the self-attention mechanism in large language models (LLMs). However, this improvement in model output has come at the cost of a prohibitively large memory and computing demand for most devices. Specialized external computing hardware, such as GPUs and TPUS, are required to effectively run these models, which can cost from a few hundred dollars to tens of thousands. These issues only compound when applied to a mobile device setting, where the hardware and storage issues become glaringly obvious. However, LLMs on mobile devices have a vast spread of applications, with an essential one being text classification. With the amount of time youth spend on social media, there is a wealth of potential red flags that could suggest declining mental health. A lightweight and accurate model is vital for detecting these users and getting them help. We fine-tune an emotion text-classification model using Google’s BERT model. We find that we can retain much of the performance of BERT even after utilizing a combination of quantization and pruning, accurately classifying text with a fraction of the computational cost of the original model. With these steps toward improving efficiency of LLMS, it becomes easier to serve a large user base quickly while lowering costs.