Hari Sagar
Class of 2026Redwood City, California
About
Projects
- "Analysis of Pretrained ViT Backbones in Classification for Social Impacts" with mentor David (July 7, 2024)
Project Portfolio
Analysis of Pretrained ViT Backbones in Classification for Social Impacts
Started Mar. 18, 2024
Abstract or project description
This paper compares the effectiveness of pre-trained frozen backbones. Three different Vision Transformer (ViT) backbones were evaluated and compared to convolutional neural network-based backbones for the classification of datasets that are relevant to social impact. The approach resulted in high results for one particular ViT known as DINOv2, which achieved an accuracy of 95.9% in the Functional Map of the World dataset.
This paper evaluates recent pre-trained backbones, iBOT, and Masked AutoEncoder (MAE). These are evaluated on datasets known as RealWaste, functional Map of the World, and DeFungi. These were chosen as they represent different aspects of human social impact, specifically, human-generated waste, environmental loss due to infrastructure, and classification of medical/biological images.