Polygence Scholar2022

Antonio Cuan

high_schoolClass of 2023Sunnyvale, California

About

Hi! My name is Antonio, and my Polygence project is about music recommendation systems. I chose to work on this project because both machine learning and music fascinate me, and this project merges my interests together.

Projects

"Creating Music Recommendation Systems using Euclidean Distance and K-Means Clustering" with mentor Lucien (July 19, 2022)

Project Portfolio

Creating Music Recommendation Systems using Euclidean Distance and K-Means Clustering

Started Jan. 27, 2022

Abstract or project description

We created a music recommendation system that takes in a single song for the basis of recommendation, and we explored whether our program could recommend similar songs with this limited information. In contrast, other music recommendation systems require more data to function, such as a user’s listening history or a playlist with multiple songs. We also implemented a feature into our music recommendation system that allows the user to specify the amount of similarity returned songs should have compared to the liked song. We assessed if this feature is able to function properly in our program. Our music recommendation system uses two machine learning algorithms: K-means clustering and Euclidean distance. Before applying the algorithms, we extracted data from Spotify playlists, scaled the data, and applied Principal Component Analysis (PCA). We created our own implementation of the K-means clustering algorithm and, in addition, used a software package to implement the algorithm. Our implementation of the K-means clustering algorithm returned well-defined clusters, similar to Scikit-learn’s implementation. After performing the K-means clustering algorithm, we sampled from the clusters using either random sampling or Euclidean distance. Our music recommendation program worked successfully and was able to recommend songs that were similar to a given liked song. Our program was also able to recommend songs with varying similarity to a given liked song as specified by the user. Between the two algorithms used, we found that the Euclidean distance algorithm was better suited for recommending very similar songs while the K-means clustering algorithm was better suited for exploring songs that were more different.