Ishan P
- Research Program Mentor
MS at Stanford University
Expertise
Computer Science, Machine Learning and AI (Computer Vision, Robotics, Large Language Models, Generative AI)
Bio
Hi! I currently work as a Machine Learning Engineer for Apple's AI/ML team. In the past, I have worked as a Deep Learning Engineer at Focal Systems, a retail AI startup trying to automate the grocery store by predicting product inventory on shelfs using cutting edge computer vision neural networks, and prior to that, as a Software Engineer at Amazon, where I was one of the founding members of the Robotics AI team that launched the Amazon Astro home robot. I have completed my MS from Stanford University, where I carried out ML research as part of Stanford Vision Lab and Stanford School of Medicine, and received my B. Tech from Indian Institute of Technology, Indore. Outside of work and research, I love spending time outdoors - road biking, running, playing tennis and more recently skiing!Project ideas
Predicting dance genre from street dance videos
In this project, we will build a machine learning model to identify the genre of street dance (jazz/ hip-hop/ break etc.) from input demonstration videos. Starting with the AIST Dance Video Database (https://aistdancedb.ongaaccel.jp/), we will first extract frames from each video to convert the video data into a format suitable for training a neural network using libraries like OpenCV. We will then define the model architecture using a popular library like PyTorch. This could be a deep learning model that needs to understand both the semantic content in each frame (CNN) and temporal context across all the frames (RNN/ LSTM) to predict the genre. We will then evaluate this model using test data (from the same dataset), and further look into ways we can improve it. Finally, we can now use this model to classify any other street dance videos (not seen during training). Could also be your own street dance video!