Random Forest Identification of Pulsars

Project by Polygence alum Ankhita

artwork of a star emitting a beam of light in a cloud of green gas

Project's result

Ankhita wrote a research paper to be submitted for publishing. She is currently getting feedback on the paper from professors and professionals in the fields of machine learning and radio astronomy, and has shared her paper in the Research Archive of Rising Scholars!

They started it from zero. Are you ready to level up with us?

Summary

A pulsar is a unique type of neutron star that emits pulses of radio emission in beams that can often be detected from Earth. As pulsars rapidly spin, the beams sweep across the earth, which allows for the detection of their periodic, repetitive pulses. Pulsars are useful in the study of extreme states of matter and exoplanets, and are useful tools in measuring cosmic distances and searching for gravitational waves. Traditionally, pulsar candidates have been identified through manual signal processing. As data volumes increase, automated methods, such as artificial neural networks and other machine learning tools, have been proposed. In this project, we used another machine learning tool, the random forest classifier – an algorithm that takes the majority output of multiple decision trees – to accurately separate real pulsar candidates from radio frequency interference (RFI) and other noise. Once identified, these candidates can be further studied and possibly allotted telescope time to confirm them as pulsars. In developing our tool, we used the HTRU2 dataset from the UCI Machine Learning Repository, which contains 1,639 real pulsar examples and 16,259 samples of RFI/noise. Features of the data we used included the mean, standard deviation, excess kurtosis, and skewness of the integrated pulse profile and DM-SNR curve. Our model demonstrated a 98% accuracy in identifying pulsars. Our results indicated that the excess kurtosis, skewness, and mean of integrated profile were the most important factors in differentiating between real pulsars and interference. This tool could be used to process data from future surveys to narrow down the number of candidates that need to be directly processed by humans.

Kristen

Kristen

Polygence mentor

PhD Doctor of Philosophy candidate

Subjects

Computer Science, Quantitative, Physics

Expertise

Physics, Astrophysics, Cosmology, Math, Computer Science, Machine Learning, Artificial Intelligence

Ankhita

Ankhita

Student

School

Eastlake High School

Graduation Year

2023

Project review

“Polygence is a great program for getting into research! The mentor they paired me with was perfect, and the platform really helped me stay on track and reach my goals.”

About my mentor

“Kristen was the best mentor throughout the entire project! She was really knowledgeable on the topic of my research, and was really helpful in guiding me and explaining difficult concepts or pointing me towards resources I could use to answer my own questions. She was very fun to talk with and be around, and I'm so grateful I got the chance to meet and work with her!”