Clayton G
- Research Program Mentor
PhD at Universität des Saarlandes
Expertise
data science, computer science, computational linguistics, natural language processing, web scraping (Twitter) and analytics, probability theory, statistics, calculus
Bio
Hi, I'm Clayton Greenberg and I was a professor in the Computer and Information Science department at the University of Pennsylvania. Students are sometimes quite surprised to learn that there is much more to computer science than programming, but this is really an opportunity in disguise. It means that you are capable and welcome to do cool computational projects even if you didn't learn how to program during kindergarten :) My favorite classes to teach provide students with the mathematical background that they need to be successful in computer science and its young cousin, data science. In my spare time, I like to troll Siri. (My Ph.D. dissertation, Evaluating Humanness in Language Models, focuses on this.) And I like to sing. I did a lot of singing in college, leading to YouTube videos that my students "discover" every semester. Perhaps it was a little too much singing and not enough studying, but this leads to my best advice for college: pick the program that fits you, rather than changing yourself to fit the program.Project ideas
Detecting bots on Twitter
Now that computers are good enough to generate very convincing text completely on their own, people have become quite concerned about "fake news". In this project, we will investigate how easy it is to detect Tweets that have been written by computers in four steps: 1) Collect some data, some possibly labelled already as "fake". 2) Look at the statistical properties of "real" Tweets versus "fake" Tweets. 3) Write a computer program, for example a Naive Bayes classifier, for labelling new Tweets as "real" or "fake". 4) Evaluate how good the program is using a sensible metric.