About This Project

Team

Edit: Thanks for the gold kind stranger

Class

Texas A&M University CSCE 489 with Dr. Caverlee

Team Members

Overview and Motivation

In our project, we analyzed reddit comments from January to May in 2015 to create a classifier that analyzes comments and determines whether or not they will be gilded.

Our motivation was to be able to predict whether a comment would be gilded when submitted to a specific subreddit. By being able to come up with a probability percentage, we could could create a new sorting algorithm that sorted the comments of a post by how likely they were to be gilded rather than the original scoring sorting.

Initial Questions

We originally wanted to analyze what features (such as word count or time of posting) of make a comment gild worthy. We expected these features to differ between subreddits and we wanted to see how.

Final Project

For our final project we decided to build a django webserver that takes in a comment and returns a boolean value of whether or not our classifier believes it will become gilded. Our original idea was to take a completely new comment and attempt to classify it without any score associated with it, but the task of attempting to predict a comment's score to then use in our classifier was not only inaccurate in the end, but a completely different problem that would require much more time to tackle. In the end we decided to just let the user of our application give the expected score of their comment.