Reddit data — Graduate School talk

This is the first post in a series I’ll be doing about posting on Reddit for 2013. The posts in this series search through every single post made to Reddit in 2013—that’s over 50 GB worth of data, and over 39,000,000 posts!

For this post, I examined every post made to any subreddit for any word that related to graduate school (including law and medical school) for each day of 2013, in either the “title” or the “self-text”. The key used for positive matches is at the end of this post.


Posts made to Reddit with words relating to graduate school in 2013. 1 data point for each day, red line is 7 point moving-average.

Maybe not as telling as I had expected, there’s a ton of variance day to day and week to week, but the most obvious observation would be the spike in graduate school related comments in the month of April, following a consistent increase in posts in March. I would suggest this is probably due to the fact that this is the time of the year when a lot of acceptance decisions come out.

Normalizing the data against total posts for the day is not any more telling, the profile is stretched a bit in the y-direction. The 7 point moving-average is an attempt to remove the weekly periodicity of Reddit posting.

The key used was

["grad school", "graduate school", "master's", "masters", "phd", "gre", "letter of recommendation", "letters of recommendation", "doctorate", "law school", "med school", "medical school", "transcript", "undergraduate gpa", "undergrad gpa"]

There are of course many more keywords that could have been used, but many have multiple implications, and this list was used as an attempt to minimize false positives.