Initial Commit wrote an interesting post examining the most popular git commit messages. In their analysis they used Google BigQuery, a cloud data warehouse with various public datasets to inspect. Using data from GitHub repositories, a subset of data was selected and analyzed to determine the frequency of the top 20 most common intial commit messages.
Git Commit Message
You can read the full post on the blog at Initial Commit.
I thought it was a really cool use of the public datasets provide by Google BigQuery and made me wonder what other interesting data analysis could take place with all of GitHub's public repositories. Are there any other interesting data sets available in Google BigQuery?