The next task is to do some exploratory data analysis. The basic goal here is to develop an understanding of the various statistical properties of the data set so that you can build a good prediction model down the road. And so two key questions to consider here are, how frequently do certain words appear in the data set and how frequently do certain pairs of words appear together? Once you've considered these basic questions you can move on to kind of more complex ones like, how do triplets of words appear together? Now before we start digging around the data set I'd encourage you to take a moment and do a little bit of thinking. Thinking is important, and the reason is because it helps you build expectations about the data set. And that's important, because when you have expectations about a data set it helps you to know when certain features or certain observations are unexpected. Okay? Now, these unexpected things might be, errors or anomalies in the data set, or they might be really interesting features that you have to account for. Now, your initial expectations might be wrong of course, but as you look at the data and you see observations, you can kind of calibrate and refine your expectations as you go. Now, the key thing is that when you do not have any expectations about a data set, then when you start looking at the data, everything will kind of seem correct. And it can be difficult to sort out what's useful and what's not useful in the data. So, this is a key task in developing our prediction model for this project. So I encourage you to take the time to get to know your data really well.