To populate the transition matrix, you need to calculate the probability of a tagging happening, given that's another tag already happened. You also need to calculate the probability that a tag happens at the beginning of the sentence. You will also learn about a new concept known as smoothing. Let's get started. Begin by filling the first column of your matrix with the counts of the associated tags. Remember, the rows in the matrix represent the current states and the columns represent the next states. The values represents the transition probabilities of going from the current state to the next state. The states for this use case is the parts of speech tag. As you can see, the defined tags and the elements in the corpus are marked with corresponding colors. For the first column, you'll count the occurrences of the following tag combinations. As you see here, a noun following a start token occurs once in our corpus. A noun following a noun doesn't occur at all. A noun following a verb doesn't occur either. A noun following another tag occurs six times. The rest of the matrix is populated accordingly. But I'll take it a little shortcut. The corpus, as you can see, is a verbalize high cool, because there are no tag combinations with the tag VB, they are all 0. Unfortunately, there is no such shortcuts in the programming assignments. You've been warned. The 0 tag, the O tag following a start token occurs twice. The O tag following a noun tag, NN occurs six times, and the last entry in the transition matrix of an O tag, following an O tag has a count of eight. In the last line, you have to take into account the tagged words on a, a, wet, wet, and, back to calculate the correct counts. Now, this you have calculated the counts of all tag combinations in the matrix, you can calculate the transition probabilities. So far, you've calculated then enter the counts in the matrix, which corresponds to the numerator in our formula. Now, you just have to divide each counts by the sum, which is actually the corresponding row sum. Remember what this row sum represents. For the row where the current state is a noun part of speech, this sum across that row represents all pairs of words where the current state is a noun, and the next state is any parts of speech, whether it's a noun, a verb, or other. For the transition probability of a noun, tag NN Following a start token, or in other words, the initial probability of NN tag, we divide one by three, or four the transition probability of an other tag followed by a noun tag, we divide 6 by 14. You may have realized that there are two problems here. One is that the row sum of the VB tag is 0, which would lead to a division by 0 using this formula. The other is that in lots of entries in the transition matrix are 0, meaning that these transitions will have probability 0. This won't work if you want the model to generalize to other high cools, which might actually contain verbs. To handle this, change your formula slightly by adding a small value Epsilon to each of the counts in the numerator and add N times Epsilon to the divisor such that the row sums still adds up to one. This operation is also referred to as smoothing, which you might remember from previous lessons. If you substitute the Epsilon with a small value, say 0.001, then you'll get the following transition matrix. The value shown here are shown up to the third decimal digits, so don't worry, if the row sums don't add up to one exactly. The results of smoothing is, as you can see, that you no longer have any 0 value entries in a. Further, since the transition probabilities from the VB states are actually one-third for all outgoing transitions, they are equally likely. That's reasonable. Since you didn't have any data to estimate these transition probabilities. One more thing before you go, and a real-world example, you might not want to apply smoothing to the initial probabilities in the first row of the transition matrix. That's because if you apply smoothing to that row by adding a small value to possibly zeroed valued entries. You'll effectively allow a sentence to start with any parts of speech tag, including punctuation. You just learned about smoothing and so why it is important. This is great. Now in the next video, we will move on and see how you can populate another type of matrix known as the emissions matrix.