Now, let us assume that we have a random variable, but we didn't have access to its mathematical specification. We don't know its cumulative distribution function or probability density function. However, we can sample from this random variable. It means that we can take its values several times, and these values are independent of each other. You can think that and you can ask the computer to give you values of this random variable, or you can make several measurement of this value and give some independent results. In this case, we can estimate how probability density function looks like. This can be done by histogram. This is a useful tool in data visualization. So let us discuss the relation between probability density functions and histograms. Let us assume that we have a random variable X and we have a sample from this random variable. It means that we have our numbers x_1 and so on, x_n, which are obtained as values of this random variable. How to find the probability density function of X? We cannot do it exactly, but we can approximate this probability density function using this sample. The idea is quite simple. Let us draw a line and let us put our values that are in our sample to this line. So for example, we have x_1 here, x_2 here, x_3 here, and so on. Let me put some more points like here and here, and that's all. Let us assume that it is x_n. So if we look at this sample, we see that we have three points here and only one point here. So it is reasonable to expect that in this region, the value of probability density function is larger than in this region because we have more points here than here. We can use this idea to draw an approximation to the graph of probability density function. To do so, let us divide our horizontal axis to several segments. Theoretically, these segments should be small. But I don't want to draw a very small segments and use the large ones. Anyway, let us enumerate the segments and this will be a zero, this is a one, and so on. This is a n. We also call this segment as A_1, this is A_2, and so on. Actually, this is A_n. Now, we can say that A_k is a segment from A_k minus 1 A_k. Let us assume that the width of all segment is the same and it is equal to W. So we have W here, W here, and W here. Now, we can approximate probability that our random variable take a value, for example, on this segment by its relative frequency according to our sample. What does it mean? First of all, let us calculate values f_k. They are called frequencies, and this is just a number of points x_i, for which x_i lies inside the segment A_k. For example, for this picture, f_1 is equal to 2, f_2 is equal to 3, and f_3 is equal to 1. Now, let us find relative frequencies. Relative frequency is equal to f_k over N. For example, for our picture, for values of k 1, 2, 3, we have f_k 2, 3, 1. This is the number of points, and the overall number of points is equal to 6. So we have as relative frequency 2 over 6 here, 3 over 6 here, and 1 over 6 here. As you can see, relative frequency is an approximation of the probability on the corresponding event. This follows from the definition of probability. This probability is defined as a ratio of cases when this condition is satisfied and all cases that we consider. So we have this approximate equality. Now, we can return to the definition of probability density function. Probability density function at some point, for example at point A_k, is approximately equal to probability that X belongs to this segment over length of this segment. Let us denote this value by h_k. So we have an approximation for this probability density function. Now we can write approximate graph of this function. This graph will be graph of piecewise constant function over each segment, it will be a constant, that is defined by this value. So it will be equal to h_k. For example, we will have some value here, some value here, and some value here. Now, the height of this rectangle is equal to height of this rectangle divided by two because we have only one point here and two points here. In the same way, height of this rectangle is equal to height of this rectangle multiplied by three because we have three points here and one point here. So height of each rectangle is proportional to the number of points that lie in the corresponding segment. This figure that consists of these rectangles is called histogram that is associated with this set of numbers. As we discussed here, this histogram approximates the graph of probability density function. In fact, it can be proved that if we increase the number of items in the sample, so N capital, and if we decrease the size of the segment, sum of this W, then histogram dance to graph of probability density function. This allows us to use samples from random variables to estimate the probability densities. This is important tool in visual investigation of data and modeling them with random variables.