One of my colleagues was recently asked by his young son what epidemiologists do. His response was that they count sick people. But how do we count sick people? At the end of this lecture, you will know how this is done. You'll be able to define the population-at-risk, calculate measures of prevalence and incidence, and know the difference between cohorts and dynamic populations. Before we can start counting, we need to consider who our population of interest is. It should only comprise those who are able to develop the disease of interest. For example, if we intend to measure the frequency of ovarian cancer, the population-at-risk should only include women. Furthermore, keeping in mind that most women who develop ovarian cancer are diagnosed after menopause, we might consider excluding women younger than 50 years of age. There are two types of frequency measures, prevalence and incidence. Prevalence measures are static, in a sense that they describe the presence of a disease at a certain moment of time. In contrast, incidence measures are dynamic, focusing on newly occurring cases of disease as time goes by. Of course, these measures need to take the size of the population, and in case of incidence measures, also the time that has elapsed into account. Prevalence is the proportion of a population that has the disease during a specific moment in time. It may represent a snapshot of a population on a single moment or even how many people had the disease during a longer period of time. For example, a month or even a year. Many of the numbers you encounter in the news will be measures of prevalence. For example, according to the World Health Organization, 40 percent of the adult world population was overweight in 2016 and 13 percent obese. These are prevalences and describe states or disease outcomes which already have occurred. These help us to evaluate disease burden at a given time in a specific population. This could be a country, a city, or even a neighbourhood. To calculate prevalence, we divide the number of cases that existed in a given period or on a specific date by the number of people in the population-at- risk during the same period or on the same date. This gives us the period or point prevalence respectively. Incidence measures come in two flavours. First, we have the cumulative incidence, which is often referred to as risk. Formally, it is a probability to develop disease over a certain period of time. Here you see an arrow denoting the time period of interest. Say that we follow 20 people during this time period. If four out of the 20 developed the disease of interest, the cumulative incidence would be 20 percent. In other words, the risk of disease occurring is calculated by dividing the number of subjects developing disease during a time period by the number of subjects followed during the same time period. Intuitively, we understand that knowing the length of this time period is necessary to interpret risks. For example, my own risk of dying might be quite low during the next hour, but will rise to nearly 100 percent when a century of follow-up is considered. It is simple to calculate the cumulative incidence when all individuals are followed for the full observational period. However, often people are followed-up for different periods of time and then it becomes more difficult. The golden measure of disease occurrence is the incidence rate. This second measure of incidence does not depend on the length of follow-up. The main difference with the cumulative incidence is that we do not refer to the number of people in the population at risk but to their aggregated follow-up time. This is known as person-time, often expressed as person-years. Specifically we need to know how long each individual was followed. The incidence rate is then the number of subjects developing disease relative to the sum person-time for the population-at-risk. The essence of person-time is that 10 people followed for one year contribute the same number of person-years as five followed for two years. Here you see 20 people again. However, at the end of this one-year study, we note that some people have only recently entered the study while others have been in there from the start. It would therefore only be fair that we take into account that they contribute different periods of follow-up. If we do the same for all 20, we might observe that their combined person- time is 11 years. If four people developed a disease of interest, the incidence rate would be 4 over 11 person-years. As I said earlier, the cumulative incidence depends on the follow-up time. For mortality, it will reach 100 percent with a very long follow-up. The incidence rate, however, does not depend on the duration of follow-up and it will remain the same for a short and for a long follow-up. Hence, it is the golden frequency measure. As you may expect, the incidence rate and the cumulative incidence will be numerically similar for short observation windows. But for longer periods of time, they are likely to diverge. They are, however, in principle, vastly different, which is reflected in their units. No unit for the cumulative incidence had one over time for the incidence rate. Nonetheless, these formula show that conversion between the two remains possible. When calculating measures of incidence, we may be dealing with either cohorts, which consists of the same but decreasing number of people over time, or with dynamic populations, in which the members vary over time. We have already discussed cohorts in a previous lecture on cohort studies. You might visualize a dynamic population as a bathtub with an open drain, which is simultaneously being filled through the faucet being open. The water level may go up, or down, or even remain constant when the faucet is put exactly right. This last concept is called a stationary dynamic population. The water molecules are never the same, but the volume is. In population health, we often work with dynamic populations. An example might be the population of a city like the Hague. The composition of the city's population is not fixed due to births, deaths, and people moving to or from the city. But under stationary condition, the composition overall and by sex or age remains the same. This contrasts with a cohort, which is a fixed population. In cohorts, we can calculate both cumulative incidence as well as incidence rates. However, in dynamic populations, directly calculating the cumulative incidence is impossible as we don't observe the same individuals for the full observational period. In a dynamic population, the importance of the concept of person-time becomes immediately clear. For example, when there are 500,000 people living in a city, they contribute 500,000 person-years every year. If in that year 500 individuals experience bicycle accidents, the incidence rate would be one per 1,000 person-years. Finally, a general rule of thumb is that the prevalence equals the product of the incidence rate and the mean duration of disease. A chronic disease, for instance, may have a low incidence rate, but its prevalence may still be high when few die or are cured. For example, compare a lung cancer, which has a high incidence but a low survival rate and therefore low prevalence, with prostate cancer which has a relatively high survival rate and therefore a high prevalence. To summarize, different measures of disease frequency can be used to describe how often a disease occurs in a population. To do so, we need to define the population-at-risk of interest and combine the information of the number of cases of disease, the size of the population-at-risk, and the time that the population- at-risk was followed. In the next lecture, we will discuss different ways of comparing frequency measures between groups of individuals.