As we have already discussed, bias in epidemiological studies can be roughly classified into three main categories: selection bias, information bias, and confounding. In this lecture, we'll focus our attention on information bias. Much like selection bias, information bias has many different names and subcategories. This shouldn't be confusing though. The core principle of information bias is rather simple. Due to a number of reasons, some of which we will discuss later, we have misclassification of the exposure or the disease status or both. Let's consider an example of a case-control study which aims to look at a potential association between smoking and lung cancer. Regarding exposure, we would obviously need to assess whether participants were smokers or not and how much they smoked. We would also need to classify people as having lung cancer or not, as this is the outcome of interest. Both exposure and outcome could be misclassified. For instance, some heavy smokers may be erroneously classified as light smokers or some lung cancer patients may not receive the correct diagnosis. Usually this happens either because the study variables are not properly defined or due to flaws in data collection. Let's examine some of these flaws more closely. One common flaw in data collection occurs when interviewers ask individuals about their exposure status. In our example, interviewers would ask individuals with and without lung cancer, if they have been smoking. But the interviewers might be more thorough in assessing past smoking when interviewing people who have been diagnosed with lung cancer, exactly because they expect that lung cancer patients are likely to have been smokers. This would lead to misclassification of exposure status and eventually to a biased odds ratio. This type of information bias is called Interviewer bias. Luckily, this can be prevented if the interviewer does not know the disease status of the individual or if the collection process has been carefully standardised, so that interviewers follow a strictly defined protocol when they collect data from participants. However, interviewers are not the only potential source of information bias. When patients with lung cancer are asked to report whether they have smoked in the past, they might be more likely to recall a brief period of smoking along time ago compared to those who don't have lung cancer. This is not unexpected. Our memory is not perfect and we often forget things that have happened in the past. But when we get sick, we try hard to remember any details that could be linked to our disease. Details that we would otherwise erase from our memory. This phenomenon is called Recall bias and is a common type of information bias. We can prevent it by using objective ways to assess exposure such as medical records or biomarkers. I should highlight that Recall bias specifically refers to the differentially inaccurate recall of past exposure between cases and controls. When all the participants have trouble remembering their exposure status, but this has nothing to do with their disease, there's no recall bias. This is a principle that can be generalised, when exposure status is misclassified but equally so among cases and controls, we speak of non-differential misclassification. The same term applies when there are errors in determining the outcome, but they occur equally among exposed and non-exposed individuals. When non-differential misclassification occurs, the odds ratio we obtain is biased always towards the null. In contrast, misclassification is differential when errors in determining an individual's exposure status occur unevenly among cases and controls or when there are errors in the diagnosis of the disease which occur unevenly among the exposed and non-exposed individuals. Differential misclassification also leads to a biased estimate, but we cannot predict if it is biased towards or away from the null. As you have seen, on all these occasions, there is information bias that could lead to a biased estimate. You should now be familiar with how these can influence the results of your study and with ways to prevent this. Together with confounding, which I explain later in this module, the broad categories of selection and information bias can explain essentially all the issues that could undermine the validity of a study.