top2006

HANDS-ON COURSE: EPIDEMIOLOGY

BIAS IN NEPHROLOGICAL STUDIES :  WHAT IT IS AND HOW TO DEAL WITH IT.

Carmine Zoccali, Reggio Calabria, Italy

 

zoccali

Prof C. Zoccali
Unità di Nefrologia, Dialisi, Trapianto e Ipertensione
OO.RR and CNR Istituto di Biomedicina
Laboratorio di Epidemiologia Clinica e Fisiopatologia delle Malattie Renali e dell'Ipertensione
Reggio Calabria, Italy

Slide 1

zoccalislide

Now don’t be afraid that we will go building up upon numbers because from the first presentation to the second presentation there has been a nice well presented increase of complexity. Now, what I’m going to present now is simpler so be sure that you will be able to end the session.

My topic today is bias and of course, I will try to exemplify by using studies in nephrology. In clinical practice and in clinical research we aim at producing valid studies.

Slide 2

zoccalislide

What does valid studies mean? Valid studies are those that avoid systematic errors and this is done by adequate design, by properly conducting the study because I reiterate, bias is the tendency towards erroneous results. 

Slide 3

zoccalislide

The founder father of clinical epidemiology, he doesn’t like the definition evidence based medicine, David Sackett and he has written more than 60 types of bias. With his armoured car he tried to disintegrate them. Now, for the sake of compactness I will compact the types of bias in 4 groups, selection bias, information bias, selection-information bias that is a combination of the first two types of bias and finally, publication bias. I will go through these quite quickly.

Slide 4

zoccalislide

Let’s start with selection bias. You are all familiar with case control studies. What do we do in case control studies? We divide patients who have a disease, the cases and those who have no disease, the controls. Retrospectively we check exposure, retrospectively and we identify patients who have been exposed and those who remained unexposed. We put numbers in a 2 by 2 table. Here it is fundamental avoiding mistakes in the identification of cases and controls because the identification of cases should be independent of the exposure status. I will make a medical example in a minute but be aware that if you’re wrong in the identification cases or controls, if you select them badly, you will get bad results.

Slide 5

zoccalislide

Prospective cohort studies. In prospective cohort studies we identify the groups on the basis of exposure. The exposure group that is a group of individuals who have been exposed to toxic substances or to any purported risk factor for disease, let’s say smoking and a group of unexposed individuals. What we do now is we follow them along the time line and during the follow up we identify patients who get diseases during the follow up and of course, we identify also who remain non-diseased we put these results in the standard two by two table. What is fundamental here is to avoid a systematic error in the ascertainment of exposed and unexposed subject. But let’s go into a practical example. I will make an example focused on a case control study but you can envisage the similar example also in a follow up study in a prospective cohort study.

Slide 6

zoccalislide

Let’s first of all imagine the ideal study. The ideal study is the study where you have assessed the status in cases and controls in the whole population. So you have in this case what the epidemiologists call complete ascertainment in the reference population.

Now, let’s have 1000 cases and 9000 controls. Imagine that the number of exposed persons among cases and unexposed persons among cases is equal, 500 and 500. In this case in control the situation is different. Imagine that a quarter of a control is exposed, 1800 and ¾ remain unexposed 7200. Let’s now calculate the odds of exposure. The odds of exposure in case is 1, 500 divided by 500. In the controls it is 0.25 that is 1 out of 4. Is it ok? Now let’s calculate the odds ratio, the odds ratio is 4. This is the ideal population but in the real world this kind of full ascertainment is practically impossible and usually we select some cases and select some controls. Usually, the number of selected cases is higher than the number of selected controls.

Slide 7

zoccalislide

Are you following? Imagine that we select 50% of cases and 10% of controls. So we have to change the bottom line. 50% of 1000 is 500. 10% of 9000 is 900. This is the subselection that we have done.

Slide 8

zoccalislide

Now, let’s imagine that we have sampled cases and controls without any bias that is that we have kept the identification of cases completely independent of the exposure status. Now, we expect to have 250 exposed cases and 250 unexposed cases. The odds of exposure remains exactly the same, 1. Controls, if we have done exactly the same, we will have 180 exposed controls and 720 unexposed controls. Again the odds of exposure remain exactly the same. Of course, it will remain exactly the same the odds ratio. So unbiased sampling has produced an unchanged final odds ratio but let’s go to the bad example.

Slide 9

zoccalislide

Imagine that the investigator made a biased sample. He was not aware of it but the bias was created by the fact that the general practitioner who helped in recruiting the cases selected cases on the basis of exposure. Let’s imagine that due to this bias the proportion of exposed individuals among cases was not half and half but 60-40. So we have to recalculate. 60% of 500 is 300. 40% of 500 is 200. Let’s recalculate the odds of exposure. This time it’s 1.5. With controls there is no problem, so we can calculate another biased odds ratio which is 6. 6 you remember is higher than the true odds ratio, which was 4. So the differential bias that we have seen distorts the magnitude of the association between let’s say hydrocarbon exposure and the generation of nephrotic syndrome or renal disease in general.

Slide 10

zoccalislide

Let’s move on to information bias.

Slide 11

zoccalislide

What is information bias? Information bias is a bias deriving from imperfect definitions of study variables or from flawed data collection. We have 2 types of information bias. Exposure identification bias and outcome identification bias. We will go into detail. Starting with exposure identification bias. Let’s make the example about the association between nephrotic syndrome and hydrocarbon exposure. Now it’s quite possible that if we try to understand the exposure status by asking patients, asking cases, cases are very concerned by their health status and they maybe more likely to recall exposure. While controls are quite satisfied with their health status and they have no such concern or prejudice, so they are easily forgetful and these, of course, create a bias. How can we control for recall bias? We may pay attention to verifying the exposure information, for example, we can check names of all persons who worked or lived in proximity of plants processing hydrocarbons, which is a quite complex task. Or we may take as a control group a selected group of symptomatic subjects. For example, patients referred to the same renal clinic that turned out to be unaffected by renal disease, this is a good control. Or we may use objective markers of hydrocarbon exposure. For example, we may try to measure hydrocarbon metabolites in biological samples. 

Slide 12

zoccalislide

Now let’s move to another type of bias. The interviewer bias. This bias is created by an interviewer that is biased towards the final results and helps the cases to remind past exposure by using a more intensive questioning, by posing secondary questions. Now how can we control it? We can control it by standardising very carefully the interview by submitting identical questionnaires to cases and controls. We can blind the interviewer to the case-control status

Slide 13

zoccalislide

Now let’s move to outcome identification bias. One outcome identification bias is the observer bias, that is the knowledge or the exposure status by the observer may effect the decision as whether the outcome is present. We have a very nice example taken from the nephrology literature. This is a paper, which appeared in the American Journal of Epidemiology in 1995 that tried to understand if the identification of cases of hypertensive end-stage renal disease may be biased. The investigator prepared a standard clinical history. Standard, I mean same clinical history that was submitted to several observers but the clinical history was absolutely the same. What they noted is that the attribution of the final diagnosis was much affected by knowledge of raise, which is unexposure raise in unexposure for the diagnosis of hypertensive renal disease. The diagnosis not withstanding the clinical history was exactly the same was made more frequently when the observer, the doctor knew the race of the patients. The remedy. The remedy is blinding the assessor to the exposure status or try to grade the diagnosis as possible, probable and definite and we can suspect bias if the difference emerges just in the weakest category in the possible category. Another way of protecting the observer bias is using multiple assessors and to adjudicate the diagnosis when the assessors agree.

Slide 14

zoccalislide

Now what is the risk over information bias. The risk is misclassification. We have two types of misclassification; non differential and differential. I will make again an example in case-control studies but you may apply the same example also to cohort studies. What is non differential misclassification? It’s a misclassification that occurs when the misclassification is independent of case-control studies that is the same in cases and in controls. Example. Cases and controls exposed among cases 50 and 50. Exposed and unexposed among controls 20 and 80. Calculated the odds of exposure 1 and 0.5. 1 out of 4 is 0.5. Odds ratio 4. Now, imagine that you have non-differential misclassification that is that you misclassify, you are wrong in classifying 30% of cases and 30% of controls. The same degree of misclassification. Now we have to make calculations now. 30% of 50 is 15, it means that the true cases are 35. 50 minus 15. Of course, the unexposed cases are 65. Make the same calculation among control and you see 30% of 20 is 6 that means that the truly exposed among controls are 14 and we should add the 6 that we took out here among unexposed nearly 86. So the odds of exposure among control are 0.16. Let’s recalculate the odds ratio. The odds ratio is now 3.4. So non-differential misclassification weakens true association. The true odds ratio is 4, the odds ratio we got with non-differential misclassification was 3.4. So the association was weakened.

Slide 15

zoccalislide

Now let’s move to the differential misclassification. Differential misclassification is when the misclassification defers between cases and controls. Now cases. No misclassification in cases, 30% of misclassification in controls. Now we have to change numbers among cases. We have to use the 30% misclassification that we have already seen in the previous example and recalculate the odds ratio. Now the odds ratio is 6.2, which is higher than the true odds ratio. This time the odds ratio was higher but it might have been lower, if the misclassification would have occurred among cases instead of controls. So differential misclassification may both increase or decrease true association.

Slide 16

zoccalislide

Let’s move now to selection-information bias.

Slide 17

zoccalislide

The incidence-prevalence bias is a type of mixed bias because we select wrongly the patients to get the information we want about risk. Let’s imagine a population of 24 people. Let’s follow this population until time t. A cohort study to establish the risk of getting a given disease. Now during this follow up 8 people may get the disease but you see the yellow people get a less severe form of disease. So these people are all alive at time t. The 4 people in red die before getting to time t. So if we do a survey, a time t to estimate the risk of disease, we get a wrong estimate of risk because the prevalence of disease at time t is 4 divided by 24, that is 0.16 and we know that incident risk in reality is 8 divided by 24 that is 0.33. So prevalence gives a biased estimate of incident risk because severe cases may die early. The remedy is whenever possible estimate risk as incident risk.

Slide 18

zoccalislide

Another type of selection-information bias is lead time bias that is a bias deriving from the fact that we apply to selected patients different criteria that is different information for assessing the efficacy of screening or early detection programs. Now again, a cohort study but here we have just 2 patients. The first patient gets diseases let’s say at time t minus 1 and we made the diagnosis because he is symptomatic. This patient dies at time t and this survival is very short. You see this is a young patient so there is concern in the community where these patients live and doctors may set up an early detection program, a screening program. A girl gets the disease and gets the disease very early at a preclinical stage. She progresses until time t and dies at time t. You see her survival is much longer than that of the first patient. So should we conclude that early diagnosis is useful in this case? We should think about it because we applied different criteria. Symptomatic phase in case 1 and preclinical stage in case 2. Had we observed the first case at a preclinical stage his survival might have been exactly the same. We should not ignore the time needed to lead the disease from the preclinical stage to the clinical symptomatic phase, the lead-time. So the remedy that we should apply is, if you are going to assess the usefulness of screening programs by survival analysis, we should take into proper account the lead time and adjust the analysis accordingly.

Slide 19

zoccalislide

Now the last type of bias. Publication bias.

Slide 20

zoccalislide

As you know, the publication bias is the bias that can be generated in the process of getting information published. You see we have many studies on renal disease progression and on drugs that can effect renal disease progression. The issue is so important that it’s used worldwide. Kasiske 5 years ago collected 13 studies on the effect of lipid lowering drugs on renal outcomes in patients with CKD. As you know he made a metanalysis and published this metanalysis in Kidney International. The results of metanalysis are summarised in plots like this. You see there is a vertical line at 0. If the treatment is better the points are on the left side, while if the points are on the right side, it means the treatment is worse than non-treatment. Now Kasiske noted that in 11 studies treatment was better than nothing or than control. Then in 2 studies treatment was worse than control or nothing. So this may suggest bias. The idea that bias can be included in this data is reinforced if you look at the numbers. All these studies were based on small numbers.

Slide 21

zoccalislide

Now, how can bias be generated? The journal editors might have been biased in accepting for publication positive studies because all these studies were small or alternatively investigators that produced negative results who have not even submitted these studies because they were certain that the paper would have been rejected.

Slide 22

zoccalislide

How can we detect and prevent publication bias? We can plot data in a funnel plot. The funnel plot is a quite simple plot in the horizontal axis you put the sample size. In the vertical axis the outcome measure, better and worse. If you spot that small studies show better results while the outcome tends to show no effect, the outcome is neutral as the study size is larger, you should think that publication bias is more likely. When the data are distributed in this way, when the outcome is independent of the study size you can think that publication bias is unlikely.

Slide 23

zoccalislide

Now let’s summarise. Bias is the tendency to produce erroneous results. Proper selection of cases and controls and of exposed and unexposed subjects is fundamental if we are to avoid bias. Recall bias, interviewer bias and observer bias all produce wrong information and misclassification of study subjects. Non-differential misclassification weakens the strength of a true association while non-differential misclassification may either inflate or weaken the strength of a given association. Prevalence may give a biased estimate of risk. Whenever possible, risk should be estimated in cohort studies rather than in surveys.
The time that leads to disease from the preclinical phase to the clinical phase should be carefully considered when assessing the usefulness of early detection programs and finally, be aware of publication bias.

Ok. I thank you for having participated into this clinical epidemiology short course. If you have questions please write to me. You can find my e-mail address in the website of NDT-educational. So don’t hesitate, write to me I will be pleased to reply to your questions. Thank you.