OBSERVATIONAL STUDIES REVISITED: DESIGNS, STRENGTHS AND WEAKNESSES |
Kitty Jager, Amsterdam, Netherlands |
Chair:
Kitty Jager, Amsterdam, Netherlands |
Alison MacLeod, Aberdeen, UK
|
|
Dr.
K.J. Jager |
Slide 1
Ladies and gentlemen, I would like to go into more detail on observational studies, designs, the strengths and the weaknesses.
Slide 2
Well, I think you’re all very aware that randomised controlled trials are really on the top of the hierarchy of study designs. That’s what everybody always thinks, the randomised controlled trial is the best thing you can do when you want to study a particular effect or association. However, what I would like you to have as a take home message today is this is true but this is only true for studies on the effects of therapy or any other intervention. Why is that so? That is so because usually the prescribing of treatment will be guided by the prognosis of the patient that means that the worse the prognosis, the more therapy is given and that is called confounding by indication. I think you’re all very aware of the fact that when you have a patient and that patient has a very high blood pressure you are inclined to give more anti-hypertensive medication. So for that reason there is a link between the prognosis and the therapy that is given and also a link with the outcome in the end. By having treatment randomly allocated to groups of patients what you are doing is breaking the link between therapy prescription by the doctor and the patient’s prognosis and that’s very important.
Slide 3
Let’s go to the design of the randomised controlled trials.
As you can see here, what you do when you do a randomised controlled trial is that at first you compose a cohort of patients and in that cohort what you do is you allocate in a random way exposure and after a particular time, months, years whatever what you do is assess the outcome in the experimental group and in the control group.
Slide 4
Let’s give an example. The very famous example of the 4D study. What the group of Wanner in Germany did was they used a double blind RCT to investigate whether atorvastatin was associated with less cardiovascular disease and mortality. In this case they used a composite end point of death and some other diseases, whichever occurred first.
So what we can see here that the strength of a randomised controlled trial, the big strength of an RCT is that it is the gold standard in making causal inferences as this randomisation breaks the link between therapy prescription and prognosis.
Slide 5
However, randomised controlled trials also have weaknesses and I would like to go into detail into these weaknesses. First of all, they are very expensive. You may know that if you want to start a randomised controlled trial, you need millions of euros. So they’re extremely expensive and for that reason they’re not very often performed, only to solve very important problems.
The second weakness is that they maybe unethical. For example, in some countries people argue that some patients are as well off on conservative treatment as they are on dialysis treatment. But I think that if I submitted a study proposal of an RCT to the medical ethical committee of the AMC stating that I would like to investigate a difference in outcome, the difference in mortality between patients on a conservative treatment and dialysis, I don’t think that I will get permission to do that study.
Another point is that it maybe inappropriate because they maybe very inefficient. For example, think of the fact that you would like to detect an adverse effect of a drug. Suppose that that effect is very rare, it’s only happening in 1 in ten thousand patients and only over a period of 5 years. If you look at the size of the randomised controlled trial that you would need to do to study this, then you would have a very inefficient study design.
Slide 6
Another problem they may also be inadequate. Why? Randomised controlled trials very often have very strict inclusion and exclusion criteria and that results from the fact that the patient populations that are studied in our RCTs are very strongly selected therefore the results may have a very low generalizability.
An example, you may know the study of Korevaar et al in the Netherlands. What they tried to do is to study the effect of starting with hemodialysis compared to peritoneal dialysis on the mortality of dialysis patients. Here as you can see, they started off with more than 1200 patients. Then almost 500 didn’t fulfil the inclusion criteria, so more than 700 did. But of those 773, 735 of the patients had a preference for either of the two dialysis modalities. So this study could only be performed in a very, very small group. Only 3% of the total population. Therefore, we can say that this population is very strongly selected and therefore, the results may have low generalizability.
Slide 7
Lastly they may also be unnecessary.
I think most of you know the rare cases where the effects of an intervention were so dramatic that it was not really needed to do an RCT. That is very rare but it happens, for example in the discovery of insulin you may know the story of Banting and Best administered this new insulin drug in 1922 to a young boy of eleven years and he became the first human to be injected with insulin and by this administration of insulin, this first administration of insulin in humans, his life was saved. And in the years that followed the deaths by diabetic coma dropped from more than 60% to only 1%. But this is very rare. I think this is an interesting example, another example is the administration of neostigmine in patients with myasthenia or maybe even the first antibiotics that were administered and had a very important and dramatic effect.Slide 8
So I think indeed that we can say that when it comes to establishing causality in studies on the effects of therapy then we can indeed say that randomised control trial is really on the top and then the evidence goes down to cohort studies, control studies and case reports and case series at the bottom.
Slide 9
So, the RCT is the king among the study designs when it comes to potential to establish causality, but please remember: this evidence, increased evidence from the bottom to the top only applies to studies on the effects of therapy of other interventions. So let’s now discuss observational studies. What you can see as a weakness of the RCT sometimes are the strengths of the observational studies. And viceversa.
Slide 10
Let’s start with the cohort study. I think you all know very well the cohort study; we have two types of cohort studies. There is the prospective cohort study. What we do? At the beginning you compose a cohort of people, at the beginning you assess the exposure to particular risk factors. What we then do is we follow them over time and assess the outcome later on in relation to the exposure. There are also retrospective cohort studies. What you do then is more or less the same but you compose the cohort in the past. You form a group of people, look at their exposure and then look at the outcomes and see whether the outcomes are different in the exposed and unexposed groups. An example: one of the colleagues in Leiden assessed the BMI at the start of dialysis and looked at the relationship with mortality. She was able to show that in haemodialysis patients being underweight was associated with an increased risk of death but any effect of obesity was not statistically significant.
Slide 11
So the idea is that the very big strength of cohort studies is that they have a very high generalizability. These kind of populations are rarely very selected. These are most of the time very broad patient populations, so they have a very high generalizability. The weakness however, is that they do have some potential to make causal inferences, but when it comes to the effects of therapy they suffer from this link between the prescription of the therapy and the patient’s prognosis at the beginning.
Slide 12
In case control studies you select cases on the basis of outcome. When you see a particular outcome in a patient you select controls without the outcome and then you look back into the past to assess the exposure to particular risk factors in cases and in controls. An example. These authors determined pre-natal and perinatal factors that were associated with the development of renal agenesis. In this study the cases were live birth infants with renal agenesis as reported in state-wide birth registry; the controls were a random sample of all births in that registry that were not reported to have renal agenesis, so they did not have the outcome. Then the authors went back to the past to see to what risk factors the different groups were exposed. Well the strength of a case control study, which I said is a very efficient study design, is that it is very suitable for studying rare outcomes, renal agenesis is a very rare outcome, and to study outcomes that take a long time to develop. The reason is that you look for these cases and controls in the present. Then you look to the past and see whether they were exposed or not. That’s very efficient because you can do it in let’s say a few weeks time. You are able to study multiple exposures: you have patients with the outcome, for example cancer, or who don’t have the outcome and then can look in the past at exposure to asbestos but also to any other risk factors. They’re very efficient because they are relatively inexpensive because you can do this in a few weeks. The weakness however is that like in cohort studies you can make causal inferences but they are much weaker, again because of this link between therapy prescription and prognosis.
Slide 13
Another weakness, another disadvantage is that you can only study one outcome at a time. The reason is because that’s the way you select your cases in your controls. So that choice was based on your outcome. Another important weakness that is very often reported is recall bias. Suppose that you have a particular illness, you are a case so to say, then you are inclined to much better remember what happened to you in the past than if you were a control. Controls haven’t really thought about exposure to risk factors, whereas someone with a particular disease has gone through his mind already to think of the fact whether he or she was exposed to particular drugs or to whatever.
Slide 14
Then a cross-sectional study: in a cross-sectional study you assess the exposure and the outcome at the same time. This leads to problems as usually it cannot be clearly distinguished what is the cause and what is the consequence for the reason that they are measured at the same time. The study on the slide, however, is a good example of a cross-sectional study. You will all know this study of S. Hallan and his co workers that used a population based survey that is a type of cross-sectional study when they wanted to explain the difference in the incidence of ESRD between Norway and the US by comparing the prevalence of CKD. So this prevalence of CKD was so to say the exposure, and the incidence of ESRD was the outcome.
Cross-sectional studies do have strengths. They can assess the prevalence and the burden of disease, suppose when you are looking at the prevalence of CKD, you can also say something about the morbidity and the quality of life in a particular population. So they can study disease burden. They’re also very fast and inexpensive and that is why they are performed so often. But they are only hypothesis generating.
Slide 15
The weakness is that you can draw very limited causal inferences from them exactly for this reason because the time order of exposure and outcome is not clear. This order cannot be determined except for rare cases like the Hallan study. Another point to reckon with in cross-sectional studies is survival bias because you look at the people who are still living, the survivors and in that group the association between exposure and outcome maybe somewhat different than in the group as a whole.
Slide 16
Then the group of studies at the bottom of the hierarchy of study designs that we have seen at the beginning of this presentation the case report, the case series. Case reports and case series are always descriptive but they can be prospective and they can also be retrospective. For example you can assess an exposure in a patient and then after some time you can see a particular outcome and you can say, hey listen this is very strange I’ve never seen this before.
On the other hand when you see a particular outcome, a particular illness in a patient, you can go back in time and see whether there was any particular factor that might have caused this specific disease for example.
Again a very famous example is the study of Casadevall and others who described the relationship between the use of Epo and the development of pure red-cell aplasia. This was an example, a very important example of the discovery of a new and rare adverse event. I’m sure that you all know that this study has had very important implications on the market of Epo.
Slide 17
Well, what are the strengths of case reports or of case series? These mostly are the first form of publication of a new disease or a rare event. They’re fast and inexpensive that’s why they also maybe popular and they maybe hypothesis generating. But a disadvantage of them is that theyhave a very limited potential to make causal inferences, unless in very dramatic cases like the insulin example that I showed you.
Slide 18
I would like to end my presentation by showing you an idea that was suggested by Professor Vandenbroucke, a Professor in clinical epidemiology from Leiden, the same author as the one on the STROBE paper. He suggests that there are two views of medical science: that of the discoverers and of what I have called evaluationists.
What do these people do? Discoverers observe, for example they can observe an odd cause of a disease, strange results of a lab experiment or a strange behaviour of a particular group of patients in analyses. Then what will they do? They will make a hypothesis, they will send in a publication and then others will seek for a confirmation in other data thereafter. The point is that evaluationists do have a problem with this type of approach because they say this is highly biased: you see something, you make a hypothesis, you write something up, you re-analyse your data, you change your hypothesis, etc. This is more or less a fishing expedition and that’s not a good way of doing science. No: evaluationists say that one needs to set up studies to evaluate whether a patient’s fate is really improved by new therapies or diagnostic tests that looked so wonderful in the first place. They say one needs a real good evaluation and by performing a carefully planned trial, preferably an RCT. The discoverers on the other hand may say is that if you only do this, if you only evaluate, then you can never discover, you can never discover something new. You can only confirm or refute the hypothesis made by someone else.
Slide 19
Professor Vandenbroucke thinks that there are two views on science and he says that for this reason you also need two hierarchies of study designs. This one hierarchy of study design for the intended effects of therapy, that’s the one we have seen before, with the randomised controlled trials on the top, and going down to the cohort studies to the case control studies and then at the bottom case reports and case series.However, there is also a hierarchy of study designs for discovery and explanation in this case th case reports are not at the bottom, no, they are on the top. First, what we do is make a discovery, then publish a case report or a case series, then you can try to confirm your hypothesis in case control study or maybe in a follow-up study, but already this becomes more difficult because especially in the case of a prospective follow-up study you need a very long time to study. Randomised controlled trials are not at all fit for the purpose. In this case they are at the bottom of this hierarchy of study designs.
Slide 20
He also says that this should not lead to uncritical acceptance of all observational research about the causes of diseases". And, finally that to guide our judgment we should position the research on the so-called axis of haphazardness of exposure.
Slide 21
And this is the subject which Friedo Dekker will approach in further detail. I can very well recommend this paper recently published in Plos Medicine about two different types of hierarchy in study designs. It’s a very interesting paper which makes you think about how you may look at different types of studies. Thank you.