Causal Inference by Paul Rosenbaum - Book Review
In scientific inquiry, nothing beats the quest for causal understanding. While statistical associations may offer intriguing glimpses into relationships between variables, they fall short of establishing the coveted cause-and-effect relationship. This is where the field of causal inference steps in, providing a rigorous framework to navigate the intricate web of causality. My interest in causal inference, as a field of study, dates back to grad school, when I was building machine learning models to discriminate kidney cancer patients from healthy controls using metabolites in their urine. I couldn't stop myself from thinking - as I should - about the different ways these patients could be different other than the disease factor. I ended up applying a method called propensity score matching (more on this later), and I concluded my paper by stating:
"We have shown the potential utility of a urine assay in the clinical setting for RCC (renal cell carcinoma) detection. This study, like others of its kind, has the limitation of numerous potential confounders that could impact biomarker discovery results. While randomized control trials (RCTs) are gold standards for epidemiology research, observational studies remain inescapable for studies like this as randomizing the intervention (RCC) is impossible. As such, to argue for the reduction in selection bias, we adjusted for five potential confounders in the study: age, BMI, gender, smoking history, and race. [...] Going forward, a much larger cohort, representing the diversity of race and geographical locations would be required for the validation of our biomarker proposals."
Many of the concepts mentioned in that excerpts were covered very clearly in the book
In "Causal Inference," Paul R. Rosenbaum masterfully distills the essence of this field, presenting a concise yet comprehensive overview that is accessible to a wide audience. With numerous motivating examples, Rosenbaum illuminates the key concepts and methods that underpin causal inference.
I. The Essence of Causal Effects
At the heart of causal inference lies the concept of a causal effect, which quantifies the difference in an outcome for an individual under two alternative scenarios: receiving the treatment versus receiving the control. This seemingly simple notion is fraught with challenges, as we can only observe an individual under one of these scenarios, leaving the other outcome forever hidden from view.
To illustrate this point, Rosenbaum recounts the historical example of George Washington's demise. In 1799, Washington fell ill with a sore throat, and his doctors, in their wisdom, decided to bleed him profusely. Washington succumbed to his illness the following day, leaving us to ponder whether the bleeding hastened his death or whether he would have met the same fate regardless. This question, like many others in the context of causal inference, involves comparing what actually happened to what might have happened in a counterfactual world, a world we can never fully access.
II. Randomized Experiments
The gold standard for causal inference is the randomized experiment, where a truly random device, such as a coin flip, dictates whether an individual receives the treatment or the control. This random assignment ensures that the treated and control groups are comparable prior to treatment, minimizing the risk of confounding factors that could distort the true causal effect.
Rosenbaum highlights the importance of randomized experiments with the example of the PALM trial, a clinical trial that compared two treatments for Ebola virus disease. In this trial, patients were randomly assigned to receive either ZMapp or mAb114, two drugs derived from antibodies. The random assignment ensured that any differences in survival rates between the two groups could be confidently attributed to the treatment and not to pre-existing differences between the patients. But, we don't always have the luxury of randomized experiment. Case in point is my study from grad school, that I briefly outlined earlier.
III. Observational Studies
In many cases, we are stuck to observational studies, where the assignment of treatment is not under the control of the investigator. This lack of control introduces the risk of bias, as the treated and control groups may differ in ways that affect the outcome, leading to spurious conclusions about cause and effect.
Rosenbaum illustrates the challenges of observational studies with the example of smoking as a possible causal factor for periodontal disease. Smokers and nonsmokers differ in numerous ways, including age, gender, education, and income. These differences could account for the observed association between smoking and periodontal disease, even if smoking itself has no causal effect.
IV. Measured Covariates
As such, to mitigate the risk of bias in observational studies, researchers often adjust for measured covariates, such as age and gender, that could confound the relationship between treatment and outcome.
One such technique, known as propensity score matching (PSM), stands out as a powerful tool for adjusting for measured covariates and creating comparable groups for analysis.
In essence, PSM involves pairing treated individuals with control individuals who share similar probabilities of receiving the treatment, based on a set of observed covariates. This probability, known as the propensity score, is typically estimated using statistical models which take into account the measured covariates that could potentially confound the relationship between treatment and outcome.
Beyond this point, I didn't have a great deal of understanding of Causal inference as a field, and this is exactly while I picked up the book.
V. Unmeasured Covariates
I gave a talk at Morehouse School of Medecine a few weeks ago, and one of the professors who attended kept asking me these questions, multiple times, same question, but different flavors each time - while using the word ' bias' alot, and rightfully so. He was speaking to unmeasured covariates. As Rosenbaum stated in the book, "an observational study is met with objections not applause."
While adjusting for measured covariates can help reduce bias in observational studies, it cannot eliminate it entirely. The challenge of unmeasured covariates, those lurking variables that remain unobserved and unaccounted for, casts this long shadow over causal inference. These unmeasured covariates could be genetic predispositions, personality traits, or any other factors that influence both the treatment and the outcome, potentially distorting the true causal relationship.
In the context of smoking and periodontal disease (the example used in the book), unmeasured covariates could include dietary habits, oral hygiene practices, or even genetic factors that affect both smoking behavior and susceptibility to gum disease. These unmeasured covariates could create a spurious association between smoking and periodontal disease, even if smoking itself has no direct causal effect.
To address the challenge of unmeasured covariates, researchers turn to sensitivity analysis, a technique that quantifies how much bias would need to be present to explain away the observed association between treatment and outcome. In essence, sensitivity analysis asks: "If the observed association is not due to a true causal effect, how strong would the influence of unmeasured covariates need to be to create this association?"
The answer to this question is typically expressed in terms of a sensitivity parameter, which quantifies the magnitude of bias that would be required to overturn the causal conclusion. If the sensitivity analysis reveals that a large amount of bias would be needed to explain away the association, then the causal claim is strengthened, as it suggests that unmeasured covariates are unlikely to be driving the observed relationship. Conversely, if even a small amount of bias could account for the association, then the causal claim is weakened, as it raises the possibility that unmeasured covariates are playing a significant role.
VI. Quasi-Experimental Devices
In some cases, researchers can exploit the so called quasi-experimental devices, which are tools used in observational studies to address potential biases and strengthen causal claims. They involve adding elements to the study design to investigate and potentially invalidate anticipated counterclaims. Quasi-experimental devices furnish new data intended to advance a claim by undermining specific grounds for doubt.
Rosenbaum discusses several quasi-experimental devices, including natural experiments, discontinuities, and instruments. Out of these three, I have only read about natural experiments before, and it was in the context of the 2021 Nobel Prize in Economics which was awarded to David Card, Joshua Angrist, and Guido Imbens for their groundbreaking work on natural experiments in economics. Card demonstrated how natural experiments can be used to study labor market issues, while Angrist and Imbens developed methodological frameworks to draw causal conclusions from such experiments, revolutionizing empirical research in economics.
In brief, in natural experiments, researchers capitalize on naturally occurring events that resemble randomized experiments. For instance, in the study of the effects of winning the lottery on financial well-being, an example used in the book, researchers might compare the outcomes of winners of large vs. small amounts to assess the impact of cash infusions on financial stability, exploiting the random nature of lottery draws to approximate a randomized experiment.
Another quasi-experimental design is the discontinuity design, which exploit abrupt treatment cutoffs along a continuum, creating a natural experiment near the point of discontinuity. For example, researchers studying the effects of a scholarship program might compare the outcomes of students who barely qualified for the scholarship to those who barely missed the cutoff, exploiting the discontinuity in scholarship eligibility to approximate a randomized experiment.
Instrumental variables, another quasi-experimental device, are variables that influence treatment assignment but do not directly affect the outcome. In other words, they are used when the treatment of interest is not randomized, but some form of encouragement to accept the treatment is. For example, imagine a study where some people are encouraged to quit smoking, and others are not - the encouragement is the instrument. The encouragement is randomized, so it affects who tries to quit. However, note that, it is the act of quitting smoking, not the encouragement itself, that is thought to improve lung function. The effect of quitting on lung function can be estimated among compliers, people who quit smoking only if encouraged to do so. And why is it useful? It is because it allows researchers to get closer to estimating a causal effect when true randomization is not possible. It allows researchers to focus on the effect of the treatment on people whose treatment status is affected by the instrument, i.e. the compliers
In sum, the beauty of quasi-experimental devices lies in their ability to provide insights into causal effects even when randomized experiments are not possible. By carefully selecting and analyzing quasi-experimental settings, researchers can gain a deeper understanding of causal relationships and inform policy decisions in a variety of fields, from economics to public health.
VII. Replication and Evidence Factors
We have all heard about the replication crisis in so and so field. Replication is a cornerstone of scientific inquiry, as it helps to ensure that findings are not due to chance or bias. However, simply repeating the same study with a new sample is not always sufficient to establish causality.
Rosenbaum emphasizes the importance of varied replication, where studies are conducted in different settings, with different populations, and using different methods. The goal here is to have studies/analyses that are inevitably flawed, but in different ways, converging on the same causal claim. This leads us to the idea of evidence factors, which are multiple, independent sources of evidence that converge on the same causal conclusion. The more evidence factors that support a causal claim, the stronger the claim becomes.
VIII. Uncertainty and Complexity
Causal inference is often fraught with uncertainty and complexity, especially in observational studies. It is important to acknowledge these limitations and to be cautious in drawing causal conclusions.
Rosenbaum concludes the book with a discussion of the ongoing debate about the health effects of alcohol (or the lack thereof), highlighting the uncertainty and complexity that surround this topic. He discusses the conflicting evidence from various studies, as well as the challenges of disentangling the effects of alcohol from other lifestyle factors.
If you have read this piece thus far, and enjoyed what you read. You should definitely give the book a try.