Correlation and Causation

An important purpose of scientific research is to establish an understanding of the causal relationship between variables and outcomes.  This may be related to the development of a disease or condition, effectiveness of health-related skills or treatments, or prognostic variables.  Given the manner in which the human brain attempts to identify patterns, we are highly susceptible to confusing correlation and causation.  This distinction is important because if we are to truly understand the relationship between variables and, on this basis, recommend the most beneficial interventions, it is crucial to identify which relationships are causal in nature and which are merely the result of correlation or coincidence.

Correlation and coincidence occurs when there is an association between two, or more, factors or variables.  In contrast, causation occurs when one variable leads to (or “causes”) the other.  It is typically simple to find evidence of correlation or coincidence between variables.  Establishing a causal pathway or relationship, on the other hand, is more difficult and requires consideration of several factors.  In essence, the process of establishing that a variable causes another requires additional and higher levels of evidence than establishing an association between two variables.

In 1965, Sir Bradford Hill presented several considerations for establishing a causal relationship between variables.  Some of these factors included the strength of the association, consistency of the association, temporality, presence of a dose-response relationship, plausibility, and coherence.  With the exception of temporality, which describes the concept that the causal variable must occur prior to the outcome, each of the factors are relative indicators of causation.  This means that neither the presence nor absence of the variable definitively defines or refutes a causal relationship.  In addition to these criteria, it is also helpful to consider whether other factors, such as unmeasured or intermediary variables, may be causing the effect being studied.

There are several examples demonstrating the difference between correlation and causation.  A hypothetical example would exist if measurement revealed an association between the amount of ice cream consumed and sunscreen applied, with greater amounts of each occurring during the summer months of the year.  While there may be an association between these variables, it would most likely reflect a common third variable, specifically warm and sunny weather during the summer.  When the weather is warmer outside, it is more likely that people would eat ice cream and simultaneously, due to the warm and sunny weather, apply sunscreen rather than any direct causal impact of one variable on the other.

While the example above may seem trivial, there was an article published in The New England Journal of Medicine in 2012 that described an association between the amount of chocolate consumed in the population and the number of Nobel Prize Laureates within the same country.  Despite this article having been published to emphasize the distinction between correlation and causation, there were articles written in the mainstream media suggesting a causal pathway between chocolate consumption and subsequent Nobel Prize winners.  In reality, there was no such causal relationship determined by the data.  This confusion between causation and correlation can readily lead to inaccurate, or even incorrect, recommendations.  In this case, there may have been an incorrect understanding that increasing chocolate consumption may result in an increased number of Nobel Prize winners.

Klein et al recently reported a cross sectional study of numerous immune related measurements and other hematologic findings amongst four groups of patients with the aim of identifying features to define long COVID.  The study included 40 healthy, uninfected control patients, 37 health unvaccinated and previously infected control patients, 39 healthy previously infected patients with no persistent symptoms, and 99 patients with prior COVID infection and persistent symptoms.  This fourth group was defined as having long COVID and their median time from acute infection was 432 days.  In addition to several tests of immune function, cortisol levels were measured amongst all those defined as having long COVID, 15 previously infected patients with no symptoms and 25 healthy and uninfected patients.  The authors reported that cortisol levels amongst those with long COVID “…were roughly half of those found in healthy or convalescent controls.  Based on machine learning, cortisol levels alone were the most significant predictor for Long COVID classification…”.  On the basis of this study, there have been several articles in the mainstream media reporting a causal relationship between cortisol levels and development of long COVID.  Despite the certainty of the authors interpretation of the results, it is worth considering the discussion above regarding the distinction between correlation and causation and whether or not these findings are proof of a causal relationship. 

The Practices of the Healthcare Athlete involves integration of mind-based and body-based skills which are evidence-driven and implemented from a polyvagal informed perspective.  This paradigm is based upon an understanding and application of the best available evidence, including an assessment of whether or not the relationship between variables is causal in nature.  In addition, integration of emerging practices and evidence is based upon a similar evidence-driven process.  This ensures that all the skills and strategies developed and recommended within this perspective are based upon high quality data and a valid assessment of the relationship between variables.

REFERENCES

Klein J, et al.  Distinguishing Features of Long COVID Identified Through Immune Profiling.  medRxiv, https://doi.org/10.1101/2022.08.09.22278592

Hill AB.  The Environment and Disease: Association or Causation?  Proc of Royal Soc Med, 58: 295-300, 1965.

Messerli F.  Chocolate Consumption, Cognitive Function, and Nobel Laureates.  N Engl J Med, 367: 1562-1564, 2012.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.