Can stats find causation when a study can’t?

(Credit: iStockphoto)

A common problem with some scientific research, particularly projects studying human health, is that it is often difficult, if not impossible, to prove that a specific action directly causes an effect.

For example, scientists have found that those who smoke cigarettes also are more likely to suffer from depression. However, scientists cannot uniquely determine whether smoking directly causes depressive symptoms, or if those with depression are more likely to cause health-damaging behaviors, including smoking.

Now, Wolfgang Wiedermann, a quantitative psychology and assistant professor in the University of Missouri College of Education, and Alexander von Eye, a quantitative methodologist at Michigan State University, have developed a new statistical technique that can help scientists determine causation of effects they are studying.

Do patient-funded clinical trials need a dose of ethics?

Wiedermann says this method can help scientists advance research that otherwise would stall out in its early phases.

“It is a limitation of observational studies, such as the smoking and depression example, that scientists can only find links and correlations between actions and effects,” Wiedermann says. “Often, this is due to ethical boundaries scientists face. It would be unethical to ask nonsmokers to start smoking to see if depressive symptoms appear, which would be the only true way to determine a causation.

“This new statistical approach can help provide scientists a direction, or cause, in their research instead of only finding links or correlations.”

In a series of six recently published papers, Wiedermann and von Eye illustrated the effectiveness of their approach by applying observational data from studies performed by other scientists.

One such study featured data finding a correlation between children with Attention Deficit Hyperactivity Disorder (ADHD) and high levels of lead in the blood. Ethically, scientists could not inject children with lead in order to determine if it caused ADHD symptoms to appear, so the most specific finding their research could prove was simply a link between the two conditions. Wiedermann and von Eye applied this data to his statistical model and were able to determine a direction from the research: that high levels of lead in the blood may cause ADHD symptoms in children.

‘Sleeping beauty’ studies don’t pay off for decades

In another example, Wiedermann and von Eye found support for hierarchical stages of development in how children learn and process numbers and mathematics. Wiedermann says this new technique determines this by examining distributional characteristics of data, such as asymmetry in variable distributions.

“It is a modern myth that all datasets sit on symmetrical, normally distributed bell curves,” Wiedermann says. “In reality, every dataset for every study has some level of ‘non-normality.’ Taking distributional characteristics into account leads to situations where two variables cannot be exchanged in their status as cause and effect without systematically violating assumptions of the model. These systematic violations can be used to identify whether an action or condition causes a certain effect (high lead-blood levels causing ADHD) from large enough sample sizes of observational data.

“This could be an important tool for scientists to use in furthering their research. Ethical boundaries in scientific experiments certainly always will remain, thus we should start working on pushing the limits of what we can learn from observational data.”

Wiedermann and von Eye’s six studies appear in the British Journal of Mathematical and Statistical Psychology, the Journal of Person-Oriented Research, Educational and Psychological Measurement, Multivariate Behavioral Research, the International Journal of Behavioral Development, and Psychological Methods.

In a recently published volume Statistics and Causality: Methods for Applied Empirical Research, edited by Wiedermann and von Eye, they present their methods in the metric and categorical data domains, other researchers from all over the world present modeling approaches that are related to direction dependence, and leading philosophers discuss the relation of these methods to philosophical accounts of causality.

Source: University of Missouri