Directed Acyclic Graphs (DAGs)

When conceptualizing and designing a study, or when developing plans to test a research question, it is important to draw a directed acyclic graph (DAG). DAGs, like path diagrams, are causal diagrams. Causal diagrams depict the hypoothesized causal processes that link two or more variables. Path diagrams are typically used after analysis to describe and report the findings in analysis (when using path analysis, factor analysis, or structural equation modeling). By contrast, DAGs are particularly useful when designing a study or before analysis, because they can help specify which variables it is important to control for and—just as importantly—which variables it is important not to control for.

When drawing a DAG for your study, draw all the variables that link the hypothesized cause to the hypothesized effect, including confounds, mediators, and colliders.

In your study, it is important to control for all confounds. In addition, it is okay to control for ancestors of the outcome variable that are not confounds or mediators. That is, it is okay to control for variables that influence Y that do not influence X and that are not influenced by X. When including these variable as control variables in a model, they are called precision variables. You do not need to include precision variables in the model because the estimate of the association is already unbiased if you have controlled for all confounds. However, including precision variables in the model reduces residual variance in the outcome variable and can yield more precise estimates (i.e., smaller standard errors) of the association between the predictor variable and outcome variable.

In addition, there are some variables that are important not to control for. It is important not to control for mediators of two variables for which you want to determine the estimate of the causal effect—unless you are interested in the direct causal effect of the predictor variable on the outcome variable above and beyond the mediator. In addition, it is is important not to control for a) ancestors of the predictor variable that are not confounds, b) descendants of the outcome variable, and c) colliders (unless the collider is also a confound).

For more information on DAGs, including ancestors, descendants, confounders, and colliders, see here: https://isaactpetersen.github.io/Fantasy-Football-Analytics-Textbook/causal-inference.html#sec-causalDiagrams.

After determining what variables are confounders and what are important to control for, there are various ways one can control for variables, as described here: https://isaactpetersen.github.io/Fantasy-Football-Analytics-Textbook/causal-inference.html#sec-causalInferenceControlVariables.

Developmental Psychopathology Lab