6  Introduction to causal mediation analysis

In this session we will go over fundamental concepts of causal inference in general and then focus more specifically on causal mediation analysis.

6.1 What is a cause?

We employ causal inference constantly in our daily lives. Consider a simple example: when you press a light switch, you expect the light to turn on. Or in another scenario, you anticipate that performing action A will result in outcome Y. In some instances, we can identify deterministic relationships where action A will lead to outcome Y. While these straightforward cause-and-effect relationships are intuitive and help us navigate the world, they become significantly more complex when we explore health outcomes.

Notation

A = received treatment/intervention/exposure (e.g., 1 = intervention, 0 = no intervention)

Y = observed outcome (e.g., 1 = developed the outcome, 0 = no outcome)

\(Y^{a=1}\) = Counterfactual outcome under treatment a=1 (i.e., the outcome had everyone, counter to the fact, received treatment a = 1)

\(Y^{a=0}\) = Counterfactual outcome under treatment a =0 (i.e., the outcome had everyone, counter to the fact, received treatment a = 0)

6.1.1 Individual causal effect

When investigating health outcomes, we would ideally want to know if you do X, then Y will happen. We could have a specific question:

Will eating more red meat give me higher blood glucose in 1 year?

To answer this question we would ideally have you consume more red meat over 1 year and measure your blood glucose levels. Then, we would turn back time, and make you eat something else over 1 year and then measure your blood glucose levels again. If there is a difference between your two outcomes, then we say there is a causal effect.

But we can never do this in the real world.

6.1.2 Average causal effect

Instead, we can perform a randomized controlled trial. We now ask a slightly different question:

Will eating more red meat give adults higher blood glucose in 1 year?

We can randomely assigning one group to consume more red meat and the other group to consume more of something else over 1 year. Then we compare the average blood glucose levels after 1 year in each of the groups. If there is a difference, we could say there is an average causal effect.

Notation

We now modify the terms a little.

\(E[Y^{a=1}]\) the average counterfactual outcome, had all subjects in the population received treatment a = 1.

\(Pr[Y^{a=1}]\) the proportion of subjects that would have developed the outcome Y had all subjects in the population of interest received treatment a = 1.

\(E[Y^{a=0}]\) the average counterfactual outcome, had all subjects in the population received no treatment a = 0.

\(Pr[Y^{a=0}]\) the proportion of subjects that would have developed the outcome Y had all subjects in the population of interest received no treatment a = 0.

6.1.3 Definition of a causal effect

More formally we can now define a causal effect (1):

\(E[Y^{a=1} = 1] - E[Y^{a=0} = 1] \ne 0\)

6.2 Association vs causation

What makes it complicated to estimate a causal effect is that we cannot observe the outcome under different treatments.

When we only have a subset of the outcomes, we have an association. This is illustrated in Figure 6.1.

Figure 6.1: Causation vs association

![Causation vs association](/images/causation-association.png){#fig-causation-association}

Causation vs association. From (1)

If we want to infer a causal effect (i.e., what would have happened, had everyone done A=1 vs A=0), we need three assumptions to be fulfilled:

  • Exchangeability

  • Consistency

  • Positivity

    Causal Assumptions
    • Exchangeability

    The risk of the outcome in A = 1 would have been the same as the risk of the outcome in A = 0, had those in A = 1 received A = 0.

    Think about a randomized trial where you, by mistake, give the intervention to the other group. The effect should be the same as the one you would have observed had the groups been correct.

    We can also have conditional exchangeability. The risk is similar in subsets of the population. That could be within the levels of a variable W (e.g., education).

    • Consistency

    The treatments under comparison are well-defined and correspond to the versions of treatment observed in the data:

    • Precise definition of \(Y^a\) via a
    • Link counterfactual outcomes with observed outcomes

    \(Y^a = Y\) for every individual with A = a.

    The observed outcome for all treated equals the outcome if they had received the treatment.

    • Positivity

    The probability of receiving every value of treatment conditional on L is greater than zero. In other words, there must be a probability of being assigned to each treatment level.

    Image if we don’t have anyone with A = 0 among those with L=1 (e.g. long education), then we cannot estimate the conditional probability of the outcome.

6.3 Causal mediation analysis

recall the major limations of tranditional mediation methods:
  • Non-linearity
  • Interactions
  • Confounding
  • Multiple mediators

Causal mediation analysis is an extension of the traditional approach by:

Note
  • Outlining all confounding assumptions needed
  • Handling non-linearity and interaction
  • Clearly defining estimands of interest

6.4 Confounding assumptions

By using DAGs, the assumptions about confounding are made much more explicit. DAG under no mediator-outcome relation affected by treatment:

We can see that we not only have to take confounding between the treatment and outcome into account, but we also have to take mediator-outcome (\(A \leftarrow W \rightarrow Y\)) confounding into account.

In addition, we can have a more complicated situation where the treatment also impacts another mediator that is also a mediator-outcome confounder.

Below DAG shows where the mediator-outcome relation is affected by treatment:

From the DAG rules, we have a special problem that we cannot solve with traditional regression approaches. If we adjust for W2 we open a backdoor path by adjusting for the collider \(W1 \rightarrow W2 \leftarrow A\). We will work on how to solve this problem later in the course.

6.5 Non-linearity and interactions

Neither the product method nor the difference method in traditional mediation analysis can effectively account for interaction effects or non-linear relationships between variables.

Causal mediation analysis addresses these limitations through more sophisticated approaches. Specifically, it can:

  • Utilize regression-based methods and alternative causal inference techniques like g-computation
  • Fundamentally differ from traditional approaches by:
    • Constructing a comprehensive causal model that explicitly incorporates non-linear relationships and interaction effects
    • Artificially manipulating the data to set the treatment and the mediator to certain values
    • Predicting outcomes using the developed causal model
    • Comparing predicted outcomes to understand causal mechanisms

Note: additional approaches also exists, but we will not focus on these in this course.

6.6 Causal mediation estimation

Counterfactual-based direct and indirect effects can be estimated from regression models, provided that the no confounding assumption hold. To be able to do so, we need a model for the mediator and a model for the outcome.

Model for the mediator:

\(E(M|A = a, W = w) = \beta_0 + \beta_1a + \beta'_2w\)

Model for the outcome:

\(E(Y|A = a, M = m, W = w) = \sigma_0 + \sigma_1a + \sigma_2m + \sigma_3am + \sigma'_4w\)

From these two regression models we can estimate the CDE, NDE and NIE.

6.7 Example

The example here include a continuous treatment, mediator and outcome. We use these equations on an additive scale because it is easier to explain the intuition behind the estimands. Similar equations exist for other types and combinations of variables.

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
# This includes readr, or use library(readr) specifically
nhanes <- read_csv(here::here("data/NHANES.csv")) 
Rows: 10000 Columns: 76
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (31): SurveyYr, Gender, AgeDecade, Race1, Race3, Education, MaritalStatu...
dbl (45): ID, Age, AgeMonths, HHIncomeMid, Poverty, HomeRooms, Weight, Lengt...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# keep the variables used in the analyses 
data <- nhanes %>% 
  dplyr::select(ID, 
                w1 = Age, 
                w2 = Gender, 
                w3 = Education, 
                w4 = Smoke100, 
                a = PhysActive,  # this is the exposure
                m = BMI,         # this is the mediator
                y = Pulse,       # this is the outcome
                y2 = Diabetes ) %>% 
  na.omit()

We can now run a model for the mediator only adjusting for a single variable. In practice you would adjust for many more variables to satisfy the confounding assumptions.

lm_m <- lm(m ~ a + w1 , data = data)
summary(lm_m)

Call:
lm(formula = m ~ a + w1, data = data)

Residuals:
    Min      1Q  Median      3Q     Max 
-14.741  -4.578  -1.086   3.421  51.417 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 29.440321   0.267707 109.972   <2e-16 ***
aYes        -1.883085   0.162184 -11.611   <2e-16 ***
w1           0.007546   0.004793   1.575    0.115    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 6.627 on 6913 degrees of freedom
Multiple R-squared:  0.02101,   Adjusted R-squared:  0.02073 
F-statistic: 74.19 on 2 and 6913 DF,  p-value: < 2.2e-16

Here, we see that physical activity (a) is associated with lower pulse (y).

Now, we run the model for the outcome, notice we have included an interaction term.

lm_y <- lm(y ~ a + m + a:m + w1 , data = data)
summary(lm_y)

Call:
lm(formula = y ~ a + m + a:m + w1, data = data)

Residuals:
    Min      1Q  Median      3Q     Max 
-32.464  -8.254  -1.007   7.254  68.847 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 75.200976   0.954088  78.820  < 2e-16 ***
aYes        -5.625108   1.252664  -4.491 7.22e-06 ***
m            0.143529   0.028201   5.090 3.68e-07 ***
w1          -0.122102   0.008402 -14.533  < 2e-16 ***
aYes:m       0.125536   0.042449   2.957  0.00311 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 11.61 on 6911 degrees of freedom
Multiple R-squared:  0.04701,   Adjusted R-squared:  0.04646 
F-statistic: 85.23 on 4 and 6911 DF,  p-value: < 2.2e-16

In this model we can see:

  • higher physical activity (a) decreases pulse, independent of the mediator (BMI)

  • excess body weight (BMI) is associated with higher pulse, independent of physical activity

  • there is statistically significant interaction between physical activity (a) and BMI (m)

    Based on the coefficients of these two models, we can now estimate the different effects.

The equations are from (2) and (3).

6.7.1 Controlled direct effect

The controlled direct effect can be obtained using this formula based on the regression coefficients:

\(CDE(m) = (\sigma_1 + \sigma_3*m)(a - a^*)\)

\(a^*\) is for a change from level \(a^*\) (=control) to level \(a\) (=intervention).

For the controlled direct effect we set m to a specific value.

Let’s look at the distribution of m (BMI).

data %>%
  select(m) %>%
  summary()
       m        
 Min.   :15.02  
 1st Qu.:24.10  
 Median :27.80  
 Mean   :28.80  
 3rd Qu.:32.22  
 Max.   :81.25  
median_m <- round(median(data$m), digits = 1)

In this example we will set it to the median of BMI, at 27.8 which is a low level. You can also try to set BMI to 25kg/m2.

CDE_m <- (lm_y$coefficients[2] + lm_y$coefficients[5] * median_m) * (1 - 0)
round(as.numeric(CDE_m), digits = 2)
[1] -2.14

The CDE is how much the outcome, here the pulse, would change on average if the mediator, here BMI, were fixed at level (m = 27.8) uniformly in the population but the treatment, physical activity, was changed from 0 to 1.

The CDE answer the question, what would be the effect of A on Y, when fixing M at a specific value for everyone in the population (4)?

6.7.2 (Pure) natural direct effect

PNDE can be obtained using this formula:

\(PNDE = (\sigma_1 + \sigma_3 * (\beta_0 + \beta_1*a^* + \beta'_2*w))(a - a^*)\)

We set the value for the confounder w1 for the interaction to be the mean value of w1.

mean_w1 <- mean(data$w1)
mean_w1
[1] 47.17467

Then we calculate the PNDE.

PNDE <- (lm_y$coefficients[2] + (lm_y$coefficients[5] * (lm_m$coefficients[1] + lm_m$coefficients[2] * 0 + lm_m$coefficients[3] * mean_w1))) * (1 - 0)
round(as.numeric(PNDE), digits = 2)
[1] -1.88

The PNDE is how much the outcome, pulse, would change if the treatment a, physical activity, was set at 1 versus 0 but for each individual the mediator was kept at the level it would have taken, for that individual, in the absence of the exposure.

how pulse rate would change if the treatment (or exposure) physically active (a=1) was set at 1 versus 0, while for each individual the mediator (BMI) was kept at the level it would naturally be in the absense of physical activity(a=0).

In other words, it measures the direct effect of physical activity on pulse rate that doesn’t work through changing BMI.

6.7.3 (Total) natural direct effect

TNDE can be obtained by this formula:

\(TNDE = (\sigma_1 + \sigma_3 * (\beta_0 + beta_1*a + \beta'_2*w))(a - a^*)\)

TNDE <- (lm_y$coefficients[2] + (lm_y$coefficients[5] * (lm_m$coefficients[1] + lm_m$coefficients[2] * 1 + lm_m$coefficients[3] * mean_w1))) * (1 - 0)

round(as.numeric(TNDE), digits = 2)
[1] -2.12

Here, the value of M is enabled to act (as opposed to the PNDE). Here we see a little lower pulse rate, this means we’re measuring the direct effect of physical activity on pulse rate while allowing BMI to be at the level it would naturally be with physical activity (a=1).

The TNDE asks the question: “to what extent does physical activity cause lower pulse via pathways other than through BMI, allowing BMI to boost up or tune down such effect at the same time?” (4)

6.7.4 Total natural indirect effect

The TNIE can be obtained by this formula:

\(TNIE = (\sigma_2 * \beta_1 + \sigma_3 * \beta_1 * a)(a - a^*)\)

TNIE <- ((lm_y$coefficients[3] * lm_m$coefficients[2]) + (lm_y$coefficients[5] * lm_m$coefficients[2] * 1)) * (1 - 0)

round(as.numeric(TNIE), digits = 2)
[1] -0.51

The TNIE is how much the outcome would change on average if the treatment was fixed at level a = 1 but the mediator was changed from the level it would take if a* = 0 (physical inactive) to the level it would take if a =1 (physical active).

Note that exposure has to have an effect on M otherwise this will be zero.

The TNIE asks the question: to what extent does physical activity cause pulse via BMI and the possible interaction between physical activity and BMI in affecting pulse rate? In other words, the effect of exposure that ‘would be prevented if the exposure did not cause the mediator’ (i.e., the portion of the effect for which mediation is ‘necessary’) (4).

This is often the effect we are interested in in biomedical research for questions regarding mediation.

6.7.5 Pure natural indirect effect

The PNIE can be obtained by this formula:

\(PNIE = (\sigma_2 * \beta_1 + \sigma_3 * \beta_1 * a^*)(a - a^*)\)

PNIE <- ((lm_y$coefficients[3] * lm_m$coefficients[2]) + (lm_y$coefficients[5] * lm_m$coefficients[2] * 0)) * (1 - 0)

round(as.numeric(PNIE), digits = 2)
[1] -0.27

The PNIE is different from the TNIE because it does not include the interaction effect. We estimate the effect of red meat on inflammation and then subsequent effect of inflammation on blood glucose.

The PNIE answer the question: “to what extent does physical activity cause lower pulse rate via BMI only (i.e., due to physical activity affecting BMI and subsequently, BMI affecting pulse rate), not accounting for the possible interaction between physical activity and BMI? In other words, the effect that the exposure would have had if ‘its only action were to cause the mediator’ (i.e., the portion of the effect for which mediation is ‘’sufficient’’)” (4).

6.7.6 Total effect

The total effect can be decomposed as:

\(TE = PNDE + TNIE\)

TE <- PNDE + TNIE

round(as.numeric(TE), digits = 2)
[1] -2.39

This is the overall effect of red meat on blood glucose levels. Higher red meat is associated with higher blood glucose.

6.7.7 Proportion mediation

From this, we can calculate the proportion mediated.

\(PM = \frac{TNIE}{TE}\)

PM <- TNIE / TE

as.numeric(PM) * 100
[1] 21.18849

21% of the association between physical activity and pulse rate is mediated by the mediator BMI.

6.8 Using R package

We can easily conduct mediation analysis using R package once you understand the basic concepts of mediation.

library(CMAverse)
res_rb <- cmest(
  data = data, 
  model = "rb",     #rb means regression based 
  outcome = "y",    #this is the name of outcome
  exposure = "a",   #this is the name of exposure
  mediator = "m",   #this is the name of mediator
  basec = c("w1"),  #this is confounding factors, you can add more confounders here 
  EMint = TRUE,     #whether include interaction, TRUE indicates to include 
  mreg = list("linear"), #specify the mediator model, we use linear regression here 
  yreg = "linear",  #specify the outcome model, we use linear regression here 
  astar = 0,       # the level of exposure (not exposed)
  a = 1,           # the level of exposure (exposed)
  mval = list(27.8),  #specifiy the value of controlled mediator 
  estimation = "paramfunc", #parameteric function 
  inference = "delta"
)
Warning in estinf(): a is not a value of the exposure; Yes is used
Warning in estinf(): astar is not a value of the exposure; No is used
summary(res_rb)
Causal Mediation Analysis

# Outcome regression:

Call:
glm(formula = y ~ a + m + a * m + w1, family = gaussian(), data = getCall(x$reg.output$yreg)$data, 
    weights = getCall(x$reg.output$yreg)$weights)

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 75.200976   0.954088  78.820  < 2e-16 ***
aYes        -5.625108   1.252664  -4.491 7.22e-06 ***
m            0.143529   0.028201   5.090 3.68e-07 ***
w1          -0.122102   0.008402 -14.533  < 2e-16 ***
aYes:m       0.125536   0.042449   2.957  0.00311 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for gaussian family taken to be 134.882)

    Null deviance: 978151  on 6915  degrees of freedom
Residual deviance: 932170  on 6911  degrees of freedom
AIC: 53553

Number of Fisher Scoring iterations: 2


# Mediator regressions: 

Call:
glm(formula = m ~ a + w1, family = gaussian(), data = getCall(x$reg.output$mreg[[1L]])$data, 
    weights = getCall(x$reg.output$mreg[[1L]])$weights)

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 29.440321   0.267707 109.972   <2e-16 ***
aYes        -1.883085   0.162184 -11.611   <2e-16 ***
w1           0.007546   0.004793   1.575    0.115    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for gaussian family taken to be 43.91474)

    Null deviance: 310099  on 6915  degrees of freedom
Residual deviance: 303583  on 6913  degrees of freedom
AIC: 45790

Number of Fisher Scoring iterations: 2


# Effect decomposition on the mean difference scale via the regression-based approach
 
Closed-form parameter function estimation with 
 delta method standard errors, confidence intervals and p-values 
 
              Estimate Std.error   95% CIL 95% CIU    P.val    
cde          -2.135202  0.289670 -2.702945  -1.567 1.69e-13 ***
pnde         -1.884592  0.290945 -2.454834  -1.314 9.33e-11 ***
tnde         -2.120987  0.289390 -2.688180  -1.554 2.32e-13 ***
pnie         -0.270278  0.057982 -0.383921  -0.157 3.14e-06 ***
tnie         -0.506673  0.073986 -0.651682  -0.362 7.48e-12 ***
te           -2.391265  0.286308 -2.952417  -1.830  < 2e-16 ***
intref        0.250610  0.086005  0.082043   0.419 0.003570 ** 
intmed       -0.236395  0.082486 -0.398066  -0.075 0.004159 ** 
cde(prop)     0.892918  0.029698  0.834710   0.951  < 2e-16 ***
intref(prop) -0.104802  0.038003 -0.179288  -0.030 0.005821 ** 
intmed(prop)  0.098858  0.036181  0.027944   0.170 0.006290 ** 
pnie(prop)    0.113027  0.027197  0.059721   0.166 3.24e-05 ***
pm            0.211885  0.038694  0.136046   0.288 4.35e-08 ***
int          -0.005945  0.006247 -0.018189   0.006 0.341319    
pe            0.107082  0.029698  0.048875   0.165 0.000311 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(cde: controlled direct effect; pnde: pure natural direct effect; tnde: total natural direct effect; pnie: pure natural indirect effect; tnie: total natural indirect effect; te: total effect; intref: reference interaction; intmed: mediated interaction; cde(prop): proportion cde; intref(prop): proportion intref; intmed(prop): proportion intmed; pnie(prop): proportion pnie; pm: overall proportion mediated; int: overall proportion attributable to interaction; pe: overall proportion eliminated)

Relevant variable values: 
$a
[1] "Yes"

$astar
[1] "No"

$mval
$mval[[1]]
[1] 27.8


$basecval
$basecval[[1]]
[1] 47.17467

6.9 Defining estimands

Imagine we have a hypothetical randomized controlled trial where we give participants treatment or no treatment on a specific outcome Y.

\(Y^{a=1} - Y^{a=0}\)

For mediation, we are also interested in the effect of a mediator on this pathway. Now image that we also intervene on the mediator in a new hypothetical randomized controlled trial.

\(Y^{m=1} - Y^{m=0}\)

Now consider if we, in the same trial, could intervene on both because we are interested in whether treatment causes the outcome because it causes the mediator.

Notation

\(Y^a\) = a subject’s outcome if treatment A were set, possible contrary to fact, to a

\(M^a\) = a subject’s value of the mediator if the exposure A were set to the value of a

\(Y^{a,m}\) = a subject’s outcome if A were set to a and M were set to m

\(Y^{a,M_a}\) = a subject’s outcome if A were set to a and M were set the value m would have had had a been set to a. Note, this is a nested counterfactual

We can now define these estimands:

  • the controlled direct effect (CDE)
  • natural direct effect (NDE)
  • natural indirect effect (NIE)

6.9.1 Controlled direct effect

The effect of A on Y not mediated through M. Fixing the value of M to m.

\(Y^{a=1,m}\) - \(Y^{a=0,m}\)

We intervene on \(a\) but fix \(m\) to a certain value. The CDE is how much the outcome would change on average if the mediator were fixed at level m uniformly in the population but the treatment were changed from 0 to 1.

This could be relevant in the context of a change in a policy that impacted the mediator for everyone. For instance, if air pollution was a mediator between physical activity and cardiovascular disease risk. If a new policy would change the level of air pollution for all while we implement an intervention to increase biking in the city.

This effect is not used that often. But can be highly relevant in some situations.

6.9.2 (Pure) Natural direct effect

The effect that would remain, if we were to disable the pathway from exposure to mediator.

\(Y^{a=1,M_a=0}\) - \(Y^{a=0,M_a=0}\)

The PNDE is how much the outcome would change if the exposure was set at a = 1 versus a* = 0 but for each individual the mediator was kept at the level it would have taken, for that individual, in the absence of the exposure.

Note that the word “natural” refers to the nested counterfactual, the level the mediator would have taken in the absence of exposure. What it would naturally have been in the absence of exposure.

6.9.3 Total natural direct effect

\(Y^{a=1,M_a=1}\) - \(Y^{a=0,M_a=1}\)

Note, different from above in that the mediator is kept at the level it would have taken in the presence of the exposure.

6.9.4 (Total) Natural indirect effect

The effect of the mediator pathway.

\(Y^{a=1,M_a=1}\) - \(Y^{a=1,M_a=0}\)

The NIE is how much the outcome would change on average if the exposure were fixed at level a = 1 but the mediator were changed from the level it would take if a* = 0 to the level it would take if a = 1.

Note that exposure has to have an effect on M otherwise this will be zero.

6.9.5 Pure natural indirect effect

\(Y^{a=0,M_a=1}\) - \(Y^{a=0,M_a=0}\)

Note, this is different from the TNIE in that the exposure is set to no intervention.

6.9.6 Interaction effects

More information on the interaction effects can be found in (3).

Reference interaction:

\(INT_{ref} = PNDE - CDE\)

Mediation interaction:

\(INT_{med} = TNIE - PNIE\)

6.9.7 Effect decomposition

Using the causal inference framework also allow for effect decomposition, even when there are interaction and non-linearity.

Effect decomposition is important when we want to assess relative contributions such as the proportion mediated and eliminated.

TE = PNDE + TNIE = TNDE + PNIE

More information about decomposition of effects can be found in (3).

6.9.8 Proportions

Proportion mediated:

\(PM = NIE / TE\)

Proportion eliminated:

\(PE = (TE-CDE(m)) / TE\)

6.10 References

1.
Hernan MA. A definition of causal effect for epidemiological research. Journal of Epidemiology & Community Health [Internet] 2004 [cited 2024 May 6];58:265–71. Available from: https://jech.bmj.com/lookup/doi/10.1136/jech.2002.006361
2.
Valeri L, VanderWeele TJ. Mediation analysis allowing for exposure-mediator interactions and causal interpretation: Theoretical assumptions and implementation with SAS and SPSS macros. Psychological methods [Internet] 2013 [cited 2024 May 14];18:137–50. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3659198/
3.
VanderWeele TJ. A unification of mediation and interaction: A four-way decomposition. Epidemiology (Cambridge, Mass) [Internet] 2014 [cited 2024 May 14];25:749–61. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4220271/
4.
Wang A, Arah OA. G-computation demonstration in causal mediation analysis. European Journal of Epidemiology [Internet] 2015 [cited 2023 Jun 20];30:1119–27. Available from: https://doi.org/10.1007/s10654-015-0100-z