Chapter 55 Estimating a Mediation Model Using Path Analysis

In this chapter, we will learn how to estimate a mediation analysis using path analysis.

55.1 Conceptual Overview

Mediation analysis is used to investigate whether an intervening variable transmits the effects of a predictor variable on an outcome variable. Both multiple regression and structural equation modeling (path analysis) can be used to perform mediation analysis, but structural equation modeling (path analysis) offers a single-step process that often more efficient. When implemented using path analysis, mediation analysis follows the same model specification, model identification, and model fit considerations as general applications of path analysis. For more information on path analysis, please refer to the previous chapter on path analysis.

Just as in a conventional path analysis model, we often represent our mediation analysis model visually as a path diagram. For illustration purposes, let’s begin with a path diagram of the direct relation between a predictor variable called Attitude with an outcome variable called Behavior (see Figure 1). This path diagram specifies Attitude (towards the Behavior) as a predictor of the Behavior, with any error in prediction captured in the (residual) error term. By convention, we commonly refer to this path as the c path.

Figure 1: Path diagram of the direct relation between Attitude (predictor variable) and Behavior (outcome variable)
Figure 1: Path diagram of the direct relation between Attitude (predictor variable) and Behavior (outcome variable)

In the path diagram depicted in Figure 2, we introduce an intervening variable, which we commonly refer to as the mediator variable – or just as the mediator. As we saw in Figure 1, a direct relation between the predictor variable Attitude and the outcome variable Behavior is shown; however, note that we now refer to this relation as the c’ path (spoken: “c prime”) because it represents the direct effect of the predictor variable on the outcome variable in the presence of (i.e., statistically controlling) for the effect of the mediator variable. The mediator variable is called Intention, and as you can see, it intervenes between the predictor and outcome variables. The direct relation between the predictor variable and mediator variable is commonly referred to as the a path, and note that because the mediator variable is an endogenous variable (i.e., has a specified cause in the model), it as a (residual) error term. The direct relation between the mediator variable and the outcome variable is commonly referred to as the b path. The effect of the predictor variable on the outcome variable via the mediating variable is referred to as the indirect effect given that the predictor variable indirectly influences the outcome variable through another variable; in other words, the mediator variable transmits the effect of the predictor variable to the outcome variable.

Figure 2: Path diagram of the mediation analysis model in which Intention (mediator) mediates the effect of Attitude (predictor variable) on Behavior (outcome variable)
Figure 2: Path diagram of the mediation analysis model in which Intention (mediator) mediates the effect of Attitude (predictor variable) on Behavior (outcome variable)

The mediation model shown in Figure 2 is an example of a partial mediation. We can distinguish between a partial and a complete mediation. A partial mediation is a mediation in which the direct relation (c’) between the predictor variable and the outcome variable is non-zero; in other words, a direct relation between the predictor and the outcome is estimated as a free parameter. A complete mediation (i.e., full mediation) is a mediation in which the direct relation (c’) between the predictor variable and the outcome variable is zero; in other words, the direct relation between the predictor and the outcome is fixed at zero, which means we don’t specify the direct relation at all in our model – or we specify the direct relation but fix it to zero. A complete mediation is shown in Figure 3.

Figure 3: Examples of two equivalent path diagram representations of the same complete mediation
Figure 3: Examples of two equivalent path diagram representations of the same complete mediation

55.1.1 Estimation of Indirect Effect

There are different approaches to estimating the indirect effect, such as causal steps (Baron and Kenny 1986), difference in coefficients, and product of coefficients (MacKinnon and Dwyer 1993). Because the product of coefficients approach is efficient to implement, we will focus on that approach in this tutorial. Using this approach, the indirect effect is estimated by computing the product of the coefficients for directional relations (i.e., paths), which include the relation between the predictor variable and the mediator variable (a path) and the relation between the mediator variable and the outcome variable (b). The statistical significance of the indirect effect can be computed by dividing the product of the coefficients by its standard error, and then the resulting ratio (quotient) can be compared to a standard normal distribution – or another distribution.

There are different methods for implementing the product of coefficients approach, and we will focus on the (multivariate) delta method and a resampling method. First, in the (multivariate) delta method (where the Sobel test is a specific form), the standard error for the product of coefficients is compared to a standard normal distribution to determine the indirect effect’s significance level (i.e., p-value). This method does tend to show bias in smaller samples [e.g., N < 50; MacKinnon, Warsi, and Dwyer (1995); MacKinnon et al. (2002)], however, and assumes the sampling distribution of the indirect effect is normal, which is not often a tenable assumption when dealing with the product of two coefficients. Second, resampling methods constitute a broad family of methods in which data are re-sampled (with replacement) a specified number of times (e.g., 5,000), and a normal sampling distribution need not be assumed. Examples of resampling methods broadly include bootstrapping, the permutation method, and the Monte Carlo method. Bootstrapping is perhaps one of the most common a popular resampling methods today, and bootstrapping can be used to estimate confidence intervals for an indirect effect estimated using the product of coefficients approach. Many resampling approaches tend to show some bias in small samples [e.g., N < 20-80; Koopman et al. (2015)]; however, resampling methods like bootstrapping are often preferred over the delta method because they do not hold the strict assumption that the sampling distribution is normal. There are a multiple types of bootstrapping, such as percentile, bias-corrected, and bias-corrected and accelerated, most of which can be implemented with relative ease in R.

55.1.2 Model Identification

Because we are testing for mediation using path analysis in this chapter, please refer to the Model Identification section from previous chapter on path analysis.

55.1.3 Model Fit

please refer to the Model Fit section from previous chapter on path analysis.

55.1.4 Parameter Estimates

In mediation analysis, parameter estimates (e.g., direct relations, covariances, variances) can be interpreted like those from a regression model, where the associated p-values or confidence intervals can be used as indicators of statistical significance.

55.1.5 Statistical Assumptions

The statistical assumptions that should be met prior to estimating and/or interpreting a mediation analysis model are the same that should be met when running a path analysis model; for a review of those assumptions please review to the previous chapter path analysis.

55.1.6 Conceptual Video

For a more in-depth review of path analysis, please check out the following conceptual video.

Link to conceptual video: https://youtu.be/3zA4iB0aAvg

55.2 Tutorial

This chapter’s tutorial demonstrates perform mediation analysis in a path analysis framework using R.

55.2.1 Video Tutorial

As usual, you have the choice to follow along with the written tutorial in this chapter or to watch the video tutorial below.

Link to video tutorial: https://youtu.be/za-KTX1A5us

55.2.2 Functions & Packages Introduced

Function Package
sem lavaan
summary base R
set.seed base R
parameterEstimates lavaan

55.2.3 Initial Steps

If you haven’t already, save the file called “PlannedBehavior.csv” into a folder that you will subsequently set as your working directory. Your working directory will likely be different than the one shown below (i.e., "H:/RWorkshop"). As a reminder, you can access all of the data files referenced in this book by downloading them as a compressed (zipped) folder from the my GitHub site: https://github.com/davidcaughlin/R-Tutorial-Data-Files; once you’ve followed the link to GitHub, just click “Code” (or “Download”) followed by “Download ZIP”, which will download all of the data files referenced in this book. For the sake of parsimony, I recommend downloading all of the data files into the same folder on your computer, which will allow you to set that same folder as your working directory for each of the chapters in this book.

Next, using the setwd function, set your working directory to the folder in which you saved the data file for this chapter. Alternatively, you can manually set your working directory folder in your drop-down menus by going to Session > Set Working Directory > Choose Directory…. Be sure to create a new R script file (.R) or update an existing R script file so that you can save your script and annotations. If you need refreshers on how to set your working directory and how to create and save an R script, please refer to Setting a Working Directory and Creating & Saving an R Script.

# Set your working directory
setwd("H:/RWorkshop")

Next, read in the .csv data file called “PlannedBehavior.csv” using your choice of read function. In this example, I use the read_csv function from the readr package (Wickham, Hester, and Bryan 2024). If you choose to use the read_csv function, be sure that you have installed and accessed the readr package using the install.packages and library functions. Note: You don’t need to install a package every time you wish to access it; in general, I would recommend updating a package installation once ever 1-3 months. For refreshers on installing packages and reading data into R, please refer to Packages and Reading Data into R.

# Install readr package if you haven't already
# [Note: You don't need to install a package every 
# time you wish to access it]
install.packages("readr")
# Access readr package
library(readr)

# Read data and name data frame (tibble) object
df <- read_csv("PlannedBehavior.csv")
## Rows: 199 Columns: 5
## ── Column specification ────────────────────────────────────────────────────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (5): attitude, norms, control, intention, behavior
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# Print the names of the variables in the data frame (tibble) object
names(df)
## [1] "attitude"  "norms"     "control"   "intention" "behavior"
# Print variable type for each variable in data frame (tibble) object
str(df)
## spc_tbl_ [199 × 5] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ attitude : num [1:199] 2.31 4.66 3.85 4.24 2.91 2.99 3.96 3.01 4.77 3.67 ...
##  $ norms    : num [1:199] 2.31 4.01 3.56 2.25 3.31 2.51 4.65 2.98 3.09 3.63 ...
##  $ control  : num [1:199] 2.03 3.63 4.2 2.84 2.4 2.95 3.77 1.9 3.83 5 ...
##  $ intention: num [1:199] 2.5 3.99 4.35 1.51 1.45 2.59 4.08 2.58 4.87 3.09 ...
##  $ behavior : num [1:199] 2.62 3.64 3.83 2.25 2 2.2 4.41 4.15 4.35 3.95 ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   attitude = col_double(),
##   ..   norms = col_double(),
##   ..   control = col_double(),
##   ..   intention = col_double(),
##   ..   behavior = col_double()
##   .. )
##  - attr(*, "problems")=<externalptr>
# Print first 6 rows of data frame (tibble) object
head(df)
## # A tibble: 6 × 5
##   attitude norms control intention behavior
##      <dbl> <dbl>   <dbl>     <dbl>    <dbl>
## 1     2.31  2.31    2.03      2.5      2.62
## 2     4.66  4.01    3.63      3.99     3.64
## 3     3.85  3.56    4.2       4.35     3.83
## 4     4.24  2.25    2.84      1.51     2.25
## 5     2.91  3.31    2.4       1.45     2   
## 6     2.99  2.51    2.95      2.59     2.2
# Print number of rows in data frame (tibble) object
nrow(df)
## [1] 199

There are 5 variables and 199 cases (i.e., employees) in the df data frame: attitude, norms, control, intention, and behavior. Per the output of the str (structure) function above, all of the variables are of type numeric (continuous: interval/ratio). All of the variables are self-reports from a survey, where respondents rated their level of agreement (1 = strongly disagree, 5 = strongly agree) for the items associated with the attitude, norms, control, and intention scales, and where respondents rated the frequency with which they enact the behavior associated with the behavior variable (1 = never, 5 = all of the time). The attitude variable reflects an employee’s attitude toward the behavior in question. The norms variable reflects an employee’s perception of norms pertaining to the enactment of the behavior. The control variable reflects an employee’s feeling of control over being able to perform the behavior. The intention variable reflects an employee’s intention to enact the behavior. The behavior variable reflects an employee’s perception of the frequency with which they actually engage in the behavior.

55.2.4 Specify & Estimate a Mediation Analysis Model

We will use functions from the lavaan package (Rosseel 2012) to specify and estimate our mediation analysis model. The lavaan package also allows structural equation modeling with latent variables, but we won’t cover that in this tutorial. If you haven’t already install and access the lavaan package using the install.packages and library functions, respectively. For background information on the lavaan package, check out the package website.

# Install package
install.packages("lavaan")
# Access package
library(lavaan)

Let’s specify a partial mediation model in which attitude is the predictor variable, intention is the mediator variable, and behavior is the outcome variable. [Note: If we wanted to specify a complete mediation model, then we would not specify the direct relation from the predictor variable to the outcome variable.] We will first apply the delta method for estimating the indirect effect and then follow it up with the percentile bootstrapping method.

55.2.4.1 Delta Method

To apply the delta method, which is a product of coefficients method, we will do the following.

  1. We will specify the model and name it specmod using the <- operator.
  • If we precede a predictor variable with a letter or a series of characters without spaces between them and then follow that up with an asterisk (*), we can label a path (or covariance or variance) so that later we can reference it. In this case, we will label the direct relation between the predictor variable (attitude) and the outcome variable as c (i.e., behavior ~ c*attitude), and we will label the paths between the predictor variable and the mediator variable and the mediator variable and the outcome variable as a and b (respectively).
  • To estimate the indirect effect using the product of coefficients method, we will type an arbitrary name for the indirect effect (such as ab) followed by the := operator, and then followed by the product of the a and b paths we previously labeled (i.e., a*b).
  1. We will create a model estimation object (which here we label fitmod), and apply the sem function from lavaan, with the model specification object (specmod) as the first argument. As the second argument, type data= followed by the name of the data from which the variables named in the model specification object come (df).
  2. Type the name of the summary function from base R. As the first argument, type the name of the model estimation object we created in the previous step (fitmod). As the second and third arguments, respectively, type fit.measures=TRUE and rsquare=TRUE to request the model fit indices and the unadjusted R2 values for the endogenous variables.
# Specify mediation analysis model
specmod <- "
# Path c' (direct effect)
behavior ~ c*attitude
# Path a
intention ~ a*attitude
# Path b
behavior ~ b*intention

# Indirect effect (a*b)
ab := a*b
"

# Estimate model
fitmod <- sem(specmod, data=df)

# Request summary of results
summary(fitmod, fit.measures=TRUE, rsquare=TRUE)
## lavaan 0.6.15 ended normally after 1 iteration
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                         5
## 
##   Number of observations                           199
## 
## Model Test User Model:
##                                                       
##   Test statistic                                 0.000
##   Degrees of freedom                                 0
## 
## Model Test Baseline Model:
## 
##   Test statistic                               103.700
##   Degrees of freedom                                 3
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    1.000
##   Tucker-Lewis Index (TLI)                       1.000
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)               -481.884
##   Loglikelihood unrestricted model (H1)       -481.884
##                                                       
##   Akaike (AIC)                                 973.767
##   Bayesian (BIC)                               990.234
##   Sample-size adjusted Bayesian (SABIC)        974.393
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.000
##   90 Percent confidence interval - lower         0.000
##   90 Percent confidence interval - upper         0.000
##   P-value H_0: RMSEA <= 0.050                       NA
##   P-value H_0: RMSEA >= 0.080                       NA
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.000
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   behavior ~                                          
##     attitude   (c)    0.029    0.071    0.412    0.680
##   intention ~                                         
##     attitude   (a)    0.484    0.058    8.333    0.000
##   behavior ~                                          
##     intention  (b)    0.438    0.075    5.832    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .behavior          0.698    0.070    9.975    0.000
##    .intention         0.623    0.062    9.975    0.000
## 
## R-Square:
##                    Estimate
##     behavior          0.199
##     intention         0.259
## 
## Defined Parameters:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     ab                0.212    0.044    4.778    0.000

Again, we have 199 observations (employees) used in this analysis, and the degrees of freedom is equal to 0, which indicates that our model is just-identified. Because the model is just-identified, there are no available degrees of freedom with which to estimate some of the common model fit indices (e.g., RMSEA), so we will not evaluate model fit. Instead, we will move on to interpreting model parameter estimates. The path coefficient (i.e., direct relation, presumed causal relation) between attitude and intention (i.e., path a) is statistically significant and positive (b = .484, p < .001). Further, the path coefficient between intention and behavior (i.e., path b) is also statistically significant and positive (b = .484, p < .001). The direct effect path between attitude and behavior, however, is nonsignificant (b = .029, p = .680). The (residual) error variances for behavior and intentions are not of substantive interest, so we will ignore them. The unadjusted R2 values for the endogenous variable intention is .259, which indicates that attitude explains 25.9% of the variance in intention, and the unadjusted R2 value for the outcome variable behavior is .199, which indicates that, collectively attitude and intention explain 19.9% of the variance in behavior. Moving on to the estimate of the indirect effect (which you can find in the Defined Parameters section), we see that the indirect effect object we specified (ab) is positive and statistically significant (IE = .212, p < .001), which indicates that intention mediates the association between attitude and intention. Again, this estimate was computed using the delta method.

55.2.4.2 Percentile Bootstrapping Method

The percentile bootstrapping method is generally recommended over the delta method, and thus, my recommendation is to use the percentile bootstrapping method when estimating indirect effects. Depending upon the processing speed and power of your machine, the approach make take a minute or multiple minutes to perform the number of bootstrap iterations (i.e., re-samples) requested. The initial model specification and model estimation steps the same as would be used for the delta method, as we will use these to get the model fit information and direct effect parameter estimates, which don’t require bootstrapping. However, after that, differences emerge.

  1. Specify the model in the same way as was done previously in preparation for the delta method, and name the object something (e.g., specmod).
  2. Estimate the model using the sem function, just as you would with the delta method, as this will generate the model fit indices and parameter estimates that we will interpret for direction relations; let’s name the model estimation object fitmod1 in this example.
  3. Just as you would with the delta method, type the name of the summary function from base R. As the first argument, type the name of the model estimation object we created in the previous step (fitmod1). As the second and third arguments, respectively, type fit.measures=TRUE and rsquare=TRUE to request the model fit indices and the unadjusted R2 values for the endogenous variables.
  4. Bootstrapping involves random draws (i.e., re-samples) so the results may vary each time unless we set a “random seed” to allow us to reproduce the results; using the set.seed function from base R, let’s choose 2019 as the arbitrary seed value, and enter it as the sole parenthetical argument in the set.seed function.
  5. Now it’s time to generate the bootstrap re-sampling by once again applying the sem function. This time, let’s use the <- operator to name the model estimation object fitmod2 to distinguish it from the one we previously created.
  • As the first argument in the sem function, the name of the model specification object is entered (specmod).
  • As the second argument, data= is entered, followed by the name of the data frame object (df).
  • As the third argument, the approach for estimating the standard errors is indicated by typing se= followed by bootstrap to request bootstrap standard errors.
  • As the fourth argument, type bootstrap= followed by the number of bootstrap re-samples (with replacement) we wish to request. Generally, requesting at least 5,000 re-sampling iterations is advisable, so let’s type 5000 after bootstrap=.
  1. Type the name of the parameterEstimates function from the lavaan package to request that percentile bootstrapping be applied to the 5,000 re-sampling iterations requested earlier.
  • As the first argument, type the name of the model estimation object (fitmod).
  • As the second argument, type ci=TRUE to request confidence intervals.
  • As the third argument, type level=0.95 to request the 95% confidence interval be estimated.
  • As the fourth argument, type boot.ci.type="perc" to apply the percentile bootstrap approach. It may take a few minutes for your requested bootstraps to run.
# Specify mediation analysis model
specmod <- "
# Path c' (direct effect)
behavior ~ c*attitude
# Path a
intention ~ a*attitude
# Path b
behavior ~ b*intention

# Indirect effect (a*b)
ab := a*b
"

# Estimate model (direct effect parameter estimates)
fitmod1 <- sem(specmod, data=df)

# Request summary of results
summary(fitmod1, fit.measures=TRUE, rsquare=TRUE)
## lavaan 0.6.15 ended normally after 1 iteration
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                         5
## 
##   Number of observations                           199
## 
## Model Test User Model:
##                                                       
##   Test statistic                                 0.000
##   Degrees of freedom                                 0
## 
## Model Test Baseline Model:
## 
##   Test statistic                               103.700
##   Degrees of freedom                                 3
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    1.000
##   Tucker-Lewis Index (TLI)                       1.000
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)               -481.884
##   Loglikelihood unrestricted model (H1)       -481.884
##                                                       
##   Akaike (AIC)                                 973.767
##   Bayesian (BIC)                               990.234
##   Sample-size adjusted Bayesian (SABIC)        974.393
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.000
##   90 Percent confidence interval - lower         0.000
##   90 Percent confidence interval - upper         0.000
##   P-value H_0: RMSEA <= 0.050                       NA
##   P-value H_0: RMSEA >= 0.080                       NA
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.000
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   behavior ~                                          
##     attitude   (c)    0.029    0.071    0.412    0.680
##   intention ~                                         
##     attitude   (a)    0.484    0.058    8.333    0.000
##   behavior ~                                          
##     intention  (b)    0.438    0.075    5.832    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .behavior          0.698    0.070    9.975    0.000
##    .intention         0.623    0.062    9.975    0.000
## 
## R-Square:
##                    Estimate
##     behavior          0.199
##     intention         0.259
## 
## Defined Parameters:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     ab                0.212    0.044    4.778    0.000
# Set seed
set.seed(2022)

# Estimate model (bootstrap)
fitmod2 <- sem(specmod, data=df, se="bootstrap", bootstrap=5000)

# Request percentile bootstrap 95% confidence intervals
parameterEstimates(fitmod2, ci=TRUE, level=0.95, boot.ci.type="perc")
##         lhs op       rhs label   est    se      z pvalue ci.lower ci.upper
## 1  behavior  ~  attitude     c 0.029 0.065  0.451  0.652   -0.099    0.156
## 2 intention  ~  attitude     a 0.484 0.054  8.980  0.000    0.378    0.587
## 3  behavior  ~ intention     b 0.438 0.071  6.166  0.000    0.297    0.576
## 4  behavior ~~  behavior       0.698 0.056 12.375  0.000    0.581    0.800
## 5 intention ~~ intention       0.623 0.055 11.351  0.000    0.515    0.730
## 6  attitude ~~  attitude       0.928 0.000     NA     NA    0.928    0.928
## 7        ab :=       a*b    ab 0.212    NA     NA     NA    0.136    0.300

Note: If you receive the following error message, you can safely ignore it, as it simply means that some of the bootstrap draws (i.e., re-samples) did not work successfully but most did; we can still be confident in our findings, and this warning does not apply to the original model we estimated prior to the bootstrapping.

lavaan WARNING: the optimizer warns that a solution has NOT been found!lavaan WARNING: the optimizer warns that a solution has NOT been found!lavaan WARNING: only 4997 bootstrap draws were successful.

The model fit information and direct relation (non-bootstrapped) parameter estimates are the same as when we estimated the model as part of the delta method. Nonetheless, I repeat those findings here and then extend them with the percentile bootstrap findings for the indirect effect. We have 199 observations (employees) used in this analysis, and the degrees of freedom is equal to 0, which indicates that our model is just-identified. Because the model is just-identified, there are no available degrees of freedom with which to estimate some of the common model fit indices (e.g., RMSEA), so we will not evaluate model fit. Instead, we will move on to interpreting model parameter estimates. The path coefficient (i.e., direct relation, presumed causal relation) between attitude and intention (i.e., path a) is statistically significant and positive (b = .484, p < .001). Further, the path coefficient between intention and behavior (i.e., path b) is also statistically significant and positive (b = .484, p < .001). The direct effect path between attitude and behavior, however, is nonsignificant (b = .029, p = .680). The (residual) error variances for behavior and intentions are not of substantive interest, so we will ignore them. The unadjusted R2 values for the endogenous variable intention is .259, which indicates that attitude explains 25.9% of the variance in intention, and the unadjusted R2 value for the outcome variable behavior is .199, which indicates that, collectively attitude and intention explain 19.9% of the variance in behavior. Moving on, we get the delta method estimate of the indirect effect (which you can find in the Defined Parameters section), and we see that the indirect effect object we specified (ab) is positive and statistically significant (IE = .212, p < .001), which indicates that intention mediates the association between attitude and intention. Again, however, that is the delta method estimate of the indirect effect, so let’s now take a look at the percentile bootstrap estimate of the indirect effect. At the very end of the output appears a table containing, among other things, the 95% confidence intervals for various parameters. We are only interested in the very last row which displays the 95% confidence interval for the ab object we specified using the product of coefficients approach. The point estimate for the indirect effect is .212, and 95% confidence interval does not include zero and positive, so we can conclude that the indirect effect of attitude on behavior via intention is statistically significant and positive (95% CI[.136, 300]). In other words, the percentile bootstrapping method indicates that intention mediates the association between attitude and behavior. If this confidence interval included zero in its range, then we would conclude that there is no evidence that intention is a mediator.

55.2.5 Summary

In this chapter, we explored the building blocks of mediation analysis using a path analysis framework. To do so, we used the sem and parameterEstimates functions from the lavaan package and to estimate the indirect effect.

References

Baron, Reuben M, and David A Kenny. 1986. “The Moderator-Mediator Variable Distinction in Social Psychological Research: Conceptual, Strategic, and Statistical Considerations.” Journal of Personality and Social Psychology 51 (6): 1173–82.
Koopman, Joel, Michael Howe, John R Hollenbeck, and Hock-Peng Sin. 2015. “Small Sample Mediation Testing: Misplaced Confidence in Bootstrapped Confidence Intervals.” Journal of Applied Psychology 100 (1): 194–202.
MacKinnon, David P, and James H Dwyer. 1993. “Estimating Mediated Effects in Prevention Studies.” Evaluation Review 17 (2): 144–58.
MacKinnon, David P, Chondra M Lockwood, Jeanne M Hoffman, Stephen G West, and Virgil Sheets. 2002. “A Comparison of Methods to Test Mediation and Other Intervening Variable Effects.” Psychological Methods 7 (1): 83–104.
MacKinnon, David P, Ghulam Warsi, and James H Dwyer. 1995. “A Simulation Study of Mediated Effect Measures.” Multivariate Behavioral Research 30 (1): 41–62.
Rosseel, Yves. 2012. lavaan: An R Package for Structural Equation Modeling.” Journal of Statistical Software 48 (2): 1–36. https://www.jstatsoft.org/v48/i02/.
Wickham, Hadley, Jim Hester, and Jennifer Bryan. 2024. Readr: Read Rectangular Text Data. https://CRAN.R-project.org/package=readr.