Chapter 29 Estimating Internal Consistency Reliability Using Cronbach’s alpha

In this chapter, we will learn how to estimate the internal consistency reliability of a multi-item measure (i.e, scale, inventory, test) by using Cronbach’s alpha (\(\alpha\)).

29.1 Conceptual Overview

Link to conceptual video: https://youtu.be/IAjR0lnCu-s

We can think of reliability as how consistently or dependably we have measured something in a given sample. Common types of reliability that we encounter in human resource management include inter-rater reliability, test-retest reliability, and internal consistency reliability. Conventionally, a measurement tool demonstrates an acceptable level of reliability in a sample when the reliability estimate is .70 or higher, where .00 indicates very low reliability and 1.00 indicates very high reliability. That being said, we should always strive for reliability estimates that are much closer to 1.00.

When we working with multi-item measures multi-item measures, we often estimate internal consistency reliability can be defined as “a reliability estimate based on intercorrelation (i.e., homogeneity) among items on a test, with [Cronbach’s] alpha being a prime example” (Schultz and Whitney 2005). In other words, internal consistency reliability tells us how consistent scores on different items (e.g., questions) are to one another. Homogeneity among items provides some evidence that the items are reliably measuring the same construct (i.e., concept). Of course, just because we are consistently measuring something doesn’t necessary mean we are measuring the correct something, which echoes the notion that high reliability is a necessary but not sufficient condition for high validity. Nonetheless, internal consistency reliability is a useful form of reliability when it comes to evaluating multi-item scales and determining whether it is appropriate to create a composite variable (i.e., overall scale score variable) based on the sum or mean of item scores for each case (e.g., observation, person, employee, individual).

Cronbach’s alpha (\(\alpha\)) is commonly used as an indicator of internal consistency reliability. If our goal is to understand the extent to which the items relate to underlying factor(s), exploratory and/or confirmatory factor analysis would be appropriate. Cronbach’s alpha can be used to assess internal consistency reliability when the variables (e.g., survey items, measure items) analyzed are continuous (interval or ratio measurement scale); however, we often relax this assumption to allow the analysis of Likert-type response scale formats (e.g., 1 = Strongly Disagree, 5 = Strongly Disagree) for variables that are technically ordinal in nature. For dichotomous (binary) items, we might use the Kuder-Richardson (K-R) coefficient of equivalence to assess internal consistency reliability.

There are different thresholds we might apply to evaluate the internal consistency reliability based on Cronbach’s alpha, and the table below shows the thresholds for qualitative descriptors that we’ll apply throughout this book.

Cronbach’s alpha (\(\alpha\)) Qualitative Descriptor
.95-1.00 Excellent
.90-.94 Great
.80-.89 Good
.70-.79 Acceptable
.60-.69 Questionable
.00-.59 Unacceptable

For additional information on internal consistency reliability, Cronbach’s alpha, and reliability in general, please check out this open-source resource (Price et al. 2017).

29.2 Tutorial

This chapter’s tutorial demonstrates how to estimate internal consistency reliability using Cronbach’s alpha in R.

29.2.1 Video Tutorial

As usual, you have the choice to follow along with the written tutorial in this chapter or to watch the video tutorial below.

Link to video tutorial: https://youtu.be/7k35isYrE4Q

29.2.2 Functions & Packages Introduced

Function Package
alpha psych
c base R

29.2.3 Initial Steps

If you haven’t already, save the file called “survey.csv” into a folder that you will subsequently set as your working directory. Your working directory will likely be different than the one shown below (i.e., "H:/RWorkshop"). As a reminder, you can access all of the data files referenced in this book by downloading them as a compressed (zipped) folder from the my GitHub site: https://github.com/davidcaughlin/R-Tutorial-Data-Files; once you’ve followed the link to GitHub, just click “Code” (or “Download”) followed by “Download ZIP”, which will download all of the data files referenced in this book. For the sake of parsimony, I recommend downloading all of the data files into the same folder on your computer, which will allow you to set that same folder as your working directory for each of the chapters in this book.

Next, using the setwd function, set your working directory to the folder in which you saved the data file for this chapter. Alternatively, you can manually set your working directory folder in your drop-down menus by going to Session > Set Working Directory > Choose Directory…. Be sure to create a new R script file (.R) or update an existing R script file so that you can save your script and annotations. If you need refreshers on how to set your working directory and how to create and save an R script, please refer to Setting a Working Directory and Creating & Saving an R Script.

# Set your working directory
setwd("H:/RWorkshop")

Next, read in the .csv data file called “survey.csv” using your choice of read function. In this example, I use the read_csv function from the readr package (Wickham, Hester, and Bryan 2022). If you choose to use the read_csv function, be sure that you have installed and accessed the readr package using the install.packages and library functions. Note: You don’t need to install a package every time you wish to access it; in general, I would recommend updating a package installation once ever 1-3 months. For refreshers on installing packages and reading data into R, please refer to Packages and Reading Data into R.

# Install readr package if you haven't already
# [Note: You don't need to install a package every 
# time you wish to access it]
install.packages("readr")
# Access readr package
library(readr)

# Read data and name data frame (tibble) object
df <- read_csv("survey.csv")
## Rows: 156 Columns: 11
## ── Column specification ──────────────────────────────────────────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (11): SurveyID, JobSat1, JobSat2, JobSat3, TurnInt1, TurnInt2, TurnInt3, Engage1, Engage2, Engage3, Engage4
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# Print the names of the variables in the data frame (tibble) object
names(df)
##  [1] "SurveyID" "JobSat1"  "JobSat2"  "JobSat3"  "TurnInt1" "TurnInt2" "TurnInt3" "Engage1"  "Engage2"  "Engage3" 
## [11] "Engage4"
# Print number of rows in data frame (tibble) object
nrow(df)
## [1] 156
# Print top 6 rows of data frame (tibble) object
head(df)
## # A tibble: 6 × 11
##   SurveyID JobSat1 JobSat2 JobSat3 TurnInt1 TurnInt2 TurnInt3 Engage1 Engage2 Engage3 Engage4
##      <dbl>   <dbl>   <dbl>   <dbl>    <dbl>    <dbl>    <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
## 1        1       3       3       3        3        3        3       2       1       2       2
## 2        2       4       4       4        3        3        2       4       4       4       4
## 3        3       4       4       5        2        1        2       4       4       4       4
## 4        4       2       3       3        4        4        4       4       4       4       4
## 5        5       3       3       3        4        3        3       3       3       3       3
## 6        6       3       3       3        3        2        2       4       4       5       3

The data frame includes annual employee survey responses from 156 employees to three Job Satisfaction items (JobSat1, JobSat2, JobSat3), three Turnover Intentions items (TurnInt1, TurnInt2, TurnInt3), and four Engagement items (Engage1, Engage2, Engage3, Engage4). Employees responded to each item using a 5-point response format, ranging from Strongly Disagree (1) to Strongly Agree (5). Assume that higher scores on an item indicate higher levels of that variable; for example, a higher score on TurnInt1 would indicate that the respondent has higher intentions of quitting the organization.

29.2.4 Compute Cronbach’s alpha

Prior to creating a composite score (i.e., overall scale score) for each case (e.g., observation, respondent, individual, employee) within our data frame based on their responses to each of the multi-item survey measures, it is common to compute Cronbach’s alpha as an estimate of internal consistency reliability. Cronbach’s alpha provides us with information we can use to judge whether variables (e.g., items) are internally consistent with each another. If they’re not internally consistent, then it won’t make much sense to compute a composite variable (i.e., overall scale score variable) based on the mean or sum of the variables (e.g., items). For the purposes of this book, we will consider a scale with an alpha greater than or equal to .70 to demonstrate acceptable internal consistency for the particular sample, whereas an alpha that falls within the range of .60-.69 would be considered questionable, and an alpha below .60 would be deemed unacceptable. Here is a table of more nuanced qualitative descriptors for Cronbach’s alpha:

Cronbach’s alpha (\(\alpha\)) Qualitative Descriptor
.95-1.00 Excellent
.90-.94 Great
.80-.89 Good
.70-.79 Acceptable
.60-.69 Questionable
.00-.59 Unacceptable

To compute internal consistency reliability, we will use the alpha function from the psych package (Revelle 2022). To get started, install and access the psych package using the install.packages and library functions, respectively (if you haven’t already done so).

# Install package
install.packages("psych")
# Access package
library(psych)

Presumably, each set of similarly named items are intended to “tap into” the same underlying concept (e.g., Turnover Intentions: TurnInt1, TurnInt2, TurnInt3), which we are attempting to assess using the items. Let’s practice estimating Cronbach’s alpha for some of the measures in our data frame, beginning with the three Turnover Intention items (i.e., variables) (TurnInt1, TurnInt2, TurnInt3). Note that you must list the name of the data frame (df) containing these items prior to the first bracket ([. After that and within the c (combine) function, provide the item (variable) names of the scale for which you would like to estimate the internal consistency reliability; because we have included a comma (,) before the c function, we are using matrix/bracket notation to reference columns, which correspond to variables in this data frame; if we had placed the c function before the comma, then we would (,) be referencing rows using matrix/bracket notation (which will almost never be the case when we’re using a data frame with the alpha function. Finally, the item (variable) names listed as arguments within the c function should be in quotation marks.

# Estimate Cronbach's alpha for Turnover Intentions items
alpha(df[,c("TurnInt1","TurnInt2","TurnInt3")])
## 
## Reliability analysis   
## Call: alpha(x = df[, c("TurnInt1", "TurnInt2", "TurnInt3")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean   sd median_r
##       0.83      0.83    0.78      0.63   5 0.023  2.9 0.64     0.59
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.78  0.83  0.87
## Duhachek  0.79  0.83  0.88
## 
##  Reliability if an item is dropped:
##          raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## TurnInt1      0.75      0.75    0.59      0.59 2.9    0.041    NA  0.59
## TurnInt2      0.73      0.74    0.58      0.58 2.8    0.042    NA  0.58
## TurnInt3      0.83      0.83    0.70      0.70 4.8    0.028    NA  0.70
## 
##  Item statistics 
##            n raw.r std.r r.cor r.drop mean   sd
## TurnInt1 154  0.88  0.88  0.80   0.72  3.0 0.77
## TurnInt2 154  0.88  0.88  0.81   0.73  2.8 0.73
## TurnInt3 154  0.83  0.84  0.69   0.64  2.8 0.72
## 
## Non missing response frequency for each item
##             1    2    3    4    5 miss
## TurnInt1 0.02 0.23 0.51 0.23 0.01 0.01
## TurnInt2 0.03 0.29 0.54 0.14 0.01 0.01
## TurnInt3 0.02 0.31 0.51 0.16 0.00 0.01

Note: If you see the following message at the top of your output, you can often safely ignore it – that is, unless you know that one or more items should have been reverse-coded. If an item needs to be reverse coded, then you would need to take care of that prior to running the alpha function.

\(\color{red}{\text{Some items ( [ITEM NAME] ) were negatively correlated with the total scale and probably should be reversed.}}\)

Based on the output from the alpha function, we can conclude that the raw alpha (raw_alpha) of .83 for all three items exceeds our cutoff of .70 for acceptable internal consistency, and enters into the realm of what we would consider to be good internal consistency. Next, take a look at the output table called Reliability if an item is dropped; this table indicates what would happen to Cronbach’s alpha if you were to drop the item listed in the row in which the item appears and then re-estimate Cronbach’s alpha. For example, if you dropped TurnInt1 and retained all other items, Cronbach’s alpha would drop to approximately .75. Similarly, if you dropped TurnInt2 and retained all other items, Cronbach’s alpha would drop to .75. Finally, if you dropped TurnInt3 and retained all other items, Cronbach’s alpha would remain the same (.83). Thus, given that Cronbach’s alpha for all three items exceeds .70 and that dropping any one of the items would not increase Cronbach’s alpha for the items from the Turnover Intentions survey measure, from an empirical perspective, we can be reasonably confident that the three Turnover Intentions items are internally consistent with one another, which means it would be acceptable to create an overall scale score based on the sum or mean of these three items.

Importantly, before making a final decision on whether to retain all of the items, however, we should review the qualitative content of each item to determine whether it meets our conceptual definition for turnover intentions. Let’s imagine our conceptual definition of turnover intentions is a person’s thoughts and intentions to leave an organization, and the three turnover intentions items follow.

  1. TurnInt1 - “I regularly think about leaving this organization.”
  2. TurnInt2 - “I plan on quitting this job within the next year.”
  3. TurnInt3 - “I regularly search for new jobs outside of this organization.”

In this example, all three of these items appear to “tap into” our conceptual definition of turnover intentions . Thus, when combined with the acceptable internal consistency reliability for these three items, we can reasonably justify creating a composite variable (i.e., overall scale score variable) based on the mean or sum of these three items; you will learn how to do this in the following chapter.

Now, let’s apply the alpha function to the three Job Satisfaction items (JobSat1, JobSat2, JobSat3). Given that we’re working the same data frame object (df), all we need to do is swap out the three turnover intentions item names (i.e., variable names) with the three job satisfaction items names. Please note that if we had fewer or more than three variables, we would simply list fewer variable-name arguments in the c function nested within the alpha function.

# Estimate Cronbach's alpha for Job Satisfaction items
alpha(df[,c("JobSat1","JobSat2","JobSat3")])
## 
## Reliability analysis   
## Call: alpha(x = df[, c("JobSat1", "JobSat2", "JobSat3")])
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean   sd median_r
##       0.78      0.78    0.72      0.54 3.5 0.032  3.3 0.68     0.47
## 
##     95% confidence boundaries 
##          lower alpha upper
## Feldt     0.71  0.78  0.83
## Duhachek  0.71  0.78  0.84
## 
##  Reliability if an item is dropped:
##         raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## JobSat1      0.62      0.62    0.45      0.45 1.6    0.061    NA  0.45
## JobSat2      0.63      0.64    0.47      0.47 1.7    0.058    NA  0.47
## JobSat3      0.82      0.82    0.70      0.70 4.7    0.028    NA  0.70
## 
##  Item statistics 
##           n raw.r std.r r.cor r.drop mean   sd
## JobSat1 156  0.86  0.87  0.79   0.68  3.1 0.82
## JobSat2 152  0.85  0.86  0.78   0.67  3.2 0.80
## JobSat3 156  0.78  0.77  0.55   0.50  3.4 0.86
## 
## Non missing response frequency for each item
##            1    2    3    4    5 miss
## JobSat1 0.02 0.19 0.48 0.28 0.03 0.00
## JobSat2 0.01 0.14 0.47 0.33 0.04 0.03
## JobSat3 0.00 0.15 0.39 0.36 0.10 0.00

Based on the output from the alpha function shown above, we can conclude that the raw alpha (raw_alpha) of .78 for all three items exceeds our cutoff of .70 for acceptable internal consistency. Take a look at the output table called Reliability if an item is dropped; this table indicates what would happen to Cronbach’s alpha if we were to drop the item listed in the row in which the item appears. For example, if we dropped JobSat1 and retained all other items, Cronbach’s alpha would drop to .62. Similarly, if we dropped JobSat2 and retained all other items, Cronbach’s alpha would drop to .63. Finally, if we dropped JobSat3 and retained all other items, Cronbach’s alpha would increase to .82. Now we are faced with a dilemma: Should we drop JobSat3 to improve Cronbach’s alpha by .04? Or should we retain JobSat3 because this increase might be described by some as only marginal? Well, this is a situation where it is especially important to look at the actual qualitative item content – just like we did with the turnover intentions items. Let’s imagine our conceptual definition for job satisfaction is a person’s evaluation of their work and job, and the three job satisfaction items are as follows.

  1. JobSat1 - “I enjoy completing my daily work for my job.”
  2. JobSat2 - “I am satisfied with my job.”
  3. JobSat3 - “I am satisfied with my work and with my direct supervisor.”

In this example, the qualitative item content corroborates the increase in internal consistency reliability should we drop JobSat3. Specifically, the item content for JobSat3 indicates that it is double-barreled (i.e., references two objects), and this likely explains why this item seems to be less consistent with the other two items. Given all that, we would make the decision to drop the JobSat3 and create the composite variable for overall job satisfaction using all items except for JobSat3.

I encourage you to practice computing Cronbach’s alpha on your own using the alpha function for the four Engagement items (Engage1, Engage2, Engage3, Engage4). Let’s say that the conceptual definition for engagement is: The extent to which a person feels enthusiastic, energized, and driven to perform their work. And let’s pretend that the actual items are:

  1. Engage1 - “When I’m working, I’m full of energy.”
  2. Engage2 - “I complete my work with enthusiasm.”
  3. Engage3 - “I find inspiration in my work.”
  4. Engage4 - “I have no problem working for long periods of time.”

In the following chapter, we will begin by estimating Cronbach’s alpha for these four engagement items and then determine which of the items should be included in a composite variable for engagement.

29.2.5 Summary

In this chapter, we learned how to estimate the internal consistency reliability of a multi-item measures by computing Cronbach’s alpha (\(\alpha\)). Cronbach’s alpha represents one way to estimate the internal consistency reliability of a set of variables (e.g., items). Cronbach’s alpha can help us understand whether a set of variables (e.g., items) are homogeneous. The alpha function from the psych package offers an efficient approach to estimating Cronbach’s alpha.

References

Price, Paul C, Rajiv S Jhangiani, I-Chant A Chiang, Dana C Leighton, and Carrie Cuttler. 2017. Research Methods in Psychology (3rd American Ed.). Montreal, Canada: Pressbooks. https://opentext.wsu.edu/carriecuttler/.
Revelle, William. 2022. Psych: Procedures for Psychological, Psychometric, and Personality Research. https://personality-project.org/r/psych/
Schultz, K S, and D J Whitney. 2005. Measurement Theory in Action: Case Studies and Exercises. Thousand Oaks, CA: Sage.
Wickham, Hadley, Jim Hester, and Jennifer Bryan. 2022. Readr: Read Rectangular Text Data. https://CRAN.R-project.org/package=readr.