Preface

This book is free to read and is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. The contents of this book may not be used for commercial purposes. This is Version 0.1.4 of this book, which means that it is in a nearly complete form and will undergo further editing and expansion.

0.1 Growth of HR Analytics

The term human resource analytics can mean different things to different people and to different organizations. Further, human resource analytics sometimes goes by other names like people analytics, talent analytics, workforce analytics, and human capital analytics. While some may argue for distinctions between these different names, for this book, I will treat them as interchangeable labels. Moreover, for the purposes of this book, human resource (HR) analytics is defined as the “process of collecting, analyzing, interpreting, and reporting people-related data for the purpose of improving decision making, achieving strategic objectives, and sustaining a competitive advantage” (Bauer et al. 2025, 42).

The foundation of HR analytics formed over a century ago with the emergence of disciplines like industrial and organizational (I/O) psychology. In recent decades, advances in information technology and systems have reduced the time HR professionals spend on transactional and administrative activities, thereby creating more time and opportunity for transformational activities supporting the realization of strategic organizational objectives. HR analytics has the potential to play an integral role in such transformational activities, as it can inform HR system design (e.g., choosing selection tools, validating selection tools) and high-stakes decision making involving people-related data from the organization.

A 2023 survey of companies highlighted the lack of readiness to adopt and integrate HR analytics (Institute 2023).
A 2023 survey of companies highlighted the lack of readiness to adopt and integrate HR analytics (Institute 2023).

0.2 Skills Gap

Although many organizations regard HR analytics as strategically important for organizational success, today many of those same organizations face an HR analytics talent shortage. To some extent, the talent shortage can be attributed to data literacy – or the lack thereof. Historically, academic and professional HR training and development opportunities did not emphasize data-literacy skills, and this omission has left organizations today scrambling to hire external talent or to close the skills gap of existing HR professionals.

To address the HR analytics talent shortage and skills gap, organizations have, broadly speaking, two options. First, for some organizations, closing the skills gap may be as straightforward as hiring a “quant” (e.g., data scientist, statistician), provided the individual works closely with HR professionals when working with data associated with HR systems, policies, and procedures, and identifying HR-specific legal and ethical issues. Second, I would argue that for most organizations perhaps a better alternative is to close the skills gap among current HR professionals, as their HR-specific knowledge, skills, abilities, and other characteristics (KSAOs) offer tremendous value when deriving insights from HR data as well as a solid domain-specific foundation for subsequently layering on data-literacy KSAOs. Importantly, those with existing HR domain expertise presumably have working knowledge of prevailing employment and labor laws and experience with anticipating and uncovering ethical issues, both of which are necessary when acquiring, managing, analyzing, visualizing, and reporting HR data.

At the most basic level, proficiency in HR analytics involves the integration of knowledge, skills, abilities, and other characteristics (KSAOs) associated with HR expertise and data literacy.
At the most basic level, proficiency in HR analytics involves the integration of knowledge, skills, abilities, and other characteristics (KSAOs) associated with HR expertise and data literacy.

0.3 Project Life Cycle Perspective

When building efficacy in HR analytics, I have found that it’s helpful to envision where and how contributions can be made at the project level and which specific KSAOs are required at each phase. To that end, I developed the HR Analytics Project Life Cycle (HRAPLC) as a way to conceptualize the prototypical phases of a generic project life cycle. These phases include: Question Formulation, Data Acquisition, Data Management, Data Analysis, Data Interpretation and Storytelling, and Deployment and Implementation. I dedicate Part 1 of this book to providing a conceptual overview of the HRPLC in Chapters 1-7.

The Human Resource Analytics Project Life Cycle (HRAPLC) offers a way to conceptualize the prototypical phases of a generic HR analytics project life cycle.
The Human Resource Analytics Project Life Cycle (HRAPLC) offers a way to conceptualize the prototypical phases of a generic HR analytics project life cycle.

0.4 Overview of HRIS & HR Analytics

If you are just looking for a basic overview of HR information systems (HRIS) and HR analytics, consider checking out the following introductory video.

Link to conceptual video: https://youtu.be/3X7qmb1M39A

And if you are looking for an introduction to human resource management with supplementary Excel-based tutorials and data exercises, I recommend checking out one of the following textbooks I co-authored:

0.5 My Philosophy for This Book

Working with data does not need to be scary or intimidating; yet, over the years, I have interacted with students and professionals who carry with them what I refer to as a numerical phobia or quantitative trauma. Unfortunately, at some point in their lives, some people begin to believe that they are not suited for mathematics, statistics, and/or generally working with data. Given these psychological barriers, a primary objective of this book is to make data analytics – and HR analytics specifically – relevant, accessible, and maybe even a little fun. In early chapters, my intention is to ease the reader into foundational concepts, applications, and tools in order to build self-efficacy in HR analytics incrementally. The tutorials in each chapter are grounded in common and (hopefully) meaningful HR contexts (e.g., validating employee selection tools). As the book progresses, I introduce more challenging statistical concepts and data-analytic techniques. Reading this book and following along with the in-chapter tutorials will not lead to expert-level knowledge and skill; however, my hope is that working through this book will do the following:

  1. Build excitement for working with data to inform decision making.
  2. Instill a sense of intellectual curiosity about data and a hunger to expand boundaries of expertise.
  3. Inspire further in-depth training, education, and learning in areas and topics introduced in this book.
  4. Enhance data literacy, including knowledge and skills related to (a) critical thinking and logic, (b) mathematics, statistics, and data analysis, and (c) data visualization and storytelling with data.

0.5.1 Rationale for Using R

Today, we have the potential to access and use a remarkable number of statistical and data-analytic tools. Examples of such tools include (in no particular order) R, Python, SPSS, SAS, Stata, MatLab, Mplus, Alteryx, Tableau, PowerBI, and Microsoft Excel. Notably, some of these programs can be quite expensive when it comes to user licensing or subscription costs, which can be a barrier to access for many.

Programming languages like R and Python have several desirable qualities when it comes to managing, analyzing, and visualizing data. Namely, both are free to use, and both have an ever-growing number of free (add-on) packages with domain- or area-specific functions (e.g., data visualizations). It is beyond the scope of this Preface to provide an exhaustive comparison of the relative merits of R versus Python; however, when it comes to the statistical analysis of data, specifically, I argue that R provides a more user-friendly entry point for beginners as well as more advanced capabilities desired by expert users, especially for ad-hoc analyses. Moreover, the integrated development environment program called RStudio (which “sits on top of” base R) offers useful workflow tools and generally makes for an inviting environment.

That said, Python has been catching up in these regards, and I wouldn’t be surprised if Python closes these gaps relative to R in the next few years. I would be remiss if I didn’t mention that the Python language is powerful and has capabilities that extend far beyond the management, analysis, and visualization of data. Fortunately, learning R makes learning Python easier (and vice versa), which means that this book can serve as a springboard for learning Python or other programming languages; in fact, RStudio allows users to create and run Python code. Finally, I believe it to be unlikely that one tool (e.g., program, language) will emerge that is ideal for every task, and thus, I encourage you to build familiarity with multiple tools so that you develop a “toolbox” of sorts, thereby allowing you to choose the best (or at least better) tool for each task.

0.5.2 Audience

I have written this book with current or soon-to-be HR professionals in mind, particularly those who have an interest in upskilling their data-analytic knowledge and skills.This book can provide a meaningful context for learning key data-analytic concepts, applications, and tools that are applicable beyond the HR context. Relatedly, this book may serve as a user-friendly gateway and introduction to the programming language called R for those who are interested in other non-HR domains.

0.6 Structure

This book consists of the following parts and associated chapters:

  1. HR Analytics Project Life Cycle:
  1. Introduction to R:
  1. Data Acquisition & Management:
  1. Employee Demographics:
  1. Employee Surveys:
  1. Employee Training:
  1. Employee Selection:
  1. Employee Separation & Retention:
  1. Employee Performance Management:
  1. Employee Compensation & Reward Systems:
  1. Odds & Ends:

0.7 About the Author

David Caughlin is an Assistant Professor of Organizational Behavior in the Orfalea College of Business at Cal Poly (California Polytechnic State University) in San Luis Obispo, where he teaches and researches topics related to organizational behavior, human resource management, and data analytics. David received his B.S. in psychology and B.A. in Spanish from Indiana University, M.S. in industrial and organizational psychology from Indiana University - Purdue University at Indianapolis, and Ph.D. in industrial and organizational psychology from Portland State University with concentrations in quantitative methodology and occupational health psychology. His research interests are generally focused on supervisor support, work motivation, and occupational safety and health. David has co-authored research published in peer-reviewed outlets such as Journal of Applied Psychology, Journal of Management, Human Resource Management, Journal of Business and Psychology, Journal of Vocational Behavior, Journal of Occupational Health Psychology, and Psychology, Public Policy, and the Law. In addition, he has co-authored the textbooks Human Resource Management: People, Data, and Analytics and Fundamentals of Human Resource Management. David teaches undergraduate and graduate courses on topics related to organizational behavior, human resource management, information systems, and data analytics. In his HR analytics courses, David teaches students how to apply the statistical programming language R to manage, analyze, and visualize HR data to improve strategic decision making; in the process, students build data-literacy, critical-thinking, and logical-reasoning skills. He received the following teaching awards from Portland State University’s School of Business: Teaching Innovation Award (2018), “Extra Mile” Teaching Excellence Award (2019), Teaching Innovation Award (2020), and the Brenda Eichelberger “Extra Mile” Teaching Excellence Award (2022). In his free time, David enjoys outdoor activities like running, cycling, hiking, paddle boarding, and skiing.

0.8 Contacting the Author

I created this free resource as a side project and outside of my day job. Thus, if you reach out to me, I may not have time to respond in a timely manner – or at all. That being said, if you would like to attempt to reach me, please email: davidcaughlin@rforhr.com.

0.9 Acknowledgements

My inspiration for writing and compiling the contents of this book stems from interactions with countless colleagues, professional acquaintances, and undergraduate and graduate students. A broad “thank you” is in order for anyone with whom I have taught or had a conversation about data analytics and data science. Further, I want to thank Liz Harman for lending her copy-editing skills to this book, David Gerbing for the many conversations we have shared about teaching R using his lessR package (Gerbing, Business, and University 2021) and his introductory R data analysis textbook, and Sam Caughlin for creating the cover art for this book. Finally, I created this book using the following programs and packages: R (R Core Team 2023), RStudio (RStudio Team 2020), rmarkdown (Xie, Allaire, and Grolemund 2018; Allaire et al. 2023), knitr (Xie 2015a, 2014, 2023b), and bookdown (Xie 2016, 2023a).

References

Allaire, JJ, Yihui Xie, Christophe Dervieux, Jonathan McPherson, Javier Luraschi, Kevin Ushey, Aron Atkins, et al. 2023. Rmarkdown: Dynamic Documents for r. https://CRAN.R-project.org/package=rmarkdown.
Bauer, Talya N., Berrin Erdogan, David E. Caughlin, and Donald M. Truxillo. 2025. Human Resource Management: People, Data, and Analytics (2nd Ed.). Thousand Oaks, California: Sage.
Gerbing, David, The School of Business, and Portland State University. 2021. lessR: Less Code, More Results. https://CRAN.R-project.org/package=lessR.
Institute, HR Research. 2023. “The State of People Analytics 2022-2023.”
R Core Team. 2023. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.
RStudio Team. 2020. RStudio: Integrated Development Environment for R. Boston, MA: RStudio, PBC. http://www.rstudio.com/.
Xie, Yihui. 2014. “Knitr: A Comprehensive Tool for Reproducible Research in R.” In Implementing Reproducible Computational Research, edited by Victoria Stodden, Friedrich Leisch, and Roger D. Peng. Chapman; Hall/CRC.
———. 2015a. Dynamic Documents with R and Knitr. 2nd ed. Boca Raton, Florida: Chapman; Hall/CRC. https://yihui.org/knitr/.
———. 2016. Bookdown: Authoring Books and Technical Documents with R Markdown. Boca Raton, Florida: Chapman; Hall/CRC. https://bookdown.org/yihui/bookdown.
———. 2023a. Bookdown: Authoring Books and Technical Documents with r Markdown. https://CRAN.R-project.org/package=bookdown.
———. 2023b. Knitr: A General-Purpose Package for Dynamic Report Generation in r. https://yihui.org/knitr/.
Xie, Yihui, J. J. Allaire, and Garrett Grolemund. 2018. R Markdown: The Definitive Guide. Boca Raton, Florida: Chapman; Hall/CRC. https://bookdown.org/yihui/rmarkdown.