For and Against 12 Training Evaluation Metrics

On Monday I asked readers to share their thoughts on the most compelling of twelve training evaluation metrics. Whether you’re trying to create a case for funding for a training program you feel is essential or you’re trying to market the value of your existing professional development offerings, using qualitative and quantitative training evaluation metrics will be an important part of your argument.  

Typically when you use metrics to make a case for your training programs, a combination of various metrics is optimal. Training evaluation experts James and Wendy Kirkpatrick suggest connecting a variety of qualitative and quantitative data points across the various levels of evaluation to construct a chain of evidence in order to demonstrate business value.

Unfortunately, we don’t always have access to a variety of data points.

Strengths and Weaknesses of Training Evaluation Metrics

Here is a review of the strengths and weaknesses of each of the training evaluation metrics I previously posted, in the event we’re left with a single metric upon which to rely. I have divided these up based on “level” of evaluation.

Level 0: “Butts in Seats”

Definition: The number of people attending a course or completing an eLearning module.

Example from the previous blog post:

  • 97% of staff have completed the 75-minute mandatory Unconscious Bias training session since its roll-out six weeks ago.

Reasons to use this training evaluation metric: As reader Peter Sharp commented in the previous blog post, “97% completed, whilst at a compliance level that may be needed,” this data point is most useful to know if people are being exposed to certain concepts, rules, regulations or practices. After all, if nobody is taking a course, creating it was a total waste of time and money. If a lot of people are taking a course, it could be an indicator that it is filling a need, that marketing of the course has gone well and/or that there’s something very compelling about the course.

Reasons to be careful: Beyond the number of people exposed to the course content, this truly is a vanity metric unless it’s combined with higher level evaluation data.

Level 1: Participant Reaction

Definition: What participants think of your training program.

Examples from the previous blog post:

  • Average participant feedback scores for this training component were 4.8 out of a possible 5.
  • “This was by far the most interesting and engaging eLearning module I’ve taken in my 10 years of working for this organization!”
  • The Net Promoter Score (NPS) on our 1-day New Employee Orientation program is 54.
  • “Thanks for offering this. I’m pretty sure I’ll be better with difficult customers when I head into work tomorrow.”
  • 88% of participants responded that they are either extremely likely or likely to apply these skills in their work.

Reasons to use this training evaluation metric: This is perhaps the most common way to come up with a quantitative metric around training, with data often coming in the form of responses to post-training evaluation forms completed by a captive audience before they leave the room or log off the eLearning module. Again referring to a comment from reader Peter Sharp, the qualitative comments that come from these evaluation forms, such as “that module was the most interesting and engaging in my 10 years with this organization” can indicate a change in attitude toward the learning programs you offer.

Reasons to be careful: While low Level 1 scores may indicate participants had a difficult time paying attention or learning, I’ve never seen a study that correlates high training evaluation scores with better on the job performance. Many post-training evaluation results, if not combined with higher level evaluation data, come awfully close to vanity metric territory.

Level 2: Knowledge

Definition: The amount of new information with which participants leave your training program.

Example from the previous blog post:

  • Participant post-test scores rose by an average of 17% compared to their pre-test scores.

Reasons to use this training evaluation metric: We obviously want our learners to walk away from a training program with a higher level of knowledge, skills, and abilities than they had before they completed the program.

Reasons to be careful: This level of evaluation is often measured with a pre-test delivered at the start of the program and a post-test administered at the end of the program. Administering a post-test immediately upon the conclusion of a training program doesn’t necessarily indicate someone has committed knowledge, skills or abilities to their long-term memory. Waiting a day or a week to administer the post-test, while administratively more difficult, could offer better insights as to the concepts that were retained over time.

Level 3: Transfer to the Job

Definition: When training participants adopt new knowledge, skills, and abilities after leaving the training environment.

Examples from the previous blog post:

  • After 3 months, 78% of learners who responded to the survey said they were still applying the new technique.
  • After 90 days 62% of managers reported that employees are applying new skills after the training.

Reasons to use this training evaluation metric: : If individuals aren’t doing things new, differently or better, then why did they take time out of their schedules in order to complete the training program? John Luber commented that individual performance offers “the most long-term” impact of the training initiative.

Reasons to be careful: Caution should be taken, especially with the first example of a Level 3 metric above, because it’s a self-reported data point, and people often have an inflated sense of performance. Key questions that will need to be answered at this level of evaluation include:

  • How long after the training program would you like to collect this data?
  • How will you collect this data? A survey of trainees? Supervisors? On-the-job observations?

This data collection becomes tricky (because of low response rates – people are busy!). It may also be tricky because you may not have access to trainees or their supervisors following the completion of a training program.

Level 4: Impact on the Organization

Definition: Training that leads to measurable organizational results.

Examples from the previous blog post:

  • Following the roll-out of the new customer service training, customer wait times have been reduced by an average of 2 minutes per call.
  • Locations that have completed this 2-day training program have realized a 13% increase in revenue compared to those who have not yet completed this training program.

Reasons to use this training evaluation metric: This was a popular choice for readers who commented on Monday’s post. Missy Brown simply said: “Customer service = bottom line – EVERYTHING!” Tracey Ebert thought these were powerful numbers while reminding us that getting this kind of data “takes some foresight as you need to know what you want to measure BEFORE the training and take a baseline.” This is such an important metric because at the end of the day, most training initiatives should be leading to improved organizational performance.

Reasons to be careful: Numbers tell a story, but exactly what story are they telling? Melanie Cossette suggests that these numbers look good… assuming something like decreasing wait times was the goal. Collecting organizational impact data will take planning and patience as it may take a year or more to begin seeing consistent organizational impact.

Level 5: Return on Investment (ROI)

Definition: The (financial) impact resulting from the training compared to the cost of putting it together.

Example from the previous blog post:

  • The eLearning module cost us $18,900 to produce. Employees who completed it have seen re-work errors drop by 24%, saving an average of 2.25 hours of re-work per employee per day. This equates to an average daily boost in productivity of $540, or $2,700 per week.

Reasons to use this training evaluation metric: This data point was the overwhelming choice of blog readers who commented on Monday. Paula Baker suggested this shows numbers that demonstrate how training supports the bottom line. Traci Cordle explained that “Everyone, from the new hire to the CEO, can understand the value of training related to this statement.” Lisa McCoy stated: “Money talks.” Mark Nilles pointed out that these are numbers that would resonate with everyone – from management to marketing.

Reasons to be careful: Calculating the ROI for any training initiative is tricky at best, and can be downright dishonest depending on how the numbers are put together. It’s simply difficult to isolate the impact of training alone. It’s possible the training coincided with an upturn in the economy, an organizational policy change, or there could be some other behind the scenes variable that impacts performance. While monetizing the value of training is a noble initiative, it’s something that must be done with extreme caution.

Thank you to all of the wonderful professionals who contributed to this post. Use the comments below to keep the conversation about training evaluation metrics going.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.