Selection Assessment Analysis

A retail company wanted to replace their unstructured interview with an evidence-based hiring process. I was asked to evaluate a set of assessment items and recommend how to use them to hire fairly and predict performance.

Sample 225 sales associates

Inputs 15 assessment items, cognitive ability test

Tools SAS, SPSS, regression, factor analysis

The short version

I tested which assessment items actually predict job performance, and how using them would affect hiring across race and gender. The result was a hiring composite that used three predictors instead of the full set, weighted equally. It predicted performance well and kept adverse impact within acceptable limits.

The problem

The company had been hiring sales associates using an unstructured interview. They wanted to switch to something more evidence-based. They had already written 15 assessment items and given those, along with a cognitive ability test, to 225 of their current sales associates. They asked me to figure out which of those items were worth using and how to use them.

They also wanted to know whether using these items to hire would unfairly disadvantage any group of applicants. A hiring tool that predicts performance well but produces big race or gender gaps in who gets hired is not a tool a company can actually use.

What I did

I worked with two outcomes the company already tracked: performance ratings (averaged across two managers) and the number of rule infractions each employee had. I also built a combined score that captured both.

The 15 items the company wrote looked like they were measuring more than one thing, so I ran a factor analysis to see what was actually in there. Three factors came out: counterproductive work behavior, emotional stability, and conscientiousness. Together with the cognitive ability test, that gave me four possible predictors.

Then I checked each predictor for reliability, ran multiple regressions to see which ones predicted the outcomes, and ran t-tests to see whether any of them favored one race or gender over another. Finally, I simulated what would happen if the company used these predictors to hire the top 100 of 225 applicants, and calculated the adverse impact ratio for each scenario.

What I found

Finding 01

Not every predictor was worth using.

Cognitive ability and conscientiousness were strong predictors of performance ratings. Counterproductive work behavior was the only meaningful predictor of infractions. Emotional stability added almost nothing on top of the others, and it showed a gender difference that would increase adverse impact. Including it would have hurt the company on fairness without helping on prediction.

Finding 02

Cognitive ability was the strongest predictor but had the biggest race gap.

Cognitive ability explained 24 percent of the variance in performance, more than any other single predictor. But it also showed a large race difference, with white applicants scoring noticeably higher than non-white applicants on average. Dropping it would have hurt prediction. Keeping it required pairing it with other predictors to keep adverse impact within an acceptable range.

Finding 03

Equal weighting beat optimal weighting.

I tested two ways to combine the predictors. Optimal weighting used the regression coefficients to set the weights. Equal weighting gave each predictor the same weight. The two performed similarly on prediction, but equal weighting produced less adverse impact on race. Equal weighting is also more stable in small samples, so it travels better to future hires.

What I recommended

Use three predictors: cognitive ability, conscientiousness, and counterproductive work behavior. Weight them equally. Drop emotional stability — it doesn't add predictive value and it makes the hiring process less fair.

The adverse impact ratio for race was 0.55 under this scheme. That's below the 0.8 threshold typically used in selection, which means the company should expect to face scrutiny on race-based hiring outcomes. The recommendation noted this clearly and suggested the company think about whether the predictive gain from cognitive ability was worth the fairness cost, and whether there were ways to source applicants differently to address the gap upstream.

Materials

Full technical report (PDF) Report
SAS analysis code GitHub