grimace emoji recreated on the side of a building
Feature

AI Hiring Bias Has an Accomplice: You

6 minute read
Lance Haun avatar
By
SAVED
If AI has an opinion, people trust it. We need new strategies to stop humans from signing off on AI bias.

Human oversight was supposed to fix AI bias in hiring. Experts and software vendors alike rely on people as the ultimate safeguard. Let the algorithm screen resumes and surface candidates, sure, but keep a human in the loop to catch the mistakes and make the ultimate decision. 

This isn’t a new idea. An IBM training manual from 1979 said, “A computer can never be held accountable, therefore a computer must never make a management decision.” And 80% of companies using AI hiring tools say they do require human review before rejecting anyone. It's what we point to when regulators or ethicists ask how we're being responsible.

And yet, a University of Washington study has just shown what human review actually looks like: recruiters clicking "yes" on whatever the AI recommended, almost all of the time. Even when the bias was obvious.

Human-in-the-loop has become the problem.

Where Human-in-the-Loop Is Failing

The UW researchers ran an experiment with 528 people screening resumes for 16 different jobs. Each person saw four equally qualified candidates, two white men and two men from other racial groups (Asian, Black or Hispanic), plus one clearly less qualified candidate as a control.

Participants made their choices in two rounds. First, no AI recommendations. Then, with AI recommendations that were either neutral (one candidate per race), moderately biased (reflecting bias levels found in earlier studies of real AI tools) or severely biased (strong preference for one racial group).

When there was no AI or neutral AI, people split their picks roughly evenly across racial groups. The moment the AI showed a preference, participants mirrored it. If the AI favored white candidates, so did the humans. If it favored non-white candidates, it was the same result. Under severe bias conditions, humans followed the AI's lead about 90% of the time.

The direction of bias didn't matter. What mattered was that the AI had an opinion, and people trusted it.

The Bias Safeguard That Isn't

AI can process volume and spot patterns humans miss, but it might encode biases from its training data. The solution? Keep a human in the decision loop. The human acts as a check, a final layer of judgment to catch what the algorithm gets wrong.

That’s the right thing to do. 

The human-in-the-loop narrative has shaped procurement decisions, vendor pitches and regulatory frameworks. It's why New York City's automated employment decision tool law requires bias audits but still allows the tools as long as a human reviews the output. It's why HR teams feel comfortable rolling out resume screening and candidate ranking software. We've got a person looking at it. We're being responsible.

The UW study shows that's wishful thinking. When an AI tool ranks candidates and a recruiter reviews that ranking, recruiters are ratifying the machine's output. 

This isn't a story about lazy or biased recruiters, though. 

It's a story about how humans interact with authority, especially automated authority. When a system that claims to be objective tells you "This is the best candidate," questioning it feels like second-guessing expertise. 

We built a system that entrenches bias while calling it responsible.

Why It’s Difficult to Fix a Bad Feedback Loop

An earlier study from UW tested three open-source AI models by having them rank more than 550 real resumes with names signaling white or Black men and women. The models compared over 3 million resume and job description combinations across nine occupations. White-associated names were preferred 85% of the time. Female-associated names, 11% of the time. Black male names were never preferred over white male names. Not once.

It gets worse.

A separate study published in Nature analyzed 1.4 million images and videos from Google, Wikipedia, IMDb and other platforms, plus nine large language models trained on billions of words. The finding: women are systematically portrayed as younger than men across the internet, even though there's no real age gap in the workforce. When ChatGPT generated roughly 40,000 resumes for 54 occupations, it assumed women were 1.6 years younger and had less work experience than identically qualified men. When evaluating those resumes, it rated older men higher.

When those models interact with humans who trust their recommendations, the bias gets operationalized. Turned into an offer letter, or more often, a rejection email. Those decisions become new data points, patterns the next generation of models will learn from. The loop tightens.

Most fairness audits miss this bias. If your AI is writing job descriptions, summarizing candidate qualifications or ranking applicants, it's importing these patterns whether you know it or not. The best solutions will try to combat them with guardrails, but AI is notoriously unreliable for consistent application.

That’s why the human-in-the-loop fails: recruiters defer to AI for predictable, human reasons. Screening resumes is cognitively exhausting. High volume, tight deadlines and a tool that promises to surface the best candidates faster. When the AI says "this one," it feels like help. Disagreeing requires effort, confidence that you know better than a system trained on millions of data points, and time most recruiters don't have. The path of least resistance is to click yes. The bias gets a human signature, the decision becomes data and the cycle continues.

What Actually Helps Reduce Bias

The UW researchers tested one intervention that made a difference: giving participants an implicit association test before they reviewed resumes reduced bias by 13%. It's not a cure, but it's a start. Making people pause and consider their own biases before they interact with AI changes how much they defer to it.

Learning Opportunities

Other strategies worth trying:

  • Audit the human-AI interaction, not just the algorithm. Most bias audits test the model in isolation. That tells you what the AI recommends, not what your recruiters do with those recommendations. You need to measure the downstream decisions, the ones that actually affect candidates.
  • Treat AI recommendations as one input among many. If your workflow is "AI ranks, recruiter clicks," you've automated the decision and added a human signature. If it's "AI surfaces candidates, recruiter independently evaluates using a structured rubric," you might get actual oversight.
  • Require vendors to disclose training data and known biases. New York City mandates independent audits of automated employment decision tools, but those audits don't cover how the tool performs when humans are using it. Procurement teams should demand transparency on what biases the model exhibits, how it was tested and what happens when people follow its recommendations.
  • Train hiring managers on how AI works and where it fails. If recruiters don't know the tool has preferences baked in, they can't correct for them. Most don't. Most assume the AI is objective because it's a machine. That assumption is the problem.
  • Default to skepticism, not deference. The recruiters in the UW study weren't malicious. They were doing what felt efficient. The fix is better systems that assume human reviewers will trust the AI unless given a reason not to.

One last consideration: The push to cut recruiting staff amid AI efficiencies has upped the pressure on untested hiring systems. It’s worth looking at your staffing levels to reduce the natural inclination to rely on AI simply because it saves time.

Closing the Accountability Gap

Responsibility sits with everyone in the hiring chain: the people who buy these tools, the people who deploy them, the people who use them and the regulators who decide what "human oversight" actually means.

Right now, most organizations are treating human-in-the-loop as a checkbox. We've got a person looking at it, so we're covered. 

Lead author Kyra Wilson put it plainly, though: the companies building these systems need to reduce bias, and policy is needed to align models with societal values. But until that happens, organizations using AI hiring tools are responsible for what those tools do, and more importantly, for what their people do with them.

AI learns bias, generates biased outputs, humans follow those outputs and the cycle tightens. Breaking it requires honesty about what human oversight actually looks like in practice and what we can do to truly make hiring less biased.

Editor's Note: Catch up on more coverage of AI's use in hiring:

About the Author
Lance Haun

Lance Haun is a leadership and technology columnist for Reworked. He has spent nearly 20 years researching and writing about HR, work and technology. Connect with Lance Haun:

Main image: Matthias Oberholzer | unsplash
Featured Research