The Fairness Trilemma in employment

0 9 5 minutes read

Economists like to draw triangles. In trade, you cannot have high taxes, no retaliation, and fixed prices. In monetary policy, you cannot fix interest rates, fix the money supply, and promise absolute stability. In hiring under unequal starting conditions, there is a similar triangle where most argue about unfairness in hiring past glide.

When firms turn to algorithms to assign rare tasks, they are drawn to three attractive goals: strict efficiency (choose candidates who are likely to perform well), strict representativeness (makes results more or less the same as group shares), and strict formal neutrality (apply the same rules automatically to everyone).

The problem is simple but uncomfortable: they can’t find all three at once. They can choose any two, but the third will go the wrong way. That’s the “fairness trilemma,” and once you see it, much of the confusion about hiring algorithms and equity programs looks less like mystery and more like general pricing theory. You can find the formal statement and proof in my working paper, “The Fairness Trilemma: An Impossibility Theorem for Algorithmic Governance.”

An old promise

For a long time, the story most firms told about recruiting was simple. Bias was always in people’s heads. The dysfunction was always in the judgment of the gut. The solution was obvious: measure, harm, measure. Replace choice with data, and recruiting will be better and more efficient.

That story has fueled a wave of investment in DEI programs and algorithmic recruiting tools. Marketers promised something unusually attractive to both public policy and business administration: moral development without trade. Better results for disadvantaged groups, no loss of performance, and fewer uncomfortable conversations about understanding or power.

Algorithmic recruiting systems are marketed as a way out of the bind. Extract resumes and applications, learn what predicts performance, enforce statistical “correctness,” and let the model do the measuring.

But algorithms don’t take away intelligence. They move it—in model design, in data selection, in the definition of “fairness” itself. And they tend to move it to places that are hard to see and hard to compete against.

Figure in three rooms

The now-famous story of Amazon’s experimental hiring algorithm is a useful illustration. Trained in historical rewriting and hiring decisions, the program has learned that applicants whose profiles match those of men who have been hired in the past are more likely to score highly for technical roles. In fact, it downplays CVs that look “feminine-coded,” indicating a male-dominated tech workforce.

In a small technical sense, the model did not perform badly. Improved prediction performance on given data. It applied the same scoring rule to all applicants. It was efficient and legally neutral. What it couldn’t do was generate representative results from unrepresentative data.

At that time, the company faced three options that show a clean map in the trilemma. It may maintain the model and accept unequal results (efficiency + neutrality, weak representativeness), add fairness constraints to push the results towards equality and accept lower predictive accuracy (efficiency + representativeness, weak neutrality), or re-introduce human judgment and deduction to correct the pattern (representativeness + understanding, weak formal neutrality). Amazon eventually left the system.

A similar arc played out with HireVue’s AI video interviews. The company touted the automated analysis of facial expression, tone, and word choice as a way to stop and eliminate bias in early stage testing. Critics have argued that these factors relate to disability, neurodivergence, and the human background in ways that are difficult to justify as work-related. Under increasing pressure, HireVue dropped facial analysis altogether.

In both cases, what failed was not the idea of introspection. What failed was the belief that measurement could be neutral in a country with unequal initial conditions, and that you could get efficiency, representation, and neutrality “for free” in the right model.

Play model

A simple model is clearly organized. Consider a firm that needs to fill a fixed number of positions from an applicant pool divided into two groups, A and B. Applicants in both groups are scored by a predictive model that estimates their chances of success. Because of unequal initial conditions—quality of schooling, prior experience, background—group A has a higher predicted success rate than group B. The firm considers one rule: hire everyone with a predicted success score above a certain predetermined level.

Under unequal base rates, one law cannot do all three. It cannot select the best performing candidates expected; they are employed in groups A and B which are approximately equal to their shares in the applicant pool (or population); and apply the same limit to everyone. If a company insists on strong performance and strong neutrality, it imposes one general limitation. Recruits will be drawn equally from group A, the group with the highest predicted score. Representation differs from group sharing.

If it insists on strong efficiency and strong representation, it should relax neutrality with certain limits or group weights in order to recruit more group B applicants while trying to choose the best among them. However applicants from A and B who have the same score are treated differently.

If it insists on strict representation and strict neutrality—the same rule for everyone, the same hiring rates by group—it will not select candidates with higher aggregate scores. It leaves some high-scoring applicants unhired and takes away low-scoring ones, sacrificing efficiency unless existing inequalities disappear.

This is the fainess trilemma in its simplest form. You can choose any two corners of the triangle, but the third will match you. Impossible isn’t primarily about machine learning; it is about allocating scarce spaces under unequal conditions.

Scarcity never ends; it moves

Economists have seen this movie before. Consider rent control. If the price ceiling is set below market clearing levels, the shortage does not end. It’s moving. It manifests as queues, non-quantitative evaluations, side payments, and deteriorating quality. Landlords who can’t measure rent will measure by waiting list, personal networks, and by sight. Empirical work such as the Diamond–McQuade–Qian study of San Francisco rent control illustrates this pattern.

Hiring systems behave similarly. Throw out one method of distribution, and scarcity finds another channel. When performance metrics cannot be measured due to equity limitations, organizations measure with committees, exceptions, comprehensive reviews, and fuzzy exclusions. Each move preserves two corners of the trilemma by relaxing the third. Policy constraints fix the deficit; ; they do not make poverty disappear.

What companies should do

If you accept that efficiency, representation, and legal neutrality cannot be maximized all at once, the question changes. Instead of asking “How do we eliminate bias without trade-offs?” Firms must ask “What berries are we willing to release, and where should the creativity reside?”

A more reliable approach to fit and fit hiring algorithms would do at least three things. Be clear about your priorities, and design governance around those choices. Put discretion where it can be monitored—established committees, written overrides, review processes—rather than burying value judgments within model design and vague fairness metrics. And stop selling algorithms as magic bullets. Models cannot establish subtle trade-offs created by unequal initial conditions; as much as they can, they specify where the constraints are binding and what the costs of the choice are.

The goal is not perfection. It is legitimacy: clearly deciding where the trilemma applies in a given context, and taking responsibility for the results.

(0 COMMENTS)