The problem with the forced distribution model

In 2019 the UK’s civil service changes its mind about its use of forced distributions ( https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/971013/SCS-PM-UpdatedGuidance-April2021_1.0.pdf ), deciding instead to move to more flexible, non-mandatory objective setting. The change came about only recently, from April 2021, for the 2021–22 performance year. But it is a big change: the civil service has half a million employees.

The details of the former scheme are instructive — it worked as follows: civil servants’ performance was ranked from top to bottom, and a distribution overlaid as follows:

-the top 25% of Senior Civil Servants were ranked as top performing;

-the 65% of civil servants deemed as coming below top performers were graded as ‘ achieving ‘; then

-the bottom 10% were ‘ low performers ‘.

Where do these numbers come from? It is not entirely clear. If the system assumes a Gaussian (normal) distribution of performance, then one would expect it to apply the Empirical Rule — which states that 99.7% of data observed (under a normal distribution) lies within 3 standard deviations of the mean. 68% of the data falls within one standard deviation, 95% percent within two standard deviations, and 99.7% within three standard deviations from the mean.

So that central 65% looks like a rounded version of 1 standard deviation from the mean. Of course the ordering of the civil service bands does not follow the standard deviation form — in consequence the thresholds applied by the civil service appear to be rather arbitrary. But in any event there is always going to be arbitrariness in attempting to overlay a mathematical model on human performance.

The civil service is by no means the only large employer using this system. Microsoft used it, and the forced distribution system was recently the subject of staff complaints at KPMG — KPMG Chairman Bill Michael told employees to stop moaning about the system. They did not stop; he had to step aside.

The appeal of the system is that it forces managers to have awkward conversations with under-performers, and also provides a quantifiable way for workers to judge their own performance relative to their cohort.

So why would the UK civil service change course ? For one thing, the approach measures relative performance — but arguably the proper test of interest to management is absolute performance. If you are a civil servant in a particularly talented team, your own impeccable work could land you in the low performers band in circumstances where of course you are not in fact underperforming.

A further problem comes with the ranking exercise — it is plainly a difficult matter to rank employees by performance, because to do so is to compare apples with pears. The difficulties entailed by redundancy selection processes are replicated at each stage of regular performance review.

And then there is the issue with employees gaming the system: strong performers have an incentive not to work together (hardly desirable). And, worse, there is an incentive to sabotage your team mates’ work for your own relative advantage.

Writing in the Financial Times, Sarah O’Conner further points out that research indicates the act of line managers giving feedback has also been shown to hurt performance in a third of all cases. ( Footnote 1). The effect on morale is toxic — because necessarily just under half the population of workers are being told that their work is sub-par.

And another point: in terms of absolute performance — who says the normal distribution should apply anyway? For example, what if performance is influenced by a team’s positive group dynamics so that the actual performance distribution is right-skewed?

The latter seems likely, and has an analogy with the performance of the stock market: you might expect the performance of stock-market assets, in aggregate, to be always normally-distributed. In fact, due to reinforcing loops, stocks commonly move up and down together. That idea is a simple but powerful one and a rebuttal to those seeking to classify performance using relative measures as a proxy for absolute measures of value.

Footnote 1: https://www.ft.com/content/0691002c-2200-4583-88c9-9c942d534228