Probabilistic fallacies in law

The Prosecutor’s fallacy

This is the classic illustration: a murder takes place. The forensics team identify hair from the crime scene. A DNA test is carried out, indicating that someone with a previous conviction who lives in the same town is a match.

The suspect is arrested. An expert report is produced: it states that there is only a 1 in 10,000 chance that the DNA of the former convict and the hair sample will be a positive match. Naturally the Crown Court’s jury is persuaded: here is a scientific expert, stating that there is only a one-in-10,000 chance that the accused is not responsible for the DNA trace.

But of course, after a moment’s reflection, we ought to realise that, if the DNA database were to contain tens of thousands of entries, that would be expected to lead not to one but perhaps to a number of other potential matches. Should we ignore those innocent matches?

Algebraic expression of the fallacy

The prosecutor’s fallacy is when the probability of innocence, given the evidence, is wrongly conflated with the very tiny probability that the evidence would occur if the defendant were innocent. In terms of probabilistic formula, it is wrongly assuming that the probability of

P(Innocence | Evidence) [the probability of the accused being innocent given he matches the DNA]

is no different to

P(Evidence | Innocence) [the probability of an innocent person having a DNA match)

So: coming back to the example set out above, the probabilistic fallacy occurs if the jury confuse the (apparently) really tiny chance that there should be a DNA match if the accused were innocent, with the probability that the accused is in fact innocent in the light of the evidence.

They are not the same thing at all, although unfortunately this distinction does not jump out at us when verbally formulated: regrettably the two grammatical structures sound more like rephrased versions of each other. Instead, the confusing situation is best represented graphically, in the tree diagram as follows –

[Source: Norman Fenton and Martin Neil: ‘Avoiding Probabilistic Reasoning Fallacies in Legal Practice Using Bayesian Networks, presentation at 7th International Conference on Forensic Inference and Statistics (ICFIS) Lausanne, 21 August 2008]

In the example, only one of the 10,000 potential suspects can be the actual source of the DNA (i. e. the guilty person actually present at the scene and leaving a trace). However, because there is a 1/1,000 probability of there being a positive match among the larger 10,000-strong population of potential suspects, this means that about 10 (i.e. 10,000/1,000) innocent suspects would be a positive match.

The Bayesian party trick

The shocking, literally unbelievable conclusion we are forced to assent to, applying Bayes’ Theorem, is that you have to incorporate the objective statistical information and integrate it with the knowledge of the accused testing positive in order to interpret that properly.

The probability of the accused’s innocence, conditional on a positive DNA match, is not 1/10,000 (as would appear if you only considered the top ‘branch’ of the tree diagram above and as ‘common sense’ points to) but instead that there is a 10/11 (i.e. 90.1%) probability that the accused is innocent, based on the DNA match: in other words, out of the suspect population, around 10 would be expect to be DNA matches, so the accused, being a match, simply has a roughly 1/10 chance of guilt before we consider other evidence. Not guilty.

In finance the prosecutor’s fallacy is referred to as the base rate fallacy – the tendency in judgment to give preeminence to information specific to an individual and neglect the broader statistical context. It is, it would seem, a common problem in our habits of probabilistic reasoning across domains.

Bayes in the employment law context?

The use of Bayesian reasoning has been deprecated by the UK’s senior criminal courts for reasons Professor Fenton takes exception to. As an employment practitioner my concern is that our failure to apply rigorous probabilistic reasoning results in numerous errors of judgment, in circumstances where the nature of the judge’s decision-making process remains unexamined.


The future of employment litigation

The justice system’s debts

You don’t have to pore through the pages of the Ministry of Justice’s statistics publications to be struck by the stark imbalance between the tally of employment cases that are filed each year, and the tribunal service’s ability to hear them. But if you do take a moment to review the statisticians’ report, it tells of a growing backlog, snowballing as though it were a cumulative debt being run up against the justice system’s tab.

Litigation risk and ‘administrative risk’

That debt will need to be settled at some point, if the system is to retain its credibility. There comes a point at which, if a complainant knows that getting a hearing will take many years, it ceases to be desirable to pursue a claim at all. Parties always have to deal with the vicissitudes of trial – ‘litigation risk‘. But the risk we are talking about is of a different character – essentially there is now ‘administrative risk‘, being the risk of not being heard within a satisfactory timeframe due to underfunding of the tribunal system. At that point of despair, while the laws on employment rights continue to exist on paper, they cease being relevant because unenforced.

Instead of looking at the figures about the annual shortfall in serving justice, you could also simply talk with a sample of litigants about their experiences, as I commonly need to do in my role. Litigants commonly have the unfortunate experience of having to wait a year – or more commonly several years – to have their cases heard, and tribunal correspondence often suffers from the same lag-time caused by the imbalance between case load and resourcing. Tribunal cases that have taken a year of more to come to trial can be put off at the last minute, sometimes for a whole year or more, due to a lack of judicial resources. Anecdotally, I have experienced this situation more than once in recent months.

Time series forecasting for the MoJ

The graphic I have included at the head of this article is a time series forecast I have made, based on MoJ historical data and using statistical software, suggesting that over the course of the next five years at current rates we can expect the backlog to rise as high as 420,000 cases by 2026.

If the Ministry of Justice were a business with supply chain problems, you would expect an efficient market to satisfy demand by the entry of alternative service providers. What alternative is there to the justice system? There is, of course, alternative dispute resolution (ADR) in the form of conciliation via ACAS. And there have been great efforts by the tribunal service to encourage mediation. But the statistics clearly indicate that, however praiseworthy such initiatives have been, as currently implemented they remain inadequate to the task of clearing the backlog.

I previously wrote about the benefits of remote hearings in terms of speed, convenience, and cost reduction, and suggested that these benefits would also translate well to enriching the conciliation process. Earlier this year, the Presidents of the Employment Tribunals in England & Wales and Scotland (Judge Barry Clarke P and Judge Shona Simon, respectively) wrote about the advantages of the new Cloud Video Platform (CVP) in their Roadmap for 2022:

Video technology allowed us to move at pace to retain listed cases in response to the restrictions announced in January 2021, and it will do so again if coronavirus restrictions return on a seasonal or regional basis. It allows for flexible use of judiciary, as the virtual region will demonstrate. It has facilitated a steady increase in the rate at which we can adjudicate upon the claims that have been presented to the Employment Tribunals – what we call our “disposal rate”. The disposal rate had returned to its pre-pandemic rate by the Autumn of 2020, and it
continues to improve. It does so because, freed of the constraints of the physical estate, more hearings can be listed and remain effective. No hearings were lost in recent months for want of a hearing room. That is because CVP has effectively tripled the size of our estate.

While I share this enthusiasm for the technology, and the giddy feeling of technology suddenly tripling the tribunal’s estate, my feeling after some reflection is this message focuses too much on freedom from the constraints of the physical estate as though that were a panacea. But it cannot be, otherwise we would be seeing a different picture in the data.

Why have things got so bad?

While reference to an improved ‘disposal rate’ is hopeful, the fact remains disposals are outstripped by the rate of new ‘receipts’. Below is my data visualisation indicating what the statistics actually report about the relative rates of disposals: you can see while disposals and receipts went in lockstep until 2015, something appeared to break at that point.

Employment law is certainly not alone in feeling this justice gap. Consider the following graphic drawing on CPS statistics (source: ‘Is the Criminal Justice System Fit for Purpose?’ by Georgina Sturge and Sally Lipscombe) – in the criminal justice system, there is also a marked aggravation in the gap between recorded crime and prosecutions around 2015.

The explanation provided in the Commons Insight paper which I have just cited draws on updated calculations from the Institute for Fiscal Studies:

In 2017, the Institute for Fiscal Studies calculated that in the decade from 2010/11, the Ministry of Justice’s (MoJ) budget would be cut by around 40%. Spending plans have been revised upwards since then, so that in 2019/20 the total MoJ budget was only around 25% lower than in 2010/11.

So it would seem that, over a decade after David Cameron’s Conservative Party came to power and implemented a Budget founded on ‘Responsibility‘, interpreted within the Red Book Budget as meaning the need to reduce or eliminate the structural deficit by the end of the 5 year forecast period – i.e. by 2015 – the bitter fruits of that policy are still being reaped by the justice system, and certainly in employment law ( ).

A second explanation which I should address is the abolition of tribunal fees in July 2017 following the Supreme Court’s judgment in R (on the application of UNISON) (Appellant) v Lord Chancellor (Respondent) [2017] UKSC 51. The background secondary legislation was the Lord Chancellor’s Employment Tribunals and the Employment Appeal Tribunal Fees Order 2013 (SI 2013/1893).

The bar chart below (from the House of Commons Briefing 7081, 18 Dec 2017, ‘Employment tribunal fees’ ) reflects that fees led to a sudden near 70% reduction in cases being filed. Upon the abolition of fees 4 years later in July 2017 there was a partial recovery with a 30% increase in receipts felt in 2017/Q3.

If we return to a detailed view of the original area chart I created to examine the backlog, you can now see how the previous cuts in justice spending introduced by the 2010 Budget (as noted, with as an impact MoJ funding being at least 25% below 2010 funding levels by 2017), aligned with the post-abolition mini-renaissance in employment litigation in late 2017 to create a growing mountain of cases in the backlog.

Three solutions

We can now see why the current situation developed. What other solutions – beyond the current steps around the use of video technology and promoting or mandating judicial mediation and conciliation – can we identify? I have three suggestions which I may return to later:

  1. The most obvious solution is political: reverse the MoJ funding cuts to eliminate the structural justice deficit. But any such 25% MoJ funding surge would require UK voters considering justice a service worth spending money on. At present, I am unaware of it being a vote-winner. The media don’t like it, and the trope of the fat-cat lawyer is easily dusted off to rubbish claims that spending money on justice is in peoples’ interests. Such spending increases may simply not be politically viable within the coming decade, and both Brexit and the pandemic have conspired to give UK GDP a substantial haircut of 4-6%. In the case of Brexit, that 4% GDP loss is ongoing until/unless better trade terms are negotiated. There is no headroom for discretionary spending, let alone for big structural spending.
  2. Tool-up Acas. I have previously written about Acas’s services being plainly under-funded. Most tribunal claims settle, or are withdrawn, or get struck out. Many of the remaining 10% make it to a full hearing at which both parties would be better served by quality conciliation services. The simple truth is that current services are commonly just box-ticking exercises. The tribunal rules require a claim to be intimated to Acas by the Claimant. Nothing more need be done – hence for many the box gets ticked, but matters proceed to trial. I am not suggesting that rules should mandate more steps being taken – but better-funded, and better-equipped conciliators would result in substantial savings because, quite simply, a day spent with a trained conciliator would cost much less than a judge, wing members and courtroom with attendant administrative staff. At present parties sometimes get an email passed on by a conciliator, and if they respond to the conciliator there is no guarantee of having a reply of any sort.
  3. This one is more controversial: Commercial B2B contracts often mandate arbitration to avoid the cost and delay of court proceedings. Put simply, per s203 ERA 1996, in the UK employment agreements cannot provide for the employee ‘contracting out’ of their rights to bring a claim to enforce their legal rights. But if the employment tribunal system will not be made ready perhaps a shadow employment arbitration service should be allowed to serve the growing need for speedy remedy. To do so, Parliament would need to amend the ERA to introduce an arbitration service. It would be a radical, and controversial solution and would essentially risk claimants contracting out of their right to remedy for all manner of statutory torts arising in the course of employment; it might also water down the value of current legal protections. But subject to careful scrutiny and limited trials, I expect it would be a cheap solution, and it would largely push many of the costs of tribunals onto employers.

Why claims don’t settle more often

I do not propose to set out a complete explanation – but one factor is a mismatch between expectations and the reality. There is a survivor bias in the media whereby high-value payouts are widely reported, and the generality of cases – with more modest tribunal awards – do not warrant media interest.

There are free statistics published by the Ministry of Justice each year. These are picked up and disseminated via solicitors’ websites. Why doesn’t that help? Well, the summary statistics presented are also less useful than they should be. The statisticians’ report headlines with the highest value award and the mean award.

Those two figures sound worthy of attention. But let me illustrate the problem by showing the actual distribution of unfair dismissal awards in the employment tribunal (left-hand side) plotted alongside the distribution which one might assume, based on the summary statistics: the shapes are quite different.

If, as would be quite understandable among litigants aware of the usual range of awards, you assume normally-distributed awards around the reported mean, and your willingness to settle will be limited to offers that are, in most cases, well above the award’s true expected value.

Or if, as is also very common indeed, you were unaware of mean values, but had noticed reports of the outlier awards in excess of £150k, that will also lead to psychological anchoring around an unrealistic settlement figure.

Naturally, each case turns on its own facts. Some claims must belong to those outlier groups – and typically those relate to claims brought by highly-paid claimants whose compensatory or pension loss awards make up the bulk of the compensatory figure.

But for me the key message is that a lack of helpful statistical information and interpretation (such as might assist both parties to a dispute in avoiding legal costs) fosters needless litigation.


Brier scores and the Bar

Most barristers are independent professionals. In practice this means they are self-employed and in competition with each other.

It is notoriously difficult to get into the English Bar (3,000 annual applicants, >400 places so prior probability of an 11% chance of obtaining pupillage as an academically-trained barrister in a given year), largely due to

a) that shortage of pupillages (sub-400 per annum) as well as

b) high cost barriers to entry (typically up to £30k for post-graduate tuition, plus the opportunity cost of foregoing graduate-level earnings for a further two years +).

But life is also very competitive after successfully completing pupillage: chambers compete with each other, and even tenants of the same chambers must compete with colleagues in their practice areas to win market share, or at least to win a share of the better-paid end of the price distribution within that particular market for barristers’ services.

That is the hidden message the ubiquity of what might seem like ‘sales puff’ tells you: barristers need to continually market and promote themselves in order to survive.

How do you evaluate a barrister?

Consequently, if you trawl through barristers’ profiles, they are replete with impressive claims about their expertise and past successes. And that may be well and good: you would expect that the difficulty of obtaining the academic training and then securing pupillage would be a Darwinian indicator of quality. So in most cases I would suggest those professional boasts are quite accurate too.

But since there is a UK population of around 16k barristers, how on earth can you distinguish between them? In other words how – as a member of the public, or a solicitor looking for a client – might you be confident you are selecting the barrister best suited to advance your case?

Common ways for solicitors to choose Counsel are:

a) personal direct knowledge through past experiences;

b) indirect knowledge – recommendations from solicitor colleagues or client requests;

c) third-party professional publications’ reviews (Legal500; Chambers and Partners);

d) assumptions of a barrister’s quality based on the reputation of chambers (and representations made by the practice manager/clerk) with a particular focus on the barrister’s online CV;

e) inferences drawn from number of years’ call and price (the higher the figure the better the expectation of quality);

f) other information from the internet, including mentions in the legal media and social media presence.

As a member of the non-legal public you are likely to be more limited in the sources of information you rely on to form a judgment – a solicitor might recommend a barrister for you; and/or you might google the barrister’s webpage.

Forget about ‘rate – my -barrister

What is missing is a more objective (and transparent) way of gauging quality. What about voting? As far as I know there is no Google or Amazon reviews system to aggregate other clients’ assessments of individual barristers. This is because barristers are not set up like businesses. You might find reviews of a particular chambers, but that is not entirely helpful: a chambers may have hundreds of members.

There is a site set up by the Bar Council to facilitate direct access: – it certainly does not have star ratings and I anticipate the introduction of such a scheme would be controversial among members.

Bit. ifthere were, it would not help. This is because there is a significant problem with a voting system that is baked-in: sometimes barristers lose cases not through lack of skill but because the case was bound to fail. Indeed, it may well be that it is the high-risk cases that go to the most experienced Counsel, in much the same way that critically-ill patients would be cared for by an experienced consultant in an ICU. So their case ‘failure’ rate is likely to be higher. And a client who has paid for representation and lost is less likely to feel generous in assessing the services they received. So neither success rate nor market-based voting are viable approaches.

A composite measure?

My view is that the best metric of skill probably has to be a composite measure: it is not controversial to state that years’ of call is an indicator of experience, and a small step to say that greater experience is one aspect of quality.

Another metric would need to be reputation, although again there is no single measure of reputation that is universally appropriate. The Treasury has its bands of approved Counsel – that is a mark of appreciation; and Chambers and Partners assigns some barristers ‘tiers’ reflecting their standing. I think those peer-review grades have a role to play too.

The third metric (a novel one I think I may be the first in advocating for the Bar) should be an objective reflection of a barrister’s long-term average ability to forecast the outcome of a case, both because this is a valuable skill and because it indicates the application of experience and analytical skills.

We already have a scoring system, initially developed to gauge the accuracy of meteorological forecasts: Brier Scores. A Brier Score is essentially another term for the mean squared difference between forecasts and outcomes. To give a single example, if I were to advise a claim enjoyed 55% prospects of success, and the claim succeeded (the outcome being coded as ‘1’), then the Brier Score would be (1-0.55)^2 = 0.20. A lower score is better, since a score of zero indicates perfect prediction, and a score of 1 would be being entirely wrong. Clearly you would want an average score based on as large a number of observations as possible, and probably at least 30 chronologically consecutive assessments.

Since a barrister’s advice is confidential, I do not think there could be any independent validation of such a metric. But I am hopeful that the inherent interest of reflecting on the accuracy of one’s predictions would encourage barristers to be honest with themselves – and an honestly-reported Brier Score would be a far more helpful indicator of quality than the occasional silly boasts one reads that someone is the ‘best barrister’ in their field.


(Friday off-topic) : we know nothing

When I was a child, we occasionally invited family over for a couple of days around dinner. I was always very excited to see family members at this time of year: they lived several hours’ drive away so it was a rare treat to see them. But I knew one thing: it would be awkward – very awkward – if my father and my uncle discussed the true meaning of ‘science’.

This was because my father was a social sciences professor, and my uncle an engineer whose work was squarely covered by the Official Secrets Act 1989. My father’s field was business management; my uncle’s was (essentially) ballistics engineering and computer coding. To put it mildly – they did not see eye-to-eye regarding the epistemic status of their respective disciplines: my uncle struggled to see that the social sciences had contributed anything resembling scientific knowledge.

These things largely passed me by: as a child I was interested in literature. I was highly competitive and satisfied if I could out-compete anyone in my school in that particular field. I did not really have any skin in the social sciences vs physical sciences game. I contented myself with asking lots of questions, and proving I could be equally irritating to both social scientists and ‘hard science’ engineers.

Fast-forward 30 years or so, and imagine my surprise as a middle-aged lawyer to have spent the past few years mugging up on both management science and techniques borrowed from the ‘hard science’ of engineering. I have become indebted to both fields.

My foray began with an interest in risk analysis, which led me to discover the use of decision trees in oil prospecting decisions. Then I became interested in fault tree analysis, as deployed in nuclear power plants and (as I know from my own work) in finance as a way of finding out what went wrong.

Then as I taught myself programming (both R and Python) and got more interested in finance, I became interested in some of the hacks that can be used when you cross-pollinate computing and statistics – I became obsessed with various Monte Carlo techniques, Brier scores and other forms of secular soothsaying.

Latterly, I have been living and breathing all things Bayesian: I labour over articles in mathematical journals I am sensu strictu poorly qualified to understand, and can at a general level offer arguments to help you distinguish Gibbs from Hamiltonian sampling techniques.

One of my interests is Bayesian Networks (“BNs”) – something I have written about in this blog in the past. Again, it is ironic that one of the primary use-cases for BNs has been in the field of engineering risk and reliability analysis. It is ironic because, scratch the surface, and you find that the physical sciences are just as in hock to the fuzziness of probabilistic inferences as other fields. It is a little difficult to see the bright red line separating social from physical sciences in this regard.

The failure of key components such as the O-rings in Nasa’s Challenger Space Shuttle in 1986 could be modelled usefully using a Poisson probability distribution (e.g. see discussion at p11 of ); but so too we might model the frequency of economic crises using the same family of distributions (there are several such articles but is one example). Neither the Challenger disaster nor the 2008 financial crisis were widely predicted; both otherwise wholly distinct types of crisis were readily explicable with the benefit of hindsight and references to the theory around the frequency of rare events.

I suppose I point to these analogies in order to propose the following precepts:

  • all areas of human endeavour are unavoidably subject to ‘known unknowns’ as well as ‘unknown unknowns’ – and they are inherently uncertain in consequence.
  • consequently: probability theory is a useful interpretive lens for all areas of our lives (physics, medicine, law among others);
  • tools originally developed in areas such as theoretical physics (such as Stanisalw Ulam’s Monte Carlo technique) are cross-functional, meaning they can be gainfully applied to other domains of human knowledge;
  • the fact that techniques developed in physics research might be applied to other domains such as finance, medicine, epidemiology, or legal studies does not entail the conclusion that to do so is a bastardisation of those techniques – or, if that line of argument is maintained, it is neither self-proving nor an obviously well-founded one.

In any event, my current interest in BNs is leading me to develop a radically new approach to the estimation of the likelihood of success for a given legal argument: indeed, I am not sure how I might reason or articulate my thought processes absent the use of such graphical probabilistic models.


Building Bayesian priors for Unfair dismissal awards

A histogram showing the spread of awards for unfair dismissal in 2020

Each year solicitors’ firms publish summaries of the latest tribunal statistics on unfair dismissal (among other types of claim).

The summaries invariably include the following data – the maximum award, the mean, and the medium.

It occurs to me that this is not the most helpful format, since as statistical moments, the mean, median and maximum tell you very little indeed about the spread of awards.

As someone who advises on valuation, it was useful for me to be able to say what sort of distribution of awards might be expected (before looking at the data of a given case). I was surprised that this data is not readily available, leading me to make a Freedom of Information request.

The response from the MoJ politely declined my request, on the basis that the data is already in the public domain. I was sent the relevant link. In fact, that data is still not public – there are counts of awards within 18 separate bands, but nothing like a list of the actual award figures which one could build a distribution from.

But it is true that with some imputation it makes little difference: I build the above histogram by using a random number generator to simulate awards within each of the 18 bands. I don’t believe there is an alternative way to generate a useful distribution given the current constraints on what data is shared.

But I have followed the same process in respect of each type of claim, since reliable records began, and use these distributions to assist me in building prior models for the valuation of a claim, which I can then feed into a Bayesian Monte Carlo model.


Game theory — and Scottish independence — Alex MacMillan:

A brief excursus from the law to tackle current affairs…

Following the electoral results from Saturday (8 May 2021), the Financial Times reports today that Michael Gove, UK Cabinet Office Minister, has waved aside the suggestion that an advisory referendum on independence is on the cards — Westminster’s ‘exclusive attention’ is on the nation’s recovery from the pandemic. And Douglas Ross, the leader of the Scottish Conservatives, points to the failure of the SNP to achieve an outright majority — there is in consequence no democratic mandate for a rerun of the previous 2014 independence referendum, he says (outcome: 45% to 55% in favour of the Union).

Unfortunately for Mr Ross, his argument omits to consider the role of the Green Party — which also committed to a referendum in its manifesto. The SNP came very close indeed to an outright majority; and putting the SNP vote together with the Green vote, there unmistakably is a democratic mandate for a further referendum (if, adopting his reasoning, that is how mandates are properly deemed to exist).

The loss of Scotland from from the UK would be a catastrophic legacy for this Government, which is a government of (to give it its full title) the Conservative and Unionist Party. And it would be particularly painful for PM Boris Johnson, since there is a fairly neat analogy between the arguments he mobilised in favour of Brexit and the arguments he would have to try and bat away.

What strategic choices lie open to this pro-Union Government? The SNP have made the first move in this sequential game by calling for a referendum. The ball is now in the PM’s side of the court. What is the best response to the SNP’s calls for an advisory referendum? Simplifying slightly, there are only three possible stances for the Prime Minister:

  1. Oppose the referendum outright;
  2. Accede to the referendum; or
  3. Delay.

Going through these in turn, from the Government’s perspective (and using ordinal numbers to express levels of preference) the payoffs are as follows:

  1. Oppose ~ 1. The Government pleases its political base by supporting the union, but risks antagonising the Scots;
  2. Accede ~ -2. The Government would be seen as failing to uphold its commitment to the Union and would set in train a risk of the breakup of the UK.
  3. Delay ~ 2. The Government avoids the politically and constitutionally dangerous territory of Scottish independence, hopefully until the political landscape has shifted in its favour.

Turning the tables, we also need to consider the SNP’s perspective. What are its payoffs in respect to these three Government strategies? Well, clearly its payoffs depend on how it chooses to act. But these are the guiding considerations, I suggest:

  1. Oppose. The SNP can rely on this as evidence of Westminster’s oppressive conduct, and can litigate with a reasonable chance of winning on this point of constitutional law, which the Supreme Court would have to rule on.
  2. Accede. The SNP have a second bite at the cherry with a referendum; given the fallout from Brexit (a constitutional decision taken without the democratic assent of the people of Scotland) there is a reasonable chance that the outcome of a second referendum would favour independence — but it would without doubt be a close-run thing.
  3. Delay. The SNP can accuse Westminster of frustrating Scotland’s mandate, and may try and take steps to arrange an advisory referendum in any event, which may lead on to scenario 1 above. But there is some risk that the political landscape will shift and the optimum moment for a referendum may pass.

And the SNP/Holyrood’s strategy set contains three strategies:

-Proceed with a referendum with or without Westminster’s blessing;

-Litigate to try and establish the right to hold a referendum;

-Wait to see if the political landscape changes to favour allowing a referendum.

Plotting the above, and giving the SNP the response strategy set of Proceed — Litigate — Wait as an answer to Westminster’s Oppose — Accede — Delay strategy set, the normal form strategy matrix looks like this:

The stars indicate the three Nash equilibria that exist for this strategic game (ie mark the scenarios in which neither party has an incentive to change strategy, if the other player’s strategy remains unchanged); the arrows represent the best response moves. I will say a little bit more about these annotations below.

Nash equilibria do not necessarily represent optimal outcomes for all parties. The Prisoner’s Dilemma is one of the classic examples where the Nash equilibrium predicts a sub-optimal outcome, the latter being an example of parties’ best mutual responses leading to an outcome that is inferior, and illustrating the potential for players’ apparently rational choices to lead to undesirable outcomes.

To explain one Nash equilibrium from the matrix above, my assumption is that, given Westminster’s choice to oppose a referendum, Holyrood would suffer a loss if it chose to do anything other than proceed (litigate and the SNP might lose; wait and the chance slips away); and equally if the SNP are intent on proceeding with an advisory referendum, Westminster would suffer a loss if they did anything other than oppose that referendum by attacking its legitimacy/legality. So the top left-hand strategic pairing represents one equilibrium state.

The arrows indicate the players’ motivations to change strategy to avoid loss. For the SNP, for example, the strategy of waiting for a referendum to be allowed is highly unsatisfactory and will always lead to a shift to a more aggressive strategy in order to achieve a higher payoff.

Our game theory matrix predicts that, assuming my assessment of the strategic options and order of preferences is right, the Scottish referendum issue will not be wished away by the Conservative government, although ‘delay’ remains their best strategy at present.

Assuming Westminster’s present strategy is to delay addressing the question, the matter will either be resolved in the courts or Holyrood will proceed with an advisory referendum without Westminster’s approval (which would also be challenged in the courts). Either way, the Prime Minister has a substantial constitutional headache that is most unlikely to recede in the coming months.

And… Game theory in the context of legal advice

I touch on a practical application of Game Theory because I wanted to illustrate that the same approaches used in political analysis can also be helpful in litigation and settlement. Where helpful and relevant, I try and incorporate game theoretical analysis into my legal advices: its main advantage lies in forcing the advisor to rise above a single party’s interests and imagine the stakes for the other side. An aim of litigation should not merely be to establish the strengths of your own case, but also to gather information about the other side’s likely payoffs. Because by better understanding those payoffs, you can anticipate the future course of the litigation and establish the other side’s ‘pain threshold’ in settlement discussions — being the point at which their interests are better served by proceeding to court in lieu of pursuing settlement.

Originally published at on May 10, 2021.


How to guess your award (without paying lawyers)

Imagine that, for whatever reason, you do not wish to engage a lawyer. Perhaps they are too expensive; perhaps lawyers do not have your trust. But you are numerate, and approach the endeavour with the seriousness of a bookie.

The context is you believe you have a strong employment claim (whether the claim is yours or simply one that your organisation anticipates paying for). Now, while normally people would reach into their pockets and commission an expert assessment from a specialist employment barrister, you say you did not wish to do so — how else might you know the right level of award to settle the claim and avoid tribunal?

Making inferences from statistics

One obvious way would be to look at the published statistics for levels of award for your type of claim. You would also want to look at the statistics over a number of years (to average out any outliers from unusual years). Then you would consider not only the average award, but above all the distribution of awards so that you could with some confidence estimate the likely range of values the award might take.

To do this, you would have regard to the shape of the distribution, and would calculate confidence intervals — what would be the upper and lower bounds of the award that you could be 90% confident any award handed down by the tribunal would fall within?

But.. statistics are not available in an immediately useful format

The truth is you would need to dig around quite a bit in order to get any meaningful data on this. There are repositories of tribunal statistics (see the Footnote 1), published each quarter — but despite being publicly available, they are not at all designed for public consumption — there is no convenient way to obtain, for example, lists of the amounts of each award (footnote 2) which are a basic prerequisite for constructing a distribution (footnote 3).

Solicitors’ firms regularly use their websites to publish updates listing the average awards for each type of claim and the upper thresholds — because those summary statistics are published by the Ministry of Justice. But there is a limit to the usefulness of knowing the outlier award, and to the unfamiliar it is actually quite a misleading figure to headline with: how valuable is it to the majority of claimants to learn that the highest claim in a quarter was worth £1/4m? It is in fact about as helpful as knowing the size of the National Lottery jackpot — good to know, but unlikely to be relevant to your life. It is equally a stressful statistic for employers — and again, not commonly the level of award for which they would be found liable. Nor is the mean award a good measure, since it is a moment that is unduly influenced by outlier amounts.

Claimants will hate this skewness

Coming back to modelling the likely bounds of any award, one can see from the plot of the cumulative distribution function above on the right that in 2020 90% of the unfair dismissal awards fall below the value of £30k. And what type of distribution actually fits the data? In the Cullen and Frey graph below, we can see that the blue dot represents the empirically observed data (the string of values representing 2020 unfair dismissal awards). Reading off the x-axis’s ‘Square of Skewness’ we can see our data is highly skewed (that is, highly asymmetrical if one looks at the distribution of the data on both the left and right sides of the distribution’s centre point). It is asymmetric with the skew to the left of the centre point — in other words, skewed in a way claimants will hate.

Employers will hate this kurtosis

Looking at the y-axis, we can see the blue dot representing the observations has a high positive score for kurtosis. Kurtosis tells us where the risk exists: is risk evenly spread through the distribution, or does it suddenly hit all at once, concentrated in the tail events?

Low (or negative) kurtosis means most observations fall within a predictable range, with little tail risk. High kurtosis, by contrast, indicates the presence of extreme surprises in the tails of the distribution. Our data has very high kurtosis. So: not good for those responding to claims. In terms of fitting the data to a known type of distribution, one can see that the unfair dismissal data appears quite close to a gamma distribution.

Conclusions: hope for the best; prepare for the worst

From this cursory analysis, one can clearly see why there is appetite to pay into insurance schemes covering employment claims: while most claims for unfair dismissal resolve for a relatively low sum of money (whether through settlement or by tribunal order), there is significant tail risk that should make any HR professional think twice before declining conciliation — we are not dealing with normal distribution but a distribution that is characterised by skewness, kurtosis, and a fat tail.

The gamma distribution is commonly used in finance as an alternative to the normal distribution — to model the returns on stock indices as well as various other areas of finance such as options pricing (Footnote 4). From our consideration of the 2020 unfair dismissal data, the gamma distribution is a solid candidate for describing the likelihood of that data (see the plots below fitting a gamma curve to the data).

Of course, if one did engage a lawyer, these non-specific insights can be used in conjunction with expert advice on the particular facts of your case to determine the most likely outcome with more precision than a general statistical approach can afford.

  2. However, there are indications of the numbers of awards falling within pre-defined bands. It is an odd way of presenting data to the public, but is enough to use bootstrapping to generate useful statistics — which is the technique I had to use to generate the density histogram above.
  3. I am grateful for the help of the Ministry of Justice in promptly answering my Freedom of Information request and indicating the extent of the statistics (and confirming to me the limits as to what is publicly available).
  4. See “Modeling and Risk Analysis Using Parametric Distributions with an Application in Equity-Linked Securities” by Sun-Yong Choi and Ji-Hun Yoon published in Mathematical Problems in Engineering (2020, Special Issue)

The problem with the forced distribution model

In 2019 the UK’s civil service changes its mind about its use of forced distributions ( ), deciding instead to move to more flexible, non-mandatory objective setting. The change came about only recently, from April 2021, for the 2021–22 performance year. But it is a big change: the civil service has half a million employees.

The details of the former scheme are instructive — it worked as follows: civil servants’ performance was ranked from top to bottom, and a distribution overlaid as follows:

-the top 25% of Senior Civil Servants were ranked as top performing;

-the 65% of civil servants deemed as coming below top performers were graded as ‘ achieving ‘; then

-the bottom 10% were ‘ low performers ‘.

Where do these numbers come from? It is not entirely clear. If the system assumes a Gaussian (normal) distribution of performance, then one would expect it to apply the Empirical Rule — which states that 99.7% of data observed (under a normal distribution) lies within 3 standard deviations of the mean. 68% of the data falls within one standard deviation, 95% percent within two standard deviations, and 99.7% within three standard deviations from the mean.

So that central 65% looks like a rounded version of 1 standard deviation from the mean. Of course the ordering of the civil service bands does not follow the standard deviation form — in consequence the thresholds applied by the civil service appear to be rather arbitrary. But in any event there is always going to be arbitrariness in attempting to overlay a mathematical model on human performance.

The civil service is by no means the only large employer using this system. Microsoft used it, and the forced distribution system was recently the subject of staff complaints at KPMG — KPMG Chairman Bill Michael told employees to stop moaning about the system. They did not stop; he had to step aside.

The appeal of the system is that it forces managers to have awkward conversations with under-performers, and also provides a quantifiable way for workers to judge their own performance relative to their cohort.

So why would the UK civil service change course ? For one thing, the approach measures relative performance — but arguably the proper test of interest to management is absolute performance. If you are a civil servant in a particularly talented team, your own impeccable work could land you in the low performers band in circumstances where of course you are not in fact underperforming.

A further problem comes with the ranking exercise — it is plainly a difficult matter to rank employees by performance, because to do so is to compare apples with pears. The difficulties entailed by redundancy selection processes are replicated at each stage of regular performance review.

And then there is the issue with employees gaming the system: strong performers have an incentive not to work together (hardly desirable). And, worse, there is an incentive to sabotage your team mates’ work for your own relative advantage.

Writing in the Financial Times, Sarah O’Conner further points out that research indicates the act of line managers giving feedback has also been shown to hurt performance in a third of all cases. ( Footnote 1). The effect on morale is toxic — because necessarily just under half the population of workers are being told that their work is sub-par.

And another point: in terms of absolute performance — who says the normal distribution should apply anyway? For example, what if performance is influenced by a team’s positive group dynamics so that the actual performance distribution is right-skewed?

The latter seems likely, and has an analogy with the performance of the stock market: you might expect the performance of stock-market assets, in aggregate, to be always normally-distributed. In fact, due to reinforcing loops, stocks commonly move up and down together. That idea is a simple but powerful one and a rebuttal to those seeking to classify performance using relative measures as a proxy for absolute measures of value.

Footnote 1: