Beyond the 80:20 Principle

By Gordon Rugg, Jennifer Skillen & Colin Rigby

There’s a widely used concept called the 80:20 Principle, or the Pareto Principle, named after the decision theorist who invented it. It’s extremely useful.

In brief, across a wide range of fields, about 80% of one thing will usually come from 20% of another.

In business, for example, 80% of your revenue will come from 20% of your customers. In any sector, getting the first 80% of the job done will usually take about 20% of the resources involved; getting the last 20% of the job done will usually be much harder, and will take up 80% of the resources. The figure won’t always be exactly 80%, but it’s usually in that area. Good managers are very well aware of this issue, and keep a wary eye out for it when planning.

Here’s a diagram showing the principle. It’s pretty simple, but very powerful. However, that doesn’t mean that it’s perfect. It can actually be developed into something richer and more powerful, which is what we’ll describe in this article.

eighty twenty

A more powerful version

The 80:20 Principle divides the world into two categories. However, the world isn’t usually that simple. Usually, there are more than two common categories that you need to be able to handle. The number usually isn’t enormous; it’s usually only a handful of categories, but that’s still more than two. So, is there a neat way of extending the 80:20 Principle to include those other common cases?

One logical next step is to ask whether the 20% itself follows the 80:20 principle, with a small sub-minority within it. The answer is that usually it does. And, to answer the follow-up question, yes, that small sub-minority will usually itself contain an even smaller minority. Here’s an illustration showing those layers of minorities.

seventy twenty

So, a large majority of cases fall in the light green category on the left. A fair-sized minority fall in the yellow category. A noticeable but smaller minority fall in the beige category, and a very small minority fall in the red category at the right.

The proportions that we’ve used above are 70:20:9:1.

These proportions aren’t completely arbitrary. We’ve chosen them via a combination of real-world examples and of statistical principles, which we’ve unpacked in the notes at the end of this article. The actual distributions that you’ll probably see in real life will probably not be exactly the same as this, but they’ll usually follow the same overall pattern. (So, don’t get hung up on the precise figures; the key point is the general principle.)

In brief, there’s usually one very common category, then a fairly common second category, and then a few less common categories, followed by an assortment of rare or very rare categories. This type of distribution has been extensively studied in statistics and related fields, via concepts such as normal distributions and power laws, which we briefly describe in the notes at the end of this article.

So what?

Here’s an example of this principle in everyday life. It involves people’s surnames.

Very common: Surnames look straightforward at first glance. You ask someone for their surname; they answer with a single word like “Smith” or “Patel”. Most of the time, it’s straightforward.

Fairly common: Quite often, though, it’s not so straightforward. A common case is married women who use their married name for everyday purposes, but who continue to use their maiden name for professional purposes. Similarly, a lot of authors, actors, etc use a pen name, or stage name, etc.

Less common: There are also other cases which are not so common, but which are still far from rare. A classic example involves patronymics, such as de Vere, or ap Rhys, or ní Gabháin. Unlike “Smith” or “Jones” these names each consist of more than one word.

Rare: Some answers to the name question will be along the lines of: “Sorry, it’s not that simple”. For example, a lot of Indonesian names consist of a single word, rather than the more familiar “personal name plus family name” combination. Sometimes, the answer will be “Don’t know”. This can happen when relief agencies are trying to deal with a young, lost, child. It can also happen when the emergency services are dealing with someone unconscious who isn’t carrying any identification.

This issue has very real practical implications. One example is passport control when you’re traveling; if some of your identification uses one version of your name, and the rest uses another version, you’ll probably have to spend a lot of time trying to explain this to sceptical immigration officials. Another example is official records, such as bank and health records, where similar problems can easily arise.

If you’re dealing with customer or client records, then you’ll also encounter very similar complications with just about every other data field you have to deal with. The less common the complication, the longer it will usually take to sort out, and the greater the potential for chaos, if different people in the organisation are making different judgment calls about similar cases. In a worst case, this can lead to lawsuits because you’re treating people inconsistently, or to death if confusion with medical records leads to doctors being unaware of critical information about a patient.

So what can you do about this?

You can reduce this problem significantly by using a simple model. We’ve named it the SALT model, for:

  • S Standard process, for the standard, very common, cases
  • A Alternate approach, for the fairly common alternatives
  • L List of the less common alternatives
  • T Trained professional judgement, applied to the rare cases

From the viewpoint of someone handling a customer query, for the sake of example, it would work like this.

Standard process: A customer asks a common, straightforward question, and you tell them the standard answer, swiftly and easily, from memory.

Alternate approach: Your organisation trains you in how to handle the handful of fairly common alternatives to the standard process. You will encounter these cases often enough to keep your memory fresh as regards the correct way to handle them.

List: Your organisation provides a list of the less common alternatives, with the policy of handling each one. When you encounter a case you haven’t been trained in, you look for it on the list.

Trained professional judgment: If you encounter a case that you haven’t been trained in, and you can’t find it on the list, you hand the case over to a trained professional. Where possible, they will add their decision about the case to the list, so that future cases are handled in the same way. Where this is not possible, the organisation has a standard legal disclaimer saying, in effect: “We’re treating you as a one-off special case, without any legal implication that the way we treat you can count as a precedent in future cases”.

Further thoughts

This approach can easily be integrated with some other useful concepts.

One is optimising the standard process, by making it as efficient as possible. When you’re dealing with large numbers of cases, which is what the standard process does, then even a small improvement in efficiency soon mounts up. This type of optimisation is a key part of Total Quality Management in the best manufacturing processes, and also in software design for systems that need to handle a very large number of transactions. Shaving a second off the time for an average online process can make a very big difference when spread across a large system.

Another is systematically deciding when to base your processes on knowledge in the head as opposed to knowledge in the world. Training a member of staff in a given procedure costs time and money. However, if the staff member can then handle that procedure efficiently from memory (knowledge in the head), they can work faster and better than if they had to look up the relevant information online or in a handbook (knowledge in the world).

The “list” approach overlaps with the issue of information representation, which has featured repeatedly in previous articles on this blog. The list needs to be formatted in such a way that it can be used swiftly and easily by staff. This is a well recognised problem in software design, because there is no single best way of structuring a list. For instance, if you structure a list alphabetically, you hit the problem of different names for the same concept; if you structure it from most common to least common cases, then the user won’t know where to look for the case that they’re dealing with, and will have to work through the entire list item by item.

A fourth significant point about this approach is that it reconciles the apparent conflict between procedures and professional judgment. This is a long-standing problem in professions such as medicine and academia, where professional judgment is a crucial element, but can conflict with the drive towards standardisation that is a central feature of successful bureaucracies.


This approach has some major implications for best practice in public policy, as well as for individual organisations.

A key implication is that any new policy should include an explicit statement of how the policy should handle the first three categories (standard, alternate, and list). A related implication is that any new policy should explicitly say what should go into the standard approach, the alternate approach, and the list.

For instance, many government initiatives in recent years have implicitly been based on the assumption that everyone is literate, and that everyone has a bank account, with only a tiny minority of people not fitting those descriptions. The reality is quite different, and the gulf between assumptions and realities has often meant that policies were unworkable from the start. Governments have to govern for everyone, not just for the most common categories; the way that a nation treats its marginal minorities says a lot about how civilised it really is.

Simply having to list the less common cases as part of the policy creation, preferably with some reality-based research into how common each of those cases was, would have avoided quite a few policy fiascos, and a lot of human tragedy for the people caught in the chaos of botched systems.

Another, less obvious, implication was pointed out by our colleague Steve Linkman, in Computing at Keele. The people in the less common and rare categories will usually have experienced a lifetime of problems with systems that don’t have a neat, simple way of handling them. Vegans have had years of explaining that they’re not the same as vegetarians; people with surnames like “ní Gabháin” have had years of seeing their name mangled by systems that were designed for names like “Smith”; people with food intolerances have had years of being treated as just fussy eaters.

That sort of experience is likely to leave people tetchy when they encounter yet another case of a system causing them problems. That can in turn easily lead to confrontations between the individual and the organisation that is deploying the system – for instance, long-suffering receptionists and front office staff.

One simple way of addressing this is to include awareness training as part of the training for alternate cases, so that staff don’t make ill-advised jokes or comments to people in unusual categories. A little effort here could go a long way, in terms of helping improve people’s experiences and people’s lives.

On which cheering note we’ll end.

Notes and links

You’re welcome to use Hyde & Rugg copyleft images for any non-commercial purpose, including lectures, provided that you state that they’re copyleft Hyde & Rugg.

There’s more about the theory behind this article in my latest book: Blind Spot, by Gordon Rugg with Joseph D’Agnese.

Overviews of the articles on this blog:

The statistical background

The numbers we’ve used are inspired by the figures for statistical normal distributions. If something is distributed on a normal distribution, then most cases are near the middle, with fewer cases occurring as you move further away from the middle. For instance, most people are about average height; quite a few are a bit above or a bit below average height; a few are very tall or very short; and a very few people are extremely tall or extremely short.

If you want to translate concepts like “quite a few” into actual numbers, then there are elegant statistical ways of doing so. This is extremely useful for all sorts of purposes, such as predicting what proportion of your customers will want a particular size of a product.

In statistical terms, if the distribution follows a normal curve, then about 70% of the cases will be within one standard deviation of the mean (a 70:30 distribution). Of the remaining cases, most will be between one and two standard deviations from the mean, dropping off to about 1% of cases being more than one standard deviation from the mean.

We haven’t tried to follow the normal distribution exactly; this article is about a rough approximation that is fairly easy to remember, and about ways of handling that distribution of cases, not about precise prediction or statistical detail.

You also see a very similar distribution in a form of statistics known as Principal Component Analysis (PCA). When you examine the relative effects of multiple causes of a phenomenon, you usually find that the first few causes account for the vast majority of the effects, with the other causes only contributing very minor effects.

Another concept which shows similar principles is a power law. Power laws crop up in areas as different as city sizes and earthquake severity.

In power law distributions, you usually see one category that’s much more common than the others (the one at the left of the diagram) followed by a few categories that drop steeply from quite common to moderately common, and then a long tail of categories that range from uncommon to very rare.

1000px-Long_tail.svg“Long tail” by User:Husky – Own work. Licensed under Public Domain via Wikimedia Commons –

Here’s a real world example.


The diagram shows the sizes of cities in Tennessee. The first two are large (more than 500,000 inhabitants), but after them, the size drops off rapidly, with the majority being well below 100,000

So, in summary, the overall principle we’re describing is very widespread, and the SALT approach gives a richer, more powerful way of handling it than the 80:20 principle.


2 thoughts on “Beyond the 80:20 Principle

  1. Pingback: Explosive leaf level fan out | hyde and rugg

  2. Pingback: Mental models and metalanguage: Putting it all together | hyde and rugg

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.