By Gordon Rugg and Sue Gerrard
What are chunking, schemata and prototypes, and why should anybody care?
The second question has a short answer. These are three core concepts in how people process and use information, so they’re centrally important to fields as varied as education and customer requirements gathering.
The first question needs a long answer, because although these concepts are all fairly simple in principle, they have a lot of overlap with each other. This has frequently led to them being confused with each other in the popular literature, which has in turn led to widespread conceptual chaos.
This article goes through the key features of these concepts, with particular attention to potential misunderstandings. It takes us through the nature of information processing, and through a range of the usual suspects for spreading needless confusion.
Original images from Wikipedia; details at the end of this article
The core principles of sensory-level chunking are widely known. Human working memory has a very limited capacity, identified by Miller in the 1950s as “seven plus or minus two”. It’s very different from human long term memory, which has a huge capacity, and which works in a different way.
We can think of human working memory as being like a set of pegs on which you can hang things, as in the image below.
In this image, there are seven yellow pegs. To keep the analogy with human working memory, we’ll use the constraints that each peg is only able to have one ring hanging from it, and that the tags have to be hung up in the correct sequence. In the example above, for the digits 0,6,1,6, there are four dog tags, each with a number on it, each attached to a ring. This arrangement means that we’ll need to use four pegs to hang up all four of the dog tags.
Now let’s suppose that we’re allowed to attach several dog tags to each other via a linking rod, and then attach that rod to a ring. This means that we would only need one peg for this set of four dog tags.
That’s what goes on in sensory-level chunking. In chunking, the human sensory system spots a link between two or more items, and does the equivalent of joining them together.
The image below shows how this in operation for a different sequence involving the same digits. Imagine that instead of the sequence above (0,6,1,6), we instead have to handle the sequence 1,0,6,6. To most British readers, these digits form the date for the Battle of Hastings, so the four digits would be automatically chunked into a single group.
The advantage of this type of chunking is that it frees up a lot of processing space; when you only have about seven pegs to hang things off, then freeing up three pegs, as in the example above, is extremely useful.
This type of chunking is a common feature in human information processing, and is one of the key features of expertise; experts have chunked huge numbers of pieces of information about their domain of expertise, so they can both see huge numbers of associations that novices can’t, and can also process much more information more efficiently than novices can.
So far, so good.
Key features, and what this isn’t
What we’ve described above is “proper” chunking, in the way that chunking is normally understood in cognitive psychology and related fields.
Unfortunately, novices sometimes confuse this concept with various forms of hierarchical information organisation that are completely different from chunking.
“Proper” chunking occurs at a very early stage of information processing, which is why we’ve used the phrase “sensory-level chunking” in the description above. It’s an involuntary, automatic process that takes place before any verbal or conscious processing begins.
Another key feature of chunking is that it’s “flat”. Your sensory system perceives a chunk in the incoming information stream from your senses, and automatically treats it as a chunk. The chunk doesn’t contain sub-chunks. Using the analogy of the pegs and dog tags, you only get to use one linking rod, and one layer of dog tags hanging off the rod.
This is completely different from what goes on with hierarchical organisation of information. This difference has major implications for learning and education, which is why understanding the difference is critically important.
Not the same as chunking: The knowledge pyramid, and a taxonomyThe Tree of Life diagram is from Wikipedia; details at the end of this article
Chunking is closely associated with implicit learning, about which there is a solid body of research evidence in the psychology literature. Implicit learning involves seeing large numbers of examples, and having some way of knowing which examples are associated with which outcome. The learner then induces rules and underlying principles from those examples.
Implicit learning is a tacit process in the strict sense; i.e. the learner has no valid introspection into what is going on within their brain while they are learning.
This is very different from learning other forms of association between pieces of information, many of which can easily be handled via verbal, explicit knowledge. Implicit learning can be slow and inefficient, so it’s often not the best way to learn something.
This has obvious and serious implications for the debate about the role of “facts” in education, where novice-level misunderstandings of key concepts are ubiquitous. For example, misunderstanding the nature of chunking could easily lead a novice to the completely erroneous conclusion that showing large numbers of “facts” to children will be a “natural” and efficient way for the children to learn the underlying principles. The reality is very different.
Schema theory, schemata and scripts
A schema is a sort of mental template for something. We’ve deliberately started with a broad and vague description, since the term has been used in a range of ways since its popularisation by Bartlett in the 1930s. There are two standard plural forms, namely schemas and schemata. We use schemata because it’s less likely to be misunderstood as schemers in spoken discussion with non-specialists.
There’s a very similar related concept, namely the script in the psychological or computational sense, as opposed to the sense of film script. Broadly speaking, scripts are a subset of schemata that deal with sequences of actions. Schemata can apply both to static knowledge and to knowledge about sequences of actions. For brevity, we won’t go into scripts and script theory in this article.
The schema is an extremely useful concept as long as it isn’t pushed outside its range of convenience, i.e. the range of contexts in which it can be meaningfully applied. That’s one reason that we’ve been deliberately loose in our terminology so far. As long as people treat the definition of “schema” with suitable caution, it’s a very useful concept.
At the heart of this concept is the mental template. There’s been a lot of debate about what form this template might take, and how the template is formed, etc. We’ll deliberately not get into these questions, for brevity.
Instead, we’ll go straight into an example, which should make the principle clearer. Our example is a lay person’s schema for a bird, which might include the key features that a bird is an animal, which has wings, has feathers, lays eggs, and can fly.
This core concept is as simple as it looks. However, because it looks superficially similar to various other concepts, it’s liable to being misunderstood and misinterpreted. Here are some examples, to clear the ground before we get into detail with this example.
The schema for bird above is an example of folk taxonomy, as opposed to formal scientific taxonomic theory or other systematic, formalised classifications. Folk taxonomies are generally based on surface features rather than deep structure, and are often internally inconsistent.
The schema above is different from sensory-level chunking in various ways. For instance:
- It involves explicit knowledge, which can be explained in words via valid introspection, whereas chunking doesn’t
- It can be learnt explicitly, whereas chunking is learnt via implicit learning
- It can be passed on purely verbally, whereas chunking can’t (i.e. you can verbally tell someone the schema for a bird, but you can’t teach someone how to chunk something just by telling them the key features in words)
- It involves hierarchically organised information, such as including the concept of wing, whose definition involves further layers of explanation; this isn’t the case for chunking
Experts differ from novices in having much more sophisticated organisation for their knowledge, typically involving multiple hierarchical structures for it, multiple facets for it, and multiple schemata for it. All of these are completely different from chunking in the proper sense of the term, and all of these have different implications for learning and teaching.
A common problem with schemata involves someone mistakenly believing that a schema with which they are already familiar can be applied without modification to a different field.
A classic example is the belief that the schema of managing a business can be applied without modification to managing a public sector organisation such as a school or a government department. Similarly, there’s a common belief that there’s a single schema for good writing, as opposed to various different schemata for good writing in different disciplines, usually with detailed rationales for how the writing style needs to fit with key issues specific to that discipline.
One of the main findings of the substantial body of research into expertise is that expertise is tightly bounded to the domain in which the expert specialises, and that it is highly dependent on huge numbers of facts about that domain, rather than being based on general transferable principles.
This is another finding with major implications for government policy and for education in particular, but it has so far received less attention than it should have.
The term prototype is yet another term that has completely distinct meanings in different fields.
To add further confusion, the Ancient Greeks have contributed their usual mixture of interesting observations combined with plausible but horribly warped conclusions. To add yet more confusion, the Ancient Greeks were aided and abetted centuries later by Jung, who introduced yet another plausible-looking but profoundly wrong speculation. We’ll return to them later.
Returning to the first ambiguity: in this article, we’re not talking about the prototype in the sense of “an early version of a product”.
Instead, we’re talking about the prototype in the sense of “a classic, typical member of a category” as described by Rosch. This is very different from Plato’s idea of abstract essences, and from Jung’s concept of archetypes, for reasons that we’ll explain below.
Here’s an example. We’ll return to the lay person’s schema we used above for a bird. This schema includes the key features that a bird is an animal, which has wings, has feathers, lays eggs, and can fly.
In prototype theory, the more of these key features a bird has, the more prototypical it is.
In the picture below, the column on the left contains prototypical birds – the American robin and the European robin. Both of them have all the key features in the list.
Prototypicality and category membershipOriginal images from Wikipedia; details at the end of this article
The next column contains birds that are unusual, namely a takahe and an ostrich; both these birds are flightless, so they’re missing one of the key features in the lay person’s schema. They’re still definitely birds, though.
The next column along contains a bird that is flightless, and whose feathers are quite different from “normal” feathers, and whose wings are quite different from “normal” wings.
The last column contains two examples that are not birds, even though they can both fly.
In more formal language, the folk category of “bird” is a fuzzy set, i.e. a set which doesn’t have clear-cut either/or boundaries. Membership of the “bird” category is defined by the fuzzy criterion of having most or all of several criteria, all of which are themselves fuzzy.
This is a classic recipe for confusion, and when you look at early classification systems and folk classification systems, you find exactly the sort of oddities that you might expect – for instance, some types of aquatic bird being categorised as fish for the purposes of Catholic Friday observances.
There are also various more surprising findings that emerged from work on prototype theory. For instance, humans can learn what the prototypical example of a category would be like, even if they have never encountered that prototypical example.
The image below shows how this can happen. Someone who sees the almost-complete shape can infer what a “proper” example would look like.
Example of shapes, and a learned prototype
This looks superficially very similar to Plato’s idea of the abstract essence, and to Jung’s idea of the archetype. The difference is that prototype theory provides an explanation for the phenomenon using observable cognitive mechanisms, without needing to invoke speculative concepts such as Plato’s imaginings about a mystical other world, or Jung’s guesses about a racial subconscious.
Some issues arising from folk taxonomies have significant implications with regard to value and belief systems. Traditional value and belief systems tend to favour crisp, binary, either/or categorisations, such as ritually clean versus ritually unclean, or male versus female, or human versus animal. They usually have a strong dislike for anything that doesn’t fit unequivocally into one or other of the pigeonholes within the belief system.
Folk taxonomies often include similar undertones of moral judgement, with the prototypical cases being viewed as “proper” and the least prototypical cases being treated as somehow wrong or abnormal. Gordon has blogged about this topic in an earlier article:
The bigger picture
From the issues identified above, it’s clear that the way in which people learn, aggregate and organise their knowledge has major implications for education and related fields.
Precisely because these implications are so important and far-reaching, anyone who wants to build them into a proposed education policy should have a very thorough understanding of what the relevant processes are and of how they work.
For instance, any attempt to present information to children in the way best suited to help them learn should be based on a detailed understanding of the difference between chunking, implicit learning, schema theory and prototype theory. As a specific example, any model of teaching children to read needs to be based on a thorough knowledge of the relevant literature, down to the level of how the human visual system does or doesn’t use chunking when it processes graphemes. The key principles in this example were identified by Hubel and Wiesel long before the current crop of graduates were born, and there is a considerable body of sophisticated research into reading that dates back even further, so the information is available for anyone who needs it.
We hope that this article will help clarify some of the issues involved in the current education debate. We also hope that it will help steer that debate towards solidly established specialist literature, and away from popular misconceptions that suit the political preferences of the government of the day.
Notes, related articles and links
You’re welcome to use Hyde & Rugg copyleft images for any non-commercial purpose, including lectures, provided that you state that they’re copyleft Hyde & Rugg.
There’s more about the theory behind this article in my latest book, Blind Spot, by Gordon Rugg with Joseph D’Agnese
Categorisation and morality
Sources for the images above
Pingback: The Knowledge Modelling Book | hyde and rugg
Pingback: The science of learning (part 2) | David Weston