By Gordon Rugg
This article is the first in a short series about things that look complex, but which derive from a few simple underlying principles. Often, those principles involve strategies for reducing cognitive load. These articles are speculative, but they give some interesting new insights.
I’ll start with Inuit tactile maps, because they make the underlying point particularly clearly. I’ll then look at how they share the same deep structure as satirical caricatures, and then consider the implications for other apparently complex and sophisticated human activities that are actually based on very simple processes.
By Gustav Holm, Vilhelm Garde – http://books.google.com/books?id=iDspAAAAYAAJ, Public Domain, https://commons.wikimedia.org/w/index.php?curid=8386260
One of the most striking discoveries from early work in Artificial Intelligence (AI) was that some of the skills long viewed as pinnacles of human intelligence could actually be replicated quite easily with very basic software. A classic example is chess, long viewed as a supreme triumph of pure intellect. Chess actually turned out to be easy to automate.
There were similar discoveries from classic work in the 1970s and 1980s about human judgment and decision-making. For tasks such as predicting the likelihood of a bank’s client defaulting on a loan, simple mathematical models turned out to be better predictors than experienced bank managers.
So, just because human beings think that something is complex and sophisticated doesn’t mean that it necessarily is complex and sophisticated. Often, very simple explanations turn out to be surprisingly accurate and powerful. This is the case with caricature drawing, which can be explained using a couple of very simple principles.
One of those principles is widely used in map making. The Inuit tactile map in the image below shows a stretch of coastline. The map is intended to be kept within a mitten, so that in cold weather you can consult it without having to take your hand out of the mitten.
https://en.wikipedia.org/wiki/Ammassalik_wooden_maps (cropped for clarity)
Each protruding part of the map corresponds to a headland protruding into the sea; each concave part of the map corresponds to a bay. The map can also show prominent mountains, etc, because it’s three-dimensional. This link shows how the tactile map corresponds to a paper map of the same area.
The tactile map doesn’t use the same scale for the distances between headlands as it does for how far the bays go inland, or for the steepness of the terrain. This sort of inconsistency is quite common in map making. One well known reason is the problem of displaying curved sections of a spherical world on a flat surface, which is a serious problem when trying to show near polar latitudes in the widely used Mercator projection.
A less well known issue is that vertical relief is often deliberately exaggerated in raised relief maps, such as the one below. Often, the vertical relief is exaggerated by a scale of 5 or 10.
A raised relief map
Because the exaggerated version often falls within the range of steepness that actually occurs within real mountainous landscapes, the exaggeration can easily go un-noticed.
Why do mapmakers do this? Because of the way the human sensory system and cognitive system work. The exaggeration makes it easier to process the image or model, and to identify key features in it.
How does this relate to caricatures? Because exactly the same process of systematic exaggeration can be used to generate a caricature from an image via software, without any need for a human artist. The steps involved are as follows. I’ve used profiles because they make it easier to demonstrate the process.
Step 1: Measure some pictures of human faces, as shown below in blue, with red measurement lines.
Step 2: Calculate the average value of each measurement, to produce an average profile (shown in black).
Original black profile image By SimonWaldherr – Own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=16531195
Step 3: Compare the face that you want to caricature with the average profile, as shown below, where the target face is in beige.
Step 4: Exaggerate the differences between the target face and the average profile. You can do this in various ways; for instance, if the nose in the target face protrudes 3 millimetres from the average profile, you could double this distance to make it protrude 6 millimetres, or you could square it to 9 millimetres, to make it particularly prominent. I’ve gone for a comparatively small exaggeration, to illustrate the uncanny valley issue, described below.
The images below show the before and after versions of the original target image.
Here’s an ancient Roman example for comparison; caricatures have been around for a long time…
Photo by Vincent Ramos – Vincent Ramos, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=1137967
The example above shows how the principle of caricature works, but it doesn’t discuss how far the exaggeration should be pushed. That’s where the second underlying principle comes in.
A useful guideline for caricatures is to push the exaggeration far enough to take it to the other side of the uncanny valley. The uncanny valley is the uncomfortable space between two categories. Usually, this is used in the context of the border territory between human and not-human, such as lifelike models of human beings, or part-human monsters such as werewolves, vampires and zombies. However, the uncanny valley also occurs in other contexts; I’ve blogged about these contexts in this article.
Caricatures are usually well on the far side of the uncanny valley, which reduces uncertainty and cognitive load for the viewer. “Well on the far side” can be measured; as a rough rule of thumb, it’s about three standard deviations or more from the mean, corresponding to a level that most people would see only rarely, or never, in a lifetime.
Conclusion and further thoughts
Silhouette caricatures can be produced using just a couple of simple underlying principles. Those same principles can be applied to three-dimensional caricatures, like the heads of puppets. You just need to measure the distance between the centre of the head and each point on the skin, and apply the same principles of exaggerating the differences between an average head and the head that you are caricaturing in the puppet. Yes, that involves a lot of measurements, but it’s just a case of repeating the same basic steps a lot of times, rather than having to invoke any new, complex principles.
So, what might look like a complex triumph of human skill can be modelled using just a couple of simple underlying principles.
With caricatures, a key feature is that the end product is unambiguously on the far side of the uncanny valley. What happens when it isn’t?
When the end product is within the uncanny valley, then by definition it will produce an uncomfortable feeling in most viewers. (The question of why it only produces this feeling in most viewers, rather than all, is one which I’ll revisit in a later article.)
There’s a very different reaction when the end product is about one to two standard deviations from the mean; for example, when a man is between about six feet tall. This tends to be a “sweet spot” for favourable perceptions, before reaching the “too much of a good thing” level. I’ve blogged about this topic in this article and this article, and gone into more depth in my book Blind Spot.
This in turn raises broader questions about what people want, and about what regularities there are within those desires, and about the implications for the entertainment industry and for social norms and for politics. Those, though, are questions for later articles…
You’re welcome to use Hyde & Rugg copyleft images for any non-commercial purpose, including lectures, provided that you state that they’re copyleft Hyde & Rugg.
There’s more about the theory behind this article in Gordon’s latest book:
Blind Spot, by Gordon Rugg with Joseph D’Agnese