By Gordon Rugg & Gavin Taylor
That’s not exactly the most inspiring title ever written, but sometimes humble examples illustrate much more profound principles, as in the way that the eminent physicist Faraday used an ordinary candle to demonstrate some of the key concepts in chemistry.
This particular case isn’t quite so illustrious, but it’s still a lot more interesting than it might appear, and a good illustration of some important principles about how best to do research, whichever field you’re working in. It’s also a good example of mistakes to avoid…
The story so far… We’re working on the D’Agapayeff Cipher, a code that hasn’t been cracked since its publication in 1939. One source of clues is the coding methods covered in the book where it appeared. Unfortunately, that book doesn’t list those methods in its table of contents.
So, we’ve ended up having to go through the book page by page, listing which codes are covered, and in how much detail they are described. It’s a classic part of research; unglamorous, methodical legwork. It’s also a good illustration of a couple of common mistakes that occur in research. Those mistakes are the main topics of this article. We’ll return to the list of contents in a later article. First, the mistakes.
First common mistake: If it’s not on the Internet, it either doesn’t exist, or it isn’t worth bothering with.
Gordon knows several people who have worked on the D’Agapeyeff Cipher. They all took it for granted that they would start by buying a first edition copy of D’Agapeyeff’s 1939 book Codes and Ciphers, which is where the Cipher appeared. It isn’t because they’re technophobes harking back to some idealised past; in fact, all of them have done work at the sharp end of innovative computing.
Serious researchers tend to be perfectionists – a theme that we return to later in this article – and they typically want to go back to the original evidence, not to a second-hand version of the evidence. After you’ve done this a few times, you start being pretty sceptical about second-hand versions; the original version is often strikingly different. This principle applies to printed second-hand versions as well as to Internet-based versions, but the Internet makes it very easy for half-true versions to spread fast and wide.
Another issue is that a lot of information isn’t on the Internet yet, and quite possibly will never make it onto the Internet. Codes and Ciphers is a classic example. It’s not yet out of copyright, so it’s not available on sites such as Project Gutenberg. It’s not a best-seller, so it’s not available as an e-book. Even if it did appear as an e-book, a serious researcher would worry about whether there had been errors when the book was scanned in. Here’s an example from Gordon’s work with Search Visualizer.
“In placing the light artillerists from the Army of Tennessee on duty as infantry, you will assure both officers and men that such assignment is only temporary, and they will be returned to their proper arm of the service as soon as gnus can be obtained for them.”
(From the official war records from the American Civil War.)
That particular example is easily identifiable as a typo (although there’s something endearing about the thought of issuing Army of Tennessee light artillerists with gnus). In the case of a cipher, though, a typo isn’t usually easy to identify, and may make a decipherment impossible. So, there are sound reasons for going back to the original sources.
Once you start going back to the original sources, you can make some unexpected discoveries. A classic example from the world of codebreaking is the long-lost Book of Soyga. (Yes, seriously. Anything even vaguely connected to the Voynich Manuscript story is usually weirder than anything that Dan Brown could ever produce.) This was a book that obsessed the Elizabethan polymath Dr John Dee. It was believed lost for centuries, until Deborah Harkness, an expert on Elizabethan literature, happened to be in the Bodleian Library in Oxford. She tried searching the manual card index there for one of the book’s alternative titles (it’s also known as the Book of Agyos and as the Aldaraia). She discovered that two copies had been sitting in peaceful obscurity on the shelves of the Bodleian for the last few hundred years.
Rich SantaColoma has been doing some very interesting archival work looking at original records about the Voynich Manuscript. Some of his findings are described in Gordon’s book Blind Spot, including a tantalising reference to a mysterious book owned by Edward Kelley, one of the prime suspects for producing the Voynich Manuscript. What Rich has found has often been very different from the standard accounts on Internet sites. If you’re interested in the Voynich Manuscript, his site is well worth visiting.
A lot of Gordon’s work has also involved getting back to original evidence, but in a different way. In his archaeology work, he’s interested in a range of topics such as the origins of handedness, and ways of measuring technological complexity, where he’s ended up having to do hands-on replication of early technologies such as flint working and bronze working. You learn a lot from the hands-on work that you wouldn’t get any other way.
So, the Internet can be helpful, but it has serious limitations, and if you’re a serious researcher, then you need to master the other sources of information in your field. That leads into the next common mistake, which often stops people from becoming serious researchers.
Second common mistake: Fear of failure
One of the most common reasons for people dropping out of research careers involves mistakes. A lot of potentially excellent researchers react traumatically to their first encounter with hostile feedback, or their first public mistake. That’s a tragic waste of human potential, particularly because making mistakes is a key part of good research.
Contrary to what most people expect, experts usually make more mistakes than novices. The TV series House is an excellent example of how this happens; even though it’s fiction, it’s got the key point absolutely right. The usual plotline involves Gregory House tackling a baffling medical case by trying one possible diagnosis after another until he finds the correct diagnosis in the final ten minutes. He couldn’t reach that right, rare, diagnosis without first working through the wrong but more probable ones. When you look at it this way, the process isn’t actually about mistakes in the strict sense; instead, it’s about eliminating possible answers that happen not to be correct.
So, if you plan your research correctly, whatever you find will be useful, either because it eliminates one of the possibilities that hadn’t previously been ruled out, or because it identifies a promising route forward.
The phrasing of planning your research correctly raises the spectre of a second type of mistake, where you actually do simply get something wrong – you miss a key reference when you’re working through the previous literature, or you mis-type a key figure in your data, for instance. If you miss a key reference, then you might waste months or years trying to tackle a problem that’s already been solved; worse, you might look like a glory hound who is trying to claim credit for work that’s already been done by other people. If you mis-type a key figure in your data, then other researchers who accept that figure as a starting point in their own research will be on a flawed foundation from the outset.
This is why experienced, successful researchers typically have these three characteristics.
They’re perfectionists. One of Gordon’s PhD students included the following classic line in the acknowledgments section of his PhD dissertation: “I would like to thank my supervisor, Dr Gordon Rugg, for teaching me that perfection would do at a pinch.” Often, that attention to detail is what produces the solution to a problem, where previous researchers have glossed over an apparently trivial detail that is actually the key issue. In the case of my Voynich Manuscript work, for instance, one key issue was the distinction between randomly generated gibberish and quasi-randomly generated gibberish.
Despite this, they accept that they will make mistakes. The alternative is paralysis by perfectionism. Quite a few promising researchers spend so long trying to produce the utterly perfect piece of research that they never produce anything. You need to accept that sooner or later, you need to stop checking for errors, and submit the paper, or whatever it is that you’re producing.
They’re sceptical of everyone’s findings, including their own. The late, great physicist Richard Feynman was legendary for working things out for himself, rather than simply trusting what he read in the literature. The more you learn about how the literature is produced, the more wary you become of trusting it uncritically, and the more discerning you become about assessing the trustworthiness of a piece of published research.
These characteristics can be liberating; they allow you to combine quality with realism. They’re also invaluable for ego-free design and ego-free research. In both of these, the key point is that you criticise the concepts (the design, or the research) not the person who produced them. This makes it much easier to weed out flaws and to produce something much better as a result. Some organisations which use these approaches actually have celebrations for the worst idea of the week/month/project, to reinforce the message that making the right types of mistake is an important part of finding the best solution.
So, returning to the theme of this article, we’re posting the list of codes and ciphers that D’Agapayeff mentions in his book Codes and Ciphers. The list also includes notes about whether each mention is a passing mention, or detailed, and whether the mention includes a worked example.
Being human, we’ve probably made one or two mistakes. If you spot any, please let us know, and we’ll amend the list. We’ll publish it on the Hyde & Rugg website later.
A couple more points before the list itself:
If you’d like to find out more about the issue of error in research, it’s a recurrent theme in Gordon’s book with Joe D’Agnese, Blind Spot, available on Amazon:
If you’d like to find out more about assessing publications in the literature, there’s in-depth coverage of this topic in one of Gordon’s book with Marian Petre:
The table of contents
Contents of D’Agapayeff’s 1939 First Edition of Codes and Ciphers:
Table collated by Gavin Taylor and Gordon Rugg
Key to annotations
D# Diagram/ Full Table
C# Corpus – text used in a worked example
B# Book Reference – mention of another book
A# Author Mention – mention of another author or cryptographer
E# Example – worked example of a cipher
In the next episode, Gavin finds a typo in Codes and Ciphers. It’s more significant than it might sound…