The Rugg and Taylor “Cryptologia” article on the Voynich Manuscript

By Gordon Rugg and Gavin Taylor

We’ve recently had an article published in Cryptologia about our work on the Voynich Manuscript, which was discussed in New Scientist. The Cryptologia article is behind a paywall, so in this article we’ve summarised the key points, for anyone who wants some more detail.

The background

Our involvement with the Voynich Manuscript started when Gordon needed a test of concept for the Verifier method that he had developed with Jo Hyde, for detecting errors in previous research into hard, unsolved problems.

The Voynich Manuscript is a book in a unique script, with odd illustrations, which had previously been believed to be an undeciphered text, either in a unidentified language or in an uncracked code. There were serious problems with both those explanations for the manuscript. If it was an unidentified language, then it was an extremely strange one. If it was an uncracked code, then it was either astonishingly sophisticated, or was based on a very unusual set of principles. The third main possibility, namely that the manuscript contained only meaningless gibberish, had been generally discounted, because there are numerous odd statistical regularities in the text of the manuscript, which everyone believed were much too complex to have been hoaxed.

Gordon’s work showed that this belief was mistaken, and that the most distinctive qualitative features of the Voynich Manuscript could be replicated using low-tech hoaxing methods. This resulted in an article in Cryptologia in 2004.

Gordon’s initial work, however, did not address the quantitative statistical regularities of the text in the manuscript.

Our recent article in Cryptologia addresses this issue, and shows how the most distinctive quantitative features of the VMS can be replicated using the same low-tech hoaxing methods as Gordon’s previous work. These features arise as unintended consequences of the technology being used, which produces statistical regularities as unplanned but inevitable side-effects.

Taken together, these two articles show that the key unusual features of the Voynich Manuscript can be explained as the products of a low-tech mechanism for producing meaningless gibberish.


The Torsten Timm Voynich article

By Gordon Rugg

There’s a new article about the Voynich Manuscript. It’s by Torsten Timm, and it’s on arXiv:

It’s 70 pages long; the abstract claims that: “As main result, the text generation method used will be disclosed”.

That’s a big claim.

In brief, the mechanism proposed in this article looks fairly sensible at first sight – it basically consists of a way of generating new words from a particular word, using a set of rules about what can be substituted for what. It’s low tech and simple, and it can produce something that looks like Voynichese.

However, as usual, the devil is in the detail, and I’m not convinced that this method provides a good explanation for the odd statistical details of the Voynich Manuscript. For instance, it doesn’t provide a compelling argument for why words in the first half of a line tend to be different in length from words in the second half of a line. I was also unconvinced by the explanation on page 18 for the “dialects” and the handwriting differences in the Voynich Manuscript: ‘The difference between both “languages” may only be that the scribe changed his preferences while writing the manuscript.’

This article also sits awkwardly on the fence with regard to whether the Voynich Manuscript contains only meaningless gibberish, or whether it contains meaningful material. That’s a significant issue with regard to how the text of the manuscript was actually generated, and this doesn’t fit comfortably with the claim that the article shows how the text was generated.

There’s no mention of significant previous work by previous researchers relating to the idea that the Manuscript might contain coded material concealed among gibberish padding text. In addition to my own discussion of this idea (and the problems with it) in Cryptologia, this has also been discussed and investigated in some depth by other Voynich Manuscript researchers who aren’t mentioned in Timms’ article.

So, in summary, it’s an interesting idea, and there are some sensible, interesting suggestions, but there are some major gaps in its references to previous work, and I don’t think it provides a compelling explanation for the statistical oddities that are a key feature of the Voynich Manuscript.


One hundred Hyde & Rugg articles, and the Verifier framework

By Gordon Rugg

This is the 100th post on the Hyde & Rugg blog. We’re taking this opportunity to look back at what we’ve covered and look forward to what comes next.

The image below shows some of the main themes and outputs so far, in the “knowledge cycle” format that underlies our Verifier framework for tackling human error. If you’ve come to this blog after reading Blind Spot, you might be pleased to discover that we’ve been covering the contents of Verifier here in more depth than was possible in the book, and that we’re well on the way to a full description.

The knowledge cycle, and topics that we’ve blogged about

Hoaxing the Voynich Manuscript, part 7: Producing the text

By Gordon Rugg

The six previous articles in this series looked at the component parts of a hoax. This article shows how those components can be put together, to produce  the text for a large document consisting of meaningless gibberish. This process is much the same regardless of which script you use for that gibberish, and regardless of which illustrations you use. The script and illustration issues are discussed in article 8, which I’ve already published

There are a few key points about this hoaxing process that are absolutely central to understanding why it gives new insights into Voynich Manuscript research. These points are:

  • This process  isn’t random.
  • This process isn’t deterministic – there isn’t an algorithm that would let a future researcher reproduce the text within a given page using the same table and grille.
  • This process produces numerous complex statistical regularities in the output text as completely unintended side-effects of a very simple production process.

This method is fast and easy to use. You can generate meaningless gibberish text as fast as you can write it down. I’ve produced quasi-copies of various pages from the Voynich Manuscript, where I’ve copied the original illustration, and generated the appropriate amount of meaningless gibberish text to match the amount of text in the original page. It consistently took about an hour and a half per page. More time spent on text within a page was balanced by less time spent on illustration and vice-versa, so each page took about the same time regardless of whether it was mainly text, mainly picture or a mixture.

At that rate, one person working alone could produce a document as long as the Voynich Manuscript (about 240 pages) in under ten weeks.

Here’s how the method works.

Applying the Bax proposed solution

By Gordon Rugg

Stephen Bax’s article provides provisional “real” transliterations for over half the commonly used letters in the Voynich Manuscript’s alphabet. If his transliteration is even approximately correct, that should be enough to give some useful insights when applied to a page from the manuscript.

I’ve tried that, and the results are unconvincing. For instance, according to his transliteration, about half the words in one of the pages he analysed end in the letter “r”.

A language where half the words end in “r”? Even in a Latin page crammed with third person passives, that would take a lot of doing. There’s a lot more that’s strange about what emerges.

If this is a decipherment, as claimed by the press release, or even a partial decipherment, as claimed by the actual article, then it’s an interesting use of the word “decipherment”.

