Applying the Bax proposed solution

By Gordon Rugg

Stephen Bax’s article provides provisional “real” transliterations for over half the commonly used letters in the Voynich Manuscript’s alphabet. If his transliteration is even approximately correct, that should be enough to give some useful insights when applied to a page from the manuscript.

I’ve tried that, and the results are unconvincing. For instance, according to his transliteration, about half the words in one of the pages he analysed end in the letter “r”.

A language where half the words end in “r”? Even in a Latin page crammed with third person passives, that would take a lot of doing. There’s a lot more that’s strange about what emerges.

If this is a decipherment, as claimed by the press release, or even a partial decipherment, as claimed by the actual article, then it’s an interesting use of the word “decipherment”.

The story begins with the idea that possible interpretations of ten words in a document more than two hundred pages long counts as enough of a decipherment to merit a press release. (Italics are in the original.)

bax solution v3

The original article doesn’t claim to identify a language, or even a language family.

bax solution language

That’s actually a more reasonable claim than it might appear. Bax begins by trying to find out possible transliterations for Voynichese words that he believes to be the names of plants and a constellation. Names need to be treated with care in linguistics, since they’re disproportionately likely to come from another language. English, for instance, has a lot of names for stars that come from Arabic, which belongs to a completely different language family. However, foreign-language names might give you transliterations for the script that will then let you get further into the language that you’re really interested in.

So far, so good.

When Bax applied this method to a number of illustrations, he came up with possible readings for ten words, which gave him possible values for more than half the commonly used letters in the Voynich alphabet (fourteen out of about twenty – the precise number depends on the definition you choose).

The next step is to plug those letter values into some chunks of text from the Manuscript, and see what you get. Bax viewed this as future work, and only plugged the letter values into other possible plant names, producing his reading of “kaur” and his reading for “cotton”. The latter raised problems that he acknowledges in his article, since the plant image on the page with a name apparently meaning “cotton” didn’t look like a cotton plant.

That’s scarcely a surprise to anyone who’s familiar with the history of Voynich Manuscript research. Bax’s approach is just another variation on a theme that has been tried repeatedly over the years, and has failed repeatedly over the years, because it simply doesn’t fit the facts in any sort of sensible way.

When you plug some of his proposed transliterations into the Voynich Manuscript text, you soon (i.e. within the first half hour) start to see the problems with it.

Some problems with the obvious approach.

It sounds sensible that the first word on each page with a picture of a plant (the “plant pages” for brevity) will be the name of the plant in the picture. However, if you look at the plant pages, then you spot something odd. A high proportion of them start with one or other of the same two letters. We’re dealing with about a hundred plant pages, so it doesn’t take long to do the sums. That’s either an awful lot of plants which have names starting with one of just two letters, or those first words aren’t as straightforward as Bax has assumed.

The obvious counter-claim that this might just be the equivalent of pages starting with the Voynichese equivalent of  the word “The” doesn’t hold up well either. You just don’t get the same word occurring at the start of various different plant pages.

So there are already problems with the idea that the first word is a name, and that the image gives you some insight into what that name might be.

This has been known for years in the Voynich research community, and is one reason that nobody in the community uses this approach in its simple form any more.

The body text

If you look closely at the body text of a plant page, you soon see more problems.

I’ve tried plugging Bax’s letter values into one of the pages he analysed, folio 3v, shown below. If his transliteration is correct, then about half the words in the body text end with the letter “r” and most of the rest end in the letter “n”. Just two letters accounting for the endings of most of the words in a real language? If that’s correct, then just about every known language could be excluded as a candidate for the language of the manuscript, which would be extremely useful to researchers.

Or, perhaps, Bax is simply wrong.

Here’s are some close-ups showing the text of the page in question. It’s f3v, which he identifies as a hellebore, and transliterates as “kauur” (with a schwa for the first “u”) or “kaur” when he searched for it with Google.

f3v version2

First paragraph detail:

f3v upper para

Second paragraph detail:

lower para detail

(Original image courtesy of the Beinecke Library)

If you look at the page closely, then you start noticing that the same few letters occur over and over again at the ends of words. There’s the one that looks like the number “9” and the one that looks like a number “2” on a slant and the one that looks like a looped “x” – those are usual suspects for word endings. In this particular page, there are also a fair few instances of the character that looks like a curly “n”.

Whatever transliteration system you use for the text, you’re going to end up with only a very few letters at the ends of the words, because there are only a very few letters at the ends of words in the original text. However, since the Bax transliteration treats no fewer than three Voynichese characters as being an “r” the result is that the number of word endings in the transliterations is even further reduced.

You also notice that a lot of the words are very short, and that the letter that looks like a backward “S” occurs several times in isolation, as if it’s a word in its own right. That’s odd, because usually consonants don’t occur on their own in real languages; it’s usually vowels that occur on their own, like “I” and “a” in English, or “y” and “à” in French. The examples on this page of Voynichese might be a case of a single symbol standing for the combination of “s plus vowel” but that doesn’t fit well with the comparatively small number of commonly used letters in the Voynich alphabet.

You get other oddities as well. There are words that begin with “nk” and words that end with “tn”. That’s unusual among the written versions of Indo-European languages. As with the isolated “S” words, this might be a case of one symbol standing for a combination of consonant plus vowel, but that hits the same problem of the comparatively small number of commonly used letters in the Voynich Manuscript.

There’s also a huge amount of repetition. This is particularly striking with another of Bax’s claimed readings, for the plant centaury. The four middle characters of the Voynichese word come out as “tuir” (with a schwa for the “u”) in his transliteration. These characters happen to be one of the most common words in Voynichese. It has some odd characteristics in its distribution, including the fact that it sometimes occurs twice or even three times in a row. You can get repetition in normal languages, but on nothing like this scale. If you start claiming that the Manuscript contains some form of incantation or poetry or both, then that’s hard to reconcile with Bax’s starting assumption that the Manuscript is what it appears at first sight to be, namely a straightforward document in prose whose main function is to transmit factual information. It’s also hard to reconcile with the sheer length of the Manuscript – over two hundred pages – and with the lack of end rhyme, alliteration or metre that would be expected with known types of poetry.

When you dig into the Voynich literature, you find that these and numerous other issues were spotted decades ago, and have been analysed in considerable depth by some of the best brains in relevant fields. There are some features of letter distributions in the Voynich Manuscript that are very odd indeed, and that have been thoroughly analysed by code breakers who analyse letter distributions for a living.

The conclusion that those features force you into is that the “straightorward unidentified language” theory just doesn’t explain most of the puzzling anomalies within the text in the Manuscript.

In brief, the idea that the Manuscript is simply written in a language that nobody has identified yet is an idea that looks plausible at first sight, and that looks like an obvious starting place. However, the fact that it’s obvious should ring alarm bells to anyone wanting to try their luck with the Voynich Manuscript. The Manuscript has been studied in detail for over a century. Does anyone really think that nobody will have tried this approach before? In fact, it’s been tried repeatedly, and it’s failed repeatedly, because it simply ignores the facts about the Manuscript that make it a difficult challenge.

In summary, Bax’s proposed solution doesn’t hold up for long when you apply it to the text in the Manuscript. The problems with the simple “unidentified language” theory were exposed by Currier and others decades ago, and they’re show-stopping problems. Bax may be an award-winning expert on eye tracking, but when it comes to the Voynich Manuscript, his theory appears to be just another example of the tragedy at the heart of science: The slaying of beautiful theories by ugly facts.

6 thoughts on “Applying the Bax proposed solution

  1. Pingback: Voynich articles overview | hyde and rugg

  2. Forgive my naivete – I’m just an interested outsider, not a member of the “Voynich research community” – but has anyone ever proposed that the script might be a non-alphabetic writing system? It seems that something like a syllabary or an abjad would better suit that smallish number of characters, and also solve some of the problems you raised about the stand-alone characters, for example.
    -Allison

    • I don’t know if anyone’s suggested either of those ideas. If I recall correctly, syllabaries tend to have more characters than alphabets – typically in the dozens – so that wouldn’t fit well. Also, a pure syllabary would give some very long words in Voynichese. A mixture of syllabic and alphabetic might fit; also, it’s been suggested that some characters might be abbreviations, and there are some examples online (at either Stolfi’s site or Rene Zandbergen’s) of Latin technical documents with a lot of abbreviations.

      There’s also been work showing that the distribution of apparent vowels in Voynichese maps fairly well onto what we’d expect from a “real” alphabet that showed all the vowels, if I recall correctly.

      One oddity that’s a major problem for any straightforward explanation is that the distribution of characters across the manuscript is odd. If I recall correctly, a lot of characters only occur once or twice in the whole manuscript, and I think they’re clustered in the early part of it, as if someone was experimenting with a script before settling down. I’d need to check the details – I’ve gone for a swift response rather than a more exhaustive but slower one. I hope this helps.

      I’d be interested in your thoughts about the Bax proposed decipherment, since its linguistic features strike me as odd – among other things, he hasn’t identified any bilabials (though he claims to have identified three separate characters for “r”) and the vowel system he proposes looks unbalanced with regard to front and back vowels etc.

    • Indeed I’m just like you, an amateur. I read the Bax article and learned through it what an Abjad is. His theory contemplates the possibility that it’s thoroughly in an Abjad system or at least has some Abjad characteristics.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.