The Structure of the Words in the Voynich Manuscript

2017-03-14

The pattern of glyphs within words in the Voynich Manuscript shows a high degree of regularity, which I attempt to capture here, in Tables 1-6. Certain prefixes and suffixes appear again and again. The tables below show the number of occurrences of words composed from some common prefixes and suffixes. Altogether, these words occur just under 14,000 times in the manuscript, which contains just over 37,000 words.

-al-aly-aim-ain-aiin-aiiin-air-aiir-am-ar-ary-as-
-02532969045741732387348245
d-442432810189807161002289298223
od-51320185745242400
qod-7701114123031110
chod-773063902011300
shod-103014220001010
yd-23003190200201
k-823204765314095244
ok-8133221141209421626124112
qok-918815227726211732514913
chok-582010160104700
shok-5400030010300
yk-51560104417253301
t-519111641113054301
ot-111371929515012144713954
qot-858506475260126312
chot-49004801051100
shot-2000020000000
yt-118611241031122611

Table 1: prefixes without l, suffixes beginning with a

For example, the word qokaiin occurs 262 times.


--al-aly-aim-ain-aiin-aiiin-air-aiir-am-ar-ary-as
-02532969045741732387348245
ld-3010231000500
old-1201290006500
qold-0000000000100
chold-0100010000100
shold-0000010000000
yld-0000000000000
lk-1501354914273000
olk-01111333104191900
qolk-0000520000000
cholk-0200240000300
sholk-0100000100100
ylk-0000020000000
lt-1300210001120
olt-0200220001300
qolt-0000000000000
cholt-0110010001100
sholt-0000000000000
ylt-0000000000000

Table 2: prefixes with l, suffixes beginning with a


--ol-oly-oim-oin-oiin-oiiin-oir-oiir-om-or-ory-os
-0528500433923213541729
d-441113051820867042
od-5200010000810
qod-7100000001201
chod-7200000000000
shod-10300000000100
yd-2010000000100
k-837300500022612
ok-879400900073338
qok-9103101620123501
chok-5400010000610
shok-5200000000100
yk-510100400001110
t-547200200012314
ot-11811001300014244
qot-847100300112901
chot-4600000002400
shot-2310000000000
yt-19000210011411

Table 3: prefixes without l, suffixes beginning with o


--ol-oly-oim-oin-oiin-oiiin-oir-oiir-om-or-ory-os
-0528500433923213541729
ld-3200000000000
old-1000000000200
qold-0000000000000
chold-0000000000000
shold-0000000000000
yld-0000000000000
lk-1500000000400
olk-0500000000410
qolk-0000000000000
cholk-0000000000000
sholk-0000000000000
ylk-0000000000000
lt-1000000000000
olt-0000000000000
qolt-0000000000000
cholt-0000000000000
sholt-0000000000000
ylt-0000000000000

Table 4: prefixes with l, suffixes beginning with o


--y-ey-eey-eeey-dy-ed-edy-eed-eedy
-01431422290005
d-4422917121202
od-54312200002
qod-71702000003
chod-79020010000
shod-105200000000
yd-2700000000
k-8221444110144153
ok-8886317427121163103
qok-9139105308264726515301
chok-539610201313
shok-5843100000
yk-51875461019127
t-516102110142213
ot-1111057135811154899
qot-882234240491374
chot-43697100201
shot-2600000100
yt-124102530021127

Table 5: prefixes without l, suffixes usually beginning with e


--y-ey-eey-eeey-dy-ed-edy-eed-eedy
-01431422290005
ld-32200000000
old-12401000000
qold-0100000000
chold-0900000000
shold-0800000000
yld-0000000000
lk-11774180128341
olk-021124090127142
qolk-0416100107
cholk-0401200003
sholk-0000000002
ylk-0000000000
lt-1202000705
olt-0112000503
qolt-0000000101
cholt-0100000000
sholt-0000000000
ylt-0000000000

Table 6: prefixes with l, suffixes usually beginning with e

In each case, the number of occurrences of a word is close to the product of the total number of words, the frequency of the prefix, and the frequency of the suffix. In Table 7, I have computed the products, rounded to the nearest integer, using the data contained in Table 1. This shows that the probability of any suffix appears to be independent of the prefix to which it is concatenated.

alalyaimainaiinaiiinairaiiramararyas
3023429520448316591369276156
d3930437726662920771789359197
od32230194516162611
qod21420122914141710
chod21320112613141510
shod17106140202800
yd15105110102600
k5385133783102114421
ok1511514310123882963413673
qok201551931353201039945183104
chok191081812031010
shok0300250101300
yk32130184315162510
t32431214926172811
ot131031329021372663012262
qot6486142993123145731
chot17106140202800
shot0100110000100
yt32020174115162410

Table 7: computed version of Table 1

When this is repeated for Table 2 - Table 6, the root-mean squared difference between actual and computed values is 24.312449. However, as the frequency of a word very much depends on where in the manuscript it appears, it is better to analyse one section at a time when investigating this further.

Up

© Copyright Donald Fisk 2017