voynich - pannous/hieros GitHub Wiki

Folio 49b (page 98) of the Voynich Manuscript contains Indo-Arabic numerals presumably annotating numbers:

https://www.kreativekorp.com/software/fonts/voynich/

Voynich numbers 1-5

Whether these annotations were contemporary and informed or later guesses, the repeating character sequences do hint at a true connection with numerals:

Voynich numbers repeated

Interestingly numerals in the various derivatives of Brahmi scripts (in 5½ out of 7 cases) show a remarkable resemblance to Voynich numbers :

1 ≈ 󿐔 EVA 'o' 2 ≈ 󿐃 EVA 'r' 3 ≈ 󿐗 EVA 'y' 4 ≈ 󿐌 EVA 'c' 󿐆 'e' 󿐏 'h' ? 5 ≈ 𓎆 EVA 'v' 󿐛 ? 6 ≈ 󿐢 EVA 'k' 7 ≈ 󿐊 EVA 's' 8 ≈ 󿐡 EVA 'p'

1 ≈ 𑇑 ౧ ၁ ᧑ 2 ≈ २
3 ≈ 󿐗 ≈ 𑽒 ≈ ৩ ≈ ၃
4 ≈ 󿐆 ≈ ၄ ≈ ༤
5 ≈ 𓎆 ≈ 𑁖 ≈ ༥ II. 'ᵓ' ٥ @ arabic 𑁖 5 "pancha" ≈ 𐨤 "p" @ Kharosthi 6 ≈ 󿐢 ≈ 'ꠉ' ≈ ೬ ≈ 𑽖 ?
7 ≈ ౭
8 ≈ 0 ≈ P ?

While no extant Brahmi derived script matches the numerals perfectly, Myanmar with 4 out of 7 good matches comes closest:

Brahmi Telugu Myanmar Other
𑁒 ONE 𑇑 Sharada, Kawi, ൧ ୧ ᧑ Tai Lue
𑁓 TWO २ Devanagari
𑁔 THREE ৩ Bengali 𑽒 Kawi
𑁕 FOUR ༤ Tibetan S Bengali Burmese
𑁖 FIVE 𑁖 Brahmi Bengali ৫ ≈ c Sindhi ५ Nepali-Marathi-Gurkhali
𑁗 SIX ೬ Kannada early form! 300BC-700CE Andraha … Nepali 12th CE
𑁘 SEVEN
𑁙 EIGHT
𑁚 NINE
𑁛 TEN

The Myanmar variant is especially interesting as it is close to Brahmi derived Tai Le script in Yunan, China. Phags-pa, a very different Brahmi alphabet, was historically used during the Yuan dynasty for various languages, including Chinese. Similarly Marchen script is unlike Voynich script.

For more matches see Georges Ifrah "The Universal History of Numbers" p.368…

Even though ೬ is the best extant match for Voynich '6', φ forms were widespread in the first millennium AD.

Note that the series does not repeat predictably after 5

The three series look like
P 1 2 3 4 5 6? 7?
P 1 2 3 4 5 6'?
P 1 2 3 4 5 X
"3 4 6? 3"

Six is the most problematic in this series, there is no (known) good approximation, in general the form might be closest to Brahmi signs for L.
ꠉ บ ⇔ ៦ ๖ ໖ ೬ ၆ ⇔ LO ល LY ឭ liao LLLA ഴ LA ல LA ల PLA ป LA ལ BUT ౫ = 5 !

The sign २ for 2 looking similar to 2 is no coincidence since European numerals are ultimately derived from Brahmi numerals: 𑁦 𑁧 𑁨 𑁩 𑁪 𑁫 𑁬 𑁭 𑁮 𑁯.

The representation of numbers using letters of derived Brahmi scripts was often used in various metrical compositions, which are poetic verses with specific rules for rhythm and syllabic count, especially during the classical period of Indian mathematics and astronomy (roughly from the 5th to the 12th century CE).

Just like Roman alphabets don't reveal the underlying language, Brahmi derived signs are compatible with a wide swath of languages: All far eastern languages of countries that adopted Buddhism, e.g. :

౧ Telugu ⇨ Dravidian:Tamil,Telugu PIE:Pallava,Prakrit, Sanskrit!, Brahmic/Sharada
၁ Myanmar ⇨ Burma: Sino-Tibetan

In any case the letter frequency distribution of Voynich is typical for a real language written with an alphabet or an abjad, so while the numbers could be Brahmic, the full syllabary is certainly not used:

So other methods of linguistics need to be applied to make possible connections to language families:

Word length distribution hints at any tonal language such as:
Vietnamese binomial
or Chinese:
Chinese binomial
or other…

Tonal languages are further dictated by Character Entropy [0], the only plausible alternatives to Sino-Tibeto-Burman origins of the language (not the script!) being other tonal languages maybe in the Niger-Congo or Malayo-Polynesian families.

Min Dong and Hakka belong to the Sino-Tibetan language family, specifically to the Chinese branch. The study [0] did not include pinyin or wade-giles transliterations of Mandarin, Cantonese or other Chinese dialects, so ancestral forms of these could be good candidates too.

Statistics rule out substitution cyphers of any non-tonal language, as well as 'random' babbling since the script has structural features very typical of real languages, specifically typical word and letter frequency curves.

Cyphers which would break Zipfs law are also ruled out.

Word duplication or triplication is a puzzling feature of the Voynich script, but a grammatical feature of said languages especially if one makes the assumtion that tones are not represented in the transcription system. In fact there is an infamous 94 word long chinese poem in which every word is pronounced shi ([ʂɻ̩]) :

No one (including scholars at the Chinese Academy of Sciences in Beijing) has been able to find any clear examples of Asian symbolism or Asian science in the illustrations (except maybe for Lotus and Dragon) indicating that the document was produced in Europe after the return of the Marco Polo delegation.

However, the apparent division of the year into 360 days (rather than 365 days), in groups of 15 and starting with Pisces, are features of the Chinese agricultural calendar (jie qi, 節氣).

The Voynich script originated in the 15th century CE, about a century after the return of Marco Polo from China and the very successful publication of his Livre des Merveilles du Monde or Devisement du Monde ("Description of the World"). It is important that the text was composed in a European context though, as evidenced by much of the iconography.

Summary of Evidence:
European symbolism with some middle eastern elements
Voynich vocabulary follows Zipfs law
Voynich character distribution typical for real languages
Voynich character entropy only compatible with tonal languages, as
Many classes of cyphers are ruled out
Two dialects, five hands
Brahmi numerals / numbers
Chinese calendar ( 360 = 2 * 12 * 15 )
Numerals coincide with letters, similar to Greek and Sanskrit ancestors.

The authors …
were generally inserted in the European culture.
had seen typical medieval herbals and astrological writings, and imitated that style.
had good handwriting,
had limited artistic abilities (the quality of drawings clearly improved as he worked on the VMS).
had practically no geometric sensibility or education.
did not give much tought to layout (bent and tilted lines, uneven line spacing, crooked template-drawn circles, overlapping ad overflowing diagrams, etc. etc.)
worked at the VMS for many months, perhaps several years.

Topics and hands

Two languages/dialects 4-5 hands

Topics are correlated with the different 'hands':

Topics cluster with specific words:

Most frequent words by topic (using standard EVA transcription, NOT to be read phonetically):

Topic 0: chedy daiin shedy ol aiin chol or ar gokeedy gokedy gokain chey gokeey gokaiin shey al dar chor dal okaiin
Topic 1: daiin chol chor thy chy shol sho cthol cthor shor shy cho dy chaiin dain gotchy otchy cheor dor they
Topic 2: aiin ar al or okar air otaiin oteos oteey okaiin otar oteody okal cheody chdy otal am dar ykar okey
Topic 3: okeol cheol gokeey gokeol okey cheor cheey shey sheol ckhey cheody ol cheo oteey okey gokeody okol gokeedy chey dol
Topic 4: gokaiin al gokain gokeedy otaiin ar gokeey lkaiin chey gotaiin oteey lchedy chol chy raiin lkeey gotain chaiin otain
Topic 5: okaiin or gokain ol gokar chol gokaiin okar okaiin goal godaiin gokol otain okain okal chdy kaiin olkain gotaiin okol gokeo

Chinese phonology has a very limited set of consonants at the end of words, mostly n and g, rarely r and in Cantonese some exceptions. Words starting with consonants are mostly of the form a- wu- /u/ yi- /í/. This motivates a very preliminary incomplete mapping:

Map d⇨ń dy⇨ng l⇨ƞ o⇨í/yi go⇨yi'

Topic 0: cheng ńaiin sheng íƞ aiin chíƞ ír ar yi-keeng yi-keng yi-kain chey yi-keey yi-kaiin shey aƞ ńar chír ńaƞ íkaiin
Topic 1: ńaiin chíƞ chír thy chy shíƞ shí cthíƞ cthír shír shy chí ng chaiin ńain yi-tchy ítchy cheír ńír they
Topic 2: aiin ar aƞ ír íkar air ítaiin íteís íteey íkaiin ítar íteíng íkaƞ cheíng chng ítaƞ am ńar ykar íkey
Topic 3: íkeíƞ cheíƞ yi-keey yi-keíƞ íkey cheír cheey shey sheíƞ ckhey cheíng íƞ cheí íteey íkey yi-keíng íkíƞ yi-keeng chey ńíƞ
Topic 4: yi-kaiin aƞ yi-kain yi-keeng ítaiin ar yi-keey ƞkaiin chey yi-taiin íteey ƞcheng chíƞ chy raiin ƞkeey yi-tain chaiin ítain
Topic 5: íkaiin ír yi-kain íƞ yi-kar chíƞ yi-kaiin íkar íkaiin yi-aƞ yi-ńaiin yi-kíƞ ítain íkain íkaƞ chng kaiin íƞkain yi-taiin íkíƞ yi-keí

At this stage ń,ƞ,n do not represent definitive sounds but are rather used to distinguish characters in the domain of the map.

Speculative phonology of some numbers is consistent with above mapping:
1 ≈ 'o' ౧ ၁ ⇔ 𓂂 ain yin ≈ yi
2 ≈ २ 'r'
3 ≈ '9' ≈ ၃ ?
4 ≈ 'ˢ' 'ᶜ' ⇔ ၄ ༤ 'ci' / 'si' vs 𑁕 𑁪 ૪ ੪ ৪ 𑀫 MU 𔖻 𔑿 mi mu 𔗘 𒐼 le'mu 𒇹
5 ≈ '𓎆' ≈ 𑁖 ≈ ༥ II. ᵓ ٥ @ arabic ⇔ ه/و hu/wu ? 6 ≈ ꠉ บ ೬ ⇔ ៦ ๖ ໖ ၆ LO ល LY ឭ liao LLLA ഴ LA ல LA ల PLA ป LA ལ BUT ౫ = 5 !
7 ≈ ౭ = ౭ si? qi?
8 ⇔ 0 ? P? pa 𐋨 ba ?

Comparing the EVA readings to modern Chinese:

1 ≈ 󿐔 EVA 'o' yi 2 ≈ 󿐃 EVA 'r' er 3 ≈ 󿐗 EVA 'y' san ⚠️ σ 'ng' in -dy 4 ≈ 󿐌 EVA 'c' si 󿐆 'e' 󿐏 'h' 5 ≈ 󿐛 EVA 'v' wu 6 ≈ 󿐢 EVA 'k' liao ⚠️ 7 ≈ 󿐊 EVA 's' qi 8 ≈ 󿐡 EVA 'p' ba

5 out of 8 EVA mappings match the expected phonology reasonably well! Note that we don't know which Sino-Tibetan dialect exactly is represented, nor do we know with 100% certainty the phonology of reconstructed Chinese dialects, thus 'reasonably well' in this context is the best we can hope for.

If there is any reason in thes mapping it frees the EVA l 󿐚

Note that a reading of '󿐔' as í / yi was already dictated by the phone frequencies of Chinese! Note that a reading of '󿐌' as 'c' is cheating since it only occurs in compound signs!

The following reading is especially problematic 3 ≈ 󿐗 EVA 'y' san ⚠️ because it almost certainly has to map to word final 'g' ( 'ng' in -dy ) Maybe it is a tone marker?

https://www.voynichese.com/#/f9v/exa:dy/0

A similar very tentative mapping yields these most frequent words:
'qíkiŋ': 132, 'ítuiŋ': 133, 'chng': 137, 'íkain': 137, 'µbu': 140, 'chøíïⁿ': 144, 'qíkar': 144, 'ain': 145, 'íten': 152, 'chuiŋ': 152, 'ítøng': 153, 'mí': 162, 'íkuiŋ': 173, 'qíkaïⁿ': 184, 'ḫaïⁿ': 187, 'ng': 188, 'ḫain': 195, 'chír': 196, 'muiŋ': 196, 'iŋ': 204, 'íken': 211, 'míïⁿ': 218, 'aïⁿ': 226, 'miŋ': 253, 'qíken': 264, 'qíkøng': 270, 'ḫar': 276, 'qíkain': 276, 'chøiŋ': 298, 'qíkung': 301, 'qíkuiŋ': 302, 'ír': 320, 'chíïⁿ': 335, 'ar': 347, 'møiŋ': 427, 'íïⁿ': 433, 'chøng': 451, 'en': 472, 'møng': 652, 'ḫen': 786, '不': 4301}

d⇨ń poses a problem for the most frequent word, which would be ḫen under d⇨ḫ or ńan/nam under d⇨ń.

The long tail is especially interesting since long words show combinations similar to qíngkøiŋ, miŋ'ckhíï …

Since ch has ligatures with P, t, k and f and is predominantly found at words starts, it's likely that ch denotes aspiration pʰ tʰ kʰ fʰ (bʰ?).

As noted by Bowern [1], chedy and shedy (cheng/sheng) should be variants of the same word in order to match the zipf distribution, which seems very plausible and may hint at a lack of expertise on side of the original scribe when writing down what he heard.

This is certainly not the final word but should encourage further research. In fact, in order to not introduce bias into further research, a full mapping is intentionally not provided here.

References and recommended further reading:

[0] Luke Lindemann, Claire Bowern : Character Entropy in Modern and Historical Texts https://arxiv.org/abs/2010.14697 May 20, 2021
[1] Claire L. Bowern and Luke Lindemann: The Linguistics of the Voynich Manuscript
https://www.annualreviews.org/doi/pdf/10.1146/annurev-linguistics-011619-030613 November 11, 2020
[2] Rachel Sterneck, Annie Polish, Claire Bowern: Topic Modeling in the Voynich Manuscript https://arxiv.org/abs/2107.02858
[3] Cardan grille https://www.reddit.com/r/voynich/comments/oyeayi/a_cipher_wheel_inspired_by_rene_zandbergens_paper/
[4] 360 degrees https://www.reddit.com/r/voynich/comments/dd4zj2/zodiac_and_moirogenesis_paranatellonta/
[5] General Observations https://www.reddit.com/r/voynich/comments/mt18bu/some_general_observations/
[6] Stolfi’s Chinese Voynich hypothesis https://www.ic.unicamp.br/⋍stolfi/voynich/02-01-18-chinese-redux/
[7] Chinese investigated: https://web.archive.org/web/20180928122339/http://graphometrie.free.fr/publications/Voynich_en.pdf

Theory: Some member(s) of the Marco Polo delegation tried to teach some Chinese dialect to European scholars and convey some of the knowledge gathered on the journey. Thus the voynich script may represent a scientific and linguistic chapter of Devisement du Monde which was either published separately or dropped from later copies for obvious reasons.

Further reading and updated information at https://github.com/pannous/hieros/wiki/voynich

Noteworthy okeo in same position on moon chart https://www.voynichese.com/#/f68r2/exa:okeo/751