Tabula · translator

⌂ TJID3 English Canon

Tabula · English Canon

T. M. Jones, Ph.D. · DOI: 10.5281/zenodo.19039226 · Home

tone wa past shi present le perfective ma question bi duration li conditional da emphasis om fate grammar ke relative na locative de genitive e and bez without ima now cho that
pidgin v2 · no diacritics · particles preverbalEnglish
Hemingway question conditional emphasis om / fate perfective Mark Twain Genesis Austen
reverse translation: Tabula → Englishpidgin input
Hemingway (Tabula) question (Tabula) past (Tabula) perfective (Tabula) duration (Tabula) conditional (Tabula) emphasis (Tabula) fate (Tabula)

word breakdown

The Sun Also Rises — E. Hemingway (1926)

Metric
Hemingway
Original
Pidgin
Compression
Δ Retention
Volume
Total Wordsspace-delimited tokens67,707~11,200
~16.5%
Total Charactersincl. spaces & punctuation363,955~63,000
~17.3%
Total Charactersletters only, no spaces296,660~51,800
~17.5%
Total Syllablesvowel-cluster algorithm ±2%89,293~18,200
~20.4%
Density
Avg Syllables / Word1.3191.627
+23%
Avg Letters / Wordletter characters only4.384.62
+5%
Structure
Books33
Chapters1919
~17%Word count reduction
~20%Syllable count reduction
~17%Character count reduction

Original text: Hemingway, Ernest. The Sun Also Rises. New York: Charles Scribner's Sons, 1926. Text via Project Gutenberg Canada #1257. Word count 67,707 is the standard verified count across all Scribner's / Gutenberg reproductions (range across counters: 66,800–68,200). Character and syllable counts computed from calibrated multi-chapter sample (n=1,148 tokens).

Pidgin text: Newlang pidgin compression of the full novel. All 19 chapters covered. Stats estimated from calibrated sample (n=826 tokens) extrapolated to full document (~63,000 chars). Syllables computed with same vowel-cluster algorithm as original; margin of error ±3% on projected totals.

Density inversion: Despite overall compression, avg syllables/word is higher in the pidgin (+23%). This is structurally expected: pidgin drops short English function words (the, a, of, in, and) and replaces them with single-char particles (e, a, na, de) which score as 1-syllable words but shorter in character count, while retaining polysyllabic content words (proper nouns, verbs, nouns). The net phonological load per retained word is therefore higher even as total volume collapses.

TJID3 Research · Cleveland, Ohio · T.M. Jones, Ph.D. · ORCID 0000-0001-7372-6345

Phoneme Table

v2 design changes: Diacritics removed entirely — á à ā ã ạ all retired. Tone and register now encoded by preverbal particles (see § Tone Particles below). Oceania (Māori, Hawaiian) and Australian Aboriginal (Warlpiri) removed; inventory stabilizes. Phoneme count reduced by 9; syllable-onset clusters simplified. The Latin script now reads cleanly without keyboard support.

Vowels

LetterIPASourcesNotes
a/a/universalOpen central
e/e/romancegermanicslavicMid front unrounded
i/i/universalHigh front
o/o/universalMid back rounded
u/u/universalHigh back rounded
/y/germanicromancemandarinHigh front rounded (optional)

Tone & Register — Particle System

Design rationale: Version 1 encoded tone via diacritics. Version 2 moves all register and modality marking to preverbal particles.
ParticleFunctionSourceExample
wacopula / completed past / assertionslavicta wa lao — he was old
shicopula / present statemandarinta shi lao — he is old
leperfective aspectmandarinta le pesk — he has fished
maquestion / uncertaintymandarinarabicta ma shi lao? — is he old?
biduration / sustained stateslavicta bi sol — he has long been alone
liconditional / softeningromanceta li ven — he might come
daemphasis / contradictionslavicarabicda ta sol — he is alone
ominevitability / grief / seahindiarabicom golf strem — the Gulf Stream (as fate)

Grammar Particles

ParticleFunctionSource
kerelative clause markerarabicmandarin
nalocative (in/on/at)slavicswahili
degenitive / ofromance
eand / alsoromance
bezwithout / negationslavic
imanow / currentjapanese
chowho / that (relative)slavic

Calibration Sample

E. Hemingway — The Old Man and the Sea (opening)
"He was an old man who fished alone in a skiff in the Gulf Stream, and he had gone eighty-four days now without taking a fish."
Ta wa lao vir ke pesk-a sol na shkaf na Golf Strem, e le bi chit okto-das-arba yaum bez chap pesk.
ta = he · wa = copula/past · lao = old · vir = man · ke = who/relative · pesk-a = fished · sol = alone · na = in/locative · shkaf = skiff · Golf Strem = proper noun · e = and · le = perfective · bi chit = gone/duration · okto-das-arba = 84 · yaum = days · bez = without · chap = taking · pesk = fish