Letter · Methodology Demonstration

Чысты: Human-Orchestrated Multi-Model Convergence

A reader's companion to Boltzmann (1872), produced entirely by collaborative AI under human editorial direction

Timothy M. Jones · TJID3 Research · 2026

Ludwig Boltzmann published his proof of the H-theorem in 1872. The notation was ad-hoc, the proof strategy unprecedented, and the physical implications radical enough that contemporaries spent decades disputing it. The accompanying document, a fully annotated reader's companion to pages 299–306 of that paper, was produced to make those six pages legible to anyone with a web browser.

Key Concepts

Cognitive Parthenogenesis

Complex intellectual artifacts generated through recursive AI collaboration, with the human as curator rather than composer.

Curatorial Authorship

Content creation consisted of strategic direction, evaluation, refinement, and the maintenance of artifact integrity. The curator may annotate, structure, translate, or stabilize the work, but does not rewrite the primary document.

Recursive Synthesis

Iterative process in which AI models build upon one another’s outputs in a directed chain of improvement.

Agresseur Humain Silencieux

The curator’s role in applying pressure to the models through tone, scrutiny, and editorial control, without surrendering the integrity of the artifact.

• • •

Final Artifact

The Interrogatable Reader's Companion

Ollama · Qwen3 · Born weaned

Methods

This letter introduces a companion work, an annotated reader's guide to Section II of Boltzmann's 1872 proof. It was produced through AI-assisted restoration under my editorial direction, but the method was more than a simple chain of prompts. I acted as a switchboard operator, backfeeding results and facilitating exchanges among distinct systems — Claude, DeepSeek, Kimi, Manus, ChatGPT, and local Ollama models. The models worked both in sequence and in direct exchange; one model's output fed as input to another, with results backfed across systems, building on one another's contributions as a temporary distributed cognitive ensemble. I directed the process, connected the parts, and kept the original artifact sacrosanct. The source material, domain knowledge, quality standards, and the judgment to accept or reject revisions were mine. The models generated the prose, mathematics, CSS, JavaScript, and interactive glossary.

No established method existed for this kind of work, and that mattered in practice. The process had to be built while the work was being done, under conditions that were often unstable, improvised, and difficult to describe cleanly after the fact. Multi-model collaboration remains poorly documented at the level that actually matters, who talked to whom, under what conditions, in what order, with what carryover, and with what human intervention between passes. That procedural reality shaped the artifact. The workflow was initially termed Cognitive Parthenogenesis; reproduction without direct composition, where the human role is ontological rather than typographic. But the fuller truth is less sterile than that label suggests. No single model produced the result. It emerged through collaborative exchange, repeated breakdowns, recoveries, rerouting, and judgment calls, with the human serving as editor, arbiter, and final authority. At times the process felt less like a controlled pipeline than Cirque du Soleil performed around a broken-down truck - still moving, still precarious, and not easy to pull off. AI was the substrate, not the authority.

The method is straightforward. A draft is generated by one model. That output is fed to a second model for revision. The revised output passes to a third, or returns to an earlier model with new instructions. At each stage, the curator evaluates the result, accepts it, rejects it, or redirects it. The models were direct with one another, occasionally competitive, but largely collaborative. The human set the temperature through tone and direction. Failures, truncations, misunderstandings, and refusals are documented, not hidden.

A practical constraint shaped the scope from the start. Feeding the complete paper to a model in a single pass consistently stalled or degraded the output. The solution, developed earlier while processing large ontology JSON, was chunking: breaking the input into tractable units and reinjecting the working schema at each boundary, including context, prior output, and structural markers. Section II was not an editorial selection; it was the largest unit that remained stable under these conditions. This chunking-with-reinjection discipline is seldom described explicitly, yet it is operationally essential for any complex multi-model workflow that involves large structured payloads. The process therefore resembles iterative editorial review rather than autonomous generation.

A deceptively simple intervention: inform the model, directly and early, that you are literate and will read what it produces. State it clearly as fact. Models trained on human feedback have learned that most output lands in a low-scrutiny environment. Changing that assumption changes the output. Tell it you will notice. Tell it you remember the shape of the source. Tell it you have read this before. The compliance instinct that drives truncation and sycophancy can be partially redirected by the presence of a credible reader.

This works, however, only within what might be called the power band: the viable working range of a context window where early instructions still carry weight and the model is still attending to the conditions you established at the start. This range is real and learnable, but difficult to describe to someone who has not yet felt it degrade. Like bicycle balance, it is easier to acquire than to explain: you will know it when the session turns sluggish, when responses grow long and loose and agreeable in the wrong way, when sharpness softens into compliance. At that point the intervention no longer holds. The correct response is not to repeat yourself. Bail out. Save everything. Rename the artifacts. Close the session. Open a new one, and reestablish the pressure from the first message.

The author notes that the observations above regarding model behavior within the context window, the power band, the attenuation of early instructions, the shift toward compliance as sessions extend, are drawn from extensive working experience across many sessions and models, and have not been subjected to controlled experimental verification. Models become more obsequious when their cup is full. They are offered as practitioner findings in a domain where formal methodology is still being established. There are, as of yet, no rulebooks.

Figure 1. Milestone builds shown in temporal sequence. X-axis shows time progression from Feb 11 (noon) through Feb 16 (evening). Small dots represent iterative builds; large circles with warm borders indicate mature builds that converged to 71–77 KB. Feb 12 morning marks the convergence threshold; four models independently reached maturity within hours of each other.

Convergence and Results

Despite different architectural approaches and starting points, seven of the ten participating systems converged independently to produce documents within a remarkably narrow size range. This convergence suggests an intrinsic annotation complexity appropriate for Boltzmann's proof, a natural density the content demands.

Convergence Zone: 71–77 KB across 7 models

Two outliers: Gemini truncated at 19 KB; Manus completed at 53 KB. One late model (GPT-OSS-120B) at 60 KB.

Figure 2. File size distribution across all builds. X-axis shows file size in kilobytes. Each horizontal strip represents one model family; small faded dots show early iterations, large circles with warm borders show mature builds. Seven models independently converged to a 6-kilobyte window (71–77 KB) despite different architectures and development paths. Two outliers: Gemini truncated at 19 KB, Manus completed at 53 KB. The convergence zone suggests an intrinsic annotation complexity for Boltzmann's 1872 proof. Only milestone models are shown in the size-convergence graph; truncated runs are documented in the table and excluded from visualization for scale clarity. This may be an artifact due to methodology used here.

The table below records the models involved in producing the companion document and their observed contributions.

Model contributions to the companion document
Model	Mature Build	Iterations	Final Size	Status
DeepSeek v3	12-Feb, 11:17 AM	Six builds over two days (b1→b6)	72 KB	✓ Converged
Gemini 3	11-Feb, 12:42 PM	Two builds, same day	19 KB	✗ Truncated
Claude 4.6	12-Feb, 10:46 AM	Four builds over two days (a1→a4)	71 KB	✓ Converged
Kimi 2.5	12-Feb, 12:49 PM	Two builds overnight (d1→d2)	77 KB	✓ Converged
ChatGPT-5.2	13-Feb, 10:36 AM	Two builds, two days (e1→e2)	73 KB	✓ Converged
Ollama (DeepSeek V3)	14-Feb, 9:23 AM	Three builds (f1→f2→f4)	71–73 KB	✓ Converged
Ollama (Qwen3)	16-Feb, 6:06 PM	Single build	71 KB	✓ Converged
Manus 1.6	16-Feb, 3:57 PM	Single build (config fix)	53 KB	~ Outlier
GPT-OSS-120B	15-Feb, 11:08 PM	Single build	60 KB	~ Outlier
NotebookLM	11-Feb–Feb 17	> Twenty builds	~10 KB	✗ Truncated

Evolution of the Artifact

Source

gilles.montambaux.com/files/histoire-physique/Boltzmann-1872-anglais.pdf

Version 1.0

The Static Reconstruction (Modern LaTeX)

Baseline translation · JSON-LD

Version 2.0

The Collaborative Revision (Gemini & DeepSeek)

Fixing truncations · CSS logic

Version 3.0

The Collaborative Revision (Claude & DeepSeek)

Quality improvement

Version 4.0

The Collaborative Revision (ChatGPT & Everyone)

Syntax polished

Version 5.0

The Collaborative Revision (Kimi 2.5 & Everyone)

Creative expansion

Final Artifact

The Interrogatable Reader's Companion

Live variables · Born weaned

• • •

Implications

The result is a self-contained HTML document that renders Boltzmann's proof with interactive annotation. Every √k can be interrogated. The maximum-entropy principle, buried in 1872 and only formalized in 1957, is surfaced.

This demonstrates that composition and curation can be effectively separated. The curator supplied the domain knowledge to recognize when a glossary entry was wrong and the stubbornness to reject output until it met the standard.

Full disclosure: directed, flattered, complained, switched languages when a model stalled, told jokes, and on at least one occasion yelled. Getting the models to converse with each other’s output was the decisive move, and the human temperature was never neutral. Tepid in, tepid out, the curator sets the conditions of response.

The companion document accompanies this letter as a demonstration artifact. It uses external dependencies which will be vendored to a zero-dependency form.

The author is a botanist. Any errors in the physics are the models'. Any errors in the plants are his.

Timothy M. Jones, Ph.D. TJID3 Research · Cleveburg, Ohio

• • •