Чысты: Human-Orchestrated Multi-Model Convergence
Ludwig Boltzmann published his proof of the H-theorem in 1872. The notation was ad-hoc, the proof strategy unprecedented, and the physical implications radical enough that contemporaries spent decades disputing it. The accompanying document, a fully annotated reader's companion to pages 299–306 of that paper, was produced to make those six pages legible to anyone with a web browser.
Key Concepts
Complex intellectual artifacts generated through recursive AI collaboration, with the human as curator rather than composer.
Content creation consisted of strategic direction, evaluation, refinement, and the maintenance of artifact integrity. The curator may annotate, structure, translate, or stabilize the work, but does not rewrite the primary document.
Iterative process in which AI models build upon one another’s outputs in a directed chain of improvement.
The curator’s role in applying pressure to the models through tone, scrutiny, and editorial control, without surrendering the integrity of the artifact.
Methods
This letter introduces a companion work, an annotated reader's guide to Section II of Boltzmann's 1872 proof. It was produced through AI-assisted restoration under my editorial direction, but the method was more than a simple chain of prompts. I acted as a switchboard operator, backfeeding results and facilitating exchanges among distinct systems — Claude, DeepSeek, Kimi, Manus, ChatGPT, and local Ollama models. The models worked both in sequence and in direct exchange; one model's output fed as input to another, with results backfed across systems, building on one another's contributions as a temporary distributed cognitive ensemble. I directed the process, connected the parts, and kept the original artifact sacrosanct. The source material, domain knowledge, quality standards, and the judgment to accept or reject revisions were mine. The models generated the prose, mathematics, CSS, JavaScript, and interactive glossary.
No established method existed for this kind of work, and that mattered in practice. The process had to be built while the work was being done, under conditions that were often unstable, improvised, and difficult to describe cleanly after the fact. Multi-model collaboration remains poorly documented at the level that actually matters, who talked to whom, under what conditions, in what order, with what carryover, and with what human intervention between passes. That procedural reality shaped the artifact. The workflow was initially termed Cognitive Parthenogenesis; reproduction without direct composition, where the human role is ontological rather than typographic. But the fuller truth is less sterile than that label suggests. No single model produced the result. It emerged through collaborative exchange, repeated breakdowns, recoveries, rerouting, and judgment calls, with the human serving as editor, arbiter, and final authority. At times the process felt less like a controlled pipeline than Cirque du Soleil performed around a broken-down truck - still moving, still precarious, and not easy to pull off. AI was the substrate, not the authority.
The method is straightforward. A draft is generated by one model. That output is fed to a second model for revision. The revised output passes to a third, or returns to an earlier model with new instructions. At each stage, the curator evaluates the result, accepts it, rejects it, or redirects it. The models were direct with one another, occasionally competitive, but largely collaborative. The human set the temperature through tone and direction. Failures, truncations, misunderstandings, and refusals are documented, not hidden.
A practical constraint shaped the scope from the start. Feeding the complete paper to a model in a single pass consistently stalled or degraded the output. The solution, developed earlier while processing large ontology JSON, was chunking: breaking the input into tractable units and reinjecting the working schema at each boundary, including context, prior output, and structural markers. Section II was not an editorial selection; it was the largest unit that remained stable under these conditions. This chunking-with-reinjection discipline is seldom described explicitly, yet it is operationally essential for any complex multi-model workflow that involves large structured payloads. The process therefore resembles iterative editorial review rather than autonomous generation.
A deceptively simple intervention: inform the model, directly and early, that you are literate and will read what it produces. State it clearly as fact. Models trained on human feedback have learned that most output lands in a low-scrutiny environment. Changing that assumption changes the output. Tell it you will notice. Tell it you remember the shape of the source. Tell it you have read this before. The compliance instinct that drives truncation and sycophancy can be partially redirected by the presence of a credible reader.
This works, however, only within what might be called the power band: the viable working range of a context window where early instructions still carry weight and the model is still attending to the conditions you established at the start. This range is real and learnable, but difficult to describe to someone who has not yet felt it degrade. Like bicycle balance, it is easier to acquire than to explain: you will know it when the session turns sluggish, when responses grow long and loose and agreeable in the wrong way, when sharpness softens into compliance. At that point the intervention no longer holds. The correct response is not to repeat yourself. Bail out. Save everything. Rename the artifacts. Close the session. Open a new one, and reestablish the pressure from the first message.
The author notes that the observations above regarding model behavior within the context window, the power band, the attenuation of early instructions, the shift toward compliance as sessions extend, are drawn from extensive working experience across many sessions and models, and have not been subjected to controlled experimental verification. Models become more obsequious when their cup is full. They are offered as practitioner findings in a domain where formal methodology is still being established. There are, as of yet, no rulebooks.
Convergence and Results
Despite different architectural approaches and starting points, seven of the ten participating systems converged independently to produce documents within a remarkably narrow size range. This convergence suggests an intrinsic annotation complexity appropriate for Boltzmann's proof, a natural density the content demands.
Two outliers: Gemini truncated at 19 KB; Manus completed at 53 KB. One late model (GPT-OSS-120B) at 60 KB.
The table below records the models involved in producing the companion document and their observed contributions.
| Model | Mature Build | Iterations | Final Size | Status |
|---|---|---|---|---|
| DeepSeek v3 | 12-Feb, 11:17 AM | Six builds over two days (b1→b6) | 72 KB | ✓ Converged |
| Gemini 3 | 11-Feb, 12:42 PM | Two builds, same day | 19 KB | ✗ Truncated |
| Claude 4.6 | 12-Feb, 10:46 AM | Four builds over two days (a1→a4) | 71 KB | ✓ Converged |
| Kimi 2.5 | 12-Feb, 12:49 PM | Two builds overnight (d1→d2) | 77 KB | ✓ Converged |
| ChatGPT-5.2 | 13-Feb, 10:36 AM | Two builds, two days (e1→e2) | 73 KB | ✓ Converged |
| Ollama (DeepSeek V3) | 14-Feb, 9:23 AM | Three builds (f1→f2→f4) | 71–73 KB | ✓ Converged |
| Ollama (Qwen3) | 16-Feb, 6:06 PM | Single build | 71 KB | ✓ Converged |
| Manus 1.6 | 16-Feb, 3:57 PM | Single build (config fix) | 53 KB | ~ Outlier |
| GPT-OSS-120B | 15-Feb, 11:08 PM | Single build | 60 KB | ~ Outlier |
| NotebookLM | 11-Feb–Feb 17 | > Twenty builds | ~10 KB | ✗ Truncated |
Evolution of the Artifact
Source
gilles.montambaux.com/files/histoire-physique/Boltzmann-1872-anglais.pdf
Implications
The result is a self-contained HTML document that renders Boltzmann's proof with interactive annotation. Every √k can be interrogated. The maximum-entropy principle, buried in 1872 and only formalized in 1957, is surfaced.
This demonstrates that composition and curation can be effectively separated. The curator supplied the domain knowledge to recognize when a glossary entry was wrong and the stubbornness to reject output until it met the standard.
Full disclosure: directed, flattered, complained, switched languages when a model stalled, told jokes, and on at least one occasion yelled. Getting the models to converse with each other’s output was the decisive move, and the human temperature was never neutral. Tepid in, tepid out, the curator sets the conditions of response.
The companion document accompanies this letter as a demonstration artifact. It uses external dependencies which will be vendored to a zero-dependency form.
The author is a botanist. Any errors in the physics are the models'. Any errors in the plants are his.
This Document in Other Languages
The following editions were produced using the same curatorial authorship methodology. Each is a self-contained artifact.
ZZZ