The 20-Year File: A Self-Contained Format for Durable Scholarly Publication

T. M. Jones, PhD

TJID3 Technical Note • Published: December 2025

DOI: 10.5281/zenodo.18054789

Abstract

The 20-Year File is a single archival HTML artifact that carries its own data, logic, styling, figures, and narrative. It opens locally in a standard browser, runs without installation, and makes no external requests. The point is not nostalgia. The point is reducing the number of things that can break. This note describes the architecture through the Michigan Cannabis Market Analysis (DOI: 10.5281/zenodo.18065297), a longitudinal economic dataset with 74 months of regulatory data at publication. The raw data are embedded as compact JSON. The browser reads that payload, recalculates the summaries, refreshes labels, and redraws the figures at load time. Borrowing from taxonomy, the raw dataset can be treated as a digital holotype. The working paper, with data, analysis, interface, and explanatory machinery bundled together, functions as a digital paratype. It is not a PDF imitation. It is an executable scholarly object.

Figure 1. Zero-dependency architecture of the 20-Year File. Nine components arranged in three tiers: Constraints (offline execution, no external APIs or CDNs, browser-native platform); Internals (Gojira embedded JSON payload, Gamera state engine, inline CSS styling); Outputs (interactive figures, rewritten narrative, archival reuse). All nine travel together in a single HTML file.

Core claim

A durable research artifact should not depend on a CDN, package manager, server route, framework hydration step, or institutional dashboard to remain legible.

Availability: The live artifact is accessible at:

Live Artifact https://tjid3.org/

The Context: Archival Risk

Longitudinal datasets can remain useful for decades. Their interfaces often do not. A chart may be scientifically valuable while the framework, CDN, build pipeline, or API that rendered it has already disappeared. This is the failure mode addressed here: the record survives, but the instrument for exploring it does not. Physical archives fail through fire, water, or neglect. Digital artifacts fail through dependency decay.

Modern publishing often solves short-term deployment problems by adding layers. For archival work, those layers become liabilities. Frameworks churn. APIs are deprecated. Package managers change. Browsers remain, but the scaffolding around a project may vanish.

The 20-Year File inverts that risk. The artifact collapses the stack into one browser-readable file: no server, no build step, no login. The complete interactive paper runs as one HTML file, locally, in a standard browser, with data and logic embedded inside it. A file that can be emailed, mirrored, forked, and opened offline is easier to preserve because it is easier to distribute. It is a digital postcard.

The Evergreen Architecture

The architecture separates the artifact into three internal parts, but keeps all three inside the same file. The pattern is “Gojira & Gamera”: one block carries the data, another carries the logic, and inline CSS carries the presentation.

The architecture did not begin as a theory. It began as impatience. Wanted a living document that could be updated without rebuilding a website, repairing a dashboard, or changing the same values in six places. The practical answer was to pull the data into the file, let the browser recalculate the summaries, and let the page rewrite its visible state from that embedded payload. Only later did the preservation argument become obvious. Once the data, logic, figures, styling, and narrative were traveling together, the artifact had fewer ways to break. What began as a lazy maintenance trick became an archival design rule: keep the object complete, keep the stack small, and make each copy capable of running on its own.

The file is not a framework, but it can be used like one. A careful reader can save it, strip the dataset, replace the payload, revise the labels, and produce a new working artifact without installing anything. The method travels as an object, not a service.

Gojira, embedded JSON data

The complete dataset sits inside the HTML, already present when the browser opens the file.

Gamera, visualization and narrative logic

Small JavaScript routines process the data, redraw figures, and update text.

Inline CSS, zero-dependency styling

The interface theme, layout, cards, tables, and figures travel with the paper.

1. Gojira, the data engine

The dataset is embedded directly into the document as a compact JSON-style array. There is no loading spinner because there is no fetch.

// Example embedded data structure
const rawData = [
    ["10/1/2019", "Oct", 2019, "$28,594,402", ...],
    ["11/1/2019", "Nov", 2019, "$26,628,235", ...]
];

Listing 1: The data payload is part of the artifact, not a remote dependency.

2. Gamera, the state engine

The logic runs as a small, scoped JavaScript module. It reads the embedded data, computes the current state, and writes the visible page.

const App = (() => {
    const state = { data: [], charts: {}, activeFilter: 'flower' };

    const init = () => {
        state.data = Logic.process(rawData);
        Render.charts();
        Render.narrative();
    };

    return { init };
})();

Listing 2: A scoped state engine keeps the artifact self-contained and predictable.

3. Vendored or native visualization

When a library is necessary, it is treated as material, not infrastructure. The code is vendored into the artifact rather than fetched from a CDN. When native SVG is enough, the artifact uses SVG directly.

The result is a working paper whose parts cannot drift apart. Data, code, interface, and explanation ship together as one object, a digital postcard that still runs.

Performance & Behavior

A zero-dependency artifact should feel fast because the browser is doing less ceremony. The file already contains its data. The logic is local. The interface does not wait for a framework to hydrate.

Figure 2. Evergreen state-engine behavior. When the embedded Gojira payload changes, Gamera checks the data, recomputes summaries, redraws figures, and refreshes the narrative locally. The artifact updates because the state engine is already inside the file.

Snapshot Date: 2026-01-06 (Production Build)

0.5s

Start Render

0.8s

FCP

0.8s

LCP

0.8s

Speed Index

0.0s

TBT

0.001

CLS

250KB

Page Weight

0.8s

DC Time

250KB

DC Bytes

1.0s

Total Time

Performance panel A: Catchpoint performance metrics for the artifact, captured from the production deployment to provide a repeatable baseline.

State engine behavior

On load, Gamera checks the embedded Gojira payload. It recalculates durations, updates labels, refreshes charts, and rewrites summary text from the data already inside the file.

This is the “evergreen” part of the method. A monthly JSON update does not require rebuilding the page. Replace Gojira, open the file, and the paper reconstitutes itself.

Reproducible baseline

Catchpoint and Google PageSpeed Insights provide a practical performance check. The production build remains small, fast, and stable because it avoids network calls and framework overhead.

Forking the artifact

To fork the method, save the HTML file, rename the paper, replace the domain content, and swap in a new embedded dataset. The artifact starts complete. No build chain has to be resurrected.

Limitations: The 5MB Ceiling

This architecture is not a general substitute for databases, collaborative software, authenticated dashboards, or streaming applications. It is for modest-to-medium research artifacts where longevity matters more than platform scale.

The practical ceiling is file size. Around 5MB, download and parsing latency begin to matter on ordinary devices and networks. Below that ceiling, behavior remains predictable.

The monolithic structure also changes collaboration. It is poor for teams that need separate CSS, JavaScript, backend, content, and deployment workflows. It is excellent for copying, preserving, teaching, forking, and archiving a finished scholarly object.

The tradeoff is deliberate: fewer moving parts, fewer future failure points.

Arithmetic Projections

The longevity claim is not mystical. It is arithmetic. The current payload is small, and the annual data growth rate is modest.

Growth rate: 5.3 KB/yr

Figure 3. Payload growth and practical longevity. Drag the slider to stress-test the projection. Even at 10× the current growth rate, the artifact stays below the 5 MB ceiling for over 90 years. The 20-year claim is not the stress point.

At roughly 5.3 KB per year, a 34 KB payload reaches only about 140 KB after 20 years. Against a conservative 5 MB practical ceiling, that leaves centuries of headroom for this class of dataset.

Values	Maths
Current payload size	34 KB
Annual growth rate	≈5.3 KB/year
Ceiling (practical limit)	5 MB
Years to ceiling	≈900 years
Size at 20 years	≈140 KB

Derived from Michigan CRA data

Source: Michigan Digital Paratype archive index (74 months, October 2019–November 2025).

Conclusion

The 20-Year File makes a simple swing: if the data, logic, styling, figures, and explanation travel together, the work has a better chance of surviving. The Michigan Cannabis artifact is the working example. The data sit inside the file. The browser opens it, runs the logic, recalculates the summaries, redraws the figures, and gets on with it. No server has to wake up. No package manager has to behave. No dashboard has to still exist.

That is different from traditional reproducible research, where the reader may have to rebuild an environment before checking the result. Here, the environment is part of the paper. Each copy is the work. This method is not for everything, but for durable, modest-scale datasets, the value is plain: no server dependency, no package dependency, no CDN dependency, no framework hostage situation. For data that must persist, distribution is still the defense. Make the file small enough to move, complete enough to trust, and boring enough to survive.

Working Implementation https://tjid3.org/

Taille actuelle	34 Ko
Croissance annuelle	≈ 5,3 Ko/an
Plafond pratique	5 Mo
Années jusqu’au plafond	≈ 900 ans
Taille à 20 ans	≈ 140 Ko

Aktuelle Größe	34 KB
Wachstum pro Jahr	≈ 5.3 KB/Jahr
Praktische Grenze	5 MB
Jahre bis zur Grenze	≈ 900 Jahre
Größe nach 20 Jahren	≈ 140 KB

Tamaño actual	34 KB
Crecimiento anual	≈ 5.3 KB/año
Techo práctico	5 MB
Años hasta el techo	≈ 900 años
Tamaño a 20 años	≈ 140 KB

Tamanho atual	34 KB
Crescimento anual	≈ 5.3 KB/ano
Teto prático	5 MB
Anos até o teto	≈ 900 anos
Tamanho em 20 anos	≈ 140 KB

当前大小	34 KB
年增长	≈ 5.3 KB/年
实用上限	5 MB
达到上限所需时间	≈ 900 年
20 年后大小	≈ 140 KB

現在のサイズ	34 KB
年間増加	≈ 5.3 KB/年
実用上限	5 MB
上限までの年数	≈ 900 年
20年後のサイズ	≈ 140 KB

Abstract

The Context: Archival Risk

The Evergreen Architecture

1. Gojira, the data engine

2. Gamera, the state engine

3. Vendored or native visualization

Performance & Behavior

State engine behavior

Reproducible baseline

Forking the artifact

Limitations: The 5MB Ceiling

Arithmetic Projections

Conclusion

Résumé

Le contexte : le risque d’archive

L’architecture « evergreen »

1. Gojira, le moteur de données

2. Gamera, la logique de visualisation

3. CSS intégré

Performance et comportement vivant

Forker l’artefact

Projections arithmétiques

Limites : le plafond de 5 Mo

Conclusion

Kurzfassung

Der Kontext: Archivrisiko

Die „Evergreen“-Architektur

1. Gojira, die Datenmaschine

2. Gamera, die Visualisierungslogik

3. Eingebettetes CSS

Leistung und lebendes Verhalten

Forken

Arithmetische Projektionen

Grenzen: die 5-MB-Decke

Schluss

Resumen

El contexto: riesgo de archivo

La arquitectura evergreen

1. Gojira, el motor de datos

2. Gamera, la lógica visual

3. CSS integrado

Rendimiento y comportamiento vivo

Fork del artefacto

Proyecciones aritméticas

Límites: el techo de 5 MB

Cierre

Resumo

O contexto: risco de arquivo

A arquitetura evergreen

1. Gojira, o motor de dados

2. Gamera, a lógica visual

3. CSS embutido

Desempenho e comportamento vivo

Fork do artefato

Projeções aritméticas

Limites: o teto de 5 MB

Conclusão

摘要

背景：档案风险

“Evergreen”架构

1. Gojira，数据引擎

2. Gamera，可视化逻辑

3. 内联 CSS

性能与“活”行为

复制与改造

算术预测

限制：5 MB 上限

结论

要旨

背景：アーカイブ上のリスク

Evergreen アーキテクチャ

1. Gojira、データエンジン

2. Gamera、可視化ロジック

3. インライン CSS

性能と「生きた」動作

フォーク

算術的予測

限界：5 MB の上限

結論