The 20-Year File:
A Zero-Dependency Architecture for Sustainable Data Visualization

T. M. Jones, PhD

Abstract

This technical note describes a practical architecture for publishing interactive data visualizations as a single, durable HTML file. The goal is straightforward: an artifact that remains usable without a server, a build step, a package manager, or a network connection.

The implementation uses native browser standards—Vanilla JavaScript (standard ECMAScript) and CSS3, while vendoring any required visualization code directly into the document. The dataset is stored inline so the page opens “already loaded,” avoiding runtime fetch calls and reducing failure modes tied to external infrastructure.

In addition to rendering charts, the document supports “living” behavior at load time: it recomputes durations, refreshes date-dependent labels, and updates summary text based on the newest data present in the embedded payload. This keeps the narrative consistent as the dataset grows, without server-side processing.

Performance was evaluated using Lighthouse / PageSpeed Insights on the production build. Results indicate strong real-world interactivity and accessibility on typical devices. This performance reflects an explicit trade-off: a monolithic file incurs a fixed parsing and initialization cost that scales with total payload size.

Within those constraints, standard growth models suggest that long-term operability over multi-century timescales is mathematically possible. In practice, a ceiling near ~5MB defines the intended operating range for archival-grade, single-file publications.

Availability: The live artifact is accessible at https://tjid3.org/

The Context: The Crisis of Archival Risk

Longitudinal datasets often retain scientific value for decades. Interactive visualizations that accompany them often do not. Many modern web deployments depend on layered tooling—frameworks, build pipelines, CDNs, and versioned packages—that can fail or become difficult to reproduce over time.

When a visualization is tightly coupled to that tooling, the research may remain intact while the interface becomes unusable. This is an archival risk: the record still exists, but the instrument for exploring it cannot be reliably executed in the future.

For the Michigan Cannabis market dataset used here, the design constraint was explicit: the complete interactive paper must run as a single HTML file opened locally in a standard browser. The artifact must function offline, without installation steps, and without external requests.

The "Evergreen" Architecture

The solution utilizes a pattern I call "Gojira & Gamera." It separates the data payload from the visualization engine, keeping both within the DOM but logically distinct.

1. Gojira (The Data Engine)

Instead of fetching data from an external database or a CSV file, the full dataset is embedded directly into the HTML as a pre-hydrated JSON object. This avoids asynchronous loading states and network latency entirely.

// Example of the "Gojira" embedded data structure
const rawData = [
    ["10/1/2019", "Oct", 2019, "$28,594,402", ...],
    ["11/1/2019", "Nov", 2019, "$26,628,235", ...],
    // ... 60+ months of data
];
Listing 1: Data embedding strategy. The raw array structure injects the complete dataset into the DOM, eliminating fetch requests.

2. Gamera (The Visualization Logic)

The application logic uses an IIFE (Immediately Invoked Function Expression) to encapsulate state, preventing pollution of the global namespace. It utilizes Vanilla JavaScript to manipulate the DOM directly, bypassing the Virtual DOM overhead inherent in complex frameworks.

// Scope-Safe Application Logic
const App = (() => {
    const state = {
        data: [],
        charts: {},
        activeFilter: 'flower'
    };

    const init = () => {
        // Hydrate data directly from memory
        state.data = Logic.process(rawData);
        Render.charts();
    };

    return { init };
})();
Listing 2: Scope encapsulation. This design creates a secure, encapsulated execution context for the logic.

For the visualization layer, the project relies on Chart.js. To strictly adhere to the zero-dependency constraint, the library was treated not as a remote dependency to be fetched, but as a brick cemented into the wall of the application. Technically known as "vendoring," the source code was minified and embedded directly into the document, ensuring the visualization engine functions as immutable infrastructure that is intended to outlast any package manager or CDN.

3. Zero-Dependency Styling

The architecture uses CSS Custom Properties for theming. This allows for instant Dark/Light mode switching through a single attribute change, eliminating the need to parse or load external CSS libraries.

Performance & "Living" Behavior

The architecture is consistent with Native ECMAScript, when architected correctly, outperforms complex frameworks for evergreen data artifacts where the narrative evolves dynamically as the dataset grows.

Unlike a static report, this document functions as a state engine. Upon loading the JSON payload, the document performs a "self-audit," recalculating the temporal duration (e.g., updating "6-Year Trends" to "7-Year Trends"), refreshing the citation year, and revising summary statistics in the DOM text. Because this logic executes linearly without the overhead of framework hydration or Virtual DOM diffing, the document achieves near-immediate First Contentful Paint (FCP) and near-instant interactivity, regardless of the dataset's growing size.

Google PageSpeed Insights (PSI) audits and standalone Lighthouse analyses were conducted on the production build to evaluate the effectiveness of the optimization strategy. The results confirmed the stability goals while highlighting the trade-offs inherent in a monolithic design:

Snapshot Date: 2025-12-21 (Production Build)
100
Performance (PSI)
100
Accessibility
100
Practices
100
SEO
Figure 1a: Google Page Speed Insights (PSI) Snapshotted: 2025-12-21.This score reflects the document's real-world performance for the majority of users on average devices/networks.

The Lighthouse Performance score of ~90, shown in the Lighthouse scores view of the infographic, reflects the cost of embedding the full charting library (Gamera) directly into the document. This score is generated under a simulated low-end mobile CPU profile and represents the expected overhead of satisfying the zero-dependency requirement.

The primary penalty contributing to this score is Total Blocking Time (TBT), which measures CPU latency during execution of large synchronous scripts on constrained devices. In this implementation, the impact of TBT was explicitly managed by deferring initialization of the heavy script using window.requestIdleCallback. This approach preserves main-thread availability during initial render, allowing immediate content display and broad device accessibility.

Limitations: The 5MB Ceiling

This system prioritizes stability and long-term preservation over the scalability required for large transactional or streaming data applications. It is designed for typical scientific publishing, where datasets are modest in size but require reliable, indefinite longevity.

The architecture is implemented as a zero-dependency, single HTML file. This design introduces a practical performance ceiling of approximately 5MB, beyond which user experience degrades due to download and parsing latency. Within this constraint, performance remains predictable and manageable, aided by the efficiency of the Domain-Specific Knowledge Schema.

Because the data structure is lean and purpose-built, the dataset grows at approximately 5.3 KB per year. With a 34KB payload representing seven years of production data and a 5MB ceiling, simple extrapolation indicates that the architecture can remain performant for centuries, well beyond the stated 20-year archival target.

Conclusion

This work shows that a zero-dependency, single-file HTML publication can support interactive, multi-series visualization while minimizing long-term operational requirements. By embedding the dataset and vendoring the visualization code, the artifact remains executable offline and avoids failure modes associated with external infrastructure.

The approach is intended for modest-to-medium datasets where durability and reproducibility take precedence over large-scale streaming, complex authentication, or rapid feature turnover. Within the practical file-size ceiling described above, a monolithic HTML artifact can function as a stable, citable interface for longitudinal research and archival distribution.