Ludwig Boltzmann · 1872

Thermal equilibrium of gas molecules

p. 299 · Boltzmann, Kinetic Theory

From these equations it can again be proved that

E = u_1 \log u_1 + \sqrt{2}\,u_2 \log u_2 + \dots + \sqrt{p}\,u_p \log u_p

must always decrease unless $u_1^2 = u_1 u_3,\; u_2^2 = u_1 u_3,\; u_3^2 = u_2 u_4 \dots$ (in other words all the expressions multiplied by the coefficients $B$ in equation (35)) vanish. Equations (35) have the inconvenient feature that while they can be written with summation formulae they cannot be written out completely explicitly. It would undoubtedly be an aid to clarity, therefore, if we begin with the simplest case and then proceed gradually to the general case.

First let $p = 3$; the molecules can only have three different kinetic energies, $\epsilon$, $2\epsilon$, and $3\epsilon$. Then the system of equations (35) reduces to the following three equations:

\begin{aligned} \sqrt{1}\,\frac{du_1}{dt} &= B_{11}^{22}\,(u_2^2 - u_1 u_3) \\[4pt] \sqrt{2}\,\frac{du_2}{dt} &= 2B_{11}^{22}\,(u_1 u_3 - u_2^2) \\[4pt] \sqrt{3}\,\frac{du_3}{dt} &= B_{11}^{22}\,(u_1 u_3 - u_2^2) \end{aligned} \tag{36}

and the expression for $E$ reduces to

E = u_1 \log u_1 + \sqrt{2}\,u_2 \log u_2 + \sqrt{3}\,u_3 \log u_3

The differentiation gives

\frac{dE}{dt} = (\log u_1 + 1)\frac{du_1}{dt} + \sqrt{2}(\log u_2 + 1)\frac{du_2}{dt} + \sqrt{3}(\log u_3 + 1)\frac{du_3}{dt}

p. 300

or, with a different arrangement of the terms,

\frac{dE}{dt} = \log u_1 \frac{du_1}{dt} + \sqrt{2}\log u_2 \frac{du_2}{dt} + \sqrt{3}\log u_3 \frac{du_3}{dt} + \left( \frac{du_1}{dt} + \sqrt{2}\frac{du_2}{dt} + \sqrt{3}\frac{du_3}{dt} \right)

The sum of the last three terms vanishes according to equations (36) so that one obtains $dE/dt$ by multiplying the first of these equations by $\log u_1$, the second by $\log u_2$, and the third by $\log u_3$, and adding all three together. If one does this he obtains

\frac{dE}{dt} = B_{11}^{22}\,(u_2^2 - u_1 u_3)\bigl(\log u_1 + \log u_3 - 2\log u_2\bigr)

\frac{dE}{dt} = B_{11}^{22}\,(u_2^2 - u_1 u_3)\,\log\!\left(\frac{u_1 u_3}{u_2^2}\right)

Of the two factors multiplying $B_{11}^{22}$ on the right-hand side of this equation, the first is positive and the second is negative when $u_2^2 > u_1 u_3$, whereas the first is negative and the second is positive when $u_2^2 < u_1 u_3$. Hence their product is always negative, and since $B_{11}^{22}$ must be positive, $dE/dt$ is always negative or zero. The latter is true when $u_2^2 = u_1 u_3$. Now it can easily be shown that $E$ cannot become negatively infinite. Obviously none of the three quantities $u_1, u_2, u_3$ may be negative or imaginary. For positive $u$, however, $u\log u$ cannot have a larger negative value than $-1/e$, hence $E$ cannot have a larger negative value than

-\frac{1 + \sqrt{2} + \sqrt{3}}{e}

where $e$ is the base of natural logarithms. Therefore $E$, since its derivative cannot be positive, must continually approach a minimum for which $dE/dt = 0$, and for which $u_2^2 = u_1 u_3$.

The proof cannot be carried out in just the same way when $p > 3$. I consider here only the case $p = 4$. In this case equations (35) reduce to:

p. 301

\begin{aligned} \sqrt{1}\frac{du_1}{dt} &= B_{11}^{22}(u_2^2 - u_1u_3) + B_{11}^{33}(u_3^2 - u_1u_4) \\[4pt] \sqrt{2}\frac{du_2}{dt} &= B_{11}^{22}(u_1u_3 - u_2^2) + (B_{12}^{23}+B_{11}^{24})(u_1u_4 - u_2u_3) \\[4pt] \sqrt{3}\frac{du_3}{dt} &= B_{11}^{33}(u_1u_4 - u_3^2) + (B_{12}^{23}+B_{11}^{24})(u_2u_3 - u_1u_4) + 2B_{22}^{44}(u_4^2 - u_2^2?) \\[4pt] \sqrt{4}\frac{du_4}{dt} &= (B_{12}^{23}+B_{11}^{24})(u_2u_3 - u_1u_4) + B_{11}^{33}(u_1u_4 - u_2^2?) \end{aligned} \tag{37}

Question marks preserved from original source — scholarly accuracy.

For $E$ one finds

E = u_1\log u_1 + \sqrt{2}u_2\log u_2 + \sqrt{3}u_3\log u_3 + \sqrt{4}u_4\log u_4$$ $$\frac{dE}{dt} = \log u_1\frac{du_1}{dt} + \sqrt{2}\log u_2\frac{du_2}{dt} + \sqrt{3}\log u_3\frac{du_3}{dt} + \sqrt{4}\log u_4\frac{du_4}{dt}

If one substitutes here for the derivatives their values from equations (37), he obtains, with a suitable rearrangement of terms,

\begin{aligned} \frac{dE}{dt} = B_{11}^{22}(u_2^2-u_1u_3)\log\frac{u_1u_3}{u_2^2} &+ B_{11}^{33}(u_3^2-u_1u_4)\log\frac{u_1u_4}{u_3^2} \\[4pt] &+ (B_{12}^{23}+B_{11}^{24})(u_1u_4 - u_2u_3)\log\frac{u_2u_3}{u_1u_4} \end{aligned}

p. 302

I remark that the change in the order of the summands, which is necessary here, is analogous to our previous transformation of definite integrals. From the above expression one sees at once that $dE/dt$ is again necessarily negative, unless simultaneously we have

u_2^2 = u_1u_3,\qquad u_3^2 = u_2u_4,\qquad u_2u_3 = u_1u_4

which can also be written

u_3 = \frac{u_2^2}{u_1},\qquad u_4 = \frac{u_2^3}{u_1^2}.

Likewise one finds in the general case that $dE/dt$ is necessarily negative so that $E$ must decrease unless

u_3 = \frac{u_2^2}{u_1},\quad u_4 = \frac{u_2^3}{u_1^2},\quad\ldots\tag{38}

Since $E$ cannot have a larger negative value than

-\frac{1+\sqrt{2}+\sqrt{3}+\dots+\sqrt{p}}{e}\tag{39}

it must necessarily approach a minimum value for which equations (38) hold. Thus it continually approaches the distribution of states determined by equations (38).

We now have to prove that equations (38) uniquely determine the distribution of states. If we add together all the equations (35), we obtain

\frac{d}{dt}\bigl(u_1 + \sqrt{2}u_2 + \sqrt{3}u_3 + \dots + \sqrt{p}u_p\bigr) = 0

hence

u_1 + \sqrt{2}u_2 + \sqrt{3}u_3 + \dots + \sqrt{p}u_p = a \tag{40}

p. 303

In a similar way we find that

u_1 + 2\sqrt{2}u_2 + 3\sqrt{3}u_3 + \dots + p\sqrt{p}u_p = \frac{b}{\epsilon} \tag{41}

where $a$ and $b$ are constants. The meaning of these equations is obvious. In particular, $u_1 + \sqrt{2}u_2 + \sqrt{3}u_3 + \dots = a$ is the total number of molecules in unit volume, while $b$ is their total kinetic energy. Equations (40) and (41) therefore tell us that these two quantities are constant.

Suppose that the two quantities $a$ and $b$ are given. Then we set the quotient $u_2/u_1$ equal to $y$. Equations (38) then reduce to

u_3 = y^2 u_1,\quad u_4 = y^3 u_1,\quad\ldots,\quad u_p = y^{p-1}u_1

If one substitutes these values into equations (40) and (41), then he finds easily:

\begin{aligned} 0 &=\; (p-1)\frac{b}{\epsilon} - pa\; y^{p-1} + \bigl((p-2)\frac{b}{\epsilon} - (p-1)a\bigr)y^{p-2} + \dots \\ &\quad + \bigl(3a-\frac{b}{\epsilon}\bigr)3y^2 + \bigl(2a-\frac{b}{\epsilon}\bigr)2y + \bigl(a-\frac{b}{\epsilon}\bigr) \tag{42} \end{aligned}

Since all the $u$'s are necessarily positive, we see immediately that $\frac{b}{\epsilon} - a$ must be positive while $\frac{b}{\epsilon} - pa$ must be negative. Hence $b$ must lie between $\epsilon a$ and $p\epsilon a$. Hence, in equation (42) the coefficient of $y^{p-1}$ is positive, while the term independent of $y$ must be negative. The polynomial is therefore positive for $y = \infty$, and negative for $y = 0$; therefore there is one and only one positive root for $y$, since the series of coefficients changes sign only once. Negative or imaginary values for $y$ are of course meaningless. But from $y$ we can determine uniquely all the $u$'s and also all the $w$'s. Hence, whatever may be the initial distribution of states, there is one and only one distribution which it approaches with increasing time. This distribution depends only on the constants $a$ and $b$, the total number and total kinetic energy of the molecules (density and temperature of the gas).

p. 304

This theorem was proved first only for the case that the distribution of states is initially uniform. It must also hold, however, when this is not true, provided only that the molecules are distributed in such a way that they tend to become mixed as time progresses, so that the distribution becomes uniform after a very long time. This will always happen with the exception of certain special cases, for example, when the molecules move initially in a straight line and are reflected back in this straight line at the walls. Since we have established this for arbitrary $p$ and $\epsilon$, we can immediately go to the case where $\frac{1}{p}$ and $\epsilon$ become infinitesimal.

† For very large $p$, the expression (39) will be very large, of order $p$. In this case it is necessary to look for a smaller negative value that $E$ can never exceed. The quantity denoted here by $E$ differs by a constant from the one earlier so denoted. If we wish to obtain the quantity denoted by $E_1$ in equation (17a), page 113, which again differs only by a constant from the other quantities denoted by this letter, then we must add to our present $E$, $-\frac{3\log\epsilon}{2}(u_1+\sqrt{2}u_2+\dots)$. Therefore

E_1 = E - \frac{3\log\epsilon}{2}(u_1+\sqrt{2}u_2+\dots) = u_1\log\frac{u_1}{\epsilon^{3/2}} + \sqrt{2}u_2\log\frac{u_2}{\epsilon^{3/2}} + \dots

It is clear now that $E_1$ is a real and continuous function of the $u$'s for all real positive values of it. Furthermore, if we say that a negative quantity is smaller, the greater its numerical value is, then $E$ is not smaller than the expression (39), hence $E_1$ is not smaller than

-\frac{1}{e}(1+\sqrt{2}+\dots+\sqrt{p}) - \frac{3}{2}a\log\epsilon

Hence, $E_1$ must have a minimum if the $u$'s run through all real positive values compatible with equations (40) and (41). One can then easily show that for this minimum none of the $u$'s can be equal to zero, so that the minimum cannot lie on the boundary of the space formed from the $u$'s, and consequently it can be found by applying the usual rules of differential calculus. If we add to the total differential of $E_1$ that of the two equations (40) and (41), multiplying the former with the undetermined multiplier $\lambda$, and the latter by $\mu$, then we obtain

(\log u_1 + \lambda + \mu)du_1 + (\log u_2 + \lambda + 2\mu)\sqrt{2}du_2 + \dots = 0

At the minimum, the factor of each differential must vanish, whence on elimination of $\lambda$ and $\mu$ one obtains

\log u_2 - \log u_1 = \log u_3 - \log u_2 = \dots

or $u_2/u_1 = u_3/u_2 = u_4/u_3 = \dots$, which we recognize to be the same as equations (38). These equations therefore determine the smallest value that $E_1$ can have when the $u$'s take all possible values consistent with equations (40) and (41). However, since the $u$'s are actually subject to equations (40) and (41) during the entire process, this is the smallest value of $E_1$ during the entire process. In order to calculate it, we set again $u_2 = u_1y,\; u_3 = u_1y^2,\dots$ We know that we then find from equations (38), (40) and (41) a unique positive value for $y$, which must correspond to the actual minimum of $E_1$. This minimum value of $E_1$ is therefore

E_1 = \frac{b}{2}\log y + a\log\left(\frac{u_1}{\epsilon^{3/2}}\right)

$E_1$ cannot have a smaller value than this. This value remains finite for infinitesimal $\epsilon$ and infinite $p$. Taking account of equations (43), we see that it reduces to $a\log C - bh$ or, since $a = 2\int C$ and $b = 2/h$, one can write for it $\frac{1}{2}a(\log C - 1)$ which is a finite quantity. Hence $E_1$ cannot be minus infinity. On the other hand, it may be plus infinity. We still have to show that in that case there cannot be thermal equilibrium. This proof, as well as an explicit discussion of the exceptional case where $\lim_{\tau\to0} \frac{\epsilon}{\tau}[\dots]$ comes out to be different according as $\epsilon/\tau$ or $\tau/\epsilon$ vanishes, will not be discussed further here.

p. 305

We have first:

w_k = \sqrt{k}u_k = u_1\sqrt{k}y^{k-1}

For infinitesimal $\epsilon$ we can again set:

\epsilon = dx,\quad k\epsilon = x,\quad y = e^{-h\epsilon},\quad \frac{1}{\sqrt{\epsilon}} = C \tag{43}$$ $$y^k = e^{-hk\epsilon} = e^{-hx}

and obtain

w_k = C\,\sqrt{x}\,e^{-hx}\,dx

which is again the Maxwell distribution. Likewise one can convince himself that the sum which we have here denoted by $E$ reduces, aside from a constant additive term, to the integral in equation (17a); we therefore obtain by this method all the results that we earlier found by transformations of definite integrals, but it is the advantage of being much simpler and clearer. One only has to accept the abstraction that a molecule may have only a finite number of kinetic energies as a transition stage.

If one sets the time derivatives in equations (35) equal to zero, he obtains the conditions that the distribution of states does not change with time but is stationary. The condition that the distribution be stationary is obtained by setting $\frac{\partial f(x,t)}{\partial t} = 0$ in equation (16). This gives:

\begin{aligned} 0 = \int_0^\infty \int_0^{x+x'} &\Big[ f(\xi)f(x+x'-\xi) - f(x)f(x') \Big] \\ &\times \sqrt{\frac{xx'}{\xi(x+x'-\xi)}} \; \psi(x,x',\xi) \; d\xi \, dx' \end{aligned}

p. 306

A solution of this equation is

f(x) = C\sqrt{x}\,e^{-hx}

which is the Maxwell distribution. From what has been said previously it follows that there are infinitely many other solutions, which are not useful however since $f(x)$ comes out negative or imaginary for some values of $x$. Hence, it follows very clearly that Maxwell's attempt to prove a priori that his solution is the only one must fail, since it is not the only one but rather it is the only one that gives purely positive probabilities, and therefore it is the only one that is useful.

Boltzmann 1872 / 2025 · critical edition · text complete, unexpurgated
Claude · after Gemini v4 · after Brush · after the German