Alternative derivation of the Boltzmann distribution

It is instructive to provide an alternative derivation of the Boltzmann distribution, due to Gibbs. Consider our system, let us call it 1, as part of an ensemble of a large number of systems identical to system 1. Let us call 0 the entire ensemble, and 2 the ensemble minus system 1. System 0 and 2 are then not very different, as we can include as many systems as we want. We make the usual assumption that the systems in the ensemble do not interact with each other in a way that affect their physical properties, in particular the energy spectrum is the same as that of a truly isolated system. They only have weak interactions that allow the systems to exchange energy in order to come in thermal equilibrium. From this point of view system 2 acts as a heat bath for system 1. Since the energy spectrum of system 1 is discrete, so are those of systems 2 and 0. From our assumption of independence, a value of the energy of system 0, $E_0^{i,j}$, is the sum of one of the possible energy levels of system 1, $E_1^i$ and one of system 2, $E_2^j$:

$\displaystyle E_0^{i,j} = E_1^i + E_2^j$ (3.77)

and since the systems in the ensemble are all independent (a part from the weak interactions that bring then in equilibrium), we have

$\displaystyle p_0(E_1^i + E_2^j) = p_1(E_1^i) p_2(E_2^j),$ (3.78)

where $p_0, p_1$ and $p_2$ are the probabilities in the ensembles 0,1 and 2. We now consider extended probability functions defined on the whole energy axis, so that we can take derivatives. We will not use a separate notation for them, but we will leave it as implicit that the real probabilities in [*] coincide with the extended ones when the energies coincide with any of the allowed energy levels. Let us now take the logarithmic derivative w.r.t $E_2$ of the two sides of [*]:

$\displaystyle \frac{d \ln p_0(E_1 + E_2)}{d E_2} = \frac{d \ln p_2(E_2)}{d E_2} = - \beta_2,$ (3.79)

where we have dropped the indexes $i$ and $j$ to highlight that we are now using the continuum energy axis. The derivatives in [*] are in general complicated functions of the energy spectrum of system 2, which we have indicated with the symbol $\beta_2$. Clearly, taking the derivative of $p_0$ w.r.t. $E_2$ is the same that taking it w.r.t. $E_1$ and so we also have:

$\displaystyle \frac{d \ln p_0(E_1 + E_2)}{d E_1} = \frac{d \ln p_1(E_1)}{d E_1} = - \beta_2.$ (3.80)

We now make the further assumption that system 2 has an unpredictable energy spectrum, in such a way that the energy spectrum of system 1 is independent from it. Since $p_1$ does not depend on the energy spectrum of system 2, we can integrate [*] and obtain:

$\displaystyle p_1(E_1) = A e^{-\beta_2 E_1},$ (3.81)

which has the same form of the Boltzmann distribution [*]. $A$ is a normalisation constant. Since all systems in the ensemble are identical, we could have chosen any of them for the above discussion, which means that the parameter $\beta_2$ is common to all of them and we can drop the subscript 2. Therefore, systems in thermal equilibrium share a parameter $\beta$, which is equivalent to the thermodynamic statement that they are at the same temperature, but notice that we have not introduced the temperature in this derivation. The physical meaning of $\beta$ and its connection with temperature is not directly apparent from [*], however, noting from [*] that $\partial \bar{E}/\partial \beta = - (\Delta E)^2 \le 0$ we see that the energy of a system increases if the parameter $\beta$ decreases. Since two systems that have different $\beta$ parameters will eventually share a common $\beta$ at equilibrium, we see that energy flows from the system with a lower $\beta$ to the one with a higher $\beta$. This shows that $\beta$ acts as a reciprocal of the temperature on some scale, via a suitable constant that accounts for the difference in units between the two.