Entropy, Information, and Physics: A Deep Dive into Statistical Mechanics
Written on
Understanding Entropy in Physics and Information Theory
Many people first encounter the concept of entropy in physics, particularly through the second law of thermodynamics or statistical mechanics. This prominence is justified, as the term was initially introduced by Boltzmann in 1870 and later refined by Gibbs in 1902. However, a more intuitive grasp of entropy can be achieved through the lens of information theory. As we will discuss, statistical mechanics can be viewed as a specific application of maximizing entropy under certain constraints.
Entropy as a Measure of Information
In 1948, Claude Shannon introduced a groundbreaking metric for quantifying information. While this concept wasn't entirely novel, Shannon acknowledged its roots in his paper.
> "The form of H will be recognized as that of entropy as defined in certain formulations of statistical mechanics where ( p_i ) is the probability of a system being in cell ( i ) of its phase space. H is then, for example, the H in Boltzmann's famous H theorem."
What was extraordinary about Shannon's work was how this measure related to information theory, leading to his designation as the "father of information theory." But how does this serve as an effective measure of information?
Consider a simple example: a coin toss. There are two outcomes: heads (1) or tails (0), representing one "bit" of information. Each outcome has an equal likelihood of 1/2, resulting in a Shannon Entropy of 1 bit, indicating that one bit is required to represent this information.
Now, let’s expand to four outcomes: 11, 00, 10, 01. A bit of calculation reveals that ( H = 2 ), meaning two bits are necessary to encode these four options. This leads us to understand why entropy serves as the definitive measure of uncertainty—and by extension, information.
When faced with two possibilities, one with a probability ( p ) and the other ( q = 1 - p ), maximum entropy occurs when both outcomes are equally probable, yielding a value of 1, as with the coin toss. Conversely, if one outcome is certain (probability of 1), entropy drops to 0, indicating no uncertainty—similar to a biased coin or loaded dice. Thus, we can define entropy as a measure of predictability in the outcome of a process.
But why does this specific measure work? Shannon outlines three critical properties that ( H ) must satisfy:
- It should be continuous with respect to probabilities.
- Increasing the number of possible outcomes should increase ( H ) due to greater uncertainty.
- If a choice can be broken down into successive decisions, the original ( H ) must equal the weighted sum of individual ( H ) values.
Shannon emphasizes that while these properties lend credibility to his definitions, their true value lies in their practical applications.
In the realm of physics, entropy assumes a central role, forming the bedrock of statistical mechanics. Although the term was coined in the late 19th century, its significance to statistical mechanics became clearer following Shannon's groundbreaking paper. The links between information theory and statistical mechanics were notably elucidated by Jaynes in 1957. He remarked:
> "(previously)…the identification of entropy was made only at the end, by comparison of the resulting equations with the laws of phenomenological thermodynamics. Now, however, we can take entropy as our starting concept, and the fact that a probability distribution maximizes the entropy subject to certain constraints becomes the essential fact which justifies the use of that distribution for inference."
Jaynes' assertion was bold; previously, entropy was regarded as peripheral, with thermodynamics at the forefront of statistical mechanics. Yet he proposed that entropy could serve as a more universally applicable concept, positioning thermodynamics as a specific instance of entropy maximization.
To break this down, the Boltzmann distribution, which describes the likelihood of particles having a certain energy, is central to statistical mechanics. However, as Jaynes pointed out, the probability distributions that maximize entropy must adhere to certain constraints.
These constraints include maintaining a fixed mean energy and ensuring all probabilities sum to one (normalization). The latter is straightforward for any equilibrium system, while the former reflects a constant mean energy correlating to absolute temperature. To find the distribution that maximizes entropy under these constraints, one can apply the method of constrained optimization using Lagrange multipliers.
By incorporating Shannon's entropy ( H ), one can derive the probability distribution, which takes the form of an inverse exponential—mirroring the Boltzmann distribution. This connection illustrates that the foundational probability distribution in statistical mechanics emerges as a result of entropy maximization, albeit as a special case.
An Important Note on Ensemble Theory
The resulting maximum entropy probability distribution represents the "Canonical Ensemble" in statistical mechanics, where a system is in thermal equilibrium with an infinite reservoir at a fixed temperature and can exchange energy while maintaining a constant mean energy ( <E> ).
A more generalized distribution, known as the "Grand Canonical Ensemble," arises when a system can exchange both energy and particles with a vast reservoir. The constraints here include both a constant mean energy ( <E> ) and a constant average particle number ( <N> ). The grand canonical ensemble probability extends the canonical concept by incorporating particle count as an additional variable.
Exploring Hidden Order in Non-Equilibrium Systems
While I have demonstrated that statistical mechanics can be understood as a special case of entropy maximization, it’s essential to recognize that this discussion pertains to thermodynamic equilibrium—where system properties remain constant over time. In contrast, numerous real-world systems exist in a state of disequilibrium, ranging from amorphous materials to biological systems and financial markets. This represents a profound unresolved challenge: equilibrium systems exhibit static macroscopic properties, while dynamic systems like financial markets are inherently out of equilibrium.
However, applying equilibrium theories to non-equilibrium situations can yield meaningful insights. Many physicists believe that non-equilibrium systems harbor hidden characteristics yet to be uncovered, which may be less evident than their equilibrium counterparts. One area of exploration is the quest to identify concealed order in non-equilibrium systems. Recent studies have shown that image file compression can serve as a proxy for entropy in such systems.
By employing the Lempel-Ziv 77 compression algorithm, researchers evaluated the ratio of image file sizes before and after compression (termed Computable Information Density, or CID) to estimate entropy. Their findings indicated that CID effectively captured specific non-equilibrium phase transitions, such as those occurring in the ID conserved lattice gas model.
In summary, entropy is a multifaceted concept with extensive implications in both information theory and physics. Although it originated within the context of statistical mechanics, it is more broadly applicable and better understood through the prism of information theory. Over the years, we have learned that statistical mechanics and thermodynamics can be viewed as particular instances of entropy maximization. Recent research further suggests that insights gleaned from file compression can enhance our understanding of non-equilibrium physics.
To simplify, the maximum entropy distribution serves as the most informed guess based on limited information. When no information is available, a uniform distribution is the most logical assumption. One could argue that all physical laws stem from the principle of maximum entropy. Paraphrasing Jaynes, a theory makes precise predictions about experiments only if it leads to sharp distributions—these are the maximum entropy distributions aligned with those theoretical frameworks.
Feel free to follow my writings if you found this article intriguing; I frequently explore topics at the intersection of complex systems, physics, and societal issues.
References
Shannon, Claude E. (July 1948). "A Mathematical Theory of Communication". Bell System Technical Journal. 27 (3): 379–423.
Gibbs, Josiah Willard (1902). Elementary Principles in Statistical Mechanics. New York: Charles Scribner's Sons.
Jaynes, E.T. (1957). "Information theory and statistical mechanics". Physical Review. 106 (4): 620–630.
Martiniani, S., Chaikin, P. M., and Levine, D. (2019). "Quantifying Hidden Order out of Equilibrium." Physical Review X: 011031.