# Is the Fluctuation Theorem a theorem?

[This is intended to be the first episode of a series of posts on what the so-called Fluctuation Theorem is and what it’s not; while I promise follow-up posts (one actually already appeared on whether we should talk about a theorem or a relation), I must admit that all previous plans of serial posts were aborted]

[I suggest to click on the title to view the post on a white background]

The Fluctuation “Theorem” (FT) is a major result in (relatively) recent Non-Equilibrium Statistical Mechanics. It roughly states that, in a wide class of probabilistic theories of dynamical systems, there exists a “physically meaningful” real-valued observable $\Omega$ such that the probability of a positive $+\Omega$ is exponentially favoured with respect to that of negative ones:

$P(+\Omega/ P(-\Omega)= \exp {\Omega}$.

I must say right away that I am not favourable to naming formulas after people, so I won’t append any name to the above relation. As a consequence of the above detailed relation, the following Integral FT follows by a simple integration over increments $d\Omega$:

$\left\langle e^{-\Omega} \right\rangle = 1$.

As a further consequence, by a simple inequality of convex functions one obtains an incarnation of the holy Second Law of Thermodynamics:

$\left\langle\Omega \right\rangle \geq 0$.

A thousand+ papers are based on this trilogy of formulas, and I contributed my own:

M. P. and M. Esposito, J. Stat. Mech. P10033 (2014), arXiv:1408.5941.

What I want to discuss here is not the validity of the FT as an overall discourse about thermodynamics, but rather in what sense the FT is a theorem at all.  By “theorem” I don’t mean that a very abstract and formal framework is needed, I just mean that at some point in the development of the argument there is a clear-cut passage between some hypothesis and some thesis, whereby the thesis is not obviously embedded in the hypothesis. I will call this the “step”. I will not embark in the analysis of a-thousand-plus papers either (most of which cite each other in a self-inconsistent way) to pinpoint if and where exactly lies the step. Nevertheless, my overall experience with this stuff is that, if you really dig down the rabbit hole, very often the observable of interest is defined as:

$\Omega = \log\frac{P(+\Omega)}{P(-\Omega) }$.

Ta-ta-ta-taaaaa! See any problem here? This definition coincides with the first formula of the trilogy. Therefore, this cannot be the step we are looking for. Where lies the step?

*  *  *

Let me narrow down my discussion to discrete-state space Markov processes (deterministic dynamical systems are special in their own way, but not that different; I will tackle them in a separate post, if at all). In this context, $\Omega$ can be defined as a special linear combination of the net number of jumps between two states,

$\Omega = \sum_{i,j} \# (i \gets j) \log \frac{w_{ij}}{w_{ji}}$

where $\# (i \gets j)$ is the number of times the process jumped from state $j$ to state $i$ and $w_{ij}$ is the corresponding transition rate. Deriving the FT from this definition does require an important passage: we need to know the explicit expression for $P(\Omega)$. In a physicist’s words, we need to know the exact path-integral representation of the process. It occurs that for Markov jump processes this explicit expression indeed exists, and that the FT follows trivially (I won’t give the expression which can be found in many papers, including my own; but check out this remarkable review: Markus F. Weber, Erwin Frey, Master equations and the theory of stochastic path integrals, arXiv:1609.02849, Sec. 1.4).

Have we found the step? If this were the case, then the FT would boil down to a known result in the theory of Markov processes that has nothing to do with physics. That can’t be our step. After all, we are borrowing results from mathematics to obtain something more.

*  *  *

So far I used $\Omega$ as an abstract object without identity. In fact, the term $\ln w_{ij}/w_{ji}$ in the latest formula seems to be quite arbitrary and ad hoc. However, in the mind of physicists this object has a clear nature: it is called the “entropy production”, quantifying the amount of entropy that is delivered to the environment along a process. If we can provide a different and relevant definition of this quantity in a way that is fairly independent of the Markov process itself, if we can show that this definition stands on its own feet without invoking the FT, then we can say that we made a step.

Arrhenius’s theory of transitions between states might come to our rescue, as it tells us that the rates at which jumps between states occur are related one to its reversed by

$\frac{w_{ij}}{w_{ji}} = \exp {\frac{u_j - u_i}{k_B T_{ij}} }$

where we imagine that the states are energy wells with energy $u_i$, that among wells there is a potential barrier that can be overcome by effect of the effect of an external bath that heats the system up to temperature $T_{ij}$. If we disconnect states $i,j$ from the rest of the state space, then the system would relax to a steady state given by

$p_i \propto \exp {\frac{- u_i}{k_B T_{ij}} }$

which is the equilibrium “Gibbs” state. Notice however that the temperature might be different for any two states, therefore in general (that is, if we do not isolate a single transition) the system does not relax to an equilibrium Gibbs state. Thus we call the above condition local detailed balance. Local detailed balance provides the expression

$\Omega = \frac{1}{k_B} \sum_{ij} \frac{\delta Q_{ij}}{T_{ij}}$

where the heat flow is defined as the net amount of energy that has flown between two states along the process:

$\delta Q_{ij} = \# (i \gets j) (u_j - u_i)$.

The condition of local detailed balance thus gives the missing link between mathematics and physics, allowing to connect to well-known lingo and concepts in thermodynamics.

*  *  *

However, our question still stands: have we made an inferential step yet? It is clear that the terms of this problem have shifted from the language of Markov processes to that of thermodynamics, but so far the condition of local detailed balance just looks like a renaming of quantities. To claim that we actually made a step we need to look at the definitions of energy, temperatures, etc. It is clear this search requires to take a trip down to the very foundations of thermodynamics.

We will not engage in such huge enterprise. Just a few observations.

Temperature is a very hostile concept in thermodynamics. The story of the the mercury column by which thermometers are introduced in the tale of basic physics is just a myth, one of many that physicists tell themselves to cast their research on safe ground, like any human community likes to tell a mythological story of their origins. There is no way one can use a mercury column to measure the temperature of the environment where a protein unfolds. Of course, better thermometers exist, but none of them overcomes the intrinsic problems that the mercury column has. What is important in physics is not so much to make reference to an absolute value of temperature that some special apparatus measures at all scales (even if there do exist better thermometers!), but rather to define at each scale a self-consistent notion of temperature (in the spirit of the “renormalization” approach to Quantum Field Theory, if you like…). I question that, when Brownian motion is at work, the operational definition of temperature can be disentangled from the observed stochasticity of the Brownian degrees of freedom. This is a very different perspective than that of a mathematician’s world where consequences follow from assumptions: here quantities need to be renegotiated, and there is no clear-cut separation between measurement apparatus and measured quantities.

The problem is not just with temperature. For definitiveness, let me consider one particular derivation of the Master Equation governing Markov jump process, the quantum case where the condition of local detailed balance is replaced by an analogous condition called Kubo-Martin-Schwinger (see for example Bruer & Petruccione, The theory of open quantum systems, Cambridge University Press). In all these derivations, one starts from an isolated universe evolving by some Hamiltonian and separates it into an environment and an effective system. It is assumed that (each piece of) the environment is at some thermal state at some temperature, that the system’s Hamiltonian is such and such, and that the system undergoes a Markovian dynamics. Then the KMS condition (local detailed balance) does follow, and with it comes the definition of the entropy production.

But again, how much of this “derivation” lies in the assumptions, and how much goes into its consequences. As you can see, facts that are assumed and facts that are derived are intertwined in quite complicated ways. In other words, it is not clear how and why exactly is defined the entropy production, given a Markov process, since the two processes go hand-in-hand.

*  *  *

So, where lies the inferential step? As you might have guessed already, by Betteridge’s law of headlines, the headline of this post has a negative answer. My point is that there is no such step. In fact, following a comment I heard from Giovanni Gallavotti (in reply to Denis Evans), I prefer to call these results Fluctuation “Relations” rather than “Theorems”.

Nevertheless, I don’t mean at any rate that these relations don’t make sense. That’s not my point, or I would have quit physics by now. What I mean is that in physics there is no net demarcation line between assumptions and consequences. In many respects, much part of physics is about developing a self-consistent discourse, and it’s so good that at least this discourse is at least self-consistent.

To conclude:

1. The so-called Fluctuation Theorem is not a theorem at all; it’s a relation between quantities that might occasionally take the form of a theorem, provided it is explained which are the assumptions and which is the output. This has nothing to do with being “mathematical”: it has to do with reasoning right.
2. The fundamental status of the fluctuation relation (just like many other laws of physics) is unclear: I argue that it is an overarching concept that unites a self-consistent discourse about processes of a certain kind.

I might also conclude with a slightly more “political” observation. At present, the industry of production of scientific papers makes pressure on scientists to produce better narratives of their research. To some degree this is good, because papers are more pleasant to read that way. However, this might go to the detriment of the logical development of papers. It pushes people to smuggle discourses for deductions, and it does not help distinguish which are the premises, which are the consequences, and which are the unknown factors…