Bad Honnef school on Entropy and Information

This week I participated to the school Entropy and Information: the Statistical Mechanical Perspective at the Physikzentrum in Bad Honnef, a very beautiful venue in an old palace with modern facilities, close to the Rhine. At the moment an interesting talk on entropy and information of black holes is going on, but live-blogging this time was impossible.

Myself, I compressed a three-hour course in a one-hour lecture on the Stochastic Thermodynamics of Chemical Networks. Notes can be found here. These notes will serve as the first step towards the drafting of more complete lecture notes, as I feel time is ripe I produce some pedagogical material on my favourite topics. Stay tuned.

 

BEST statistics of Markovian fluxes: a tale of Eulerian tours and Fermionic ghosts

New paper out!

MP, BEST statistics of Markovian fluxes: a tale of Eulerian tours and Fermionic ghosts,
J. Phys. A: Math. Theor. 48, 365005 (2015), arXiv:1503.03045

This paper represents my first attempt of an incursion in the field of Large Deviation Theory. I like it because it connects results in two different areas, namely graph theory and probability theory (sneaking into the territories of Quantum Field Theory), though I understand that it is a rather obscure paper that will find few readers.

In this paper I prove that

formula

OK I just wanted to show off a long formula that means nothing. So suppose a person is blindfolded, cast away in the middle of the desert and asked to walk in some random direction. How much sweat and tears will he pour? Of course, the answer to the above question is probabilistic, since every time this strange experiment is repeated the person (assuming he is still alive and memoryless) will walk a different path, due to the influence of environmental disturbances, sand, wind, sun, scoprions etc. According to the path he follows perspiration and desperation will vary.

In mathematical terms, this person is performing a random walk, and the question we are asking is to calculate some number characterizing how “expensive” (in thermodynamic terms) the trajectory is.

So, quite unremarkably, it turns out that… the answer does not exist. In fact, unfortunately, assuming we know how much sweat and tears it takes to walk every possible step in the desert, we still don’t know which exact sequences of steps produce exactly N liters of sweat and tears.

Probability theory does not have the general answer to this question, but does it have an answer to any question at all?

Indeed, yes. If, instead of focusing on sweat and tears, we ask the much more detailed question: for every possible step in the desert, how often did the walker walk that step?, then we have the answer. That’s what my paper is about. Before my paper, answers could be provided after the walker walks for an infinitely long time. My work extends this before infinity. More details might be coming, or not.

Cycle representations of Markv paths

Cycles are my personal obsession, so I try to intercept any paper containing the words “cycle” and “Markov” (with the aim of, one day, condense all this stuff into a report on cycles in Stochastic Thermodynamics). In this post and a future one I will try to explain what I understand about this remarkable paper:

Chen Jia, Daquan Jiang, Minping Qian, Cycle symmetries and circulation fluctuations for discrete-time and continuous-time Markov chains, arXiv:1407.1263

The Qian dinasty and their students and coworkers are major experts on the particular techniques in Markov processes that are at the interface with NonEquilibrium Statistical Mechanics. I suggest taking a look at the very impressive book Mathematical Theory of Nonequilibrium Steady States. In particular, at some point in one of their early papers I found one of the “early manifestations” of the Fluctuation Theorem for Markov processes that was re-discovered years later by many people. I also have my own version of the FT, on which we will briefly touch later on.

So let’s consider the trajectory performed by a Markovian walker on some state space (not displayed: if you really want it, draw dots at line intersections and elsewhere):

1

It obviously forms many cycles. How do we characterize this trajectory in terms of the cycles it forms?

There are several ways. The first, which subtends Schnakenberg’s theory and the corresponding Fluctaution Theorems by Andrieux and Gaspard [J. Stat. Phys. 2007] (then generalized by myself [J. Stat. Mech. 2014] to transient times) is as follows. Consider the support of the trajectory (the figure it draws), and then remove as many pieces (correspoding to some Markovian transition between inner states) as it takes in such a way that the support is still connected, but contains no cycles. One then obtains a so-called spanning tree, which is sort of the cycle-less scheleton of the trajectory, for example:

2

Now, notice that the trajectory is fully described by the initial and final states and by the ordered sequence of missing segments crossed by the Markovian walker, for example x = a,b,c,d,e on the left-hand side of this figure:

3

Moreover, by definition, notice that when one adds one of the missing lines (e.g. c on the l.h.s.) one encloses a unique cycle, so that the trajectory can be fully described by the sequence of cycles on the right-hand side of the above figure. Physically speaking, one can associate to each such cycle a value \Sigma(x) sometimes called affinity and measuring the entropy delivered to the environment when performing a thermodynamic cycle, and such that the total entropy produced by the trajectory is \Sigma(a) + \ldots \Sigma(e).

However, what the Qians have in mind by cycle representation is different. The idea is very simple: whenever the Markov walker encloses a cycle, record that cycle and cut it off, util you’re left with a final trajectory without cycles:

4

Again, one obtains a sequence of cycles that represents the trajectory (i.e. one could reconstruct the trajectory by knowing initial and final states, and the sequence of cycles), but different from the Schnakenberg one. The difference is quite substantial. In fact, suppose that the Markovian walker kept moving along these lines and performing more and more cycles. Then, in the Schnakenberg representation this would keep adding cycles of type a,b,c,d,e, and nothing different. Instead, in the latter representation any self-avoiding cycle that can be drawn without lifting the pencil could be generated, which is a larger set comprising a,b,c,d,e,ab,bc,abc,ec,ea,eca. As regards the entropy production, one obtains the same result provided the cycle affinity of compound cycles is additive, \Sigma(ab) = \Sigma(a)+\Sigma(b) etc.

Another interesting aspect of the inequality of these two representations has to do with time reversal, which is believed to be a crucial operation in the formulation of Fluctuation Theorems. A simple look at the Schnakenberg cycle representation of a trajectory conveys that by swapping initial and final points and by inverting the sequence of visited missing edges to -e,-d,-c,-b,-a (the minus meaning that now the cycles have to be toured in the other direction) one obtains the exact time-reversed trajectory.  This is not the case in the Qians/Kalpazidou representation. In fact if we were to invert the trajectory we would obtain a different set of (backward) cycles:

5

This means that time-inversion leads to a nontrivial symmetry of the cycle currents. In fact, numbering from 1 to 11 the sequence of segments of the forward trajectory (on the left of next figure), the trajectory that does invert the sequence of cycles is the following (on the right):

7

Notice that it is not obtained by setting i \to 11 - i (which would correspond to time inversion). Still, the second trajectory is a cut-and-paste of fragments of the time-reversed trajectory, and therefore by the Markov property they have the same probability. Moreover, it can be proven (I think) that the trajectory that does invert all of the cycle fluxes is unique, hence there is a one-to-one correspondence. And therefore, one should expect the Fluctuation Theorem for cycle currents to be satisfied, despite it not being a direct consequence of time-reversal arguments.

System/Environment Duality of Nonequilibrium Network Observables

Finally published is my one of my first papers, which I wrote in 2011 at a time where I was not knowledgable on the strategies of scientific publishing. I naïvely believed that, being a crucial piece of science, it would be accepted welcomingly. I submitted it to a much-too-high-impact-factor journal where it was refused without even going through the peer-review process. My delusion was such that I dismissed it. Four years later I picked it up, did some minor modifications and proposed it as a conference proceeding:

• MP, System/environment duality of nonequilibrium network observables, in
Mathematical Technology of Networks, Springer Proceedings in Mathematics & Statistics Volume 128, ed. Delio Mugnolo (Springer, 2015), pp 191-205, arXiv:1106.1280

The idea is the following. A system is said to be in a nonequilibrium steady state when its state does not change in time, but still there is internal motion. Take for example convective cells of water in a pan where a temperature difference is kept between the top and the bottom surfaces. Then, the steadiness condition implies that the working substance within the system, water in this case, must flow in cycles and there is no net flow, meaning that if one ideally cuts the pan in two, there will be no flow from one side to the other. On a network, this is a manifestation of Kirchhoff’s Current Law.

Now let’s focus on the environment. As the water flows in cycles, it picks up heat from the hotter reservoir and it gives it away to the colder one, hence creating a net flux of heat across the environment (a manifestation of Kirchhoff’s Loop Law). This directed flux does not contain cycles.

So, there is a dual behavior of system and environment: cyclic flow in the system implies directed flow through the environment, and vice versa, so that in a way the system is the environment of the environment.

In fact, this duality can be formulated in rigorous terms, at least when such flows occur on a network. The mathematical technology is that of graphs, their cycle and cut vector spaces, and the well-known notion of a dual of a (planar) graph (from the Wikipedia page):

Duals_graphsAlthough I found this idea quite fascinating, I didn’t advance afterwards, partly because it has a static character that does not adapt well to thermodynamics of master equation systems which is my main interest.

Seminar – Matthieu Verstraete: Ab initio thermoelectric properties

From the University of Liège. Work with Xu, Di Gennaro, Delhadj, Dewandre, Verstraete.

we are trying to scramble the post-petroleum energy scenarios. While nuclear seems dead, solar and wind rule, biomass sis not really ecological, the real key point is to make sure we use energy efficiently and recover it as much as possible. The projections on global energy consumption show that the projected growth in renewables will all go to fill in the energetic needs of fast-“developing” (my quotes) countries like China, India, Brazil, hence we will keep burning oil at the same rate.

Physical strategies to save energy are: Waste heat scavenging, exploiting phase transformations [Nature 502, 85 (2013)], magnetic induction, MEMS/NEMS, via sensors-smart clothes-WiFi-power plants, and most importantly thermoelectrics.

Thermoelectrics uses either the Peltier effect (for cooling), or its inverse the Seebeck effect (power generation), to produce thermodynamic cycles that transform heat flow in electric currents or viceversa. A crucial object to quantify the quality of thermoelectrics is the figure of merit:

ZT=\frac{S^2 \sigma_{el}}{k_{latt}+k_{el}}

One wants to make this number as big as possible. The denominator is the total thermal conductivity. The second term is due to electrons, while the first term is due to phonons (“phonon heat shunt”), which you can’t easily get rid of (this is part of the reason why thermoelectric is not so efficient). So one wants to minimize this phonon part and reduce the electronic one also, without reducing the electronic conductivity (at the numerator) so much. Finally, the coefficient S is the so-called Seebeck coefficient whose ab-initio calculation this talk aims at.

In the 90’s there has been revived interest in thermoelectric, also in industrial, because of new niches, and also because of some new paradigms that allow to reduce thermal conductivity without reducing electrical conductivity too much. However, there is quite some controversy on the possibility that thermoelectrics will ever be efficient:

In particular the latter has argues that whatever people do doesn’t make much of a difference in the economics of beating other resources.

Nevertheless, theory since the 90s made quite some steps. Hicks & Dresselhaus [PRB 47 16631 (1993)] showed that in principle if you nano-structure normal materials, you can improve the figure of merit quite spectacularly, especially if you can move to 2D or 1D materials (5-10 Å layers). But Sofo and Mahan showed that the model was too simplistic [APPL 65 2690, (1994)], because layers should communicate if they are too thin (see also Sofo & Mahan, The Best Thermoelectric, PNAS). Another advance is “All-scale nanostructuring” [Biswas et al. Nature 489 414 (2012)], a technique designed to kill the phonons of different wave-length. This is done at different scales: from atomic doping, to nanoscale to mesoscale grains. So, by killing all phonon modes one can greatly improve ZT.

In this talk we learn about ab-initio calculations of the Seebeck coefficient, starting from the band structure of electrons in materials. Calculations are based on Boltzmann transport equation for the electrons, so arguments are semiclassical. Once one solves the equation one can obtain the currents (assuming diffusive scattering, no ballistic, no hopping, which works quite well thorough different scales, even in lower dimensional systems). From this setup he starts an ab-initio calculation of the Seebeck coefficients. One crucial approximation (that can eventually be waived, to some extent), that allows to solve the Boltzmann equations, is the constant relaxation time approximation (RTA)

\frac{\partial f}{\partial t} = - \frac{f- f_0}{\tau}

The RTA is sometimes very wrong, for example in simple metals. My understanding is that it corresponds to a “linear regime” approximation in nonequilibrium thermodynamics.

As an application, there are metals (Li, Au, Cu, Ag) where the Seebeck coefficient is anomalous, that is it has the wrong sign (it is positive: the majority of carriers pushed through the system are holes) (Robinson PR 161, Robinson and Dow PR 171, Jones Proc. Phys. Soc.). By their method they can explain the reason, based on some funny band structure:

  • Xu and Verstraete, First Principles Explanation of the Positive Seebeck Coefficient of Lithium,  PRL 112, 196603 (2014)

They apply the theory to other metals, they get the right sign but not a good agreement in curves. The question now is: can one engineer S?

The reason why a positive as well as negative Seebeck coefficient are needed is that one needs materials with both properties in a thermoelectric device, to circulate currents. Hence, one wants these material to be chemically compatible, and the best would be to obtain both situations from one material only, by doping it appropriately.

 

Mathematical Trends in Reaction Network Theory in Copenhagen /3

Alan RendallDynamics of phosphorylation cycles. A protein is a chain of amino acids. Often other chmical groups are attached to aminoacids, for example the phosphate group attached to serine etc. Phosphorylation and dephosphorylation then happen, and it is catalysed by enzymes (kenases and phosphatases). There emerge interacting networks of phosphoproteins, which allow storage and transfer of information. There is a cascade of phosphorylated things that act as enzymes for the next step, as studied by Huang and Ferrell, 1996 (with MAK for explicit enzymes included). For Michaelis-Menten, one needs a clear distinction between enzyme and substrate, but here it is not possible; moreover, it can happen that one enzyme is not specific, i.e. it catalyses two reactions (sequestration). Moreover, the cascade can be stuck in feedback loops which can lead to oscillations (Kholodenko 2000). Originally no interesting dynamics was expected without feedback, but in 2008 Wang & Sontag proved multistationarity in a model with a dual futile cycle, and similarly already in MM one can obtain bistability (Ortega et al 2006), for certain parameter values. Baez asks the question everybody wanted to ask: what’s a futile cycle? Something like X -> X + Phophate -> X + 2 Phosphate -> X, which uses energy but goes back to the initial state so seems to be pointless. A quite detailed list of more and more mathematical results on existence of periodic solutions in the cascade follows.

Manoj Gopalkrishnan, Statistical inference with a chemical soup. In the same spirit as Doty, how would we design a network to do something interesting? Website: Molecular programming project. The idea is to find a model of computation native to reaction networks, via Boolean circuits (Winfree, Qian) etc. His proposal is to use statistical inference, in particular Log-linear models, “arguably the most popular and important statistical models for the analysis of categorical data” (Fienberg et al., Maximum likelyhood… Log-linear models). There is a matrix A that contains some “stoichiometric coefficients” regulating the probability of outputs from certain unknown parameters. One asks for Pr[X | \Theta] = \Theta^A (in compressed CN notation). Given several trials, one can build the Maximum likelihood estimator \Theta^\ast = \mathrm{argmax} Pr[x|\Theta] (x being the frequency of X in series of trials), and a Maximum likelihood distribution for the probability of certain parameters. Now he has what looks like a very nice result: if you take a desing matrix A, applying MAK etc. the Maximum Likelihood Estimator is the equilibrium point of the corresponding reaction network. He has some claim that “MAK maximises entropy”. Refs: Napp & Adams, Message passing inference with CRM, Pachter and Sturmfels, Algebraic Statistics for computational biology, Soloveichik, Seeling and Winfree, DNA as a universal substrate for chemical kinetics.

David Anderson, Stochastic models of biochemical reaction systems and absolute concentration robustness. He talks about a subclass of deficiency-1 models, called ACR (absolute concentration robustness) models (Feinberg and Shinar, Structural sources of robustness in biochemical reaction networks, Science 2010. Remarkably for Science, the main result is a theorem!). Consider

A+B \stackrel{\alpha}{\to} 2B, B \stackrel{\beta}{\to} A

Fixed points are: \bar{x}_A = \beta/\alpha, \bar{x}_B = M-\beta/\alpha, where M is a conserved quantity. Interestingly, the concentration of \bar{x}_A is robust in the sense that it does not depend to the conserved quantities (while \bar{x}_B does). Then there is a Theorem (Feinberg and Shinar): Consider a deterministic mass-action reaction system that has a deficiency of one, admits a positive steady state and has two nonterminal complexes that differ only in species S, then the system has absolute concentratikon robustness in S. “Differing in one species” means that they differ in one species, e.g. A and A+B. Also A and A+2B should. To prove, let’s write

\dot{x} = YA_k \Psi(x)

where Y is the complex-to-species matrix and A_k the so-called Kirchhoff matrix (a Laplacian) containing the rates, and \Psi(x) contains the monomials corresponding to complexes. We have

\Psi(\bar{x})= \sum_i c_i v_i, \qquad v_i \in \ker{YA_k}

Now there must be (at least one) vector v_i that has support in non-terminal complexes, and then if one can write the solution in terms of its entries then one is done. But here I didn’t really see how deficiency steps in.

Now, at the stochastic level they started with a conjecture: What is the marginal of a robust species? The conjecture is that it should be Poissonian, but the simplest example proved it wrong. But they came up with a Theorem: Consider CN satisfying deficiency one, etc. then with probability one the species will be extinct (Anderson Enciso et al.) But they observed that they dohave some quasi-stationary Poisson behavior, and the are turning it into some Theorem (Anderson, Cappelletti, Kurtz, being written): time averages of some function look like Poisson averages. Here the story is somehow reminiscent of ageing and glassy bahevior. A simple example of a system that has unique stable fixed point for ODE, but no stationary distribution is (Ankit Gupta) \to X_1, \to X_2, X_1+X_2 \to.

Mustafa Khammash, Real-time control of gene expression. A control theory perspective to gene expression, based on feedback control. He showed the simplest Feedback Control Algorithm, with e(t) = r - y(t) being the error and y(t) the variable and r the desired value (didn’t have time to copy the equation). Basically you want to correct y(t) based on some measure, there will be an error and some control system adapts the response of the system. Similar processes happen in the body, for example for maintaining the Calcium concentration in blood, via input from bone resoprtion and intestinal absorpition, and output via calcium clearance. Similar phenomena appear for balancing. Certain integrals in control theory allow to reach a steady state that is independent of the external conditions and with zero error, while if one applies proportional feedback one will depend on external conditions. He then applies some ideas to the Chemical Master Equation, but since controlling the full distribution is hard, one would want to control the moments. The problem is that the moment closure gives spectacular problems (as mentioned before). The control problem is: Augment the stochastic reaction so to obtain a given average for the species in the long-time limit.

 

Mathematical Trends in Reaction Network Theory in Copenhagen /2

Following from here.

Ovidio Radulescu, Taming the complexity of biochemical networks through model reduction and tropical geometry. Actual Chemical Networks (CN) can be extremely complex, while rate equations cannot go much beyond 20 reactions. Hence one needs to find ways to reduce the model, by lumping reactions and replacing them by “elementary modes”. Most models have a multitude of time scales. Ovidio represents monomolecular reactions by integer-labelled digraphs. Rules for reduction are: “pooling” a fast cycle, “pruning” a fast reacton, etc. All of these are meant to somehow preserve the dynamics (while I have my own ideas on how to system-reduce basing on thermodynamics). So far though he was describing linear networks; the nonlinear case is much more complicated (although I did not understand in which respect his model reduction preserves the dynamics even in the linear case). The Tychonov-Penichel reduction theorems are mentioned again, which makes it one of the more important things I’ll have to learn about.

Nikki Meshkat, Algebraic techniques for the parameter identifiability problem in system biology. Suppose you have a state, an input function, an output function and a parameter. The question is which parameters can be identified given the input/output. One wants to be able to identify very clearly the parameter in a noise-free situation. She considers linear models, but all can be applied to nonlinear models. She considers a two-species model with an input current, and the output is the concentration of one of the two species. The only known thing is then a concentration and an influx. The unknown parameters of the problem are the kinetic rates. She needs to do some “differential elimination” (see Boulier 2007) to obtain a second-order equation for the input/output equation. Then from the time course she can estimate the coefficents of this second-order equation. And then the question is: is it possible to trace back the parameters, is the map from the parameters to the coefficients a 1-to-1 map? The map is polynomial, and most often it will be many-to-1, hence the system is “not identifiable”. Even in that case, though, she can come up with a reparametrization that maps the system into an identifiable system. Things are not always that easy, and the math becomes soon complicated, including Groebner basis, matroid theory (which seems to be more and more important in many areas…) etc.

Janos Toth, The form of kinetic differential equations. The first page was filled with formulas I could not understand.

David Doty, Computation by (not about) chemistry. This was the coolest talk, Doty is really good at captivating the attention by asking the audience to solve simple “exercises”. You may be wondering “why computing with chemistry?”, given that it’s slower and not significantly bigger. The reson might be that it adapts to wet hot environments. CRNs have many properties that might be suitable for computation: Boolean logic, be discrete or analog, have oscillations etc. He showed that many functions (linear functions, min and max) can be implemented by a chemical reaction, where some variable species X_1,X_2 etc. interact with “Yes” and “No” species; interaction between the species performs some computation (e.g. f(X_1,X_2) = X_1-X_2 is implemented by X_2+Y \to \emptyset, X_1 \to Y, interaction with Yes and Nos allows to program logical sentences and evaluate their truth value. Furthermore, CRNs can simuate counter machines (that are very slow universal Turing machines). This only holds if certain reactions can artificially be made extremely slow, and at very high energetic costs (all reactions are irreversible).

David Schnoerr. Breakdown of the chemical Langevin equation and moment closure approximations for stochastic chemical kinetics. Evidence for intrinsic noise in living cells has been found by Elowitz et al. Science 297. In E. Coli, genes are controlled by same promoters, hence extrinsic noise should be the same and if that was the only source of noise, then they should be perfectly correlated. Nevertheless the genes are expressed differently, hence there is intrinsic stochasticity. Scnoerr compared the Chemical Langevin equation and moment closure approximations to the Chemical Master Equation. The moment closure works like this: One takes the CME and obtains a hierarchy of moment equations, and then approximates higher-order moments in terms of lower-order moments, by assuming a distribution (e.g. Gaussian, which gives certain relations about the moments). Problems with negative CLE: molecule numbers can become negative! People tried to fix this problem; sometimes by posing a reflecting boundary at zero. Unfortunately this will not give the exact first two moments. A solution is to let the variables become complex; still the moments remain real, and this can be proven exactly, which is remarkable. Moreover, he showed by simulations that going complex performs much better in calculating mean and variance. Instead, moment closure approximations can fail to give the physical predictions, they diverge from the real values and can only be used above a certain critical volume. Interstingly, moment-closing to second moments is not equivalent to Van Kampen’s system-size expansion.

Daniele CappallettiComplex balanced reaction systems and Product-form Poisson distribution. He covered the great results relating complex-balance, deficiency and product-form Poisson-like steady distributions. Apparently some of the results can be stated in a more general setting than Mass-Action Law. Before, he gave me a very interesting example of a network that is not zero-deficiency but has product-form stationary distribution, namely A+B \to 2A + B, 2A \to A, which is formally product-form (because B does not evolve) but the rates differ in every stoichiometric compatibility class. The novelty is a definition of a “complex-balanced stationary distribution”, which is kind of a new definition. It turns out that it is always a product-form Poisson-like distribution, and that one can parallel for stochastic systems the same construction as Horn and Jackson and Feinberg, but for the Chemical Master Equation.