Relation with Statistical Mechanics

The mathematical structure of quantum field theory, in the form in which it was defined in section 3.1, is formally identical to the mathematical formalism used in statistical mechanics for lattice systems. The mathematical difficulties that must be faced in the calculation of the averages are the same in either case and, in fact, the case $d=3$ coincides completely with the formalism of the micro-canonical ensemble of statistical mechanics. In the case $d=4$, particularly because there is then the additional issue of changing from Euclidean space to Minkowski space, we have only an analogy with respect to the physical aspects of the theory, however this analogy is extremely useful as a guide for our physical intuition within quantum field theory. Often concepts of statistical mechanics find use in quantum field theory and their nomenclature is used for the corresponding mathematical elements in that theory, but we should not loose sight of the great differences of physical interpretation that exist between the two theories.

We will make here a few comparisons between terms and concepts of each theory, relating each element of our structure to the corresponding elements of statistical mechanics. We will also point out the main differences of interpretation between the two theories, in addition to introducing some concepts that are of great importance and usefulness. Without intending to develop the subject in detail or to show objective evidences of the facts mentioned, we will try to describe the main facts relating to the aspects of statistical mechanics that are most important for quantum field theory, specially those related to the phenomenology of systems that display phase transitions and critical behavior.

In statistical mechanics the lattice usually represents some real crystalline structure, which implies, in particular, that in this case there is a natural length scale in the system, defined by the lattice spacing of this crystalline structure as measured in terms of the atomic and molecular parameters of matter. The paradigmatic topic for the use of the lattice in statistical mechanics is the study of crystalline substances with magnetic properties. In this case the fields $\varphi(s)$ associated to each site are representations of the spins of the components of matter, and of their magnetic moments. In this context the quantity that plays the role of the action is the energy, represented by the Hamiltonian function $H$ of the system of spins, the relative statistical weights being given by the usual Boltzmann distribution $\exp(-\beta H[\varphi])$, where $\beta=1/(kT)$ is the usual factor involving the temperature $T$ of the system. A simple model that is very popular for this type of study is the Ising model, in which we have at each site a one-dimensional spin $\varphi$ that can assume only two discrete values, $1$ and $-1$. The energy of the system is given by


\begin{displaymath}
H[\varphi]=-\sum_{\ell}\varphi_{(-)}\varphi_{(+)}-j\sum_{s}\varphi(s).
\end{displaymath}

In future volumes we will see that there are indeed close relations between this model and the models of scalar fields in quantum field theory. Observe that this Hamiltonian causes it to be energetically favorable for neighboring spins to have the same sign, that is, for them to align with each other. The denominator that appears in (3.1.1) corresponds in this case to the partition function of the statistical model,


\begin{displaymath}
Z=\sum_{\cal C}e^{-\beta H[\varphi]},
\end{displaymath}

where the indicated sum is over all the configurations ${\cal C}$ of the system, that is, all possible combinations of $1$ or $-1$ at all the sites of the lattice. This model was created and is widely used for the study of critical phenomena in statistical mechanics, which are associated to phase transitions in the materials. Processes such as the boiling of liquids and the spontaneous magnetization of certain metals and other materials are examples of phase transitions. The Ising model can be solved without too much difficulty in the case $d=1$, but in this case it does not display critical behavior. On the other hand, in any dimension equal to or greater than $d=2$ it does display critical behavior, but the exact solution of the model is unknown in the majority of these cases. The case $d=2$ is extremely special because it is one of the very few models with critical behavior between two distinct phases that can be solved exactly, under certain conditions. It is necessary to emphasize here that all these models only display critical behavior in the $N\rightarrow \infty $ limit, that is, when we have extremely large lattices, as is the case for the real crystalline lattices of macroscopic quantities of materials.

Models like these, that display critical behavior, will be of extreme interest for quantum field theory. In the case of the Ising model the spins are discrete variables, but it is also possible to define similar models with continuous variables, which will be of even greater interest. One such example is the Heisenberg model, in which we consider that there exists at each site a three-dimensional classical spin, that is, a vector $\vec{\varphi}$ with three components and fixed modulus $\varphi=1$. These are continuous variables that span the two-dimensional sphere $S_{(2)}$, rather than discrete variables as in the Ising model. In this case the Hamiltonian is given by


\begin{displaymath}
H[\vec{\varphi}]=-\sum_{\ell}\vec{\varphi}_{(-)}\cdot\vec{\varphi}_{(+)}
-\vec{\jmath}\cdot\sum_{s}\vec{\varphi}(s),
\end{displaymath}

where the dot denotes the scalar product of vectors. As we shall see in future volumes, this model also has close relations with the models of scalar fields of quantum field theory. An important difference between this type of model and the Ising model is that in this case $H$ is invariant by a continuous set of symmetry transformations, the set of three-dimensional rotations, while in the discrete Ising model $H$ is invariant by a discrete set of transformations, the sign reflections of the spins. In this continuous case the partition function is not given by a discrete sum, but rather by a functional integral


\begin{displaymath}
Z=\int_{S_{(2)}}[{\bf d}\sigma]\;e^{-\beta H[\vec{\varphi}]},
\end{displaymath}

where ${\rm d}\sigma$ is the area element of $S_{(2)}$. These models only display critical behavior for $d>2$, not for $d=2$ or $d=1$. In fact, it is a fairly well-established fact that in $d=1$ there are no models with couplings only between next neighbors that display the long-range order which is characteristic of the type of critical behavior that is of interest for us in quantum field theory. The same is true in $d=2$ for models which are invariant by continuous symmetry transformations, as is the case for the Heisenberg model. The particular case of the Ising model in $d=2$ is not an exception to this rule, because in this very special case the invariance transformations are discrete, not continuous.

Figure 3.2.1: Qualitative diagram of the magnetization as a function of $\beta $.
\begin{figure}\centering
\epsfig{file=c3-s02-magnetizacao.fps,scale=0.6,angle=0}
\end{figure}

The behavior of the Heisenberg models for $d>2$ may be described in a qualitative way as follows. The case of the Ising models is a little different due to the fact that the variables are discrete, but all the fundamental facts relative to the behavior close to the critical point are similar. First, we define a quantity $\vec{M}$, which we refer to here as the magnetization, which is simply the sum of all spins,


\begin{displaymath}
\vec{M}=\sum_{s}\vec{\varphi}(s).
\end{displaymath}

In the case of the quantum theory of fields, we would be more interested in the average value of the fields over the lattice,


\begin{displaymath}
\overline{\vec{\varphi}}=\frac{1}{N^{d}}\sum_{s}\vec{\varphi}(s),
\end{displaymath}

which is basically the same quantity with a different normalization. We say that the modulus of the average value of $\vec{M}$ is the order parameter of the system, because its behavior characterizes the two phases in which the system can exist. For high temperatures $T$, that is, for small $\beta $, the model has a phase that is denominated symmetrical or disordered and that is characterized by the value


\begin{displaymath}
M=\vert\langle\vec{M}\rangle\vert=0
\end{displaymath}

for the quantity shown, which we name the scalar magnetization $M$, where the statistical average is defined by


\begin{displaymath}
\langle\vec{M}\rangle=\frac{\displaystyle \int_{S_{(2)}}[{\b...
...e \int_{S_{(2)}}[{\bf
d}\sigma]\;e^{-\beta H[\vec{\varphi}]}}.
\end{displaymath}

For low temperatures $T$ and hence large $\beta $ the model has an ordered or broken-symmetrical phase, in which $M\neq 0$. These two regions of values of $T$ are separated by a certain value $T_{c}$, the critical temperature, which is finite and non-zero for $d>2$. The two phases have very different thermodynamical characteristics, which change abruptly at $T_{c}$. For example, the typical qualitative behavior of the scalar magnetization is given in the graph of figure 3.2.1, where $\beta_{c}=1/(kT_{c})$.

In the symmetrical phase the spins are distributed in a very random way across the lattice and the correlations between a site and its neighbors are weak, that is, if the spin at a certain site points in one direction the probabilities that the spin of one of its neighbors point in the same direction or in the inverse direction are practically the same. Sites which are more distant from one another than next neighbors are even less correlated. Clearly, this tends to make the average of $\vec{M}$ go to zero. We say that this phase is highly uncorrelated or that is has a short correlation length. In the broken-symmetrical phase the situation is the opposite of this one, the spins tend to be all aligned with each other, causing the average of $\vec{M}$ to be different from zero. In this phase there are long-range correlations in the system, that appear dynamically as spin waves that propagate along the crystalline lattice. If disturbed, the spins oscillate is a coordinated way, each one affecting significantly its neighbors and giving origin to perturbations that propagate like waves for long distances. Se say that in this case the system is highly correlated or that it has a long correlation length. The point $T=T_{c}$ is very special because this is the only point where we have at the same time $M=0$ and long-range correlations.

As one can see in the graph of the scalar magnetization given in figure 3.2.1, at the critical point the magnetization has a singular behavior, and is not differentiable as a function of $\beta $. In general the systems that display phase transitions are characterized by some form of singular behavior at the critical point that separates the two phases. We may classify the critical systems according to the degree of singularity that they display at the transition point. The first order critical systems, of which boiling liquids are an example, are systems in which the order parameter, for example the density of the fluid, has itself a discontinuous behavior at the transition. Systems like the spontaneous magnetization models that we discuss here, in which the order parameter is continuous but not differentiable at the transition point, are denominated second order critical systems, and are the only ones of real interest for the quantum theory of fields. This is due to the fact that the first order systems, unlike the second order ones, do not have long range correlations at the critical point $T_{c}$. The existence of these long range correlations is essential for the very existence of the quantum field theories in the continuum limit. Due to this, only the immediacy of the critical points of models with second-order phase transitions are of interest for the quantum theory of fields, unlike what happens in statistical mechanics, where all the other regions of the space of parameters of the models also correspond to situations of physical interest.

In the classical theory of the free scalar field we saw that in order to obtain a finite mass $m_{0}$ in the continuum limit it is necessary to make the parameter $\alpha_{0}$ go to zero in the limit. It was mentioned then that this was a special value of this parameter, the critical value. We will see that in the quantum theory this is in fact a critical point of the model. In this case there is no phase transition, properly speaking, because the model only exists at all in one of the two regions of the $\alpha_{0}$ real line separated by the critical value, the half-axis in which $\alpha_{0}>0$. In the other half-axis the model is unstable, in the sense that in this region it is not possible to define it by means of the Euclidean lattice as we did here. We may denominate this region as the unstable “phase”, a name that comes from the fact that the computer simulations, that one may try to execute in this region, are in fact unstable, making the dimensionless fields $\varphi$ diverge randomly to infinity. The phase that does exist is denominated “symmetrical phase” for reasons the will become clear in future volumes when we examine the polynomial models of scalar fields. We can represent all this situation by means of a critical diagram like the one in figure 3.2.2, as we will do in future volumes for less trivial models than this one. In statistical mechanics the free theory is called the Gaussian model and the critical point $\alpha_{0}=0$ is called the Gaussian critical point.

One of the most fundamental differences between statistical mechanics and quantum field theory relates to the types of limits that are of interest in each case. In both cases we are interested in the limit $N\rightarrow \infty $, but in statistical mechanics this limit is taken in a way that does not characterize it as a continuum limit, but rather as the thermodynamical limit in which we make the volume of the system tend to infinity. This is due to the fact that in this case the lattice spacing $a$ does not go to zero, but instead of this is kept constant, which implies that the size $L$ of the box must become infinite in the limit. This is the limit that corresponds to the study of macroscopic samples of materials whose structure is a lattice at the atomic level, where the lattice spacing $a$ establishes the physically relevant scale. In the case of quantum field theory we may either make the volume tend to infinity or keep it finite, but what is important is that in either case the lattice spacing $a$ be made to go to zero in comparison to the length scales that are relevant to the physics of the model. Hence, when we consider some finite and non-zero length in the case of statistical mechanics, it will always correspond to a finite number of consecutive links. In quantum field theory a finite and non-zero length will always correspond to an infinite number of consecutive links. This difference regarding the nature of the limits is one of the main conceptual differences between statistical mechanics and quantum field theory.

Figure 3.2.2: Critical diagram for the theory of the free field.
\begin{figure}\centering
\epsfig{file=c3-s02-critdiagfree.fps,scale=0.6,angle=0}
\end{figure}

In these statistical systems we may define a function, which we will call the correlation function, that measures the range of the correlations among the spins at the various sites, as a function of the distances among them. Assuming that the model is such that the averages of the variables $\varphi$ at the sites are zero, $\langle\varphi\rangle=0$, while the variables undergo statistical fluctuations with a certain characteristic magnitude around this value, we may define this function, relating two sites $s_{1}$ e $s_{2}$, as


\begin{displaymath}
g(s_{1},s_{2})=\langle\varphi(s_{1})\varphi(s_{2})\rangle.
\end{displaymath}

It has the property that follows: if, when $\varphi(s_{1})$ has a positive value of typical magnitude, the probabilities that $\varphi(s_{2})$ be positive or negative are similar, then the average value of the product tends to go to zero, resulting in a small or zero $g(s_{1},s_{2})$; on the other hand, if the fact that $\varphi(s_{1})$ has a positive value of typical magnitude implies that the probability that $\varphi(s_{2})$ is aligned with it is significantly larger than the probability that is has the opposite sign, then the average value tends to be positive and non-zero, resulting in a non-zero $g(s_{1},s_{2})$, with a magnitude related to the typical value of the fluctuations of the variables at the sites. Hence, the fact that this function is either large or small compared to the typical size of the fluctuations measures the level of statistical correlation between the variables associated to the sites $s_{1}$ and $s_{2}$. If $s_{1}$ and $s_{2}$ are the same site $s$, then $g(s,s)=\sigma^{2}$ is the square of the average magnitude of the fluctuations of the variables, a positive and non-zero number. Since we are not interested here in the absolute values of the fluctuations of these variables but rather in the correlations between two of them, it is natural to normalize the correlation function so that it is unity at the origin. In addition to this, in case the variables $\varphi$ do not have zero averages, we can always calculate this average value $\bar{\varphi}$ and then describe the model in terms of new variables $\varphi'=\varphi-\bar{\varphi}$, that do have zero averages. With all these considerations we arrive at the final definition of the statistical correlation function. Given statistical variables $\varphi(s)$, we define the corresponding two-point correlation function as


\begin{displaymath}
\mathfrak{f}(s_{1},s_{2})=\frac{\langle\varphi'(s_{1})\varphi'(s_{2})\rangle}
{\langle[\varphi'(s)]^{2}\rangle},
\end{displaymath}

where


\begin{displaymath}
\varphi'(s)=\varphi(s)-\langle\varphi(s)\rangle.
\end{displaymath}

The function $\mathfrak{f}(s_{1},s_{2})$ has the property that $\mathfrak{f}(s,s)=1$, which represents the trivial fact that the variable at a certain site is always completely correlated to itself. In homogeneous systems, that have discrete translational invariance on the lattice, $\mathfrak{f}$ is in fact a function only of the distance $r$ between the sites, measured in terms of the number of links crossed in order to go from one site to the other. Besides, $\mathfrak{f}(r)$ is never an increasing function of the distance, usually it decreases or at most remains constant. In the great majority of systems $\mathfrak{f}(r)$ displays one of two general classes of behavior: it can display a decay with distance according to some inverse power of $r$, a situation which we denominate polynomial decay; or it can display an exponential decay with $r$, always much faster than any polynomial decay. In this case, for large distances $r$, we have that $\mathfrak{f}(r)$ assumes the general form


\begin{displaymath}
\mathfrak{f}(r)\sim \mathfrak{f}_{0}\frac{e^{-\frac{r}{r_{0}}}}{r^{p}},
\end{displaymath}

where $\mathfrak{f}_{0}$ and $r_{0}$ are positive constants and $p$ is a positive integer or half-integer power. The constant $r_{0}$ defines the range of the correlations, since for $r<r_{0}$ there will be appreciable correlations, while for $r>r_{0}$ the correlations vanish very quickly. We call $r_{0}$ the correlation length of the statistical system. As measured here, in terms of number of links and therefore using as the unit of length the lattice spacing $a$, this is the correlation length of interest for statistical mechanics. The statistical systems that display second-order critical behavior are characterized by the fact that the correlation length $r_{0}$ goes to infinity when we approach the critical point, which means that $\mathfrak{f}(r)$ ceases to display an exponential decay and acquires a polynomial decay at this point. We say then that the system has acquired long-range order. In these systems the exponential decay of $\mathfrak{f}(r)$ is characteristic of the symmetrical or disordered phases, while the polynomial decay is characteristic of the broken-symmetrical or ordered phases. In the context of quantum field theory, on the other hand, a $r_{0}$ that is a finite multiple of the lattice spacing $a$ represents a correlation length that goes to zero in the continuum limit, because by definition $a$ goes to zero in this limit. Hence, in the quantum theory only the situation in which $r_{0}$ tends to infinity in terms of $a$ are of any interest. It is due to this that in the quantum theory of fields we are interested only in the critical points, which are the points where $r_{0}$ behaves in this way.

We close with an observation regarding the concept of temperature in the context of quantum field theory. Observe that the statistical-mechanic quantity that really corresponds to the action $S$ of quantum field theory is the product $\beta H$. In many important models such as, for example, the gauge theories, it is possible to change variables in the action so that it ends up multiplied by a parameter such as this $\beta $. In these models we tend to refer to this parameter as the inverse of a temperature, since the analogy with the temperature of statistical mechanics is very useful to guide our intuition regarding the statistical inner workings of the model. However, it is necessary to emphasize that this parameter is in no way related to the thermodynamical temperature of the physical system described by the model defined by $S$. Usually the parameter is related to what we call the non-renormalized or bare coupling constant of the theory, and not to the true physical temperature. Of course there is a concept of thermodynamical temperature that can be defined as part of our models of quantum field theory, but it is not related to this parameter and it is important to keep in mind a clear distinction between the two concepts, since one involves the real thermodynamical temperature and the other is only a very useful mathematical analogy.

Problems

  1. Consider the Ising model in one dimension, as defined in the text, on finite lattices of size $N$. Write a program to calculate directly, summing over all possible configurations, the quantities $M=\vert\langle\vec{M}\rangle\vert$ and $M'=\langle\vert\vec{M}\vert\rangle$, given a value of $\beta $. Plot $M$ and $M'$ as functions of $\beta $ for a few values of $N$, up to the largest lattice for which it is still possible to run the program in a few minutes or less, for each value of $\beta $. How can you understand the results that you got?

  2. Repeat the previous problem for the Ising model in two dimensions.