Definition of the Quantum Theory

In this section we will define the mathematical object which we will denominate quantum field theory and enumerate some of its most important properties in a purely descriptive way. We will also mention a few points of fundamental importance for the physical interpretation of the theory. We will not make in this section any effort to justify these points of physical interpretation or to derive the properties of the theory from its definition. Essentially, all the rest of this book will be dedicated to such activities, and in future volumes we intend to explore other specific models and examples that may serve as illustration, with the objective of clarifying progressively the structure of the theory. With regard to this section, we will consider its objectives achieved if it becomes clear along it that a complete definition exists and that this definition is constructive, being given very explicitly by means of an algorithm, which specifies rules of procedures that, at least in principle, allow us to answer any questions formulated within the structure of the theory.

For the definition of the quantum theory of fields, we start from the same discrete mathematical structure in which we obtained the classical theory. Once again we will use the action $S_{0}$ to illustrate the definition. Is a way similar to that used to define the classical theory, we will first define a finite quantum theory on each finite lattice, and only after that consider the limit $N\rightarrow \infty $. As we shall see, a very important point is that, unlike the case of the classical theory, in this case it will not be necessary to introduce a dimensional scale, external to the model, when we take the continuum limit. We will define the quantum theory on each finite lattice of size $N$ as a finite statistical model on that lattice. The quantities of more immediate physical interest, the observables of the theory, will be defined as statistical averages of functionals of the field within this statistical model. The statistical model establishes that all the possible configurations of the fields contribute to the statistical averages, with relative probabilities defined by the action functional of the model. These configurations of the fields are simply all possible field-functions that we can define on the lattice, which can be described either directly in position space or by means of their Fourier components in momentum space. The relative statistical weights are given by a Boltzmann factor involving the action functional. For example, in the case of the free scalar field we have for these factors


\begin{displaymath}
e^{-S_{0}[\varphi]}.
\end{displaymath}

The set of field configurations with these associated probabilities is referred to as the ensemble of configurations or as the distribution of configurations of the model. The definition would be the same for any other model, with any number and types of fields, defined by some action functional $S$. Given a certain functional ${\cal
O}[\varphi]$ of the field, the expectation value of the observable associated to it on a lattice of size $N$ is defined as the average

  $\displaystyle
\langle{\cal O}\rangle_{N}=\frac{\displaystyle \int_{-\infty}^{\i...
...style \int_{-\infty}^{\infty}\prod_{s}{\rm d}\varphi(s)\;e^{-S_{0}[\varphi]}},
$ (3.1.1)

where the integration element is


\begin{displaymath}
\prod_{s}{\rm d}\varphi(s)=
\prod_{n_{1}=1}^{N}\ldots\prod_{n_{d}=1}^{N}{\rm d}\varphi(\vec{n})
\end{displaymath}

and the integral extends over all possible values of the field, on all the sites. In our case here, the value of the field at each site ranges over the whole real line. This is a ratio of two multiple integrals of large but finite dimension, being therefore a well-defined and familiar mathematical object. The conditions imposed before on the action and the fact that it appears as the argument of a decreasing exponential imply that, for all reasonably well-behaved functionals ${\cal O}$, we do not need to worry about the convergence of such integrals on finite lattices. We see now that the conditions imposed on $S[\varphi]$ so that it may be used in the role of an action functional have the objective of making sure that these integral exist for a large set of observables, including those of physical interest for the theory of fields. From now on we will simplify a little the notation of these integrals, denoting $\prod_{s}{\rm d}\varphi(s)$ simply by $[{\bf d}\varphi]$. In more general cases, in which the field may have several components, this notation will refer to the integration over all independent field components at all the sites. For example, if we have a field $\vec{\varphi}$ with several components $\varphi_{i}$, the complete definition would be


\begin{displaymath}[{\bf d}\varphi]\equiv\prod_{s}\prod_{i}{\rm d}\varphi_{i}(s).
\end{displaymath}

Usually we will also omit the extremes of integration, since it is always understood that the integrals extend over the full image of the field functions. The structure including the functional integration element and the distribution of statistical weights, in which the observable is integrated in order to produce the expectation value,


\begin{displaymath}
\frac{\displaystyle [{\bf d}\varphi]\;e^{-S_{0}[\varphi]}}{\displaystyle \int[{\bf
d}\varphi]\;e^{-S_{0}[\varphi]}},
\end{displaymath}

defines a kind of measure over the space of configurations and is usually referred to as the measure of the model defined by the action $S_{0}$, or as the measure of $S_{0}$. As we will see later, this statistical structure, be it described as an ensemble, as a distribution or as a measure, constitutes in fact a representation of the vacuum state of the model in the context of the quantum theory.

Trivial examples of this kind of integration include the observation that the denominator of our definition in equation (3.1.1) guarantees that, if ${\cal
O}[\varphi]\equiv 1$, then


\begin{displaymath}
\langle{\cal O}\rangle_{N}=\frac{\displaystyle \int[{\bf
d}\...
...}}{\displaystyle \int[{\bf
d}\varphi]\;e^{-S_{0}[\varphi]}}=1,
\end{displaymath}

for all values of $N$, which establishes the normalization of the expectation values. We also have, in the free theory defined by $S_{0}$, that if ${\cal O}[\varphi]=\varphi(s_{0})$ for a certain given site $s_{0}$, then


\begin{displaymath}
\langle{\cal O}\rangle_{N}=\frac{\displaystyle \int[{\bf
d}\...
...}}{\displaystyle \int[{\bf
d}\varphi]\;e^{-S_{0}[\varphi]}}=0,
\end{displaymath}

also for all values of $N$, as can be easily verified (problem 3.1.1). Another example, and a far less trivial one, which is of great interest, would be the expectation value for the choice ${\cal O}[\varphi]=S_{0}[\varphi]$, which we will calculate in detail later on. The observables of greater interest to us will be those defined as the product of a finite number of values of the field at different sites,


\begin{displaymath}
{\cal O}[\varphi]=\varphi(\vec{n}_{1})\ldots\varphi(\vec{n}_{n}).
\end{displaymath}

The expectation values of these observables will be refereed to as the $n$-point functions or as the correlation functions, which we shall denote by


\begin{displaymath}
g_{N}(\vec{n}_{1},\ldots,\vec{n}_{n})=
\langle\varphi(\vec{n}_{1})\ldots\varphi(\vec{n}_{n})\rangle.
\end{displaymath}

Their values define completely most of the physical characteristics of the models defined by each action functional. In the most general case we will be interested in functionals ${\cal
O}[\varphi]$ that will be finite-order polynomials on the fields. One of the examples that we gave above, ${\cal O}[\varphi]=\varphi(s)$, is the one-point function and its expectation value $\langle\varphi\rangle$ is the expectation value of the field, which will have an important role to play in a future volume, when we discuss the phenomenon of spontaneous symmetry breaking.

Figure 3.1.1: Periodical two-point correlation functions for $d=1$.
\begin{figure}\centering
\epsfig{file=c3-s01-prop-d1-2curves.eps,scale=0.6,angle=0}
\rule{\rulewidth}{\figheight}
\end{figure}

The two-point function $\langle\varphi(s_{1})\varphi(s_{2})\rangle$, which we will also call the propagator of the theory, has a particularly important role to play. It is the simplest observable that gives us relations between different sites of the lattice, which may be arbitrarily distant from one another. Hence, it is the simplest observable by means of which we may look at propagation phenomena along the lattice. As we shall see later on in specific examples, in general this function decreases when we increase the distance between the two sites involved, measured in discrete terms, that is, in terms of the minimum number of links that it is necessary to cross in order to go from one site to the other. We say that the two-point function measures the correlations between the values of the field associated to the two sites, and that these correlations decay with the distance along the lattice.

Figure 3.1.2: Periodical two-point correlation functions for $d=2$.
\begin{figure}\centering
\epsfig{file=c3-s01-prop-d2-2curves.eps,scale=0.6,angle=0}
\rule{\rulewidth}{\figheight}
\end{figure}

This decaying behavior of the two-point function may be, in general, of one of two different types, polynomial or exponential. If the decay is polynomial we say that there are in the model correlations with an infinite range, and that it does not establish a scale of distances. However, if the decay is exponential, then the rate of decay of the two-point function does establish a scale of distances that is intrinsic to the model. In this case the sites which are the immediate neighbors of a given site are significantly correlated to it but, since the value of the function decays very fast for large distances, beyond a certain distance the sites become completely uncorrelated with the given site. Hence, this two-point correlation function establishes an intrinsic scale in the theory, given by the discrete distance within which the values of the fields at two different sites are appreciably correlated.

Figure 3.1.3: Periodical two-point correlation functions for $d=3$.
\begin{figure}\centering
\epsfig{file=c3-s01-prop-d3-2curves.eps,scale=0.6,angle=0}
\rule{\rulewidth}{\figheight}
\end{figure}

On a finite periodical lattice one can easily see this, because in this case the finite volume of the box causes the polynomial-decay cases not to decay at all over the finite extent of the lattice. For example, figure 3.1.1 shows two propagators of the free theory defined by $S_{0}$ in dimension $d=1$ on a lattice with $N=25$, one with infinite-range correlations, for which the correlations do not decay at all, and another one with finite-range correlations that clearly establishes a region of strong correlations of a given site with other sites which are close to it in terms of number of links. In these graphs the correlation functions have been normalized so as to be equal to one at the origin. The graphs were obtained calculating the correlation function in the case $\alpha_{0}=5$ to illustrate the exponential decay, and in the case $\alpha_{0}=10^{-66}$ to illustrate the polynomial decay because, due to the existence of a zero mode on the torus, we cannot use the value zero for $\alpha_{0}$. Later on we will discuss how to make such calculations.

One can show that in $d=1$ the quantum theory of fields defined by $S_{0}$ is formally identical to the quantum mechanics of the harmonic oscillator (problem 3.1.2). However, the situation with the correlations in larger dimensions is similar to this one. A similar example with $d=2$ can be found in figure 3.1.2, for the same value of $N$, where the same values of the parameter $\alpha_{0}$, and hence of the range of the correlations, were used. The difference between the two correlation functions is a bit more pronounced in this case, and it becomes even bigger in larger dimensions. In the graphs contained in figures 3.1.3 and 3.1.4 one can see similar examples for $d=3$ and $d=4$. Note the clear similarity of these graphs with the graphs of the Green functions of the classical theory, which were examined in section 2.10. In fact, as we shall see later on, in the free theory the two-point correlation function is always equal to the Green function of the classical theory, in any dimension.

Figure 3.1.4: Periodical two-point correlation functions for $d=4$.
\begin{figure}\centering
\epsfig{file=c3-s01-prop-d4-2curves.eps,scale=0.6,angle=0}
\rule{\rulewidth}{\figheight}
\end{figure}

We will refer to this distance, within which the correlations are appreciable, as the range of the correlations or as the correlation length. If the decay of the two-point function is polynomial and not exponential, we say that the correlations have an infinite range or that the model has long-range correlations. In this case no length scale intrinsic to the theory is established. This is only the case, of course, if the correlations are long-range for all the different fields that are part of a given model. It suffices that one of the fields display an exponential decay of its two-point correlations for an intrinsic length scale to be defined in the model. Usually we will always have at least one field with finite-range correlations, thus providing the model with an intrinsic scale. Observe that in this case we may use the correlation length of this field as the physical unit of length, measuring in terms of it, for example, the size $L$ of the lattice and the lattice spacing $a$. In this way we can define a system of physical units that is intrinsic to the model and not external to it.

As we will discuss in more detail later, most of the physical content of the theory will be encoded into the nature of the fields included in the models and in the nature and behavior of the set of $n$-point correlation functions among these fields. They will determine whether or not we have particles that in fact propagate dynamically, whether or not these particles have non-zero masses, whether or not these particles interact with each other in scattering processes, whether or not there are bound states and what are their properties, in short, all the elements needed to determine both the nature of the structure of matter and the nature of the physical interactions among the elementary entities of which it is composed. Another correlation function of particular importance, besides the propagator, is the four-point function, because it will be related to the existence or not of interactions among particles within the theory. For the time being we cannot give examples of this, because the free theory we are using as an example, exactly because it is a theory of free fields, does not contain interactions between particles. This means that we may calculate the four-point function in this model, but it will decompose into sums of products of pairs of two-point functions. Later on we will present a complete analysis of the structure of the correlation functions in the free theory.

Having defined the quantum theory of our model on each finite lattice, we are now in a position to define completely the quantum field theory associated to this action, in the continuum limit. Since it is the $n$-point functions that define the physics of the model, it would suffice to define them in this limit, but we can do this in a somewhat more general form, for an arbitrary observable. We say then that the values of all observables of the quantum field theory in the continuum limit are the values obtained by means of the limits


\begin{displaymath}
\langle{\cal O}[\varphi]\rangle= \lim_{N\rightarrow\infty}\langle{\cal
O}[\varphi]\rangle_{N}.
\end{displaymath}

To solve exactly a quantum field theory means to manage to calculate exactly these limits for all observables of physical interest. The quantum theory of the model in question will be well-defined if these limits exist and are finite. Note that it is not necessary that the limits be finite for all possible observables, but only for that set, say the $n$-point functions, that define completely the physics of the model. In addition to this, we will see later on that, in order for these limits to exist and have acceptable physical properties, in general it is necessary to impose additional conditions on the dimensionless parameters that appear in the model, regarding their behavior in the limit.

One of the especially important conditions to satisfy in the continuum limit is that the correlation length of the model have a non-zero limit, because otherwise we would have no correlations at all left in the theory after the limit, which would thus become physically meaningless. A zero correlation length in the limit corresponds to the existence of particles with infinite physical mass $m$, a case in which there is no possibility of propagation in the theory, since the movement of such particles would require infinite energy. Usually we will impose that at least one of the correlation lengths of the model have a finite and non-zero limit, since it should define in the limit the physical scale associated to the intrinsic system of physical units on the theory. All other correlation lengths must be non-zero (but possibly infinite) in the limit. In order to put it in a more precise way, if $\xi$ is the dimensionless correlation length and $\chi=a\xi=1/m$ the corresponding dimensionfull correlation length, in general we will impose that, in the limit, the ratio $\chi/L$ have a finite and non-zero limit, or at least that the ratio $a/\chi$ go to zero in the limit, characterizing it as a continuum limit.

Figure 3.1.5: A sequence of lattices with decreasing lattice spacing.
\begin{figure}\centering
\epsfig{file=c3-s01-rede-1.fps,scale=0.6,angle=0}
\end{figure}

Since $\chi$ defines the unit of length, it makes no sense to impose any conditions on its value, but only on ratios between it and other lengths. The condition with the most direct physical meaning would be that, if there is more than one parameter with dimensions of mass in a particular model, then the ratios between these should have finite and non-zero limits. In this way it would also be simpler to conceive limits in which the product of $L$ by any of these parameters would go to infinity, corresponding to models defined in infinite, limitless space. In the simpler models, with only a single massive field, we have only the mass of the field and the size of the lattice to consider, of course, but conceptually the situation does not change. In figure 3.1.5 we show a sequence of superimposed lattices, with decreasing lattice spacings, together with a correlation length which is kept constant, hoping that this illustration will help the reader to visualize what should happen with the relation between the lattice spacing and the correlation length in the continuum limit.

The calculation of these continuum limits, which are always constrained by one or more conditions over the existing parameters, consists of two steps: first the calculation of the integrals on finite lattices of arbitrary size, and then the calculation of the limits for $N\rightarrow \infty $ under the required constraints. Although these are clearly defined mathematical operations, we will see that usually neither of them is easy to realize. As we shall see, we are able neither to calculate the integrals in exact form nor to take the limits in exact form except in the simplest model, the free theory, which we use here as an example. As we shall show in detail, the theory of the free scalar field can be solved exactly by the use of Fourier transforms. While the calculation of these high-dimensional integrals is simply a task of great complexity, which very quickly goes beyond our analytical possibilities, the calculation of the continuum limits is a mathematical operation full of subtleties and surprises.

It is important to observe here that not all elements that appear in the mathematical structure of the theory correspond to observables. In fact, the definition of physical observable as statistical averages given here should be understood as a provisional definition. While all physical observables must be statistical averages of functionals of the fields as defined here, not all the possible statistical averages of functionals of the fields will be interpretable as physical observables. For example, although statistical averages of functionals of the field, $\langle{\cal
O}[\varphi]\rangle$, may be observables according to our provisional definition, the field $\varphi$ itself is not an observable. The field is a random variable whose fluctuations constitute a representation within the theory of the uncertainty principle or, to put it in a more general form, of the observability limits of nature.

These fluctuations behave exactly like thermal fluctuations in statistical mechanics, but their physical interpretation is completely different. The real quantum fluctuations of the theory are those that can be observed on the expectation values of superpositions of the fields within finite boxes with non-zero extension in all dimensions of space-time, measured in successive times. These block variables are very important for the physical interpretation of the theory, as we shall see in the next chapter. They are related to additional restrictions on the nature of the quantities that can be associated to observables of the theory. As we shall see later on, only variables associated to superpositions within these blocks can in fact be observed. Note that the Fourier transforms may be understood as a kind of weighted average over the whole lattice, hence characterizing them as a certain type of block variable. Therefore, the Fourier components of the fields are related in a more direct way with the physical observables of the theory.

The content of the remaining part of this book may be classified in a rough way as composed of two main parts. From the mathematical point of view it consists of the discussion and development of methods and means of calculation of these ratios between multiple integrals. From the physical point of view it consists of the development of the physical interpretation of the elements of this mathematical structure. In the remainder of this chapter the mathematical aspects will be addressed, which will enable us to review the interpretation of the structure on the next chapter. In future volumes we intend to consider the extension of these ideas to other types of fields and will examine other quantum-field-theoretical models.

Problems

  1. Show that in the theory of the free scalar field defined by $S_{0}$ the expectation value of the field $\varphi(s_{0})$ at an arbitrary site $s_{0}$ is zero. Use symmetry and parity arguments to evaluate the necessary functional integrals, in particular the fact that the action $S_{0}$ is invariant by changes of sign of the field, $\varphi\rightarrow-\varphi$, when these are made in an homogeneous way over the whole lattice.

  2. Starting from the action $S_{0}$ for the free scalar field in one dimension in the continuum limit,


    \begin{displaymath}
S_{0}[\phi]=\int{\rm d}t\left[\frac{1}{2}(\partial_{t}\phi)^{2}
+\frac{m_{0}^{2}}{2}\phi^{2}(t)\right],
\end{displaymath}

    show that it is formally identical to the Euclidean action of the one-dimensional harmonic oscillator of mass $M$ and elastic constant $K$, described by a coordinate $X$,


    \begin{displaymath}
S[X]=\int{\rm d}t\left[\frac{M}{2}(\partial_{t}X)^{2}
+\frac{K}{2}X^{2}(t)\right],
\end{displaymath}

    mapping the variables and parameters of one model on those of the other. Show from this fact that the quantum theory of the free scalar field is formally identical to the quantum mechanics of the one-dimensional harmonic oscillator, that is, that one can map all the observables of one of these theories onto the observables of the other.

  3. Recalling problem 1.3.1, where one considers making $\alpha_{0}<0$, show that in this case the integral


    \begin{displaymath}
\int[{\bf d}\varphi]\;e^{-S_{0}}
\end{displaymath}

    does not exist even on finite lattices, where it is just a finite-dimensional integral.

  4. Show that, for ultra-local actions $S$, that is, actions that do not depend on products of the fields at different sites, the correlation functions always factor out in terms of the expectation values of the fields at single sites,


    \begin{displaymath}
g_{N}(\vec{n}_{1},\ldots,\vec{n}_{n})
=\langle\varphi(\vec{n...
...(\vec{n}_{1})\rangle
\ldots\langle\varphi(\vec{n}_{n})\rangle.
\end{displaymath}