The Wolff Algorithm

The traditional Metropolis algorithm, when applied do the Ising models close to their critical points, suffers from severe critical slowing down problems. This is due to its single-site updating procedure, which is good for updating the short-wavelength components of the configurations, but very bad at updating the long-wavelength components. Near the critical point the long-wavelength components play an important role due to the emergence of long-distance correlations, and in these circumstances the algorithm tends to diffuse only very slowly through the configuration space of the model. This increases the autocorrelations along the stochastic sequence of configurations and severely degrades the statistical quality of the results.

This is always a problem is the case of quantum field theory, since in that theory we are interested primarily in the critical region of the statistical models. If we are to use the Ising model in the analysis of problems in quantum field theory, we must use simulation techniques that do not suffer from such problems. The cluster algorithms are the answer to our needs, and among them the Wolff algorithm is particularly well suited for the task, due to its simplicity and efficiency. Cluster algorithms are characterized by the updating of whole sets of sites, or clusters, at a time, and in doing this they solve the problem of critical slowing down. The problem of how to make collective updates with a low rate of rejection and in such a way that the required statistical distribution is still produced correctly is a very difficult one, which was solved by the inventors of the cluster algorithms.

In this section we will describe the basic algorithm and show that it satisfied the condition of detailed balance, thus converging to the correct distribution. After that we will generalize the algorithm to the case in which we introduce external sources in the systems, as well as to the case in which we use fixed rather than the more usual periodical boundary conditions. This last topic will also lead us to consider the use of truncated clusters, a variation of the algorithm which may be useful under some circumstances.

Let us start by reviewing the fundamentals of the Metropolis algorithm. In order to do that we must first establish some notation. Consider a model in quantum field theory defined by an action functional $S[\psi]$, where $\psi$ is the field. This action determines the distribution of probabilities for all the possible configurations $C$ of the field, which is given by $P_{B}[C]=Z\exp(-S[C])$, where $Z$ is a normalization constant and the dependence on the field can be represented either by $S[\psi]$ or by $S[C]$. What a simulation algorithm does is to establish a probability $W[C\rightarrow C']$ for going from a configuration $C$ to a configuration $C'$ on a single iteration of the algorithm. By iterating the algorithm indefinitely one produces a sequence of configurations with a distribution of probabilities that is supposed to approach $P_{B}[C]$.

In order for this to work, it suffices that the algorithm satisfy two conditions: ergodicity and detailed balance. Although not strictly necessary, these conditions are sufficient to ensure convergence to the correct distribution. Ergodicity means that any configuration $C'$ should be reachable from any other configuration $C$ in a finite number of iterations of the algorithm. In other words, the algorithm should never get stuck in any particular configuration that it is unable to leave, and there should be no configuration that it cannot reach from some other configuration. This means that the algorithm should be able to diffuse through the whole space of possible configurations of the field, without leaving out any points or regions within it. Note that no statement is made here about how fast it is able to diffuse. Of course, in practice the quality of the algorithm will depend on it being able to do so fast, because the space of configurations is usually very large.

The condition of detailed balance is the one which will mostly concern us here. It is a relation between the probability $W[C\rightarrow C']$ of going from the configuration $C$ to the configuration $C'$ and the probability $W[C'\rightarrow C]$ of doing the reversed update, that is, of going from $C'$ to $C$, and involves the probabilities $P[C]$ and $P[C']$ of the two configurations involved, within the ensemble of all possible configurations. It states that the ratio of these two probabilities should satisfy the condition


\begin{displaymath}
\frac{W[C\rightarrow C']}{W[C'\rightarrow C]}=\frac{P[C']}{P[C]}.
\end{displaymath}

Since the probabilities of the configurations can be written in terms of the action, this condition can be translated into


\begin{displaymath}
\frac{W[C\rightarrow C']}{W[C'\rightarrow C]}
=\frac{\exp(-S[C'])}{\exp(-S[C])}=\exp(-S[C']+S[C])=\exp(-\Delta S),
\end{displaymath}

where $\Delta S$ is the change in the action due to the update of the configuration. It is possible to show by elementary means, in a simple one-dimensional case, that this condition is sufficient to guarantee that the set of configurations produced by the algorithm converges to the correct distribution. However, that is not our focus here, and we will simply take it as a given fact. Let us now describe in detail the usual Metropolis algorithm for a single local update in the Ising model, and show that it satisfies this condition. Starting from some initial configuration, the Metropolis single-site updating algorithm is given by the following sequence of steps:

  1. Choose a single site of the lattice for trying an update, for example in a random way.

  2. Calculate the variation $\Delta S$ of the action caused by flipping the field at that site.

  3. If it happens that $\Delta S\leq 0$, then flip the field with probability equal to $1$.

  4. If $\Delta S>0$, then flip the field with probability $\exp(-\Delta
S)$, which in this case is a number in the interval $(0,1)$; if the update is rejected, repeat the current configuration in the stochastic sequence.

  5. Loop back to step $1$ and repeat the procedure.

As one can see, the algorithm is very simple and easily generalizable to more complex models. It is also possible to improve it in various rather technical ways, that help in some cases but that do not solve the central problem of critical slowing down caused by the local update. It is now easy to verify that this algorithm satisfies the condition of detailed balance. We must consider three cases, depending on whether the variation of the action is positive, zero or negative. If the update is such that $\Delta S>0$, then we have for its probability $W[C\rightarrow
C']=\exp(-\Delta S)$; in this case the reversed update is such that $\Delta S<0$, so that we have for its probability $W[C'\rightarrow C]=1$; it therefore follows that


\begin{displaymath}
\frac{W[C\rightarrow C']}{W[C'\rightarrow C]} =\frac{\exp(-\Delta
S)}{1}=\exp(-\Delta S),
\end{displaymath}

as required by the condition. If the update is such that $\Delta S<0$, then the probability for the update is $W[C\rightarrow C']=1$, and the probability for the reversed update is $W[C\rightarrow
C']=\exp(-\Delta'S)$, where $\Delta'S=-\Delta S$ is the variation of the action in the reversed update; it therefore follows that


\begin{displaymath}
\frac{W[C\rightarrow C']}{W[C'\rightarrow C]} =\frac{1}{\exp(\Delta
S)}=\exp(-\Delta S),
\end{displaymath}

Figure 1: A periodical lattice with the sites and links classified, showing the clusters that may be built in it.
\begin{figure}\centering
\epsfig{file=lattice-periodic.fps,scale=0.7,angle=0}
\end{figure}

just as in the previous case. If $\Delta S=0$, then we also have $\Delta'S=0$, and hence $W[C\rightarrow C']=W[C'\rightarrow C]=1$, so that the ratio of the two updating probabilities is equal to $1$; since in this case $\exp(-\Delta S)=\exp(0)$ is also equal to $1$, we see that in all three cases the condition is satisfied.

In order to discuss the Wolff algorithm, we must now establish a few more concepts and notations. The basic idea of the algorithm will be to first built a cluster of sites according to certain rules, and then to flip the whole cluster. The cluster will be built by the ``activation'' of some of the links between two neighboring sites, and requires that we classify the links according to whether it connects two sites where the field has the same orientation, or opposite orientations. At first, let us consider only lattices with periodical boundary conditions and the model in the absence of any external sources. Later on we will extend the algorithm to these other cases. We should recall here that the Ising model is defined by the action


\begin{displaymath}
S[\psi]=-\beta\sum_{\ell}\psi_{-}\psi_{+},
\end{displaymath}

Figure 2: The same periodical lattice, showing a possible cluster that might have been built in it.
\begin{figure}\centering
\epsfig{file=cluster-periodic.fps,scale=0.7,angle=0}
\end{figure}

where $0<\beta<\infty$ is the free parameter of the model, the sum is over the links and $\psi_{-}$, $\psi_{+}$ are the values of the field at the two ends of a given link. In figure 1 one can see an example of a two-dimensional periodical lattice with all its sites and links dully classified. The sites where the field is positive are shown as black circles and those where the field is negative as white circles. The links that connect two sites with the same orientation are shown as solid lines, and those connecting sites with opposite orientations as dashed lines. Note that this is true for the border links as well, which establish the periodical boundary conditions. The links represented by solid lines will be denoted by $\ell_{+}$ in the formulas, and those represented by dashed lines by $\ell_{-}$. According to the algorithm only $\ell_{+}$ links can be activated to form clusters, so the sets of sites connected by solid lines illustrate the clusters that can be formed in this particular configuration. We may now state the Wolff algorithm, which is given by the following sequence of steps:

Figure 3: A larger lattice, showing a cluster and its boundary.
\begin{figure}\centering
\epsfig{file=large-cluster.fps,scale=0.56,angle=0}
\end{figure}

  1. Choose a single site of the lattice for starting to build the cluster, in a random way.

  2. Consider all the links connected to that initial site; the $\ell_{-}$ links are never to be activated; activate the $\ell_{+}$ links with the probability $p_{+}=1-\exp(-2\beta\psi_{-}\psi_{+})$, thus forming a first cluster of sites, that is, updating the cluster from a single site to possibly a few sites.

  3. Given the set of sites added to the cluster in the previous update of the cluster (which is not the whole cluster), consider all the links that connect those sites to sites that are still outside the cluster; among these, activate the $\ell_{+}$ links with the probability $p_{+}$, thus enlarging (updating) the cluster.

  4. Loop back to step $3$ until the set of links left for activation trial in the next round of the loop is empty; this means that you stop this loop at the first instance of step $3$ that adds no new sites to the cluster.

  5. Flip the whole cluster with probability $1$, then loop back to step $1$, in order to build and flip a new cluster.

Note that in steps $2$ and $3$, since only $\ell_{+}$ links can be activated, the argument of the exponential is negative and thus $\exp(-2\beta\psi_{-}\psi_{+})$ is a number in the interval $(0,1)$. Therefore, so is the probability $p_{+}=1-\exp(-2\beta\psi_{-}\psi_{+})$. Note also that during the construction of the cluster a certain site may be approached by the growing cluster from two or more different directions, along different links. If that site does not get linked to the cluster at the first trial, it might still get linked at a later trial. The algorithm implies that the same link is never tried more than once for activation, but a single site my be tried more than once for linking to the cluster, if it has more than one link able to be activated. Finally, observe that during construction it might happen that two sites already included in the cluster are connected by a link that has not yet been tried. That link will in fact never be tried by the algorithm, and may be considered as automatically activated, that is, they can simply be activated with probability $1$ at the end of the construction. A simple cluster that could result from the algorithm is shown in figure 2, with its sites and activated links, which are shown in a thicker solid line.

Let us now show that this algorithm satisfied the condition of detailed balance. In order to do this, given a certain cluster, we must calculate the probability that it will be built and flipped. We will denote the cluster by ${\cal C}$ and the boundary of the cluster, that is, the set of links that connect the cluster with sites outside the cluster, by ${\partial\cal C}$. In figure 3 one can see a larger lattice with a cluster that could have been built by the algorithm, as well as its boundary, illustrated by the thick dotted line which crosses all the links that connect the cluster to sites outside the cluster. Since the probability of flipping a cluster is $1$, once it has been built, we must simply calculate the total probability of building a certain given cluster. The process of building the cluster might start at any of its sites, so we start the calculation with the probability that one of the sites of the cluster will be chosen in step $1$ of the algorithm. Let us call this probability $p_{i}$, and there is no need to further elaborate on it.

Irrespective of where the construction starts, the probability that the given cluster will be built is the product of the probabilities that each one of its internal links will be activated, times the product of the probabilities that each one of the links of its borders will not be activated. Now, all the internal links are $\ell_{+}$ links, so the probability that they will be activated is given by


\begin{displaymath}
\prod_{\ell_{+}\in{\cal C}}p_{+}.
\end{displaymath}

For the $\ell_{+}$ links in the border of the cluster not to be activated we have the probability


\begin{displaymath}
\prod_{\ell_{+}\in{\partial\cal C}}q_{+},
\end{displaymath}

where $q_{+}=1-p_{+}=\exp(-2\beta\psi_{-}\psi_{+})$ is the complementary probability to $p_{+}$. Since the probability for the $\ell_{-}$ links at the border not to be activated is $1$, this product is the complete probability for the construction of the boundary. We have therefore the complete probability for the construction of the cluster, and hence the complete probability for the update of the configuration,


\begin{displaymath}
W[C\rightarrow C']=p_{i}\left(\prod_{\ell_{+}\in{\cal C}}p_{+}\right)
\left(\prod_{\ell_{+}\in{\partial\cal C}}q_{+}\right).
\end{displaymath}

Next we must consider the probability for the reversed update. Since the same set of sites that makes up the cluster ${\cal C}$ in the configuration $C$ also makes up the flipped cluster ${\cal C}'$ in the configuration $C'$, it is clear that the probability $p_{i}$ is the same in either case. Now, repeating the argument used for the cluster ${\cal C}$ we get for the probability that the cluster ${\cal C}'$ will be built


\begin{displaymath}
W[C'\rightarrow C]=p_{i}\left(\prod_{\ell_{+}\in{\cal C}'}p_...
...right)
\left(\prod_{\ell_{+}\in{\partial\cal C}'}q_{+}\right).
\end{displaymath}

Observe now that when we flip the fields certain mappings between links take place. When we go from the cluster ${\cal C}$ to the flipped cluster ${\cal C}'$, we can see that all the internal $\ell_{+}$ links of ${\cal C}$ become internal $\ell_{+}$ links of ${\cal C}'$, since the fields at both ends of each link get flipped and hence their product does not change sign. This means that the first products that appears in the two formulas above are equal, and therefore that we may write the ratio of the two probabilities as


\begin{displaymath}
\frac{W[C\rightarrow C']}{W[C'\rightarrow C]}
=\frac{\prod_{...
...tial\cal C}}q_{+}}{\prod_{\ell_{+}\in{\partial\cal C}'}q_{+}}.
\end{displaymath}

For the links at the border, however, we have that the $\ell_{+}$ links of ${\partial\cal C}$ become the $\ell_{-}$ links of ${\partial\cal C}'$, and that the $\ell_{-}$ links of ${\partial\cal C}$ become the $\ell_{+}$ links of ${\partial\cal C}'$, since for these links only the field at one end gets flipped and therefore the sign of the product changes. In addition to this, the term of the action corresponding to each link changes sign, so that the factor $q_{+}$ gets mapped to $1/q_{+}$. If we write the ratio above in terms of the action terms,


\begin{displaymath}
\frac{W[C\rightarrow C']}{W[C'\rightarrow C]}
=\frac{\prod_{...
..._{\ell_{+}\in{\partial\cal C}'}\exp(-2\beta\psi_{-}\psi_{+})},
\end{displaymath}

we see that we can decompose each one of the two products in two parts,


\begin{displaymath}
\frac{W[C\rightarrow C']}{W[C'\rightarrow C]}=
\frac{\prod_{...
...d_{\ell_{+}\in{\partial\cal C}'}\exp(-\beta\psi_{-}\psi_{+})}.
\end{displaymath}

Now, due to the mappings of the boundary links which we just discussed, we have that the following two equations hold,

\begin{eqnarray*}
\prod_{\ell_{+}\in{\partial\cal C}}\exp(-\beta\psi_{-}\psi_{+}...
... \prod_{\ell_{-}\in{\partial\cal C}}\exp(\beta\psi_{-}\psi_{+}),
\end{eqnarray*}


so that the ratio of updating probabilities may be written as

\begin{eqnarray*}
\frac{W[C\rightarrow C']}{W[C'\rightarrow C]} & = &
\frac{\pro...
...\prod_{\ell_{-}\in{\partial\cal C}}\exp(\beta\psi_{-}\psi_{+})}.
\end{eqnarray*}


Note that in this way we have completed the set of boundary links in both the numerator and the denominator, so that we may write


\begin{displaymath}
\frac{W[C\rightarrow C']}{W[C'\rightarrow C]}
=\frac{\prod_{...
...}
{\exp(\sum_{\ell\in{\partial\cal C}}\beta\psi_{-}\psi_{+})}.
\end{displaymath}

What we see here in the argument of the two exponentials in the last form of this equation is the negative of the part of the action that corresponds to the boundary links, which we may denote by $S_{B}=-\sum_{\ell\in{\partial\cal C}}\beta\psi_{-}\psi_{+}$. We may therefore write the ratio of the two updating probabilities as


\begin{displaymath}
\frac{W[C\rightarrow C']}{W[C'\rightarrow C]}
=\frac{\exp(-S_{B}[C'])}{\exp(-S_{B}[C'])}=\exp(-\Delta S_{B}).
\end{displaymath}

Since in the flipping of a cluster the parts of the action corresponding to links that are either completely internal to the cluster or completely external to it do not change at all, the only variation of the action is that due to the boundary links, which means that $\Delta S=\Delta S_{B}$, and hence we get the final relation


\begin{displaymath}
\frac{W[C\rightarrow C']}{W[C'\rightarrow C]}=\exp(-\Delta S),
\end{displaymath}

which is indeed the condition of detailed balance, as we set out to show. Please note how important it is for this demonstration the fact that the update of a cluster changes the action exclusively at the boundary links of the cluster, and not at its bulk, that is, at its interior sites and links.