Tuesday, June 11, 2013

Simulated Annealing: Modeling Metallurgical Annealing to Solve Engineering Optimization Problems

Although my research in antenna arrays primarily focuses on genetic algorithms, I have studied other stochastic algorithms during the course of my PhD. One such algorithm is known as simulated annealing. Like genetic algorithms, SA is based on a process that occurs in nature. However, simulated annealing is not population based, and it is based on a physical process instead of a biological process.

The concept behind simulated annealing is simple. It is analogous to a blacksmith tempering steel to make a sword. The blacksmith heats the metal to a very high temperature and lets the sword cool slowly as indicated in Figure 1. This process makes the sword stronger than the untreated steel. However, if the blacksmith suddenly tosses the hot sword into a cold water bath, the steel would become brittle. This is certainly not good, as the sword could shatter when it strikes another sword or hard surface.

Figure 1: Sword Annealing Analogy to Explain Simulated Annealing (Copyright Jonathan Becker)

Of course, annealing works at the atomic level as shown in Figure 1. When the steel is heated to a very high temperature, the atoms move very fast in random directions. As the steel cools, the atoms begin to slow down. If the steel is cooled at just the right rate, the atoms will settle into a perfect crystalline lattice. However, if you cool the metal too fast, the crystalline lattice will have imperfections in it, and the steel will be of lower quality. You can cool the object too slowly and still get good results, but you'll wait too long to see the results.

From the example above, you can see the main steps behind simulated annealing:
  1. Start with an "object" that has M "atom" positions in random positions.  This is iteration n = 0, and the solution is named S(n=0).
  2. Heat the "object" to a very high temperature, say 1000 Kelvins.
  3. Cool it slowly according to a predefined temperature schedule until it reaches a low temperature, say 0 Kelvins, where the "atoms" are frozen in their current state. The algorithm runs for a total of (N + 1) iterations, i.e. 0 < n <= N.
    •  At each iteration n, generate a new solution S(n+1) with a probability Ps(n+1) defined by the temperature schedule. If a new solution is not generated, S(n+1) = S(n).
    • If S(n+1) is better than S(n), keep solution S(n+1). Otherwise, keep S(n).
  4. Compare the end result with the optimization requirements. If the result does not meet the acceptance criteria, discard it and go back to Step 1.
The trickiest part of getting a simulated annealing program to produce good results is in choosing the right temperature schedule. Typically, one would choose a sigmoid temperature schedule (shaped like the letter "S") like the one shown in Figure 2.  Of course, there are other types of sigmoid functions available (such as an arctangent function offset by 90 degrees). One chooses the parameters of the sigmoid function through multiple simulations.

Figure 2: Example Sigmoid Temperature Schedule.  Cooling rate (tau) and offset "a" need to be chosen carefully.
A sigmoid temperature schedule starts at a high temperature such that Ps(0) = 1 (or close to 1), and it decays slowly at first, so the algorithm can effectively search the parameter space. As the number of iterations "n" increase, the schedule decays faster allowing the algorithm to settle into a crystalline lattice. As the algorithm nears completion, the temperature schedule settles out such that Ps(N) = 0. This finishes the cooling cycle and ensures that the best solution is preserved.  You can note that the algorithm is purely random at the beginning of its search and becomes more deterministic as the number of iterations approaches N. The shift from complete (or nearly complete) randomness near the beginning of the with a gradual shift to complete determinism at of the run ensures that the best solution the algorithm found is preserved.

To date, my use of simulated annealing in my PhD research has been limited to antenna placement simulations. Because the test equipment used in CMU's wireless lab was old and slow, I found it impractical to optimize my beamforming array hardware in situ with a simulated annealing algorithm. However, I hope to build a new array this Fall semester that can be attached to a Universal Software Radio Protocol (USRP) device. This would allow me to optimize the new array in a non-ideal environment (i.e. lab space outside antenna chamber) with faster test times. Hence, it would be more practical for me to evaluate simulated annealing at that time.


Jonathan Becker
ECE PhD Candidate
Carnegie Mellon University

P.S. For anyone interested in learning more about simulated annealing, I highly suggest reading Laarhoven's and Aarts' text Simulated Annealing: Theory and Applications. More info can be found on Google Books.