Ecological Archives A018-043-A1

Steven J. Phillips, Paul Williams, Guy Midgley, and Aaron Archer. 2008. Optimizing dispersal corridors for the Cape Proteaceae using network flow. Ecological Applications 18:1200–1211.

Appendix A. Modeling the second definition of nonoverlapping chains.

Definition 2 of nonoverlapping is that no two chains for a species can use the same cell, even in different time periods. Optimizing the protected cells under this definition is more challenging, and requires some additional complexity to be added to the mixed integer program. We will call a cell "contiguous" for a species $ s$ if the set of time periods during which it is predicted to have suitable conditions for $ s$ is contiguous. In other words, if a contiguous cell ever changes from suitable to unsuitable, it remains unsuitable. Similarly, we will call a chain $ p$ contiguous on a cell $ c$ if the set of time periods in which $ p$ uses $ c$ is contiguous. The main observation used in this section is that there is an optimum solution for Definition 2 in which, on all contiguous cells, all chains are contiguous (proof omitted). We can model contiguous use of a cell by adding "multi-year" arcs to the model. For example, an arc from the left-most black node in Fig. 4 to the right-most black node would represent use of the example cell in two contiguous time periods, 2010 and 2020. This allows us to enforce the second definition of nonoverlapping using the following constraint: for each species $ s$ , the sum of the flow on all in-year arcs corresponding to a cell $ c$ plus the sum of the flow on all multi-year arcs corresponding to $ c$ must be at most the preserve variable $ p_c$ . We now require all flow variables to be integral (taking value only 0 or 1): because of the new constraint, making only the preserve variables integral is no longer sufficient to ensure that there is an integral maximum flow.

If all cells are contiguous, then using the new constraint and requiring the flow to be integral would be all that we need to find the optimum set of cells to protect under Definition 2 of nonoverlapping. However, we have one remaining problem: some cells may be non-contiguous for some species. We would expect this to be a rare event, because the global climate is predicted to follow a consistent warming trend. Indeed, in the Proteacea data, out of 418487 cell-species combinations for which the cell is suitable for the species in at least one time slice, there are only 106 non-contiguous combinations for species whose conservation goals are not already met by existing protected areas. These can be handled as special cases, adding complexity to the model but without making the resulting mixed integer program computationally intractable. For each each cell $ c$ and each species $ s$ for which it is non-contiguous, we introduce a special separate flow that is allowed to send either one unit of flow or none at all. Moreover, this special flow is allowed to have value 1 only if it uses cell $ c$ . Finally, at most one of the flows for species $ s$ , ordinary or special, may use any given cell, whether via in-year or multi-year arcs. This latter mutual exclusivity constraint, in other words, the fact that the ordinary and special flows compete to use the arc capacity, turns the model into a form of "multi-commodity flow", which is another widely-used class of flow problems (Ahuja et al. 1993). Integer multi-commodity flow problems tend to be hard to solve, but we are aided here by the small number of non-contiguous species-cell combinations, and by the strict constraints that we place on the special flows.

Computational Details

We wrote the mixed integer program in AMPL, a modeling language that can be used as a convenient front end for multiple integer programming solvers (Fourer et al. 2002), and solved it using a commercial software package called CPLEX (version 10.0.0). For the first definition of nonoverlapping, the mixed integer program took one minute for AMPL to process and 11 minutes for CPLEX to solve the linear relaxation on a 1.5 GHz Itanium PC. Working on the integer program with all default settings on the same PC, CPLEX found an integer solution with value 994 after 5.7 hours, at which time it had determined a lower bound of 991. To prove that the 994 solution is optimal, we raised the "gomory candidates" and "gomory passes" CPLEX parameters as high as they go; this allowed CPLEX to prove optimality in another six hours. For Definition 2 of nonoverlapping, the mixed integer program turned out to be easier to solve: CPLEX managed to find the optimal integer solution and prove its optimality in about half an hour. A simplified version of the model (not described here) was used to determine the required flow (i.e., the maximum number of nonoverlapping paths, up to a maximum of 35) for each species.

Those who do not have local access to a similar software package can submit their numerical optimization problems over the Internet to the Network-Enabled Optimization System server (www-neos.mcs.anl.gov), maintained by the U.S. Department of Energy's Argonne National Labs. For those who wish to obtain a software package, the Decision Tree for Optimization Software (plato.asu.edu/guide.html), is a good guide to non-commercial codes that are freely available for research purposes, while the NEOS Guide to Optimization Software (www-fp.mcs.anl.gov/otc/Guide/SoftwareGuide) is a good guide to commercial codes.

For larger data sets, the mixed integer program may be hard to solve exactly. Approximate solutions can be obtained by relaxing the integer constraints to get a linear program, solving the latter using any of the above software packages, and then doing rounding to get an integer solution. There are a number of rounding methods available (for example, Raghavan and Thompson 1987). On the data presented here (for Definition 1 of nonoverlapping), the linear program is quick to solve and gives a lower bound of 983 for the optimum integer program solution. A simple iterative rounding heuristic gives an integer solution consisting of 1007 cells. Solving the linear program and running the rounding heuristic together took 15 minutes, compared with the 12 hours it took CPLEX to find an optimal solution. In this way, we could avoid the computational difficulty of solving the mixed integer program, while still obtaining a solution that is guaranteed to be within 2.5% of optimal.

Ahuja, Ravindra K., Thomas L. Magnanti, and James B. Orlin. 1993. Network Flows: Theory, Algorithms, and Applications. Prentice Hall, Upper Saddle River, New Jersey, USA.

Fourer, Robert, David M. Gay, and Brian W. Kernighan. 2002. AMPL: A Modeling Language For Mathematical Programming, Second Edition. Duxbury Press, Pacific Grove, California, USA.

Raghavan, P., and C. Thompson. 1987. Randomized rounding: A technique for provably good algorithms and algorithmic proofs. Combinatorica 7(4):365–374.


[Back to A018-043]