We obtain the optimal feedback
and the associated present value by solving the Bellman equation of stochastic
dynamic programming (Mangel and Clark 1988, Clark
and Mangel 2000). If we replace the continuous variables *P* and *M*
by sets of values defined on meshes, then the continuous dynamic model becomes
a discrete one that is amenable to computation. In contrast to Mangel and Clark,
we seek solutions that are independent of time. These solutions may be obtained
as limits of the usual time-dependent solutions as the time horizon recedes
to infinity.

We solve the Bellman equation by
a combination of so-called value iterations and policy iterations. We begin
with an initial guess for the policy and present value that is analogous to
the results in Carpenter et al. (1999). Value iteration
uses the initial guess for present value as the final payoff after *T*
years, and the value at earlier years is obtained by the usual backwards iteration
of dynamic programming, using a fixed policy. As *T* approaches infinity,
the value at the initial time approaches a steady state, which is the result
of the value iteration. This steady state is the basis of a policy iteration:
a new policy is determined to maximize the value obtained by the preceding value
iteration. The value of this new policy is then computed by value iteration,
and then the policy is updated by successive applications of this process until
there are no significant changes in policy or value.

Although such a procedure might
seem cumbersome when compared with simulation results, it should be kept in
mind that comparable results using simulations would require many simulations
(perhaps hundreds) corresponding to each combination of *P* and *M*.
We have used simulations to check the present method.

*
*

We approximate the function *v*^{i+1}
in Eq. B.6 by a linear interpolation of its values at mesh points. Since *z*
in Eq. B.2 is unbounded, values of *v*^{i+1} beyond the
last mesh point also enter in Eq. B.6. We extrapolate *v ^{i+1}*
to decay quadratically beyond that point, proportional to

(B.1) |

This quantity may be interpreted as the P level that would be expected in the absence of loading. Since loading is non-negative, all accessible mesh points must lie at or above

(B.2) |

where

(B.3) |

It is convenient to extend the definition of

if | (B.4) | |

(B.5) |

If *P*^{t+1}
is between *P _{k}* and

(B.6) |

From Eq. A.6 we have

(B.7) |

Hence

for | (B.8) |

where

provided that . The definition of *C _{k}* is for later convenience.

We extrapolate *v* for *P*^{t+1}
> *P*_{N} as follows :

(B.12) |

This leads to

(B.13) |

where

With these definitions and approximations, it follows that

E | (B.17) |

The integrals may be evaluated in terms of the cumulative distribution function for the normal distribution:

(B.18) |

This yields

(B.19) |

These equations determine *v ^{t}* for a specified

At this point we must face the fact that actually depends upon according to Eq. B.1. In general lies between two mesh points; is not an integer, but can be obtained by linear interpolation between mesh points.

If the value *V ^{T}*
is specified at a final time

In the case where parameter values are given by a posterior distribution, the preceding equations must be modified. We perform value iterations for each point of the posterior: the value function is then given by

(B.20) |

where *v ^{p}* denotes the limit of the preceding value iterations
f or a single point

So far the feedback policy *L _{i,j}*
was arbitrarily prescribed. We can use the Bellman equation to improve a given
policy. Given a policy , and the corresponding value , we can apply Eq. B.6 to each term to obtain

(B.21) |

Here

(B.22) |

This maximization is carried out separately for each point *P _{i},
M_{j}*. Note that the complete sum over the posterior is required
for each maximization. We used Brent's method, first evaluating the function
on a mesh of 50 points in order to try to find all local maxima. The simpler
method of looking for a single local maximum fails because the right-hand side
of Eq. B.22 may have several local maxima.

Literature cited

Carpenter, S. R., D. Ludwig, and
W. A. Brock. 1999. Management of eutrophication for lakes subject to potentially
irreversible change. Ecological Applications **9**:751–771.

Clark, C. W., and M. Mangel. 2000. Dynamic state variable models in ecology. Oxford University Press, Oxford, UK.

Mangel, M., and C. W. Clark. 1988. Dynamic modeling in behavioral ecology. Princeton University Press, Princeton, New Jersey, USA.