Multiple Uncertainty Corrections

Consider a stock $$x_t$$ at time $$t$$ that recruits stochastically following harvest $$h_t$$ from the escaped population ($$s_t := x_t - h_t$$), $$x_{t+1} = z_g f(s_t)$$, where $$z_g$$ is a multiplicative stochastic shock to the growth. We imagine the decision maker sets a policy determining the harvest quota $$q_t$$ each year, which is implemented as the real harvest with some error $$z_i$$, $$h_t = z_i q_t$$. Before setting a quota, the decision maker assesses the stock $$y_t = z_m x_t$$, so $$y$$ is a noisy measurement of the true stock $$x$$. The decision maker seeks to maximize the expected present-value of profits $$\Pi(x_t,h_t)$$ that are a function of both the actual stock $$x$$ and the realized harvest $$h$$ over some long term period $$T_{\textrm{max}}$$ subject to some discount rate $$\delta$$,

$max_{\{q_1 \ldots q_{t_{\textrm{max}}} \}} \sum_{t=1}^{t=t_{\textrm{max}}} \operatorname{E}\left( \Pi(x_t,h_t) \right)$

Discrete formulation

We consider this problem in a discrete state space where $$y$$ and $$x$$, $$q$$ and $$h$$ all take values on grids of $$n_y$$, $$n_x$$, $$n_q$$, and $$n_h$$ points respectively.

• Let $$\mathbf{P}$$ be the $$n_x$$ by $$n_h$$ matrix of all possible profits in a time step, as given by $$\Pi(x_i, h_j)$$, where rows represent possible stock sizes and columns possible harvests.

• Let $$\mathbf{M}$$ be the row-normalized $$n_y$$ by $$n_x$$ matrix whose elements are $$z_m(y_i, x_j)$$, the probability of observing a stock size $$y_i$$ from a true size $$x_j$$.

• Let $$\mathbf{I}$$ be the row-normalized $$n_h$$ by $$n_q$$ matrix whose elements are $$z_i(h_i, q_j)$$, the probability of implementing a harvest $$h_i$$ given a quota $$q_j$$ Edit: I had been created in the code as $$n_q$$ by $$n_h$$, so had to be transposed.

• Let $$\mathbf{F}_q$$ be the tensor of row-normalized $$n_q$$ $$n_x$$ by $$n_x$$ transition matrices given by $$z_g(x_j, \sum_i \sum_j f(x_i, h_j) \mathbf{M}[y_k, x_i] \mathbf{I}[h_j, q_l] )$$. By summing over the uncertainty in measurement and implementation, $$\mathbf{F}_q$$ is in the space of observed state $$y_t$$ into the true stock at the next timestep $$x_{t+1}$$, given quota $$q$$. We transfer into the future measured state through another application of $$\mathbf{M} \mathbf{F}_q$$.

• The profits expected given a measurement of the stock $$y$$ for a quota $$q$$ are $$\sum_j \sum_i \Pi(x_i, h_j) p(x_i | y) p(h_j | q)$$. Hence the matrix of expected profits for each possible combination of $$y_i$$ and $$q_j$$ is given by the matrix product $$\mathbf{M} \mathbf{P} \mathbf{I} =: \mathbf{Q}$$

• Let $$\mathbf{V}_t$$ be the $$n_y$$ by $$n_q$$ matrix at time $$t$$ giving the value of choosing quota $$q_j$$ having observed stock $$y_i$$ and having selected the optimal $$q_{t+1} \ldots q_{t_{\textrm{max}}}$$, that is:

$\mathbf{V}_t = \mathbf{Q} + \delta \mathbf{V}_{t+1}(y_t)$

• Let $$\vec{v}_t$$ be the $$n_y$$ vector with the maximum value of the each row (i.e. for each state $$y_i$$) of the matrix $$\mathbf{V}_t$$, and
• let $$\vec{d}_t$$ be the $$n_y$$ vector whose elements give the column number (e.g. the $$j$$ determine the choice $$q_j$$) of the quota that corresponds to that value.

Thus $$\operatorname{max}_q V_t(y_t) = v_t$$. Starting from the final timepoint, we know the ending value $$v_T$$ as a function of the ending state $$y_T$$. The probability of ending up in each possible state for $$y_T$$, given a choice of quota $$q$$ and the current state $$y_{T-1}$$ is given by the transition matrix $$\mathbf{M}\mathbf{F}_q$$. The Bellman recursion becomes

$\mathbf{V}_t[,q] = \mathbf{Q}[,q] + \delta \mathbf{M} \mathbf{F}_q v_{t+1}$

R code algorithm:

Ep <- M %*% P %*% I
V <- Ep
for(t in 1:Tmax){
D[,(Tmax-t+1)] <- apply(V, 1, which.max)
v_t <- apply(V, 1, max)
V <- sapply(1:n_h, function(j){
Ep[,j] + (1-delta) * M %*% F[[j]] %*% v_t
})
}
D

Where

P <-  outer(x_grid, h_grid, profit)
M <- rownorm( outer(x_grid, x_grid, pdfn, sigma_m) )
I <- rownorm( outer(h_grid, h_grid, pdfn, sigma_i) )

F is a bit more complicated:

F <- lapply(1:n_h, function(q){
t(sapply(1:n_x, function(y){
out <- numeric(n_x)
mu <- sum(sapply(1:n_x, function(x)
f(x_grid[x],h_grid,p) %*%
I[q,] *  # Implementation error
M[x,y])) # Measurement error
if(mu==0)
out <- 1
else
out <- dlnorm(x_grid/mu, 0, sigma_g)
out/sum(out)
}))
})

Something seems to be a mistake, since the optimal policy under uncertainty does not make sense: