1 The Model

1.1 Fishers

Fishers are simple state automata. A fisher is either at port, moving to and from his fishing spot (using the standard A* pathfinder) or fishing. Each fisher has a boat (with given speed, mileage and hold size), a fishing gear (with given catchability and gas consumption per hour deployed), a bank balance and a port he belongs to.
Finally a fisher has a utility function to judge his status and a set of friends he can communicate with.

A fisher has to make 3 decisions repeatedly: whether to go out fishing, where to go and when to come back. Fishers need to take these decisions without precise knowledge of the world laws and states besides what they can gleam from past experience. Knowledge decays rapidly however as biomass, quotas and profit opportunities are being simultaneously exploited by competitors. Additional decisions may be possible in certain situations: the fisher might need to choose his gear or his reservation price for additional catch quotas.

We model here fishers’ decisions as separate bandit problems. The “Multi-armed bandit problem” is a framework to study the exploration-exploitation trade-off faced when repeatedly choosing among a finite set of options (see the monograph on the subject by Bubeck and Cesa-Bianchi 2012 as a reference for both the mathematical problem and common algorithms to solve it; see Kuleshov and Precup 2014 for a gentler introduction). For example, when choosing where to fish, one either goes to the best known spot (exploitation) or tows at a new promising location (exploration). Often fishers face bandit problems that are adversarial (as biomass and profitability are affected by natural processes and other fishers), contextual (as information about one area might provide clues about neighbouring locations) and vast: a map gridded with 50 rows and columns forces fishers to choose 1 option out of 2500.

In this model fishers deal with bandit problems with a slightly modified \(\epsilon\)-greedy algorithm. Fishers will explore with probability \(\epsilon\) and otherwise exploit. Exploiting means repeating the last best choice. Exploring means choosing a new option in the neighbourhood of the last one. Geographical exploration means picking a random cell from the \(\delta\)-sized Von-Neumann neighbourhood of the current best. Numerical exploration means shocking the current best by random noise \(\delta \sim U[\text{Minimum},\text{Maximum}]\) (these are just variants of stochastic hill-climbing , see chapter 4 of Russell and Norvig (2010)). If the explored choice improves utility it becomes the current best. The following pseudo-code summarizes this approach:

given current_best
with probability epsilon explore:
    choose new_option in neighborhood of current_best
    if utility(new_option) > utility(current_best) :
        current_best = new_option
else exploit:
    choose current_best

We show a geographical example below. The dark grey patch maximises profits but the fisher doesn’t know it. He initially tows at a random spot and from time to time explores around it. He eventually finds the best area.

Fishers compete over the same resource but they often explore different parts of the sea. We can improve the bandit algorithm by assuming a minimum of information sharing. Assuming each fisher has a set of “friends”, an exploiting fisher chooses between his current best or the current best of a friend if they are doing better.
This results in a primitive population-based optimisation (see chapter 3 of Luke 2009) which the following pseudo-code describes.

given current_best, epsilon
with probability epsilon explore:
    choose new_option in neighborhood of current_best
    if utility(new_option) > utility(current_best) :
        current_best = new_option
else exploit:
    pick friend with highest utility
    if utility(friend) > utility(mine)
       current_best = friend's current_best
    choose current_best

We show below how two fishers perform when sharing information.

This is the only kind of adaptation that the fishers perform. It is rudimentary but general purpose as it can be used to adjust any variable without even knowledge of its meaning. It assumes no model knowledge and is enough to generate realistic behaviour. Besides the utility function and the social network, the adaptation is driven by three parameters: \(\epsilon\) chance of exploration, \(\delta\) exploration range and how often the decision takes place (every trip, every month or every year). This fisher is simple but responds to incentive by trial-error, acting more and more optimally.

In all the following examples the fishers will use profits as their utility function: profit per trip duration, profit per month or profit per year depending on the speed of adaptation and what is being adapted. Each fisher is assigned 2 friends at random and friendships are not reciprocal. A trip’s profit is just revenue from a trip, that is biomass caught times sale price, minus gas expenditures from travelling and fishing.

1.2 Biology

In this paper fishers deal only with 2 alternative biology layers: logistic or static. The logistic layer hosts one or two species of fish. Every year the surviving biomass in each map cell grows logistically with its own carrying capacity \(K\) and Malthusian parameter \(r\). \[ \text{Biomass}_{t+1} = \text{Biomass}_{t} + r * \left( 1 - \frac{\text{Biomass}_{t}}{K} \right) * \text{Biomass}_t \] There are no cross-species effects. Fish move every day between neighbouring (in the Von-Neumann definition) cells if there is a difference in their biomasses. The daily movement between two cells is given by: \[ m * ( \text{Biomass at cell } i - \text{Biomass at cell } j ) \] Where \(m\) is a parameter describing fish spreading speed. With \(m=0.001\) two undisturbed cells take about 5 years to equalize their biomass.

In some examples, when we need to minimise the noise to human decisions due to biomass movement, we will use a simpler biology layer where biomass isn’t affected by fishing, doesn’t grow nor move. With this static biology any shock (in prices or in policy) to fishers will not be modulated by cumulative fishing mortality and can be studied in isolation.

1.3 Fish Markets

We will model fish market prices under two alternative assumptions: exogenous or endogenously linear prices. For most examples we assume fish price is fixed and exogenous. When endogenous, prices will be driven by the linear demand: \[ p_t = a - b q_t \] Where \(p_t\) is the price of a unit of biomass sold at day \(t\), \(q_t\) is either the average quantity landed the previous 5 days, and \(a\),\(b\) are demand parameters.

1.4 Events Order

The model works in discrete steps. These represents the unit of time of the model. Each step the fishers are given a budget of hours (24 by default) and use them to fish or stay home. Fishers act sequentially within a step, taking turns at consuming their time budget. Their order within a turn is random and shuffled each step.

Each step fishers act first, then the biology (movement, growth and so on) and then the policy (market price changes, quota trading and so on).

 The order in which the model's components are activated for each step (by default a step is 24 simulated hours)

The order in which the model’s components are activated for each step (by default a step is 24 simulated hours)

The model is implemented in Java through the MASON library (Luke et al. 2003).

2 Conceptual Results

We will now introduce some simple results from our model. These results are obvious and often the only rational decision possible under the circumstances. It is less obvious however that such simple agents as the ones populating the model could self-organize to achieve them.
These scenarios are therefore better understood as patterns (Grimm et al. 2005): realistic macro-behaviour that we use to validate our individual decision making.

2.1 Baseline Scenario

2.1.1 Results

This is our baseline scenario, all the other examples will use these parameters, deviating only in a few key areas. 100 homogeneous fishers face the logistic biology layer, a fixed and exogenous sale price for the fish and no fishing regulations. Fishers’ only significant decision is where to fish next. They rest 12 hours between trips and fish in the one spot only until their hold is full or they have spent 5 days at sea.

Because biomass is initially uniformly distributed profits are higher when fishing near port. By trial and error agents quickly learn to fish near port and as they deplete the area agents spread further and further out from the coast. The macro-behaviour that emerges is one of a fishing front slowly fanning out from port. This is entirely generated by the exploration-exploitation algorithm and not hard-coded in any way.

A sample run is shown below.

The next figure shows the geographical pattern created by fishing and the resulting biomass. During the first year fishers exploit biomass near port and then move out further over the next two years.

Evolution of fishing through 3 years of simulation, normalized tow distribuion on the left and the biomass remaining at the end of the year on the right. Green is land (cells with $x>40$)

Evolution of fishing through 3 years of simulation, normalized tow distribuion on the left and the biomass remaining at the end of the year on the right. Green is land (cells with \(x>40\))

2.1.2 Parameters Used

In this table I list all the parameters that make up the baseline scenario.

Parameter Value Meaning
Biology Logistic
\(K\) 5000 max units of fish per cell
\(m\) 0.001 fish speed
\(r\) 0.7 Malthusian growth parameter
Fisher Explore-Exploit-Imitate
rest hours 12 rest at ports in hours
\(\epsilon\) 0.2 exploration rate
\(\delta\) \(\sim U[1,10]\) exploration area size
fishers 100 number of fishers
friendships 2 number of friends each fisher has
max days at sea 5 time after which boats must come home
Map
width 50 map size horizontally
height 50 map size vertically
port position 40,25 location of port
cell width 10 width (and height) of each cell map
Market
market price 10 $ per unit of fish sold
gas price 0.01 $ per litre of gas
Gear
catchability 0.01 % biomass caught per tow hour
speed 5.0 distance/h of boat
hold size 100 max units of fish storable in boat
litres per unit of distance 10 litres consumed per distance travelled
litres per trawling hour 5 litres consumed per hour trawled

2.1.3 Sensitivity Analysis

Computational models are opaque: code is hard to read, parameters interact non-linearly and the input space is so large that most of it lays unexplored. We run the risk of focusing only on the inputs that “work” and ignore, or hide, the simulation runs that fail. To keep honest we will run ANTs (active non-linear tests) (Miller 1998) . This involve letting the computer search for parameters that “break” the examples. By knowing the parameters that fail we delineate the limits of our model.

ANTs however have limitations. Only the parameters we allow to vary will be explored and that decision still resides with the author as it involves a trade-off between test power (by searching over more dimensions) and feasibility (the more parameters one looks through the slower the optimisation). ANTs are also silent about non-parametrized model choices: we do not test, for example, if the results would hold by switching the cell shape to hexagons rather than squares. No test however, computational or otherwise, can pit a model against every other conceivable alternative. Within the confines of what is possible ANTs are useful validation tools.

While the test is automated one important manual detail is the range over which to modify input parameters. The original ANT paper modified initial parameters by at most 10%. In our model however even changes of 20% are not enough to break any pattern. We chose to vary parameters sometimes even 1000 fold as this allows the optimiser to always find interesting counter-examples to our results. The fact that these results are achievable and explainable reinforces our trust in the model.

In this example we showed that boats start by fishing near port and expand their range over time as close biomass depletes. We can plot the average distance from port of boats over time and the trend line will be positively sloped ( the more time passes the farther from port one fishes). To test when this fails we instruct the computer to search for parameters where the trend line is flat or negative. Formally this is just an optimisation problem that we solve with the methodology described in section 4.

Here we manipulate 8 parameters: \(\epsilon\), \(K\), \(m\), “hold size”, “cell width”, “gas price”, “speed” and “catchability”. Two kinds of failure are identified by the tests: the trend line is either flat (fronts do not form) or negative (fronts start far from port and get closer). The flat trend can be generated by increasing gas prices by a 1000 fold to 10$ per unit of distance. At that price travelling becomes too expensive and fishers react by towing only next to port regardless of fish distribution. Fishers in this scenario are prohibited from exiting the fishery so they minimise losses by travelling as little as possible.

For a negative trend to emerge all the parameters need to be manipulated to create the right conditions: it must take little to move from port to deep sea (high speed, small cell-size), fish needs to be plentiful and adaptable (high \(K\), high \(m\)), boats need to be highly capable (high catchability and hold size) and explore continuously (high \(\epsilon\)). When all these conditions apply the boats initially still fish near port and expand away from it. However this happens quickly and rather than fully consume close to port areas, the fishers move ever farther to virgin patches. Eventually all areas are somewhat depleted at which point a clean-up dynamics occur where fishers go back towards port and slowly consume anything they left alive on their first pass.
Dynamically it looks as follows:

The next table summarises the changes in parameters, the following figure shows the difference in dynamics for the sample runs.

Parameter Default Flat Negative
\(\epsilon\) 0.2 0.2 0.72
\(K\) 5000 5000 20000
\(m\) 0.001 0.001 0.018
hold size 100 100 1000
cell width 10 10 1
gas price 0.01 10 2.707518
speed 5.0 5.0 14.59
catchability 0.01 0.01 0.08
Square root of the average towing distance from port of agents. While the default parameters produce an upward sloping trend (fishing further from port as tim progresses) there are parameters set for which there is no trend or the trend is negative

Square root of the average towing distance from port of agents. While the default parameters produce an upward sloping trend (fishing further from port as tim progresses) there are parameters set for which there is no trend or the trend is negative

Overall these are good results. The fronts do not emerge either by making the whole fishery unprofitable or through a combination of parameters such that fronts emerge and are over quickly enough so that a secondary “clean-up” dynamics overshadows it. This means that the spreading out dynamic is robust to parameter changes.

2.2 Hyperstability

2.2.1 Results

A secondary result of the baseline scenario is that its fishery is hyper-stable. Hyper-stability (Harley, Myers, and Dunn 2001) occurs when changes in biomass are not reflected in changes to the CPUE (catches per unit of effort). Since catch data guide stock assessments (Branch et al. 2006), hyper-stable fishery would mask depletion until too late. Mathematically, fitting the following log regression: \[ \log(\text{CPUE}) = \alpha + \beta \log(\text{Biomass}) \] a fishery is hyper-stable when \(\beta<1\).

In this example we increase the number of fishers to 150 (from the baseline to 100). That’s enough fishers to slow deplete the entire biomass. The next figure compares changes in catches per unit of effort versus biomass left in the sea. They are related but particularly for the first 10 years CPUE is constant or increasing while biomass is depleting. The regression \(\beta = 0.712\) indicating hyper-stability.

The trajectory of catches per unit of effort and biomass over a 40 year run of the simulation. While CPUE falls as biomass is depleted some of the changes are masked by fishers switching their location so that while in the first 10 years two fifths of the biomass is consumed the CPUE remained steady.

The trajectory of catches per unit of effort and biomass over a 40 year run of the simulation. While CPUE falls as biomass is depleted some of the changes are masked by fishers switching their location so that while in the first 10 years two fifths of the biomass is consumed the CPUE remained steady.

2.2.2 Parameters Used

The only difference from baseline is the number of fishers which are increased to 150.

Parameter New Value Baseline Value Meaning
fishers 150 100 number of fishers

2.2.3 Sensitivity Analysis

Again we manipulate 8 parameters: \(\epsilon\), \(K\), \(m\), “hold size”, “cell width”, “gas price”, “speed” and “catchability”. This time we instruct the optimiser to look for parameters that generate proportional (\(\beta=1\)) or hyper-depleted (\(\beta>1\)) fisheries. We force the optimiser to ignore simulations where the fit is poor \(R^2<.2\) or where biomass is completely depleted \(\text{Biomass}_{t=40}=0\).

The key driver for hyper-depletion scenarios is high movement rate \(m\). The scenario found by the optimiser involves fish that is plentiful (high \(K\)) and boats that are powerful (high catchability and hold size) but expensive to run (high gas price). Given these incentives the fishers first deplete the biomass close to port and then settle on fishing relatively close to port and wait for the fish to move their way. This creates a low catch per unit of effort value even though fish is plentiful further out at sea where it is unprofitable to go.
The \(\beta\) generated by this scenario is \(4.82\).

It is also possible to look for cases where \(\beta < 0\). A negative \(\beta\) implies that a decrease in CPUE masks an increasing in biomass. The key for this result to emerge is to force agents to catch in areas near port even as they deplete and even though biomass is growing elsewhere. This is achieved by making fish move faster than boats. By setting a low speed but high fish movement \(m\) the fishers are content to stay near port and wait for the fish to come to them rather than waste them going further themselves. The \(\beta\) generated by these parameters is \(-2.80\).

Parameter Hyper-stability Hyper-depletion Negative Relation
\(\epsilon\) 0.2 0.30 0.2
\(K\) 5000 8689.205965 7500
\(m\) 0.001 0.095023 0.085
hold size 100 830 800
cell width 10 7.25 3.75
gas price 0.01 3.194378 0.38
speed 5.0 1.590745 0.01
catchability 0.01 0.15 0.15
We can tweak parameters to achieve both hyper-depletion and negative correlations. These are the two runs with the parameters chosen by the ANT

We can tweak parameters to achieve both hyper-depletion and negative correlations. These are the two runs with the parameters chosen by the ANT

2.3 Reaction to Oil Price Changes

2.3.1 Results

In this example we show how agents adapt to a yearly shock in oil prices and how the same algorithm can guide fishers with different technologies.

Much like with biomass depletion, fishers adapt to oil shocks automatically. We reuse the baseline scenario (fixed sale price, no regulations, 100 homogeneous fishers adjusting their fishing spot every trip) but here the biomass is fixed and distributed unequally so that more fish is available farther from the coast.

We compare two different technology: in the first scenario agents consume fuel when moving but not while fishing, in the second scenario when fishing and not while travelling. When gas is free fishers in both scenarios trawl far from port at a bountiful location that minimises the time it takes to fill the boat’s hold (and therefore maximises profits per hour). At the end of each simulation year we increase the price of gas. Agents who consume their gas by travelling respond by fishing closer and closer to port. These locations have less biomass and therefore require the fisher to spend more time at sea while saving him some of the additional gas costs. Vice-versa agents who consume their gas only when trawling never change their fishing spot since maximising catches per hour automatically minimises gas consumption as well.

How fishers react to oil price changes depends both on the price of the oil and the way fishers consume it; if fishers consume most (or in this case all) gas into moving from and to port then the gas price effects are large in terms of where to fish; if most (or in this case all) gas gets spent on fishing instead then gas prices have no effect on fishing location.

How fishers react to oil price changes depends both on the price of the oil and the way fishers consume it; if fishers consume most (or in this case all) gas into moving from and to port then the gas price effects are large in terms of where to fish; if most (or in this case all) gas gets spent on fishing instead then gas prices have no effect on fishing location.

This example helps showing how the same trial and error fisher dealing with the same incentives can react optimally given different constraints. The fishers use the same algorithm in each scenario, the only difference is in their oil consumption technology. Notably neither oil consumption nor oil price appears in the trial and error algorithm of the agent but is accounted for indirectly by the way it affects trip profits. This allows the same algorithm to deal with differences in technologies without any additional tuning.

The next figure shows the average distance between fishing locations and port in separate simulations varying gas prices and consumption technology. When movement is expensive and gas price differences large agents react by fishing closer to port, when trawling is expensive agents keep fishing at optimal distance.

A plot of average distance to port per simulation after fixing oil prices and technology. The stepwise shape of the `movement only` fisher depends on the interaction of fish distribution, oil prices and cell size

A plot of average distance to port per simulation after fixing oil prices and technology. The stepwise shape of the movement only fisher depends on the interaction of fish distribution, oil prices and cell size

The following video shows a similar scenario where oil price changes continuously and the agents keep adapting to it as a consequence.

2.3.2 Parameters Used

The main changes from baseline are the biology (which is fixed with higher biomass available further from port), gear consumption, hold size and cell width (to accentuate the distance effects).

Parameter New Value Baseline Value Meaning
Biology Fixed Logistic
cell width 1 10 width (and height) of each cell map
gas price varies 0.01 $ a litre of gas
hold size 500 100 max units of fish storable in boat
litres per unit of distance varies 10 litres consumed per distance travelled
litres per trawling hour varies 5 litres consumed per hour trawled

2.3.3 Sensitivity Analysis

We run the “only movement” gear example for 2 years with free gas and then one more year with gas prices of 2$. We monitor the average distance from port before and after the gas price shock. We want to find the parameters for which the difference in distance is small. This would highlight cases where agents do not adapt to gas price changes.

We manipulate 7 parameters: \(\epsilon\), \(K\), “hold size”, “cell width”, “speed”, “catchability” and “fishers”. There is no parameter set for which fishers fish farther away from port when oil prices increase. It is however relatively simple to obtain parameters for which fishers keep their fishing spots after the gas price increase. One such combination is described below, it involves high \(k\) with large maps (high “cell width” and low “speed”).

Parameter Original No Adaptation
\(\epsilon\) 0.2 0.27
\(K\) 5000 17004
hold size 500 978
cell width 1 13.27
speed 5.0 3.64
catchability 0.01 0.033
# of fishers 100 143

2.4 Fishing the Line

2.4.1 Results

Fishing the line means towing at the border of protected areas. This way fishers catch mobile species as they exit their sanctuaries. Fishing the line emerges in our model even though agents do not even understand the concept of boundary in their algorithms. It is a consequence of imitating those that, initially at random, have started fishing on the edge of protected areas and made money.

In this example, 300 fishers exploit a single species distributed in the standard logistic biomass model where fish spreads from areas with more biomass to those with less. An arbitrary MPA is mandated in the centre of the sea. Initially fishers simply avoid the MPA, fishing in open virgin areas instead. This behaviour coincides with the baseline scenario where fishers slowly spread out from around the port. However as biomass in open areas declines some fishers stumble on fishing just at the border of the MPAs and their profits attract everyone else. Averaging out over 20 years of simulation, as the next figure shows, most fishing occurs on the line (with higher effort on the part of the border closest to port).

The area from (15,10) to (30,40) is protected by a MPA, fishers resort, over 20 years, to mostly fish at its border, so much so that the normalized tows plot show the MPA contour. Notice also how the corners of the MPA are not as heavily fished. This is a consequence of fish movement proceeding over Von Neumann neighborhoods so that fish doesn't move 'diagonally'. The trial and error agents figure out even this detail.

The area from (15,10) to (30,40) is protected by a MPA, fishers resort, over 20 years, to mostly fish at its border, so much so that the normalized tows plot show the MPA contour. Notice also how the corners of the MPA are not as heavily fished. This is a consequence of fish movement proceeding over Von Neumann neighborhoods so that fish doesn’t move ‘diagonally’. The trial and error agents figure out even this detail.

This pattern is also clear in its dynamics. Initially fishers simply bypass protected areas, as this first video shows:

However as virgin areas are depleted fishers start fishing the line:

Functionally the same happens if we model gear-to-habitat interplay. Assume that certain areas of the map are now considered of “rocky” habitat (this is a purely arbitrary Boolean value in the code) and further assume that fishers are supplied with a gear that doesn’t work in rocky areas (catchability is 0 there). The simulated fishers again generate fishing the line, in this case the habitat line between rocky and not-rocky. When multiple areas are rocky, multiple lines emerge as the next figure shows.

5 random rectangles on the map are designed as rocky: fishers who try to catch there fail to get any fish out. Quickly however they learn to tow just at the border catching fish as it moves out of the rocky area.

5 random rectangles on the map are designed as rocky: fishers who try to catch there fail to get any fish out. Quickly however they learn to tow just at the border catching fish as it moves out of the rocky area.

2.4.2 Parameters Used

This example uses the baseline parameters and just adds a protected area in the centre of the map. It also increases the number of fishers to 300 as it leads to quickly depleting biomass outside of the protected areas and accentuates the emerging fishing the line behaviour.

Parameter New Value Baseline Value Meaning
Regulation MPA -
MPA coordinates [(15,10),(30,40)] - Endpoints of the MPA rectangle
fishers 300 100 number of fishers

In the rocky habitat scenario we designated four random rectangles of the map as rocky and changed catchability so that it would be 0 at those spots. All the other parameters follow the baseline scenario.

Parameter New Value Baseline Value Meaning
Habitat 4 Rocky Rectangles -
catchability rocky: 0, non-rocky: 0.01 0.01 % biomass caught per tow hour
fishers 300 100 number of fishers

2.4.3 Sensitivity Analysis

There is no need to run an optimisation to find the parameters for which fishing the line doesn’t emerge. For \(m=0\) there is no fishing the line as there is no spillover from the protected area that the fisher can consume. For high enough \(K\) (or few enough fishers) there are always virgin areas to exploit that are more profitable than the MPA border itself.

2.5 Complex Biologies: OSMOSE

In this example we show how exploration-exploitation agents function over much more complicated biological layers.

We do not need to adapt agents’ behaviour even when changes in some components are drastic. In this example the same 100 fishers compete over a much more complicated biological model without any change in their algorithms. OSMOSE (Shin and Cury 2001; Shin and Cury 2004; Grüss et al. 2015) is a computational model of fish dynamics simulated at school level. We couple our agent-based model to the OSMOSE representation of the Benguela ecosystem (a 12 species conceptual example). OSMOSE provides the biological and geographical layer over which we simulate humans and fishing mortality. Agents’ behaviour is similar to our previous examples.

This is how the model looks in action while underneath OSMOSE moves biomass around.

Assume 100 fishers, all targeting the “Demersal 2” species in the area. Because OSMOSE models biological interactions between species, we can study the effect that fishing mortality of one species has on the others. In particular as show in the next figure as “Demersal 2” schools are depleted losing on average 50% of their total stock the mesopelagic fish triples its biomass.

The time series of 50 simulation runs studying the biomass of `Demersal 2` and Mesopelagic fish over 30 years of simulation when fishers are targeting the `Demersal2` species. The bolded line is the average biomass over the 50 runs

The time series of 50 simulation runs studying the biomass of Demersal 2 and Mesopelagic fish over 30 years of simulation when fishers are targeting the Demersal2 species. The bolded line is the average biomass over the 50 runs

2.6 Optimal Heuristic

In this series of runs we show how there is no single “optimal” heuristic.

It might be supposed that an optimal exploration-exploitation algorithm can be found to drive the trial and error of the agents, or at least an optimal exploration probability \(\epsilon\) can be defined such that agents explore and exploit efficiently. There are two main objections to this view: the first is scientific while the second is practical.

Scientifically there is no particular reason to assume that if an optimal heuristic exists then fishermen are using it. Some fishers may act as satisficiers rather than maximisers and while satisficing is usually defined in terms of outcome (see the mathematical definition of satisficing in Rubinstein 1997; see the experimental definition in Caplin, Dean, and Martin 2011) it is just as possible that fishers are satisficers in terms of the heuristic they use, exploring too little or not using their friends to the fullest. An example of this is provided in transportation studies where commuters take routes that are suboptimal until a shock, for example a strike, forces them to experiment at which point the switch to a better route and keep using it even after the shock is over (Larcom, Rauch, and Willems 2015).

A more practical issue is that a heuristic’s optimality depends on the problem it is solving. Here we show how a value \(\epsilon\) can be optimal for one biology model and suboptimal for others. We compare 3 heuristics: the “explorer” has a high \(\epsilon = .8\), the “exploiter” has a low \(\epsilon=.2\) and the “adaptive”" heuristic starts at \(\epsilon=.8\) and increases it by multiplying it by \(1.02\) every time an exploration is successful (decreasing it when the exploration is unsuccessful).

The next figure shows how the exploiter heuristic performs better when the biomass is fixed. This is because after finding the best fishing spot any further exploration is wasteful. Vice-versa with the logistic biology model the exploiter performs poorly because the best fishing spot is quickly consumed and exploration is necessary. The adaptive heuristic is a close second in both scenarios. We also ran the 3 heuristics against the OSMOSE simulation described above. When targeting pelagic fish the exploiter performs slightly better than the explorer heuristic. This is because in this particular biology fish regenerates much faster than fishing consumption and as such the best fishing spot is relatively stable.

A comparison on the overall efficiency (in terms of cumulative catches) for 3 sample heuristics over 3 different biology models

A comparison on the overall efficiency (in terms of cumulative catches) for 3 sample heuristics over 3 different biology models

2.7 Switching Gear

2.7.1 Results

In this example we show how the exploration-exploitation framework can be used by fishers to change gear and target species.

Fish comes in two species: red and blue. They are co-located throughout the map with each species’ biomass growing independently and logistically in each cell. In this example fishers adapt over two variables: where to fish and what gear to use. Fishers change their fishing spot every trip as before. Fishers now also change their gear each year according to the exploration-exploitation framework: when exploring fishers try a new random gear while by exploiting they keep using their gear or their friends’ if they are doing better.

Two gears are available: gear that catches only reds and gear that catches only blues (catchability \(q=0.01\)). There are 100 fishers and all of them start with red-only gear. This unbalanced mortality however depletes red fish while allowing the blue species to grow. This makes blue gear profitable and by exploration and imitation more and more fishers switch to targeting blue fish. The overall dynamic is initially one of sudden changes where the fishery as a whole targets either blue or red fish but eventually it settles into an equilibrium where about half of the fishers catch blue and half catch red which mirrors the equilibrium in the relative biomasses. A sample run is shown in the next figure. Fishers manage to allocate their effort relative to fish abundance in spite of not know what the actual biomasses are. Moreover they are able to coordinate without any leadership or consciousness into splitting in 2 separate fleets.

A sample run where agents are allowed to switch gear and by consequence target species. Fishers tend to respond to variation in relative biomass distribution even without knowing it by following the example of those that are more profitable

A sample run where agents are allowed to switch gear and by consequence target species. Fishers tend to respond to variation in relative biomass distribution even without knowing it by following the example of those that are more profitable

2.7.2 Parameters Used

This scenario has a 2 species logistic biology layer and agents adapt their gear over time.

Parameter New Value Baseline Value Meaning
Biology Logistic 2 species Logistic
\(K\) red: 5000, blue: 5000 5000 max units of fish per cell
\(m\) red: 0.001, blue: 0.001 0.001 fish speed
\(r\) red: 0.7, blue:0.7 0.7 Malthusian growth parameter
catchability (red:0.01,blue:0) or (red:0,blue:0.01) 0.01 % biomass caught per tow hour
Gear Adaptation 2 options - How agents change their gear over time
\(\epsilon_{\text{gear}}\) .2 - exploration rate on gear
gear change frequency yearly - how often the fishers think about changing gear

2.7.3 Sensitivity Analysis

The major result in this section is the ability of fishers to keep up with changes in biomass by changing gear with the eventual closure of gap between red and blue biomass. We therefore instruct the optimiser to look for the largest difference between red and blue biomass in the last simulation year. We manipulate 9 parameters: \(\epsilon\), \(K\), \(m\), “hold size”, “cell width”, “speed”,“catchability” (for all gears), \(\epsilon_{\text{gear}}\) and “gas price”.

While the optimiser can find scenarios where the difference in biomass ends up being large, these results are banal. The gist is that when fish is extremely abundant and easy to catch (high \(K\), high \(m\), high catchability), when profits are mostly cost-driven (high gas price, large cell-width, high \(\epsilon\)) and when gear experimentation is slow (low \(\epsilon_{\text{gear}}\)) fishers overwhelmingly remain targeting the red species as there is no great advantage in switching. The next figure shows the dynamics in biomass and target species.

The dynamics generated by the active non-linear test.

The dynamics generated by the active non-linear test.

Parameter Small Difference Large Difference
gas price 0.01 10
\(\epsilon\) 0.2 0.8
\(K\) 5000 20000
\(m\) 0.001 0.2
hold size 100 160
cell width 10 20
speed 5.0 14.68
catchability 0.01 0.2
\(\epsilon_{\text{gear}}\) .2 .05

2.8 Directed Technological Change

2.8.1 Results

In this example we show that the higher the price of gas the faster the agents adopt energy efficient gear. Fishers again adapt over both fishing location and gear. However, rather than choosing between two gears, they now face a continuum of options where catchability is fixed at \(q=0.01\) but gas consumption (units of oil consumed per hour of fishing) is randomly distributed \(U \sim [0,20]\). When exploring fishers randomly shock their gas consumption by \(U \sim [-\delta,\delta]\) while when exploiting they copy their best performing friend (if he’s doing better). Fishers perform this adaptation every two months using bi-monthly profits as their utility function.

If gas prices are free, fishers are indifferent to energy efficiency and average gas consumption follows a random walk. When gas prices are low but not free there is an incentive to switch to better gear but other noises (for example a series of unlucky explorations) may matter more and fool fishers with better gear into imitating less efficient competitors. As gas prices become expensive however the benefits of mileage become evident and average energy efficiency improves quickly. The next figure shows the average fuel consumption per hour of fishing over 10 runs for each of 3 scenarios: when agents face free gas, when agents face cheap gas (\(0.1\$\) per unit of fuel consumed) and when agents face expensive gas (\(1\$\) per unit of fuel consumed).

Each line represents the average fuel inefficiency for an indpendent simulation. When facing free gas there is no incentive to improve fuel efficiency and therefore technology on average follows a random walk. The more expensive gas gets the more pronounced the march towards better gear becomes

Each line represents the average fuel inefficiency for an indpendent simulation. When facing free gas there is no incentive to improve fuel efficiency and therefore technology on average follows a random walk. The more expensive gas gets the more pronounced the march towards better gear becomes

2.8.2 Parameters Used

This example uses baseline parameters altering gas price and adding gear adaptation.

Parameter New Value Baseline Value Meaning
litres per trawling hour \(\sim U[0,20]\) 10 litres consumed per hour trawled
gas price 0 or 0.1 or 1 0.01 $ per litre of gas
Gear Adaptation Mileage - How agents change their gear over time
\(\epsilon_{\text{gear}}\) .2 - exploration rate on gear
\(\delta_{\text{gear}}\) 0.05 - exploration area size
gear change frequency bi-monthly - how often the fishers think about changing gear

2.8.3 Sensitivity Analysis

We search here for the parameters that cause the least gear adaptation when gas is expensive. We manipulate 9 parameters: \(\epsilon\), \(K\), \(m\), “hold size”, “cell width”, “speed”,“catchability”,\(\epsilon_{\text{gear}}\) and \(\delta_{\text{gear}}\). We task the optimiser to find the simulation with the lowest fuel efficiency.

While it is possible to find a set of inputs for which fuel efficiency is essentially random walk even though gas is expensive, the results are uninteresting. Tasked with flattening the fuel efficiency adoption curve the optimizer lowers boat speed by 50 times to 0.1 kilometres per hour while at the same time doubling cell size to 20 kilometres each. In such a way each boat can take up to a year to even reach its fishing spot. What the optimizer found then is just a way to slow down the model so much that even after 4000 days the agents simply had too little time to adapt their gear. This is the result of supplying to the optimizer an impossible task.

Each line represents the average fuel inefficiency for an indpendent simulation. Even though the agents are facing high gas prices the biology and the map are structured in such a way as to make movement towards better gear unattractive

Each line represents the average fuel inefficiency for an indpendent simulation. Even though the agents are facing high gas prices the biology and the map are structured in such a way as to make movement towards better gear unattractive

One minor tweak we performed for this test was to remove the maximum number of days at sea as it interferes with very slow boats travelling over very large cells.

Parameter Adaptation Random Walk
gas price 1 1
\(\epsilon\) 0.2 0.48
\(K\) 5000 7242.33
\(m\) 0.001 0.0395
hold size 100 600
cell width 10 20
speed 5.0 0.1
catchability 0.01 0.2
\(\epsilon_{\text{gear}}\) .2 .70
\(\delta_{\text{gear}}\) .05 .48
max days at sea 5 Unlimited

3 Policies

3.1 Quotas

Modern fishery management often involves setting up a maximum quota of catches allowed each year. There are broadly two ways of implementing this control: total allowed catches (TAC) and individual tradeable quotas (ITQ). Total allowed catches is a fishery-wide quota, when the sum of all catches for any species is above the quota allowed the season ends for everyone. Individual tradeable quotas instead are individual but agents can trade quotas among themselves (usually due to differences in expected profitability per unit of catch). To simulate this policy we need to build quota markets.

Each fisher needs to estimate their reservation price \(\lambda^*\): what quota price would make him indifferent between buying or selling an additional quota. Given each fisher’s \(\lambda^*_i\) for each species \(i\) we assume the fisher will buy an additional quota if it costs \(\lambda^*_i - \mu\) or selling it for \(\lambda^*_i + \mu\) where \(\mu\) is a mark-up parameter. While in the real world matching buyers and sellers may be difficult here we assume quotas are traded as if in a stock exchange through an order-book where every fisher places both a bid and (if they own any quota) an ask and all crossing orders fire. This is equivalent to assume perfect matching between traders.

Assume first that there is only one species of fish for which quotas are available to trade. The fisher knows he makes \(\Pi \) from catching and selling an additional unit of fish. The price of the quota is \(\lambda\).
The monetary revenue from buying a new unit of quota is: \[ \text{Revenue from Buying}= \begin{cases} \Pi - \lambda,& \text{if quota is used} \\ -\lambda, & \text{otherwise} \end{cases} \] This is because there is the risk of buying a quota and not using it before the season ends in which case the fisher has wasted \(\lambda\).

The expected value of buying a quota (for a risk neutral fisher) is then: \[ E[\text{Value of Quota}] = \text{Pr}(\text{Needed})(\Pi - \lambda) + \text{Pr}(\overline{\text{Needed}})(- \lambda) \] If we set this expected value to \(0\) we can find the price \(\lambda^*\) that makes fishers indifferent between owning or not a quota. \[ \lambda^* = \text{Pr}(\text{Needed})\Pi \] This is the reservation price from owning an additional quota.

The fisher inside the model can compute this probability by keeping track of the following: \[ \begin{aligned} q &= \text{Quota owned} \\ c &= \text{Daily catch} \\ t &= \text{Day of the season} \\ T &= \text{Season length} \end{aligned} \] Then the probability \(\text{Pr}(\text{Needed})\) is just: \[ 1 - \text{Pr}(c \leq \frac{q}{T-t}) \] That is the probability of needing a quota is equal to the probability that the expected number of daily catches times the season length remaining are higher than the quotas currently owned. We let the fishers assume, somewhat unrealistically, that daily catches \(c\) are normally distributed and this way they can compute reservation price \(\lambda^*\) at any point by using the normal CDF using a moving average and moving standard deviation as the CDF parameters.

The multiple species version of this estimation is complicated by the fact that fishers in general catch a mix of species and that not owning a quota for one prevents you from catching and selling the others. Assume that each time the fisher catches 1 unit of species 1 it also catches on average \(x_2\) units of species 2. The benefit of owning an additional unit of quota of species 1 then is not just that you can sell that unit of fish 1 but also that you can sell \(x_2\) units of species 2, provided you own \(x_2\) units of quotas for species 2. This translates into the following revenue equation: \[ \text{Revenue from Buying}= \begin{cases} \Pi_1 - \lambda_1 + x_2(\Pi_2 - \lambda_2),& \text{if quota is used} \\ -\lambda_1, & \text{otherwise} \end{cases} \]

For a risk neutral fisher the quota price that makes him indifferent between buying or not is: \[ \lambda_1^* = \text{Pr}(\text{Needed})\left(\Pi_1 + x_2(\Pi_2 - \lambda_2) \right) \] Notice how, depending on the profits from the other species \(\Pi_2\) and its quota prices \(\lambda_2\), the reservation price could be higher or lower than it was for the one species example.
Similarly for species 2: \[ \lambda_2^* = \text{Pr}(\text{Needed})\left(\Pi_2 + x_1(\Pi_1 - \lambda_1) \right) \]

Now \(\text{Pr}(\text{Needed})\) can be estimated much like the one species example as long as we know \(x_2\) which we estimate by using averaged daily catches for each fish: \[\begin{align*} x_2 &= \frac{c_2}{c_1} \\ x_1 &= \frac{c_1}{c_2} \end{align*}\]

And we can generalize this to more than two species into the following equation: \[ \lambda_i^* = \text{Pr}(\text{Needed})\left(\Pi_i + \sum_{j \neq i}\frac{c_j}{c_i}(\Pi_j - \lambda_j) \right) \]

Fishers now can generate reservation prices each day as a function of their current stock of quota and predicted future catches. Fishers can then use these reservation prices to place bids and asks in the quota order-book and any crossing orders trigger an exchange.

As market progress and trades are made we can track the daily closing price of quotas. The fishers use these quota prices in their profit function to value the quotas consumed (a proper opportunity cost). This way the quota price affects the profits and incentivise the fisher in their exploration-exploitation behaviour.

3.2 One Species Quota: Winners and Losers

3.2.1 Results

In this example we compare how TAC and ITQ policies affect the allocation of catches among heterogeneous fishers. While both policies reward fishers with better fishing gear, only ITQs allocate more catches to energy efficient fishers.

100 agents fish in a one-species fixed biology world. The fishers are heterogeneous in terms of catchability which is randomly distributed \(q \sim U[0.01,0.1]\); once assigned, catchability is fixed and fishers only adapt over where to fish. In the ITQ scenario each fisher is given yearly quotas worth 4000 units of catch, in the TAC scenario there is a yearly fishery-wide quota of 400000 units of catch.

The next figure shows how higher catchability translates into more catches both when quotas are traded over an ITQ and when a single untradeable TAC is applied to the whole fishery. In the TAC scenario fishers with higher catchability have an edge over their competitors as they are able to complete more trips faster before the quota is exhausted. In the ITQ scenario fishers with better gear are more profitable which allows them to buy more quotas and therefore catch more.

Each dot in this graph represents a simulated fisher. The generated scatterplot shows a clear positive correlation between catchability and catches in both scenario. For these particular parameters the relationship is stronger in the ITQ example than the TAC one but this difference is not generalizeable

Each dot in this graph represents a simulated fisher. The generated scatterplot shows a clear positive correlation between catchability and catches in both scenario. For these particular parameters the relationship is stronger in the ITQ example than the TAC one but this difference is not generalizeable

The next figure shows what happens when fishers are identical over catchability (\(q=0.1\)) but heterogeneous over gas consumption per hour of fishing (which is \(\sim U[0,20]\)). Only the ITQ scenario has a clear correlation between catches and energy efficiency. This is because better mileage makes fishers more profitable which in ITQ translates to more quotas being bought while in the TAC scenario fishers with higher profits doesn’t translate into more catches as efficiency provides no benefit in racing to fish. From a fishery perspective ITQs are more efficient as the average gas consumption per hour of fishing is lower.

Each dot in this graph represents a simulated fisher. The generated scatterplot shows how ITQs reward more energy efficient fishers with higher catches while in the TAC scenario there is no correlation between efficiency and catches

Each dot in this graph represents a simulated fisher. The generated scatterplot shows how ITQs reward more energy efficient fishers with higher catches while in the TAC scenario there is no correlation between efficiency and catches

This example showed how different ways of implementing quotas can reward different fishers even when the overall quota allocated is the same. As we will show in the next examples these differences become more pronounced as fishers are allowed to adapt more variables than just where to fish.

3.2.2 Parameters Used

In this scenario we introduced two new regulation objects: ITQ and TAC. We also varied the catchability and mileage of the fishers. We kept the biology fixed (no death, no growth, no movement) so that local depletion wouldn’t add noise to the results. All the other parameters follow from the baseline scenario.

Parameter New Value Baseline Value Meaning
Biology Fixed Logistic
Regulation TAC or ITQ -
total quota 400000 - maximum yearly catch
catchability \(\sim U[0.01,0.1]\) 0.01 % biomass caught per tow hour

In the mileage scenario we also modified gas consumption so that it would be trawling only driven (movement cost no gas). This reduces the exploration noise on profits and strengthens the correlation between catches and efficiency.

Parameter New Value Baseline Value Meaning
Biology Fixed Logistic
Regulation TAC or ITQ -
total quota 400000 - maximum yearly catch
litres per trawling hour \(\sim U[0,50]\) 5 litres consumed per hour trawled
litres per unit of distance 0 10 litres consumed per distance travelled

3.2.3 Sensitivity Analysis

We look for parameters that reduce the correlation between fuel efficiency and catches when an ITQ is in place. Here we manipulate 7 parameters: \(\epsilon\), \(K\), “hold size”, “cell width”, “gas price”, “speed” and “catchability”.

The original correlation between catches and consumption per trawling hour is -.95. If we run the optimiser to minimise correlation by only allowing 20% changes from the original parameters it achieves the minimum of -.20 correlation. It achieves this indirectly by reducing daily catches (large cell size, low speed) and reducing the amount of time spent towing (high catchability, low hold size) which lowers both the demand for buying quotas and the advantage of having better gear.
If we let the optimiser change parameters over much larger ranges it finds the obvious solution of setting gas price to 0 which removes any advantage from better mileage.

3.3 Race to Fish

3.3.1 Results

In this example we show how ITQs solve the “race to fish” phenomenon that TAC and open-access generate.

We populate the simulation with 100 homogeneous agents fishing over a single species fixed biological model. Besides adapting where to fish in this scenario agents will also adapt when to go out fishing. Each agent is willing to go out fishing only for some specific months, initially set at random. Each year he adapts this set of months by exploration-exploitation: exploring means a 5% chance of changing any preferred month to not-preferred and vice-versa, when exploiting the agent checks his best friend and copies his preferred months if that friend is doing better.

We also make the price of fish endogenous as previously described in the “Fish Markets” section. This creates a profit dynamics where if too many fishers go out in the same month they all makes a loss. Practically this is a fishery version of the El Farol Bar problem (Arthur 1994). While this game theoretical problem usually requires agents to be intelligent enough to predict competitors choices (this is true even for the agent-based version in chapter 3 of Wilensky and Rand 2015), fishers in this simulation are able to “solve” it just by blind trial and error.

The next figure shows the number of fishers who go out every month averaged over 50 model runs after 20 simulated years of trial and error. The market parameters are set so that about 50 fishers make up the zero-profit threshold. The no-regulations scenario exhibits the classic race to fish behaviour: every month enough fishers show up to about the make profits equal to 0. The TAC scenario is similar except that the fishery closes after the quota is exhausted. Because no fish is sold for months between one season and the other prices are artificially high for the first few days of the season which explains why for the first month the number of fishers is actually even higher than the no-regulation scenario. In the ITQ scenario instead fishers spread out so that each month only a small part of the quota is consumed. This provides a steady supply of fish throughout the year at higher price than the ones achieved by a TAC. It is how a monopolist would spread production if it controlled the whole fishery. Interestingly our trial-and-error fishers coordinate to a quasi-monopolist behaviour without any knowledge of the world or the market or each other.

The effort distribution in terms of fishers active each month at the last year of simulation averaged over 50 simulations. TAC and open-access regulations generate race to fish while ITQ allocates effort efficiently.

The effort distribution in terms of fishers active each month at the last year of simulation averaged over 50 simulations. TAC and open-access regulations generate race to fish while ITQ allocates effort efficiently.

The following figure provides shows the distribution of profits per fisher per run. TACs and open-access have fishers struggling to make a profit while ITQs reliably generate them. This is a result of spreading effort and consequent high sale price per unit of fish caught.

The distribution of average profits made on the last year of simulation for each run made, subdivided by regulation. While ITQs generate profits on average, open-acess and TACs do not.

The distribution of average profits made on the last year of simulation for each run made, subdivided by regulation. While ITQs generate profits on average, open-acess and TACs do not.

3.3.2 Parameters Used

The parameters used for this example were discovered by the optimiser: it searched for market parameters that would achieve 0 profits with approximately 50 fishers.

Parameter New Value Baseline Value Meaning
Biology Fixed Logistic
\(K\) 7433 5000 -
gas price 4.578909 0.01 $ per litre of gas
Regulation None or TAC or ITQ None
total quota - or 500000 - maximum yearly catch
Market Linear Smoothed Fixed Price
market price - 10 $ per unit of fish sold
demand intercept 1464 - fish demand intercept in $
demand slope 0.377625 - fish demand slope in $
demand moving average 5 - number of days over which to average quantity supplied
Effort Decision Monthly Preference -
\(\epsilon_{\text{effort}}\) .2 - probability to randomize preferred months
mutation rate .05 - when exploring, probability of changing each month preference

3.4 Two Species Quotas: Locational Effects

3.4.1 Results

Miller and Deacon (2014) claim that fishers can avoid by-catch in only two ways: they either fish elsewhere or they change gear. Here we generate the first behaviour, the next example deals with the second. In both cases the incentives from trading drive fishers to avoid by-catch.

Here there are two species of fish: red and blue. Red fish live on the north side of the map, blue fish live on the south. Both species are equally profitable and sell for the same fixed sale price. There are 100 Fishers who have non-selective gears with catchability \(q=0.01\) for both the red and the blue species. Fishers only adapt over where to fish. We compare two different policies: the first is an ITQ where each fisher is given quotas for 4000 units of red fish and only 500 for the blue fish while the second scenario has the same amount of quota globally shared by the whole fishery.

Because their quotas are rare, blues are the by-catch species. We’d expect fishers to focus their effort north where blue species do not exist. Fishers aren’t told what the geographical distribution of the two species are nor are they able to perform long term planning to judge what species to target next. They proceed again by trial and error. As the next figure shows trial and error is enough in the ITQ scenario to push fishers to put most effort into the north part of the sea. This behaviour is entirely due to the quota prices generated endogenously within the model: the rarity of blue quotas make them valuable enough that most profits are made by people who tow north. This in turn convinces their friends to imitate them. Red fish are exploited further and further from port as their stock is depleted. The few fishers targeting the blue species, by contrast, can tow next to port as the restrictive number of quotas protect the blue fishing grounds. Because there is no personal incentive to avoid by-catches in the TAC scenario fishing is evenly distributed north and south of the sea. Once the blue quotas are exhausted everyone’s season ends. The TAC scenario then has little fishing actually occurring and most red quotas ignored; the towing that does happen tend to be next to port as both species’ stocks remain undepleted.

The normalized number of tows for each map cell over 5 simulated years for both the scenario with ITQ and TAC policy in place. The dashed line represents the divide between blue and red species at y=24. Any cell on the dashed line and below contains only blue fish (the bycatch) while the cells strictly above the dashed line contains only red fish

The normalized number of tows for each map cell over 5 simulated years for both the scenario with ITQ and TAC policy in place. The dashed line represents the divide between blue and red species at y=24. Any cell on the dashed line and below contains only blue fish (the bycatch) while the cells strictly above the dashed line contains only red fish

The following is how an ITQ run works in motion.

3.4.2 Parameters Used

This example uses the baseline parameters except for the geographically split biomass and the regulations.

Parameter New Value Baseline Value Meaning
Biology 2 Species Logistic - Geographically Separated Logistic
\(K\) \(y>25\):\((5000,0)\) else \((0,5000)\) 5000 max units of fish per cell
Regulation TAC or ITQ None
total quota 4000 red, 500 blue - maximum yearly catch

3.4.3 Sensitivity Analysis

We search here for the parameters that cause the least spatial adaptation when an ITQ is in place We manipulate 8 parameters: \(\epsilon\), \(K\), \(m\), “hold size”, “cell width”, “speed”,“catchability” and “gas prices”. We task the optimiser to find the the smallest ratio \(\frac{\text{red landings}}{\text{blue landings} + \text{red landings} }\).

If we only allow a 20% deviation from baseline parameters the % of red landings over the total never drops below 85%. The ITQ incentives are unaffected by most parameter changes. If we allow full variation then it’s possible to generate a scenario with weak boats (low catchability, low hold-size), bountiful nature (high \(K\), high \(m\)) and expensive travel (high oil prices) where agents only fish just outside of port where the fish happens to be blue. For normal parameters however the ITQ incentivise red targeting.

Parameter Original No targeting
\(\epsilon\) 0.2 0.05
\(K\) 5000 17314
hold size 100 10
cell width 10 1
speed 5.0 20
catchability 0.01 0.001
gas price 0.01 10

3.5 Two Species Quotas: Gear Effects

3.5.1 Results

The alternative way to avoid by-catch is to switch to more selective gear. In this example agents in an ITQ are induced to use gear less likely to consume rare quotas.

There are 2 species of fish: red and blue. They command the same fixed sale price at market. The 2 species are homogeneously distributed throughout the map so that in each cell map 50% of the biomass is red and 50% is blue. This way fishers are unable to avoid catching one species simply by selecting where to tow. We simulate two quota scenarios: a TAC and ITQ policy are implemented where 90% of the quotas allocated are red (4500 units per fisher) and only 10% are blue (500 units per fisher).

Besides choosing their fishing location agents in this scenario can also change their gear. Each fisher gear is defined over two variables: blue catchability \(q_b\) and red catchability \(q_r\). These are different for each fisher and initially randomly distributed \(\sim U[0.001,0.02]\). Each year fishers who explore generate a new random gear and start using it, fishers who exploit copy the gear of their most successful friend.

The next two figures summarise gear evolution over 50 years of simulation when the policy is a single fishery-wide non-tradeable quota (TAC). Over time fishers move toward gear that has higher catchability for both blue and red species ( but blue in particular as it drives the race to the fish). Because blue quotas are rare and fishing gear non-selective, throughout the simulation blue fish are the choke species while most of the allocated red quotas remain unused each year.

The distribution of catchability for each fisher for both species on the first and last year of simulation when a TAC policy is in place.

The distribution of catchability for each fisher for both species on the first and last year of simulation when a TAC policy is in place.

The total number of yearly landings per year of simulation when a TAC is in place. The dashed line represents the total number of quotas available each year. Because blue quotas are rare, the blue fish is the choke species of this simulatio and remains so throughout the 20 simulated years.

The total number of yearly landings per year of simulation when a TAC is in place. The dashed line represents the total number of quotas available each year. Because blue quotas are rare, the blue fish is the choke species of this simulatio and remains so throughout the 20 simulated years.

The next two figures shows how different gear evolution is with the ITQ incentives in place. After 50 years of simulation fishers have moved towards gear that have high red catchability and low blue catchability. This allows them to catch more reds for each unit of blue which helps ignore the “choke” status of the blue species. The end result is a more efficient use of quotas due to the more selective gear used. Again this dynamic is generated purely by fishers randomly exploring and copying those that do best rather than any long term planning algorithm.

The distribution of catchability for each fisher for both species on the first and last year of simulation when a ITQ policy is in place.

The distribution of catchability for each fisher for both species on the first and last year of simulation when a ITQ policy is in place.

The total number of yearly landings per year of simulation when a ITQ is in place. The dashed line represents the total number of quotas available each year. Because of gear evolution blue fish do not stay as choke species for long and a more efficient use of quotas emerges.

The total number of yearly landings per year of simulation when a ITQ is in place. The dashed line represents the total number of quotas available each year. Because of gear evolution blue fish do not stay as choke species for long and a more efficient use of quotas emerges.

3.5.2 Parameters Used

This scenario uses a 2 species mixed biology, quota regulations and gear selection. Moreover agents explore but do not imitate their friends (with respect to locational choices). This is because fishers who imitate location of those with different gear are often worse off than they’d be had they stayed put.

Parameter New Value Baseline Value Meaning
Biology 2 Species Logistic - Well Mixed Logistic
\(K\) \((5000,10000)\) 5000 max units of fish per cell
Regulation TAC or ITQ None
total quota 4000 red, 500 blue - maximum yearly catch
Fisher Explore-Exploit-Imitate Explore-Exploit
Gear Adaptation Catchability - How agents change their gear over time
catchability \(\sim U[0.001,0.02]\) 0.01 % biomass caught per tow hour
\(\epsilon_{\text{gear}}\) .2 - exploration rate on gear
\(\delta_{\text{gear}}\) 0.2 - exploration area size
gear change frequency yearly - how often the fishers think about changing gear

3.5.3 Sensitivity Analysis

We search for an ITQ case where fishers switch to gear that has higher blue catchability than red. That is not possible if we allow parameters to change only 20% around their original value. If we allow the optimiser to search a much larger parameter space however we do find a corner case where agents switch to blue gear. This happens when the fishers lose money out of every trip. Because they are not allowed to exit the market they realize they can cut their losses by catching their own yearly blue quota, at which point they are forced to stay home which saves them money.
If the hold size is small and moving expensive (high gas prices, large cells) but the boat can make many trips (high speed) then catching red is a nuisance as it doesn’t count for filling quotas and agents “smartly” switch to gear that has low red catchability.
Albeit implausible this is a good example of how resilient the agents’ adaptation is: even in this impossible situation the fishers manage to cut their losses.

Parameter Switch to red gear Switch to blue gear
\(\epsilon\) 0.2 0.05
\(K\) 5000 20000
\(m\) 0.001 0.07
hold size 100 10
cell width 10 20
speed 5.0 15
gas price 0.01 0.85

4 Policy Optimisation

Fishers will respond to any policy. The next step is to look for the policy that elicits the “best” response. The model takes policies as inputs and outputs the simulation. Finding the “best” policy is a black-box function optimisation looking for the highest scoring simulation.

We maximise through Bayesian optimisation (Shahriari et al. 2016). This is a model-fitting optimisation: the algorithm iteratively tries new policies and fits the outcomes to a statistical meta-model that suggests which policy to try next. In agent-based models the search for optimality is tied to the question of how many simulations to run in order to be confident and the great advantage of Bayesian optimisation is that it answers both questions at once. The posterior generated by the Bayesian optimiser provides not only the average expected value of running a new simulation but also its confidence interval.

In this section we provide a few examples of how to use the Bayesian optimiser to find optimal policies in our agent-based model. This is implemented through the python package Spearmint (Snoek, Larochelle, and Adams 2012; J. M. Hernández-Lobato et al. 2015).

4.1 The Optimal MPA

In this example we use the Bayesian optimiser to find the optimal MPA in a set of simple scenarios. The results are trivial but they highlight the importance of the scoring function.

There are 300 fishers in this scenario whose only decision is where to fish. There are 2 species of fish: reds live on the north side of the map and blues live on the south. Assume we only care about the conservation of the blue species so that we score a 20 year simulation run as: \[ \text{Score} = \text{Blue Biomass}_{t=20} \] We want to find the single best MPA rectangle that achieves the highest score.

Since blue fish live only on the south of the map and since we care only about its conservation any MPA that covers the entirety of the south will be optimal. As reds do not appear in the score function the optimiser will be indifferent between protecting them or not. The first grid in the next figure shows the optimal MPA discovered by the Bayesian optimiser. Everywhere the blue fish live is protected as are a large part of the north map. This is optimal.

Keep the same scenario but change the score function to a linear combination of two indicators: the conservation of blues and the exploitation of the reds \[ \text{Score} = \text{Blue Biomass}_{t=20} + \sum_{i=1}^{20} \text{Red Landings}_{t=i}\] Again we expect the optimal rectangle to cover the south of the map where blue fish live. Protecting none of the north sea will lead to over-exploitation and immediate depletion of the red species with limited cumulative landings while protecting too much will prevent most of the stock from being harvested. The Bayesian optimiser solves the trade-off by protecting a small area of the red habitat at the centre of the map as shown in the second grid of the figure below.

Finally we run the same optimisation over a new map where blue and red fish are homogeneously distributed in each map cell in proportion 80% red to 20% blue. Since there is no geographical separation between red and blue the trade-off between conserving blue and exploiting red cannot be avoided. The Bayesian optimiser defines the best policy as a tall but narrow rectangle near port, as shown in the last grid of the figure below. In this scenario fishers first consume both blue and red fish that are unprotected and then continue by fishing the line catching a continuous stream of red and blue fish while the stock replenishes within the MPA. The Bayesian optimiser maximises the score by balancing the stream of landings outside the MPA with the stock of biomass growing within it.

The 3 optimal MPAs discovered by the Bayesian optimiser; the red dotted line represents the demarcation line between blue and red fish with the blue living on the line and south of it.

The 3 optimal MPAs discovered by the Bayesian optimiser; the red dotted line represents the demarcation line between blue and red fish with the blue living on the line and south of it.

4.2 Optimal Quotas

In this scenario we show that moving from a TAC to an ITQ by simply parcelling and distributing the global quota to each fisher is suboptimal as it doesn’t take into consideration individual incentives. For the same fishers catching the same biomass and maximising the same score the TAC and ITQ optimal quotas are different.

There are 300 fishers whose only decision is where to fish. There are 2 species of fish: reds live on the north side of the map and blues live on the south. We want to find the yearly quota allocation such that we maximise: \[ \text{Score} = \text{Blue Biomass}_{t=20} + \sum_{i=1}^{20} \text{Red Landings}_{t=i}\] We run the optimisation twice, once with a TAC policy enabled and once with the ITQ policy. Each has 2 parameters to set: yearly red and blue quotas.

The next figure shows the Bayesian optimiser output when looking for the best TAC policy. The optimal global quotas are 325,897 red units and 763,361 blue. This looks counter-intuitive since the score function wants to preserve blue and catch red. However because TACs provide no personal incentive to target either species, fishers tend to catch whatever is close to port resulting in well-mixed yearly landings. Whenever the red quota is exhausted the season ends protecting blue fish as well. Moreover, since the policy score is just a sum, we are scoring catching reds just as much as saving blues: the optimiser then discovers that is to best to target about 350,000 red landings, the maximum sustainable red yield, and ignore blue conservation altogether. Blue quotas, as long as they are not binding, do not matter much. This means that the posterior is maximised around red quotas of about 350,000 and any blue quota value high enough to be non-binding.

 The Bayesian posterior after running 200 simulations through the optimiser. Each black dot represents an actual run. The red plot represents the expected score of running a simulation while the blue plot shows the standard deviation associated to the expected score.

The Bayesian posterior after running 200 simulations through the optimiser. Each black dot represents an actual run. The red plot represents the expected score of running a simulation while the blue plot shows the standard deviation associated to the expected score.

As shown in the next figure the optimal ITQ quotas differ. The optimiser generates a “one strike and you are out” rule: the optimal blue quota suggested is 0. The reason is that fishers dealing with ITQs are incentivated to avoid species whose quotas are rare. The Bayesian optimiser notices this and proposes, logically, that if you want to save blue fish you should not provide any quota for it. Fishers who by experimentation land in the south of the map and catch blue fish are unable to fish again for the entire season and their friends learn not to imitate them.

The posterior mean and standard deviation of policy score when looking for the best quotas to give in an ITQ.

The posterior mean and standard deviation of policy score when looking for the best quotas to give in an ITQ.

To prove that the difference is caused exclusively by fishers’ reaction to policy we run the same optimisations on a well mixed map where red and blue fish occur homogeneously in every map cell in a 50% red and 50% blue proportion. Now agents are unable to modulate the ratio of blue to red catches just by fishing elsewhere. In this case, as shown in the next figure, optimal ITQ and TAC quotas are identical and the posterior has Leontief shaped optimum since increasing quota for one species without increasing the other has no real effect as landings will always be on a 1-1 ratio.

In a scenario where fishers are unable to respond to incentives the optimal quotas under TACs and ITQs are exactly the same

In a scenario where fishers are unable to respond to incentives the optimal quotas under TACs and ITQs are exactly the same

4.3 Multiple Objectives and Pareto Fronts

A weakness of policy optimisation in the previous sections is that we need to specify a score function. If we value multiple variables, for example conserving some species while catching others, we need to define the score as a combination of them and weight their importance. This is particularly hard if it is unclear how to weigh the variables correctly: for example how many more pounds of by-catch landed are we willing to tolerate for a 1% decrease in the Gini coefficient?

An alternative is to assign to the simulation multiple objectives and present a set of policies that represents the best possible trade-off between them. This is multi-objective optimisation (see Deb 2001; chapter 8 of Luke 2009) which outputs a Pareto front: a set of options that dominate all the others in at least one objective.

One way to understand the Pareto front is to consider it as the “budget constraint” for the policy-maker: it expresses what must be given up for one objective in order to improve another. The score function is the “utility function” and the Bayesian optimiser is the method to obtain the policy that maximises this utility subject to the constraint of being on the edge of the Pareto front. Knowing the Pareto front has a value in its own as it shows clearly what are the trade-off between objectives.

Because Bayesian multi-objective optimisation is still at its germinal stage (see however D. Hernández-Lobato et al. 2015) for this section we use the NSGA-II genetic algorithm(Deb et al. 2002). While genetic algorithms have a long history in agent-based models (since at least Axelrod 1986) and NSGA-II is a popular algorithm, by using it we forfeit the advantage of meta-modelling and the Bayesian approach to noise. Moreover NSGA-II is an elitist algorithm and “lucky” simulations may prevail over parameters that are better on average. We implemented this algorithm through the DEAP python package(Fortin et al. 2012)

We replicate here the homogeneous TAC optimisation from the previous section: blue and reds are well mixed geographically, there are 300 fishers and yearly quotas must be set for both blue and reds. While in the previous section we maximised over the sum of red catches and remaining blue biomass, this time we partition the two into separate objectives and run the NSGA-II algorithm. The next figure shows the Pareto front generated by the optimiser. The Pareto front shows how one has to give up relatively little in terms of blue conservation to increase red catches substantially (and vice-versa). This information would have been used but not reported by running directly the Bayesian optimiser.

The pareto set produced by 10 generations of optimisation through the NSGA-II algorithm looking for the best policy that would protect blue fish and land red fish in a world where the two aren't geographically separated. Notice that both axis are in terms of objectives, not policies We super-imposed the Bayesian optimum we found in the previous section which is, as expected, around where the pareto front would be tangent to highest line with negative 45 degrees slope.

The pareto set produced by 10 generations of optimisation through the NSGA-II algorithm looking for the best policy that would protect blue fish and land red fish in a world where the two aren’t geographically separated. Notice that both axis are in terms of objectives, not policies We super-imposed the Bayesian optimum we found in the previous section which is, as expected, around where the pareto front would be tangent to highest line with negative 45 degrees slope.

4.4 Fairness between Heterogeneous Fleets

Pareto fronts are particularly useful when it is hard to weigh a-priori the trade-off between two objectives. In this example there are 2 fleets of 50 fishers each. One fleet is highly efficient while the second one is made up of small inefficient boats. We have two objectives: maximise overall catches over 20 years and maximise small boats income.

The scenario follows the baseline parameters except that fishers are split into two fleets, the small one which cannot go far from port and the large one which has unlimited range, large hold size and high catchability.

Parameter Small Boat Large Boat
hold size 10 500
catchability 0.001 0.05
range 10 cells infinite

The only policy available is to creation of a single MPA rectangle that prevents large boats from fishing in it but allows small boats.
We look for the optimal rectangle with the NSGA-II algorithm. What follows is the Pareto front after 10 “generations” of policies.

The pareto set produced by 10 generations of optimisation through the NSGA-II algorithm looking for the best MPA line that simultaneously protects small boats and maximise total catches

The pareto set produced by 10 generations of optimisation through the NSGA-II algorithm looking for the best MPA line that simultaneously protects small boats and maximise total catches

The Pareto front shows how one can triple total catches by decimating small boats’ incomes. It also shows how most efficiency gain can be achieved with minimal costs to the small fishers. This can be useful to understand the trade-offs between the two objectives.

The next video shows the policy effects of the right-most solution (maximising small-boats income at the expenses of efficiency). The optimiser built an MPA around port. Small boats fish within the MPA while large boats are forced to steam out and fish far. The MPA is large enough that some fish can be in it and distant from port enough that small boats can’t catch it. It is this intermediate biomass that slowly repopulates the ocean and allows for long-term catches for both large and small boats.

The right-most solution instead provides no protection for small boats. The optimiser builds an MPA but it is far enough from port that small boats have no chance of using it. Its purpose is to preserve enough biomass that large boats can fish the line around it when all the other virgin areas are depleted. Small boats make very few profits and all during the earliest stage of the fishery as they compete over biomass close to port against the larger boats. Once this reservoir is depleted small boats struggle to make any revenue.

4.5 Hybrid Policies

Rather looking for one optimal policy (say, the optimal MPA), we could optimise the score where we can use every combination of simple policies. Finding the optimal hybrid policy involves maximizing within the Cartesian product of the parameter spaces of each sub-policy. This optimisation is computationally harder (because the space to search is larger) but it frees the researcher from having to pick the right policy class.

4.5.1 Mixed 2 Species - Hybrid Policy

Take the optimal 2-species TAC problem where blue and red fish live in the same areas and fishers cannot avoid catching both at once. We tried to maximise the score: \[ \text{Score} = \sum_{t=1}^{20} \text{Red Landings}_t + \text{Blue Biomass}_{20} \] using a global two quota system. Its posterior is L-shaped both because agents have no way to avoid blue to target red (since they live together) and because global TACs do not provide incentive to target anyway.

Combining instead three different policies:

  • TAC quota
  • MPA
  • Fishing Season

can we obtain a better score?
This is a 8 parameter optimisation problem:

Parameter Meaning
Quota Red Fishery-wide maximum of red fish landed before fishery is closed
Quota Blue Fishery-wide maximum of blue fish landed before fishery is closed
Season Length Maximum number of days before fishery is closed
\(\text{MPA}_x\) X coordinate of top left MPA corner on Map
\(\text{MPA}_y\) Y coordinate of top left MPA corner on Map
MPA height Heigth (in cells) of the MPA
MPA width Width (in cells) of the MPA
MPA duration Number of days within a year where the MPA is active

While having more parameters should help in improving the score, we might intuitively assume policy hybrids are not helpful in this scenario for two reasons. First, quotas are already a way to define season length that is more flexible and accurate than day limits and second, both species are uniformly distributed throughout the map making MPAs look pointless.

Surprisingly then, the best hybrid policy scores on average 50% above the optimal TAC alone. Even more surprisingly the hybrid policy optimisation collapses into selecting an MPA-only policy. The optimal hybrid parameters found by the optimiser are:

Parameter Value
Quota Red 1,221,258
Quota Blue 1,997,840
Season Length 366
\(\text{MPA}_x\) 16
\(\text{MPA}_y\) 5
MPA height 40
MPA width 40
MPA duration 283

The optimiser effectively turned off the TAC by setting quota levels too high to ever bind. The optimiser turned off the fixed-season policy as well by setting longer than a year. The MPA is very large but there are 80 days where boats can fish in it. The next figure shows the position of the MPA:

The position of the temporary MPA when using the optimal hybrid policy

The position of the temporary MPA when using the optimal hybrid policy

The reason a spatial policy works so well is that it exploits, without being explicitly coded to do so, the logistic growth of the biology cells. Without regulations boats generate fishing fronts, depleting cells closest to port. Were the effort more dispersed, fewer cells would be emptied and recruitment would be higher for same amount of global biomass.
A temporary MPA is a way to somewhat disperse effort. When the MPA is open agents exploit it (since it is close to port and more profitable) but since the MPA is only ever open for a few months, the effect is never enough to cause major depletion. For the rest of the year agents fish the line around the MPA which causes some local depletion but this is tempered by the days agents spend fishing within the protected area.

The next figure shows red fish dynamics comparing a sample run using the optimal TAC alone versus using the optimal hybrid policy. There are always more red landings with the hybrid policy than with the optimal TAC. This, for the first 8 years, results in biomass being higher in the TAC-only scenario. However because effort is dispersed in the hybrid scenario, recruitment doubles (as most cells have some biomass spawning there). The result is that while more red fish is landed, red biomass is 1M units higher by the end of the simulation when using the hybrid policy.

Select dynamics comparing the optimal TAC and the optimal hybrid policy sample runs

Select dynamics comparing the optimal TAC and the optimal hybrid policy sample runs

The “hybrid” policy has a large advantage in terms of score against other policies as shown in the next figure.

The average score obtained by running the same mixed scenario 100 times with each policy.

The average score obtained by running the same mixed scenario 100 times with each policy.

4.5.2 Separated 2 Species - Hybrid Policy

We want to maximize the same score: \[ \text{Score} = \sum_{t=1}^{20} \text{Red Landings}_t + \text{Blue Biomass}_{20} \] Where blue fish lives south and red fish lives north.
We have seen this example when studying optimal ITQs and we know the posterior is maximized by setting blue quotas to 0 and aggregate red quotas to about 350000 units.

Can we achieve better by using a hybrid policy? The answer is yes, and interestingly we can do so without quotas just by mixing an MPA and a fixed term seasons. The optimal parameters found by the Bayesian optimiser are:

Parameter Value
Quota Red 1,121,503
Quota Blue 1,228,868
Season Length 118
\(\text{MPA}_x\) 0
\(\text{MPA}_y\) 24
MPA height 40
MPA width 40
MPA duration 365

Which correspond to a year-round MPA covering the entire area where blue fish live and a short season length to fish the remaining (red) areas. The quotas are set at levels so high that they never bind.
This hybrid solution combines the obvious effect of protecting blue fish by an MPA and deciding that effort control in days returns a better long-term yield than fixed yearly quotas.

The average score obtained by running the same scenario 100 times with each policy.

The average score obtained by running the same scenario 100 times with each policy.

MPA-only rules do not perform as well in the geographically separated scenario as they did in mixed species simulations. The optimal policy suggested is similar, however, with a large MPA that is open for the last 100 days of the season. Boats fish the line for the remaining 265 days of the year.

Parameter Value
\(\text{MPA}_x\) 1
\(\text{MPA}_y\) 10
MPA height 40
MPA width 40
MPA duration 265

4.5.3 Penalized Hybrid Policy

While both examples above showed the optimiser turning off some policies to achieve the best score, this is not usually the case. The policy suggested by the optimiser could be an unworkable mix of policies that would be too hard to implement in the real world. We can tweak this procedure to “penalize” for hybrid policies that are too complex. There are fundamentally 4 ways to do so:

  1. Add a theoretical penalty to the score function to lower the size of the parameters. This procedure would work much like regularization in statistics
  2. If a cost function exists for the implementation of each policy, the cost can be added to the score function
  3. Turn the problem in a multi-objective optimisation where one of the dimensions is the complexity of the hybrid policy
  4. Turn the problem in a multi-objective optimisation where one of the dimensions is the actual cost of implementing the hybrid policy

Bibliography

Arthur, W. Brian. 1994. “Inductive Reasoning and Bounded Rationality.” The American Economic Review 84 (2): 406–11.

Axelrod, Robert. 1986. “An Evolutionary Approach to Norms.” American Political Science Review 80 (04): 1095–1111.

Branch, Trevor A, Ray Hilborn, Alan C Haynie, Gavin Fay, Lucy Flynn, Jennifer Griffiths, Kristin N Marshall, et al. 2006. “Fleet Dynamics and Fishermen Behavior: Lessons for Fisheries Managers.” Canadian Journal of Fisheries and Aquatic Sciences 63 (7): 1647–68.

Bubeck, Sébastien, and Nicolo Cesa-Bianchi. 2012. “Regret Analysis of Stochastic and Nonstochastic Multi-Armed Bandit Problems.” arXiv Preprint arXiv:1204.5721.

Caplin, Andrew, Mark Dean, and Daniel Martin. 2011. “Search and Satisficing.” The American Economic Review 101 (7): 2899–2922.

Deb, K., A. Pratap, S. Agarwal, and T. Meyarivan. 2002. “A Fast and Elitist Multiobjective Genetic Algorithm: NSGA-II.” IEEE Transactions on Evolutionary Computation 6 (2): 182–97. doi:10.1109/4235.996017.

Deb, Kalyanmoy. 2001. Multi-Objective Optimization Using Evolutionary Algorithms. Wiley-Interscience Series in Systems and Optimization. Chichester: Wiley.

Fortin, Félix-Antoine, De Rainville, Marc-André Gardner Gardner, Marc Parizeau, Christian Gagné, and others. 2012. “DEAP: Evolutionary Algorithms Made Easy.” The Journal of Machine Learning Research 13 (1): 2171–5.

Grimm, Volker, Eloy Revilla, Uta Berger, Florian Jeltsch, Wolf M. Mooij, Steven F. Railsback, Hans-Hermann Thulke, Jacob Weiner, Thorsten Wiegand, and Donald L. DeAngelis. 2005. “Pattern-Oriented Modeling of Agent-Based Complex Systems: Lessons from Ecology.” Science 310 (5750): 987–91. doi:10.1126/science.1116681.

Grüss, Arnaud, Michael J. Schirripa, David Chagaris, Michael Drexler, James Simons, Philippe Verley, Yunne-Jai Shin, Mandy Karnauskas, Ricardo Oliveros-Ramos, and Cameron H. Ainsworth. 2015. “Evaluation of the Trophic Structure of the West Florida Shelf in the 2000s Using the Ecosystem Model OSMOSE.” Journal of Marine Systems 144 (April): 30–47. doi:10.1016/j.jmarsys.2014.11.004.

Harley, Shelton J, Ransom A Myers, and Alistair Dunn. 2001. “Is Catch-Per-Unit-Effort Proportional to Abundance?” Canadian Journal of Fisheries and Aquatic Sciences 58 (9): 1760–72.

Hernández-Lobato, Daniel, José Miguel Hernández-Lobato, Amar Shah, and Ryan P. Adams. 2015. “Predictive Entropy Search for Multi-Objective Bayesian Optimization,” November. http://arxiv.org/abs/1511.05467.

Hernández-Lobato, José Miguel, Michael A. Gelbart, Matthew W. Hoffman, Ryan P. Adams, and Zoubin Ghahramani. 2015. “Predictive Entropy Search for Bayesian Optimization with Unknown Constraints,” February. http://arxiv.org/abs/1502.05312.

Kuleshov, Volodymyr, and Doina Precup. 2014. “Algorithms for Multi-Armed Bandit Problems.” arXiv Preprint arXiv:1402.6028.

Larcom, Shaun, Ferdinand Rauch, and Tim Willems. 2015. “The Benefits of Forced Experimentation: Striking Evidence from the London Underground Network.”

Luke, Sean. 2009. Essentials of Metaheuristics : A Set of Undergraduate Lecture Notes. [S.l.]: Lulu.

Luke, Sean, Gabriel Catalin Balan, Liviu Panait, Claudio Cioffi-Revilla, and Sean Paus. 2003. “MASON: A Java Multi-Agent Simulation Library.” In Proceedings of Agent 2003 Conference on Challenges in Social Simulation. Vol. 9.

Miller, John H. 1998. “Active Nonlinear Tests (ANTs) of Complex Simulation Models.” Management Science 44 (6): 820–30.

Miller, Steve J, and Robert T Deacon. 2014. “Protecting Marine Ecosystems: Prescriptive Regulation Versus Market Incentives.”

Rubinstein, Ariel. 1997. Modeling Bounded Rationality. MIT Press.

Russell, Stuart Jonathan, and Peter Norvig. 2010. Artificial Intelligence: A Modern Approach. Prentice Hall.

Shahriari, B., K. Swersky, Ziyu Wang, R.P. Adams, and N. de Freitas. 2016. “Taking the Human Out of the Loop: A Review of Bayesian Optimization.” Proceedings of the IEEE 104 (1): 148–75. doi:10.1109/JPROC.2015.2494218.

Shin, Yunne-Jai, and Philippe Cury. 2001. “Exploring Fish Community Dynamics Through Size-Dependent Trophic Interactions Using a Spatialized Individual-Based Model.” Aquatic Living Resources 14 (2): 65–80. doi:10.1016/S0990-7440(01)01106-8.

———. 2004. “Using an Individual-Based Model of Fish Assemblages to Study the Response of Size Spectra to Changes in Fishing.” Canadian Journal of Fisheries and Aquatic Sciences 61 (3): 414–31. doi:10.1139/f03-154.

Snoek, Jasper, Hugo Larochelle, and Ryan P Adams. 2012. “Practical Bayesian Optimization of Machine Learning Algorithms.” In Advances in Neural Information Processing Systems 25, edited by F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, 2951–9. Curran Associates, Inc. http://papers.nips.cc/paper/4522-practical-bayesian-optimization-of-machine-learning-algorithms.pdf.

Wilensky, Uri, and William Rand. 2015. An Introduction to Agent-Based Modeling: Modeling Natural, Social, and Engineered Complex Systems with NetLogo. MIT Press.