Box 2: Learning and Movement Processes
Movement is the spatial consequence of a number of different behaviors by animals. For example, a predator searching for predictable but mobile prey must change its location in space to increase the chances it will encounter a prey item. In many situations (e.g., predictable environments or regularly available prey), learning can reduce uncertainty and increase success in such spatial behaviors. We outline a selection of these below:
Search and attack in predation – When prey live in a complex and heterogeneous environment, predators may benefit by adjusting their search and attack behavior over time (Stephenset al. 2007a). When predators detect their prey through visual, auditory, or olfactory cues, they can use associative learning to refine their ‘search image’ and improve their ability to detect and attack prey (Ishii & Shimada 2010). For instance, desert ants (Cataglyphis fortis ) use associative learning to connect specific odors to food, and then use this food-odor memory to assist their next foraging journey (Huber and Knaden 2018).
Escape from a predator – Spending time in familiar space allows animals to learn motor programs that enhance efficient movement within that space (Stamps 1995). For instance, in response to a pursuing human, Eastern Chipmunks (Tamias striatus ) within their home range (i.e., familiar space) take half as much time and travel half as far to reach a refuge compared to when outside their home range (Clarke et al. 1993)
Foraging bouts – An animal’s rate of energy gain while foraging can increase by collecting information about the environment (Stephens & Krebs 1987), given the environment changes in a (at least somewhat) predictive way. In most of these cases, animals use associative learning to connect the reward of a food source with some aspect (e.g., color, nearby landmark) of that food source. For instance, Rufous Hummingbirds learned the location of flowers that they had emptied in a foraging trial, and in subsequent trials did not waste time visiting them again (Healy & Hurly 1995).
Navigation and migration – Migratory movements notably occur at spatial scales that greatly exceed perceptual abilities of animals (mammals: (Teitelbaumet al. 2015); birds: (Alerstamet al. 2003). Thus, it is expected that animal migration is based on memory of past experience, and thus learning is likely used to improve migratory performance. For instance, social learning of migration helps ungulates improve energy gain (Jesmeret al. 2018) and helps birds reduce costs (Muelleret al. 2013).
Home range or territory selection – The decision process of choosing the size and location of home range or territories can be thought of as a learning process of integrating new information about the distribution of resources of a landscape (Mitchell & Powell 2004). For instance, home range size is often smaller in areas with fewer resources available (e.g., (Morelletet al. 2013; Viana et al. 2018). Further, increased exploration events, presumably to sample new locations when others are unavailable, can result in larger home ranges (Merkleet al. 2015).
Box 3. Robotics: Learning by Mobile Autonomous Agents
Robots that move and act autonomously, learning as they go, are confronted with tasks that parallel, in some ways, the life needs faced by moving animals. As in living animals, future decisions by a mobile autonomous robot hinge on what the learning robot experiences and encounters. Consequently, it is interesting to investigate how animal decision making about movement (Fig. 1) may be understood using concepts commonly used in robotics and control theory (Jordan & Mitchell 2015).
The basic model of an autonomous learner includes the following ingredients:
  1. The external environment.
  2. An internal state representation­ (sometimes termed aworld representation­ ).
  3. A set of possible actions.
  4. A policy map that relates state representations toactions.
  5. Information acquisition, which is a consequence ofactions interacting with the environment and thestate representations.
  6. Value functions that quantify benefits and consequence of actions as represented by the internal states.
A robot’s state representation­ simplifies all the information in the environment to a manageable (pruned and stylized) subset of relevant information that can eventually be linked to actions . Unsupervised state representations (Lesort et al. 2018) in which there are no performance measures, may be particularly relevant as constructs for how learning operates in animals. Staterepresentations allow the policy map to act on a dimensionally reduced decision space (the collection of states), which dramatically simplifies the task of learning individual policies.
A policy map structures the relationship of the robot’sstate representation to possible actions . A policy map may be complete, mapping all possible states to actions, or calculated on the run. Monte Carlo tree search, as used in the Go program AlphaGo from Google Deepmind (Silver et al. 2017), determines the next move via an extensive stochastic search. As an additional complication, a robot may possess several policy mapsand then select among the alternatives in a rule-based fashion.
Specified in this way, the basic details of a mobile autonomous robot map quite closely onto a formal conceptualization of the learning process in the context of animal movement (Fig. 1).