Neural representation of future states in the mouse cortex
The ability to predict long-term future rewards is crucial for survival. Some animals may have to endure long periods without getting any reward and can plan and predict the future. How does the brain predict possible future states? We can conceive the above question in a reinforcement learning...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Doctor of Philosophy |
Language: | English |
Published: |
Nanyang Technological University
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/171355 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-171355 |
---|---|
record_format |
dspace |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Science::Medicine::Computer applications |
spellingShingle |
Science::Medicine::Computer applications Ahmad Suhaimi Bin Ahmad Ishak Neural representation of future states in the mouse cortex |
description |
The ability to predict long-term future rewards is crucial for survival. Some animals
may have to endure long periods without getting any reward and can plan and
predict the future. How does the brain predict possible future states?
We can conceive the above question in a reinforcement learning framework.
Reinforcement learning is mapping the states with appropriate actions that maximize
cumulative reward. Reinforcement learning can be solved by deriving the value
function, that is, the lasting appeal of being in each state. There are two prominent
families of algorithms that are frequently used to solve value, namely, the model based and model-free methods. Model-based methods solve value by incorporating
an internal model of the environment. It requires a significant amount of
computational resource to compute but flexible and adaptable to environmental
changes. On the other hand, model-free algorithms estimate value directly from
experience. It is computationally cheaper but inflexible and less adaptable to
environmental changes.
Neural correlates for model-free algorithms are extensively studied. The phasic
activity of dopaminergic neurons in the basal ganglia has been directly linked to the
reward prediction error computed in temporal difference learning algorithms. The
neural signatures of model-based methods, however, remained elusive. Yet, several
lesion studies have suggested that the dopamine pathway implicated in model-free
methods may also play a role in model-based behaviours. This led to the hypothesis
that the brain might use a third approach, namely the successor representation.
The successor representation is a tabular and discriminative representation of
discounted future state occupancies conditioned on a current state. In other words, it
is a predictive map that informs the probability of visiting a future state, given the
current state. The successor representation algorithm is learned through a model free method and has behavioural flexibility akin to a model-based approach. Recent
studies have proposed neural substrates for the successor representation algorithm
in the hippocampus. They theorized that place cells in the hippocampus do not
necessarily encode place per se but rather a retrodictive map derived from the
successor representation matrix. However, the tabular nature of the successor
representation makes it unlikely that the brain would adopt such an algorithm as the
cost to build the table would exponentially increase with increasing dimensionality of
the state space. Here, I proposed using the γ-model, a recently developed
generative and continuous analogue to the successor representation algorithm that
outputs a probability map of future states, given a current state and action. Unlike the
successor representation, the γ-model does not rely on the tabular representation.
The γ-model amalgamates the advantages of model-based and model-free
reinforcement learning algorithms. Like a model-based method, the γ-model is
flexible to environmental changes (e.g., change in the reward function). Like a
model-free method, the γ-model is conditioned on the policy and contains
information about the long-term future. The γ-model makes an excellent candidate to
study how the brain predicts future states.
In Chapter 1, I reviewed the literature on reinforcement learning, solving
reinforcement learning problems by estimating values using various methods,
outlined literature on the neural correlates of reinforcement learning in the brain, the
problems in the field, and how I aim to address these issues with my hypothesis and
specific aims. Next, I give a detailed method and materials used in this study in
Chapter 2. Chapter 3 details the findings of this study. I first introduce the novel
behavioural paradigm compatible with the γ-model in the context of reinforcement
learning. Next, I showed that the γ-model could be developed and used to predict
future states. Lastly, I attempt to identify neural correlates in the cortex of mice that
correspond to the predictive features of the γ-model. I showed that spatial
representations in the cortex of mice correlate with the predictive features of the γ model, which is similar to the theories postulated in the hippocampus. Lastly, I will
discuss the findings in Chapter 4 and conclude the thesis in Chapter 5. The
conclusions of this thesis will shed light on the neural mechanism of prediction in the
mammalian brain |
author2 |
Hiroshi Makino |
author_facet |
Hiroshi Makino Ahmad Suhaimi Bin Ahmad Ishak |
format |
Thesis-Doctor of Philosophy |
author |
Ahmad Suhaimi Bin Ahmad Ishak |
author_sort |
Ahmad Suhaimi Bin Ahmad Ishak |
title |
Neural representation of future states in the mouse cortex |
title_short |
Neural representation of future states in the mouse cortex |
title_full |
Neural representation of future states in the mouse cortex |
title_fullStr |
Neural representation of future states in the mouse cortex |
title_full_unstemmed |
Neural representation of future states in the mouse cortex |
title_sort |
neural representation of future states in the mouse cortex |
publisher |
Nanyang Technological University |
publishDate |
2023 |
url |
https://hdl.handle.net/10356/171355 |
_version_ |
1781793789251158016 |
spelling |
sg-ntu-dr.10356-1713552023-11-02T02:20:48Z Neural representation of future states in the mouse cortex Ahmad Suhaimi Bin Ahmad Ishak Hiroshi Makino Lee Kong Chian School of Medicine (LKCMedicine) ahmadsuhaimi1992@gmail.com, hmakino@ntu.edu.sg Science::Medicine::Computer applications The ability to predict long-term future rewards is crucial for survival. Some animals may have to endure long periods without getting any reward and can plan and predict the future. How does the brain predict possible future states? We can conceive the above question in a reinforcement learning framework. Reinforcement learning is mapping the states with appropriate actions that maximize cumulative reward. Reinforcement learning can be solved by deriving the value function, that is, the lasting appeal of being in each state. There are two prominent families of algorithms that are frequently used to solve value, namely, the model based and model-free methods. Model-based methods solve value by incorporating an internal model of the environment. It requires a significant amount of computational resource to compute but flexible and adaptable to environmental changes. On the other hand, model-free algorithms estimate value directly from experience. It is computationally cheaper but inflexible and less adaptable to environmental changes. Neural correlates for model-free algorithms are extensively studied. The phasic activity of dopaminergic neurons in the basal ganglia has been directly linked to the reward prediction error computed in temporal difference learning algorithms. The neural signatures of model-based methods, however, remained elusive. Yet, several lesion studies have suggested that the dopamine pathway implicated in model-free methods may also play a role in model-based behaviours. This led to the hypothesis that the brain might use a third approach, namely the successor representation. The successor representation is a tabular and discriminative representation of discounted future state occupancies conditioned on a current state. In other words, it is a predictive map that informs the probability of visiting a future state, given the current state. The successor representation algorithm is learned through a model free method and has behavioural flexibility akin to a model-based approach. Recent studies have proposed neural substrates for the successor representation algorithm in the hippocampus. They theorized that place cells in the hippocampus do not necessarily encode place per se but rather a retrodictive map derived from the successor representation matrix. However, the tabular nature of the successor representation makes it unlikely that the brain would adopt such an algorithm as the cost to build the table would exponentially increase with increasing dimensionality of the state space. Here, I proposed using the γ-model, a recently developed generative and continuous analogue to the successor representation algorithm that outputs a probability map of future states, given a current state and action. Unlike the successor representation, the γ-model does not rely on the tabular representation. The γ-model amalgamates the advantages of model-based and model-free reinforcement learning algorithms. Like a model-based method, the γ-model is flexible to environmental changes (e.g., change in the reward function). Like a model-free method, the γ-model is conditioned on the policy and contains information about the long-term future. The γ-model makes an excellent candidate to study how the brain predicts future states. In Chapter 1, I reviewed the literature on reinforcement learning, solving reinforcement learning problems by estimating values using various methods, outlined literature on the neural correlates of reinforcement learning in the brain, the problems in the field, and how I aim to address these issues with my hypothesis and specific aims. Next, I give a detailed method and materials used in this study in Chapter 2. Chapter 3 details the findings of this study. I first introduce the novel behavioural paradigm compatible with the γ-model in the context of reinforcement learning. Next, I showed that the γ-model could be developed and used to predict future states. Lastly, I attempt to identify neural correlates in the cortex of mice that correspond to the predictive features of the γ-model. I showed that spatial representations in the cortex of mice correlate with the predictive features of the γ model, which is similar to the theories postulated in the hippocampus. Lastly, I will discuss the findings in Chapter 4 and conclude the thesis in Chapter 5. The conclusions of this thesis will shed light on the neural mechanism of prediction in the mammalian brain Doctor of Philosophy 2023-10-23T00:55:33Z 2023-10-23T00:55:33Z 2022 Thesis-Doctor of Philosophy Ahmad Suhaimi Bin Ahmad Ishak (2022). Neural representation of future states in the mouse cortex. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/171355 https://hdl.handle.net/10356/171355 10.32657/10356/171355 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University |