Evolution of the Brain

600m ya - Bilaterians and Steering

Valence

Bilaterians are the only animals that have brains
Nematodes (legless wormlike creatures about the size of a grain of rice) emerge in the Edicaran period from 635 to 539m ya.
Brain had 302 neurons (against 85 billion today
Initial steering is obtained through assessing the valence (goodness or badness) of a stimulus, and going towards the things that smelled good and away from the things that smelled bad.
There were negative and positive valence sensing neurons and move forward neurons and turning neurons.
The various sensory inputs acted as votes for going one way or another and the first brains evolved as a mega-integration place to take in all these votes and then decide who had won and thus where to steer.

Emotions

Affect is the name for the unifying foundation of emotions
In addition to valance (good or bad), there is arousal (high or low)
A primitive good mood encourages feeding, digesting, and sexual activity
A primitive bad mood inhibits these activities
An aroused good mood leads to exploiting nearby food sources or sexual partners
An aroused bad mood leads to escaping from bad feelings - hunger, fear
The brain generates affective states using neuromodulators like dopanmine and serotonin.
In the nematode, dopamine is released to create arousal and drive the search for food and serotonin is released to suppress arousal and drive the enjoyment of digesting it.
Dopamine is less about liking things and more about wanting them.
Other neuromodulators - norepinephrine, octopamine, and epinephrine drive escape behavior by suppressing the effectiveness of serotonin and stopping an animal from being able to rest and feel safe - acute stress response.
Opioids initiate recovery processes and inhibit negative valence neurons to stop and recover from stress episodes.
Chronic stress turns off arousal and motivation, activates serotonin and leads to numbness and depression (anhedonia). It can cause learned helplessness
Affect answers two questions:
- Do I want to expend energy by moving?
- Do I want to stay here or leave?

Associating, Predicting, Learning

The digestive organs are under the control of the nervous system
Conditional reactions are involuntary associations - associative learning happens automatically without conscious involvement.
At the same time as valence, the ability to use experience to change what is considered good and bad also emerges.
Learning in biological brains has always been continual.
Pavlov’s conditional reflexes are always strengthening (acquisition) or weakening (extinction) with each new experience. Extinction may be followed by spontaneous recovery (instantaneous) or reacquisition (faster than first time)
Spontaneous recovery is a primitive form of long-term memory.
The credit assignment problem - which cue really predicted something subsequently happening?:
- Eligibility traces - Immediately follows cue
- Overshadowing - Pick strongest cue
- Latent inhibition - Pick the cue you haven’t seen before.
- Blocking - Use existing cues and ignore others.
Learning occurs when synapses change strength or when new ones are formed or old ones are removed.
Association, prediction, and learning emerged to tweak the goodness and badness of things

500m ya - Vertebrates and Reinforcing

Reinforcement Learning

Cambrian period (Cambrian Explosion) is 540-485m ya.
The brains of all vertebrates, from fish to humans, develop in the same initial steps:
- Brains differentiate into three bulbs - a forebrain, midbrain, and hindbrain
- The forebrain unfolds into two subsystems:
  - The cortex and the basal ganglia
  - The thalamus and the hypothalamus
Animals learn by first performing random exploratory actions and then adjusting future actions based on valence outcomes.
Reinforcement learning is the ability to learn arbitrary sequences of actions through trial and error with reinforcing and punishing depending on the valence outcomes.

Temporal Distance Learning

Most drugs of abuse - alcohol, cocaine, nicotine - work by triggering the release of dopamine. All vertebrates, from fish to rats to monkeys to humans, are susceptible to becoming addicted to such dopamine-enhancing chemicals.
Discounting drives AI systems (or animals) to choose actions that lead to rewards sooner rather than later.
Dopamine is not a signal for reward but for reinforcement. Reinforcement and reward must be decoupled for reinforcement learning to work. To solve the temporal credit assignment problem, brains must reinforce behaviors based on changes in predicated future rewards, not actual rewards. This is why animals get addicted to dopamine-releasing behaviors despite it not being pleasurable, and this is why dopamine responses quickly shift their activations to the moments when animals predict upcoming reward and away from rewards themselves.
Dopamine was originally a signal for good things nearby - a primitive version of wanting. Evolution reshaped it into a temporal difference learning signal, from a fuzzy average of recently detected food to an ever fluctuating, precisely measured, and meticulously computed predicted-future-reward signal.
Disappointment and relief are emergent properties of a brain designed to learn by predicting future rewards.
The omission of an expected punishment is itself reinforcing; it is relieving. And the omission of an expected reward is itself punishing; it is disappointing.
Vertebrates are unique in the precision with which they can measure time.
Temporal distance learning, disappointment, relief, and the perception of time are all related.
The basal ganglia is in a perpetual state of gating and ungating specific actions, operating as a global puppeteer of an animal's behavior.
- It learns to repeat actions that maximize dopamine release.
- It is a system designed to repeat behaviors that lead to reinforcement and inhibit behaviors that lead to punishment.
The hypothalamus houses valence neurons inherited from the valence sensory apparatus of ancestral bilaterians.
- It is, in principle, a more sophisticated version of the steering brain of early bilaterians; it reduces external stimuli to good and bad and triggers reflexive responses to each.
- When the hypothalamus is happy, it floods the basal ganglia with dopamine, and when it is upset, it deprives the basal ganglia of dopamine.
- The basal ganglia is a student, always trying to satisfy its vague but stern hypothalamic judge.
- The hypothalamus is the decider of actual rewards.
How is dopamine transformed from a valence signal for actual rewards to a temporal difference signal for changes in predicted future reward? The basal ganglian student initially learns solely from the hypothalamic judge, but over time learns to judge itself, knowing when it makes a mistake before the hypothalamus gives any feedback.
This is why dopamine neurons initially respond when rewards are delivered, but over time shift their activation toward predictive cues.
This is also why receiving a reward that you knew you were going to receive doesn't trigger dopamine release; predictions from the basal ganglia cancel out the excitement from the hypothalamus.

Pattern Recognition

Sometime around 500m ya, our ancestor evolve pattern recognition to remember the smell of that dangerous arthropod, to remember the sight of its eyes peeking through the sand.
Early vertebrates could recognize things using brain structures that decoded patterns of neurons. Within a small mosaic of only fifty typos of olfactory neurons lived a universe of different patterns that could be recognized. Fifty cells can represent over one hundred trillion patterns.
Patterns can be similar but not the same.
Your iPhone needs to be able to tell the difference between your face and other people's faces, despite the fact that faces have overlapping features (discrimination). It must also be able to identify your face despite changes in shading, angle, facial hair, and more (generalization).
In the first cortex evolved a new morphology of neuron:
- Pyramidal neurons have hundreds of dendrites and receive inputs across thousands of synapses.
- These were the first neurons designed for the purpose of recognizing patterns.
- A small number of olfactory neurons connect to a much larger number of cortical neurons. They connect sparsely - a given olfactory cell will connect to only a subset of these cortical cells. This leads to pattern separation, decorrelation, or orthogonalization.
The problem of "catastrophic forgetting is why we don't let AI systems learn things sequentially; they learn things all at once and then stop learning. But even early bilaterians learned continually.
The retina contains over 100m neurons of five different types. The visual cortex decodes and memorizes the visual pattern the same way the olfactory cortex decodes and memorizes smell patterns.
But the same visual object can activate different patterns depending on its rotation distance, or location in your visual field. This creates the invariance problem - how to recognize a pattern as the same despite large variance in its inputs.
The same issue arises with words spoken by a child and an adult or in different accents. Your brain is somehow recognizing a common pattern despite huge variances in the sensory input.
Visual (and audio) processing in mammals is hierarchical:
- The lateral geniculate nucleus (LGN) is a small, ovoid, ventral projection of the thalamus where the thalamus connects with the optic nerve.
- The V1 area decomposes the complex pattens of visual input into simpler features like lines and edges
- V1 sends its output to V2, which then sends information to an area called V4, both of which are sensitive to more complex shapes and objects
- V4 sends its output to the inferior temporal gyrus or IT cortex, which is sensitive to complex whole objects like specific faces.
In the predatory arms race of the Cambrian, evolution shifted from arming animals with new sensory neurons for detecting specific things to general mechanisms for recognizing anything, and this caused new sensory organs and each incremental improvement in the brain's pattern recognition expanded the benefits to be gained by having more detailed sensory organs:
- Noses evolved to detect chemicals
- Inner ears evolved to detect frequencies of sound
- Eyes evolved to detect sights
In the brain, the result was the vertebrate cortex, which somehow recognizes patterns without supervision, accurately discriminates overlapping patterns and generalizes patterns to new experiences. It somehow continually learns patterns without catastrophic forgetting and despite larges variances in its inputs.
Pattern recognition and reinforcement learning evolved simultaneously in evolution. The greater the brain's ability to kearn arbitrary actions in response to things in the world, the greater the benefit to be gained from recognizing more things in the world. The more unique objects and places a brain can recognize, the more unique actions it can learn to take.
And so the cortex, basal ganglia, and sensory organs evolved together, all emerging from the same machinations of reinforcement learning.

Curiosity

It was early vertebrates that first became curious.
In vertebrates, surprise itself triggers the release of dopamine, even if there is no "real" reward.
To make animals curious, we evolved to find surprising and novel things reinforcing, which drives us to pursue and explore them. Even if the reward of an activity is negative, if it is novel, we might pursue it anyway.
Games of gambling are designed to exploit this with a 48% chance of winning it is high enough to be possible, but uncertain enough to make it surprising.
Social networks also hack into our 500m year preference for surprise, by showing us surprising things, but only sometimes.
Curiosity is a requirement for reinforcement learning to work. For the first time, learning became, in and of itself, an extremely valuable activity.

Modeling the World

Your brain has built a spatial map of your home so that you can make your way around (with a few stubbed toes) in the dark.
All vertebrates can learn spatial maps, but simple bilaterians cannot.
The vestibular sense feels the direction of head movement through "head-direction neurons".
The cortex of early vertebrates had three subareas:
- Lateral cortex - Recognizes smells and will evolve into the olfactory cortex in early mammals.
- Ventral cortex - Recognizes patterns of sights and sounds and will evolve into the amygdala.
- Medial cortex - Visual, vestibular, and head direction signals propagate here, where they are mixed together and converted into a spatial map. Later became the hippocampus. Contains place cells that activate when an animal is in a specific location
This was the first time that an organism could recognize where it was.
The first time a brain differentiated the self from the world.
The first tiem that a brain constructed an internal model - a representation of the external world.

200m ya - Mammals and Simulation

The Devonian and Permian Eras

420m to 375m ya is called the Devonian period - arthropods walked out of the oceans to populate the land, plants first evolved leaves for better absorption of sunlight and seeds for spreading, and trees first developed.
The Late Devonian Extinction caused the Carbon dioxide levels to plummet and the climate to cool, freezing the oceans.
Reptiles and therapsids evolved, with the therapsids (our ancestors) developing warm-bloodedness - the ability to use energy to generate their own internal heat. During the Permian era (300-250m ya) they became the most successful land animals
During the Permian-Triassic mass extinction event, 250m ya, over the course of 5-10m years, 96% of all marine life and 70 of land life died.
The reptiles became dominant while the bigger therapsids died out and only small therapsids, like the cynodont survived.
These burrowing or arboreal four-inch mammals, like birds or squirrels, had one advantage, they could make the first move.
The neocortex evolved to give these mammals the ability to simulate actions before they occurred.
Early vertebrates learned by doing, while these mammals could learn before doing, by imagining.
There were two requirements needed for simulation to evolve:
- Far-ranging vision - To see much of your surrounding and simulate various paths through them
- Warm-bloodedness - To let mammal brains operate much faster than fish or reptile brains.
The ventral cortex of the vertebrates became the associative amygdala in mammals - learning to recognize patterns that were predictive of valence outcomes

Evolution of the Brain

600m ya - Bilaterians and Steering

Valence

Emotions

Associating, Predicting, Learning

500m ya - Vertebrates and Reinforcing

Reinforcement Learning

Temporal Distance Learning

Pattern Recognition

Curiosity

Modeling the World

200m ya - Mammals and Simulation

The Devonian and Permian Eras

Generative Models and the Neocortex

Imagination

Model-Based Reinforcement Learning

The Secret to Dishwashing Robots

15m ya - Primates and Metathinking

Political Savvy

Modeling Other Minds

Tools, Teaching, and Imitation

Modeling the Future

100k ya - Humans and Speech

The Search for Human Uniqueness

Language in the Brain

The Perfect Storm

ChatGPT