The Little Fish
Emergence of intelligence through evolution
by Jun Zhuang
2024.08.01
This is a hobby project that explores how the intelligence of a neural network (i.e. a primitive brain) might be optimized (i.e. evolved) across generations without using back propagation and real-time feedback.
It simulates a situation in which a "fish" with a primitive "brain" lives in a habitat with land, water and food. It explores whether intelligence can emerge from pure evolution (random mutation, overbreeding and natural selection), without giving any teaching signals to the individuals, which are usually used in supervised learning and reinforcement learning.
All simulation was run on silicon, no actual fish was used in this project. All the code is in this github repository: littlefish.
Let's say there was a primitive fish living on the earth hundreds of million years ago. It's habitat consisted of land and water. In the water there was food scattering around. We can squeeze this world into a 2d map (green represents land, blue means water, dark red means food and yellow means the fish itself). The fish occupies a 3x3 space. Each food is 1x1 and food only exists in the water.
Basic terrain for the simulation. A fish (3x3, yellow) lives in a 2d environment (64 x 64) with water (blue) and land (green). There is food (1x1, red) scattering around in the water.
The fish's task is simple:
Stay in the water, avoiding land.
Catch as much food as possible (to make its life easier, the food does not move).
To achieve these goals, the fish needs some properties and abilities.
Health point (HP) : The fish has a certain starting health point, some sort of energy reserve, let's say 100. If it does nothing, over time the HP will decay, say 0.01 per time point. This will give a default life span of 10000 time points.
If the fish hits land, its health will decay 100x faster to 1 per time point.
If the fish overlaps with a food, it will eat and gain 10 HP.
If the fish's HP becomes 0, it dies and the simulation ends.
Eye: To survive longer, obviously, the fish needs to sense its environment. For this, it has 8 eyes (2 sets of 4 eyes) around its body, each looking at a different direction (this is quite unrealistic, but simplifies the simulation greatly: there will be no head direction and the fish will never need to turn). One set of 4 eyes senses the terrain while the other set senses food. Each eye has an "eye-sight" of 2 pixels (figure below).
The receptive field of each of the four eyes of the fish. Each eye (dark red) is facing one direction from the fish's body (yellow) with a receptive field marked as the shaded red pixels. The darkness of the receptive field represents the weight of the given pixel. The weighted sum will be the eye's input.
A fish will have two sets of these four eyes, one set sensing the terrain (water or land), and the sensing the food.
Muscle: Even if the fish can sense its environment, it makes no difference if it cannot move. So I gave it four muscles, each will let it move in one of the four cardinal directions.
With the environment and the fish's sensory and motor systems prepared, it is now ready to implement the brain: the neural network that links the sensor (eye) to the motor (muscle). The brain's role is to enable the fish to move in response to the sensory inputs. The goal of this simulation is to see if a evolution process, namely random mutation, overbreeding, and natural selection, can optimize the fish's reaction, i.e. the brain's intelligence.
Only a very simple brain/neural network was used in this simulation. It consisted only three layers (a sensory layer, a hidden layer and a motor layer) with full connections between adjacent layers.
A sensory layer consists of the 8 eyes (2 sets of 4 eyes, one set looks at terrain and the other looks at food).
A hidden layer with 8 neurons connecting the sensory layer and the motor layer.
A motor layer with 4 neurons, each commanding a single muscle in one cardinal direction.
The neural network of a random brain.
Each circle is a neuron. The face color of each neuron shows its baseline firing rate (cyan: low; purple: high).
Each line is a connection. The color of the line shows the connection strength (blue: negative; red: positive; dark: strong; light: weak).
The sensory layer is on the left column marked as "EYE". It has 8 eyes (eye is a subclass of neuron). The letter inside the eye shows the direction it is looking at. Green outline means the eye is sensing the terrain (water or land); red outline means the eye is sensing the food.
The hidden layer is in the middle with 8 neurons. It receives full input connections from the sensory layer and outputs full connections to the motor layer.
The motor layer is on the right with 4 neurons. The letters inside shows the direction of the muscle it connects to.
The following is detailed description of the implementation of the brain. If you find it too nuanced, you can jump to the next section: "Simulation". You can pretty much grasp the principles of the mechanism of the brain from the plot above.
Neurons and connections: A neuron in this simulation is more realistic than a common node in modern machine learning (e.g. a ReLU unit) but less realistic than a leaky integrate and fire neuron (LIF neuron, commonly used in theoretical neuroscience research).
Basically a neuron has a baseline firing probability, once it receives an input, with some delay, there will be a small blip with a certain shape (either positive or negative, depending on the connection weight) and time duration on the otherwise flat firing probability.
At any given time point, if the neuron will fire or not is draw from the current firing probability. The output of any neuron is just binary firing sequence on the continuous time line.
A neuron also has a refractory period, meaning if it fires, it will never fire again in a short time period right after the firing regardless of its inputs.
The reason for this particular implementation is that it is sort of biologically realistic, operating on a continuous time line with time jitters and firing stochasticity. This is probably an overkill and might be replaced by something like an RNN.
For the motor neuron (a.k.a. the muscle), if it fires, the fish will move one pixel in the corresponding direction (unless it is trying to go over the boundaries of the map).
Example of information flow mechanism. The left plot shows a sketch of a simple network: two presynaptic neurons (one excitatory, red, and one inhibitory, blue) connect to a single postsynaptic neuron (gray). The right plot shows the internal information flow of this network. Each tick represents an activation of the neuron (analogous to an action potential). The continuous lines show the fluctuation of the firing probability of the postsynaptic neuron corresponding to the input (presynaptic neuron firing). The presynaptic neuron_0 is excitatory, meaning that after each of its firings, the firing probability of the postsynaptic neuron briefly increases (the red blips), while the presynaptic neuron_1 is inhibitory, and its firing have the opposite effect (blue blips). The sum of the red line, the blue line and the baseline probability is the final firing probability of the postsynaptic neuron (the grey line). From these probabilities, the firing can be draw as shown in grey ticks. As it shows, the postsynaptic neuron often fires a high-frequency short burst in response to each firing of the presynaptic neuron_0.
Now with everything in place, the fun begins. Let's throw a random fish (a fish with a brain of random connection weights and random baseline firing rates) into its habitat to see what happens. Once the simulation is done, using the simulation viewer included in the repository, we can see the fish roamed around aimlessly, got some food by luck but then hit the land. Due to the high penalty on land, it died very quickly. The total lifespan was 1224, much less than the default lifespan of 10000 (i.e. how long it would have lived if it did not move at all).
Firing histories (spike trains) of all neurons in the above simulation. Different colors represent different types of neurons. Some neurons were activate all the time and some neurons were completely silent. This is not surprising since the neural network is random.
It's not an exaggeration to say that this fish isn't very smart. This is not surprising because it has no reason to be smart. In the current implementation, the fish has no incentive to avoid land, find food or live longer. The neurons in its brain fires through a random probability and random network connections.
Now we finally reach the central question of this project: can we make the fish more intelligent, not by back propagation (supervised learning) or real-time value feedback (reinforcement learning), but by evolution.
The key nature of evolution is random mutation, overbreeding, and natural selection. To apply these principles into this simulation, I have done the following:
Define the first generation of fish. A generation of fish consists of 1000 fish. The first generation consists of 1000 fish with random brains.
Simulate the lifespan of each fish in a generation in random terrains with constant food resources. Currently I only simulated one fish at a time, since the current code does not deal with the situation when two fish bump into each other.
Based on the lifespan of the current generation, generate the next generation using random mutation.
The 400 fish in the current generation with the longest lifespans were copied to next generation. The rest 600 were discarded.
Generate 600 new fish from the copied 400 fish through random mutations on their brains for the next generation. The number of "children" of a given "mother fish" depends on how long it lived in the current simulation. The longer it lived, the more "child fish" it will spawn.
Repeat the process 2 - 3.
Evolution between two generations. In the (n)th generation, the 40% of population with longest lifespan, named as the mother fish, are copied to the (n+1)th generation. The rest 60% are discarded. From these mother fish, generate another 60% of the population using random mutation, named as child fish, to be added into the (n+1) generation.
As you can see, each individual still does not have incentive to avoid land and find food. They just roam around with their given brain determined at birth. The "learning" only occurs when a new generation is created, through the interplay between natural selection and random mutation. The hypothesis is that, over many generations, only the fittest will survive and the population lifespan will increase.
First lets look at the distribution of lifespans of the 0th (a.k.a. random) generation. As shown in the figure below, the majority of the 0th generation had lifespans much less than the default lifespan, meaning they hit land at some point during the simulation.
Lifespan distribution of the 0th generation (a.k.a. the random generation). The majority did not reach the default lifespan (i.e. how long the fish would have lived if it did not move at all).
What about their descendants after several generations? Our hypothesis predicts that they should be much better than their ancestors. And they did! At the 50th generation, their lifespan significantly increased, with about half of the population lived no shorter than the default lifespan. The great peak at the default lifespan of the 50th generation distribution (orange curve below) shows that many fish have acquired the ability of "avoiding land". The other two smaller peaks on the right represent the fish that have eaten 1 or 2 food without hitting land, meaning they may have acquired both "avoiding land" and "seeking food".
Lifespan distribution of the 0th and the 50th generation. The lifespans of the 50th generation are significantly longer than those of the 0th generation . The major peak at the default lifespan shows that many fish at the 50th generation acquired the ability of "avoiding land". The other two smaller peaks on the right, represent the fish that have eaten 1 or 2 food without hitting land.
Now lets take a look at the behavior of a "good" fish at 50th generations (video below). It seemed doing a peculiar circular dancing and drifting slowing in certain directions. It captured several food along the trip and avoided land. The circular movement is evident from the motor neuron firing histories (muscle spike trains in the figure below, green ticks). The muscles on all four direction moved almost synchronously with a certain rhythm, which allow the fish run a circle with 1 pixel step and return back to the original location very quickly.
At the first, I thought it is some sort of network artifact, some unwanted oscillatory activity. But then I realized that this is a quite viable strategy! Because the fish always spawns in the water, the starting location is always safe. By running in a circle, the fish will always return to the original safe location. Even if it hits the land during the circular dance, the penalty will be small because it will be very brief. However, on the other hand, if there is food around (precisely 1 pixel away), the fish will be able to take advantage of it and gain substantial amount of HP, way higher than the potential "land penalty". Thus this is a pretty cool "low risk, high reward" strategy!
I have never thought of this strategy before implementing the simulation. And, woo-hoo, the neural network and evolution surprised me by their creativity!
Firing histories (spike trains) of all neurons of a "good fish" from the 50th generation during the simulation (shown by the video above). The neurons were way more active than the example of the random fish (in the "Simulation" section). The synchronous and oscillatory activities of the muscle neurons (the green ticks) show a potential "low risk, high reward" strategy in such an environment (see the main text above).
Looking at individuals might be informative, but one need to look at the population across generations to see how the "learning" happened. So finally, let's look at the average population lifespan across generations.
As shown in the figure below, the population lifespan increased quickly in the first 10 generations, and became stable with some fluctuations, probably due to the random mutation introduced in each generation. If we look at the 400 fish that have longest lifespans in each generation (a.k.a. the "mother fish" that had at least one chance to pass themselves and offspring to next generation), the trend was similar but much more stable, and after 10 generations, it constantly beat the default lifespan. These results show that the "learning" happens quickly and the "acquired abilities" are preserved in later generations.
Mean population lifespan over generations. The population lifespan increased quickly in the first 10 generations and then plateaued.
So, this is all about "the little fish" project. It demonstrates that a neural network can be optimized to fit the external environment by applying only evolution principles/algorithms. Each individual is oblivious about the goal and incapable of learning, yet the population, over generations, becomes fitter to the environment.
Clearly this is just a tiny toy example. But it shows a way to explore many more questions along this line. Some of these questions that I found interesting are listed in the next section "Afterthoughts".
Is it really a fish?
By looking at the behavior, I can hardly convince myself that it is fish I am simulating😅. They are more like microbes that roam around in the water, running in circles. Actually, the microbe analogy may fit this project better, since it is designed to probe the origin and evolution of the nervous system. The intelligent vertebrates with highly structured brains (a.k.a. fish) came much later. But, urrrr, it is too late to change ...
Co-evolution between the nervous system and everything else
This is actually my original motivation to start this project. It is inspired by the fact that every animal's nervous system is not designed from scratch but always evolved from the previous version. Through evolution, the nervous system changes gradually in response to evolutionary events in the history over many generations:
the change of the sensory system: what if the fish's peripheral visual system suddenly improves and the fish's eye sight increases? What if the number of eyes and visual resolution increases? What if the fish's eye can sense depth? How will brain evolve to work with these new sensory systems?
the change of the motor system: what if the fish can move quickly? Or jump?
the change of the external environment: what if the food become scarcer? Or start to run away from the fish? what if sea level rises creating more water surface?
the change of the structure and properties of the brain itself: what if the number of the layers increases? Or neurons in each layer increases? What if the brain doubles itself? How will it evolve to utilize this increased capabilities?
All these changes simulate the equivalent evolutionary events in real natural history. By gradually shaping these events, one can expect that a complex neural network can evolve to specifically fit a certain environment and ability set.
All these changes can be simulated in the current framework. But I feel they might grow into massive projects that are beyond my current bandwidth.
The fish has no long term memory.
Due to the absence of recurrent connectivity, the fish's memory relies solely on the duration of postsynaptic activities that travel through the network, which are very short. Thus, when the fish moves to a new location, it forgets about what the previous location was like. This is probably one of the reasons that it often displays the "back and forth" movement patterns. It will be interesting to see, if we add recurrent connections in the brain, whether the fish will acquire "memory" and start to swim continuously over generations.
Learning and reaction within an individual's life.
There are many aspects of this simulation that are unrealistic. One might find two important aspects are missing: learning and real-time reaction of each individuals.
For individual learning, however, I would argue that the neural connections in this simulation might be more equivalent to long range connections between different brain structures, which are pretty much defined by genes and not sensitive to adaptive learning. Plus the connection weights obtained by learning usually do not pass to the next generation (i.e. no acquired inheritance), unless the species have discovered "teaching", which comes much later and requires a very complex brain.
On the other hand, the real time reaction of individual might be worth simulating. Say if a fish hit the land, I can hardly imaging that it does not feel the shortage of breath and approaching death. Actually, it will be quite simple to implementing such "proprioceptic" sensory system. One just need to use the fish's current HP as an input to some additional sensory neurons.
Can communication ability be evolved?
The fish in these simulations are lonely. In each simulation, there was only one fish. What if there were more? Will they talk to each other? For them to be able to communicate, they should be able to sense each other. What if they evolve another set of eyes dedicated to sense other fish? What if they can change the the color of their body to send out different messages? To me, all of these are fascinating questions. Just throw in those conditions and abilities, without learning rules, we might be surprised by what will emerge.
Can this type of simulation be implemented with modern machine learning infrastructure.
Currently the implementation of simulation is not efficient at all. As mentioned above, replace the neural network with a modern ML architecture such as RNN may let us take advantage of current ML infrastructure. (But I just don't like that every neuron has to operate on the same time clock, with no temporal jitter at all, but this is just my personal feelings.)
There are two things on my side in terms of implementation efficiency, though:
Since there is no back propagation, there is no need to track derivatives and gradients.
The parallelization comes naturally. Each individual is an instance, running a large-scale simulation will be just like running a massive online video game.