Notes from “A Thousand Brains: A New Theory of Intelligence”

Here are some quotes and my thoughts from Jeff Hawkins’ book A Thousand Brains: A New Theory of Intelligence. I particularly enjoyed the first half of the book, where Hawkins lays out his theory of how the neocortex works. Below, the italicized are quotes from the book, sometimes surrounded by my own ideas.

prediction is the core of intelligence, just as prediction is the fundamental goal of physics (understand and describing our world). this is function approximation, and it arises in ways beyond simply supervised learning tasks. “We are not aware of the vast majority of these predictions unless the input to the brain does not match. As I casually reach out to grab my coffee cup, I am not aware that my brain is predicting what each finger will feel, how heavy the cup should be, the temperature of the cup, and the sound the cup will make when I place it back on my desk. But if the cup was suddenly heavier, or cold, or squeaked, I would notice the change. We can be certain that these predictions are occurring because even a small change in any of these inputs will be noticed. But when a prediction is correct, as most will be, we won’t be aware that it ever occurred.” reinforcement learning is one paradigm of how prediction can be harnessed to create higher level intelligent behaviors… what is beyond reinforcement learning?

standard neural networks have an independence with respect to a neuron’s different inputs. the signal from one input does not change the way the neuron responds to signals from another input, which is unlike the way the brain works. what if a single neuron’s input could nonlinearly affect its response to another input?

“Recall that predictions made by the neocortex come in two forms. One type occurs because the world is changing around you… Predicting the next note in a melody, also known as sequence memory, is the simpler of the two problems, so we worked on it first. Sequence memory is used for a lot more than just learning melodies; it is also used in creating behaviors. For example, when I dry myself off with a towel after showering, I typically follow a nearly identical pattern of movements, which is a form of sequence memory. Sequence memory is also used in language. Recognizing a spoken word is like recognizing a short melody. The word is defined by a sequence of phonemes, whereas a melody is defined by a sequence of musical intervals. There are many more examples, but for simplicity I will stick to melodies. By deducing how neurons in a cortical column learn sequences, we hoped to discover basic principles of how neurons make predictions about everything. We worked on the melody-prediction problem for several years before we were able to deduce the solution, which had to exhibit numerous capabilities. For example, melodies often have repeating sections, such as a chorus or the da da da dum of Beethoven’s Fifth Symphony. To predict the next note, you can’t just look at the previous note or the previous five notes. The correct prediction may rely on notes that occurred a long time ago. Neurons have to figure out how much context is necessary to make the right prediction. Another requirement is that neurons have to play Name That Tune. The first few notes you hear might belong to several different melodies. The neurons have to keep track of all possible melodies consistent with what has been heard so far, until enough notes have been heard to eliminate all but one melody.”

“we understood that most predictions occur inside neurons. A prediction occurs when a neuron recognizes a pattern, creates a dendrite spike, and is primed to spike earlier than other neurons. With thousands of distal synapses, each neuron can recognize hundreds of patterns that predict when the neuron should become active. Prediction is built into the fabric of the neocortex, the neuron.” this is like a more advanced attention mechanism

a cortical column takes as input grid cells which give information about movements or change in the environment, and then the lower layer cells use associative memory to match the input to maps. learning a new object is like connecting the layers. this is a network made up of an MLP + associative memory. what’s the loss function?

thinking involves moving through reference frames: “The method of loci uses a previously learned map, the map of your house, to store items for later recall. In the bird example, the neocortex created a new map, a map that was suited for the task of remembering birds with different necks and legs. In both examples, the process of storing items in a reference frame and recalling them via “movement” is the same. If all knowledge is stored this way, then what we commonly call thinking is actually moving through a space, through a reference frame. Your current thought, the thing that is in your head at any moment, is determined by the current location in the reference frame. As the location changes, the items stored at each location are recalled one at a time. Our thoughts are continually changing, but they are not random. What we think next depends on which direction we mentally move through a reference frame, in the same way that what we see next in a town depends on which direction we move from our current location.”

learning is about finding the right reference frame with which to incorporate new information about the world.

“Reference frames in the old brain learn maps of environments. Reference frames in the what columns of the neocortex learn maps of physical objects. Reference frames in the where columns of the neocortex learn maps of the space around our body. And, finally, reference frames in the non-sensory columns of the neocortex learn maps of concepts.”

“Reference frames provide the substrate for learning the structure of the world, where things are, and how they move and change. Reference frames can do this not just for the physical objects that we can directly sense, but also for objects we cannot see or feel and even for concepts that have no physical form. Your brain has 150,000 cortical columns. Each column is a learning machine. Each column learns a predictive model of its inputs by observing how they change over time. Columns don’t know what they are learning; they don’t know what their models represent. The entire enterprise and the resultant models are built on reference frames. The correct reference frame to understand how the brain works is reference frames.”

“Our proposal of reference frames in cortical columns suggests a different way of thinking about how the neocortex works. It says that all cortical columns, even in low-level sensory regions, are capable of learning and recognizing complete objects. A column that senses only a small part of an object can learn a model of the entire object by integrating its inputs over time, in the same way that you and I learn a new town by visiting one location after another. Therefore, a hierarchy of cortical regions is not strictly needed to learn models of objects. Our theory explains how a mouse, with a mostly one-level visual system, can see and recognize objects in the world.”

“Knowledge in the brain is distributed. Nothing we know is stored in one place, such as one cell or one column. Nor is anything stored everywhere, like in a hologram. Knowledge of something is distributed in thousands of columns, but these are a small subset of all the columns.”

“Truly intelligent machines, AGI, will learn models of the world using maplike reference frames just like the neocortex.”

look at geoffrey hinton and capsules

“I don’t see why the path of unguided evolution is preferable over a path of our own choosing. We can be thankful that evolutionary processes got us here. But now that we are here, we have the option to use our intelligence to take control of the future”

papers linked