Symbols are the lifeblood of computation. Load the Turing tape up with ones and zeros, turn the crank, and as the symbols flow through the machine the solution is etched out by the ticker in due time. I remember my fascination upon first learning that mere zeros and ones suitably arranged could encode images, manage finances, represent endless information, and enable seemingly endless interactive applications. In the Age of Digitization, it is tempting to think everything could be digitized, that all is symbol, and that even human reason is little more than the operation a program, and a seriously buggy program at that.
Let us call this development the Triumph of the Symbol, a triumph so thorough that it has conquered even the symbol itself. For originally, a symbol was meant to stand as proxy for something else, something real that it would represent. The word dog would evoke a non-digital experience of a furry, sociable mammal incapable of keeping its slobber to itself. Gradually, however, symbols have been stacked on other symbols in a sort of Jenga tower rising ever away from the ground. At this point in modernity, the ground is now so far below that the entire global society presently seems to suffer a form of insanity emerging from our inability to adequately perceive a reality so far removed from the world of our symbols. But I digress.
Let me explain what I mean by symbols stacked on symbols. In object-oriented computer programming, the fundamental concept is that of an object which is an instance of a class. So, for example, the class might be named Dog and it might have attributes such as name, age, weight, breed, and owner. An instance would fill in the actual name, age, weight, breed, and owner for a particular dog. And indeed, such a representation might be adequate for a record in a vet’s computer system. The instance itself would be an address (location) in computer memory, and this address then serves as a symbol for the record. The record itself consists only of more symbols: the name is a sequence of letters, the age a sequence of digits, and so on. Thus the computer record for Fluffy is a symbol referring to symbols. The list of the vet’s patients is yet another symbol pointing to a list containing Fluffy. And the veterinary franchise run by this particular vet is one more symbol in the computer records of a larger corporation. Symbols upon symbols.
So far, so good, as long as one could eventually encounter the very real slobber of the very real Fluffy. But a funny thing happened on the way to the 20th century. A subset of philosophers and logicians began to advocate the idea of extensional formal models. Extensionality is a principle with its origin in set theory, a WYSIWYG principle to be specific (What You See Is What You Get). It says that if two sets contain the same things, they are equal, i.e., the same. When applied to semantic model, extensionality means that all relationships are explicit, and there is no hidden (that is, intensional) meaning beyond those enumerated by the model. That is, the fact that our model of Dog above does not include slobber only means that our model is incomplete; a correct model would include not only slobber but much more.
One of my mentors, Russian Jewish emigré and Purdue professor Victor Raskin, taught that all thought is effable, that is, that anything we can conceive of we can also say. He believed that all knowledge could be represented by a sufficiently large and complete formal model. He further believed that such a model could be coded into a computer manually if we would just do the work, and felt that research in statistical machine learning for NLP was a futile attempt to avoid doing the work.
Raskin and many others did the work — Doug Lenat at Cycorp and the Watson effort at IBM come to mind — but the complete formal knowledge model never did materialize. There are many reasons why, but what I always found most curious was that two knowledge modelers given the same task would generate models that were incommensurate and incomparable, often due to issues of scale or resolution. If there is a single true formal model of knowledge, why should it be so elusive, even for tightly circumscribed domains?
One possible explanation is that there is no single true formal model, and that we humans form models ad hoc to solve specific problems or to perform particular tasks. These models might differ from task to task in granularity or resolution, but there must be something shared among tasks given that we so rapidly transfer information from one task to another. If that source is not a formal model, what might it be?
Leaving this explanation to the side, suppose that there does exist an underlying formal model of all human knowledge. Then within the individual human, there are then two potential sources for this model: nature or nurture. Either the model is inherited genetically (nature) or learned individually during each lifetime (nurture). The idea that all concepts are purely genetic is absurd on the grounds that humans continually invent new concepts. At the other extreme, at least some concepts must be physically grounded in the body, and hence at least some concepts describing our physical, embodied experience must emerge from our genetic heritage. It necessarily follows that the human model of knowledge, formal or otherwise, must emerge as a hybrid of nature and nurture, grounded in embodied experience but with a facility for generating new concepts as needed to adapt to experience and to innovate.
The reasoning above implies that at least some concepts must be learned during one’s lifetime. How, exactly? More specifically, if we assume that the relationship of dog with attributes like breed and owner reflects some meaningful aspect of the human knowledge model, then how do these associations come to be recognized?
I propose that the foundational origins of symbolic reference lie in the simultaneous processing of objects both as a whole and as an assemblage of parts. In his book The Master and His Emissary, Iain McGilchrist has argued that analysis of objects in terms of parts has its origin in the left hemisphere of the brain whereas the right hemisphere processes objects (or parts) in their broader, holistic context. I have argued for a fundamental distinction between a holistic, allocentric navigational system centered on the hippocampus and the frontal cortex in cooperation with a part-focused, egocentric navigational system centered in the parietal lobe and driving the premotor cortex. Perhaps reconciling the two perspectives, we find that among humans the right frontal lobe is larger than the left frontal lobe, whereas the left parietal lobe is bigger than the right parietal lobe, a biological fact known as the Yakovlevian twist. Thus the right hemisphere may have a predilection for allocentric processing of whole scenes whereas the left hemisphere prefers egocentric processing of focused parts. This argument is bolstered by the evidence cited by McGilchrist that the right hemisphere is possessed of broader dendritic trees that can incorporate information from a wider context than the left hemisphere.
Processing a scene as a whole in one part of the brain and as an assemblage of parts in another means that the neural state can juxtapose a representation of the whole with a representation of its parts. Is such a juxtaposition not exactly what ontological classes seek to provide? Thus one observes a dog, but also its legs, paws, tail, head, tongue, ears, and eyes, all at once. The concept of dog captures the whole, integrated experience, whereas each part plays a role in recognizing and understanding the whole. We have here in nascent from the genesis of the symbol, the holistic sign and the part-based decomposition.
And yet what we have is radically different from a symbol. For, as de Saussure taught, the sign dog is arbitrary. There is nothing in the spelling d-o-g that necessitates the experience of a dog, and hence this part-whole association is not what we would traditionally call a symbol, in that the perception of dog as a whole contains dogness in some meaningful, non-arbitrary degree. It is not a symbol, but something else, something more.
I call this representation of the whole a pseudosymbol. It is like a symbol, in that it stands as a representation of parts, referring to them by inference. If we see a dog, we expect to see legs, a tail, ears, eyes, and yes, a tongue that will lather us in slobber. Thus a pseudosymbol has referents and stands in their place. Unlike symbols, however, the pseudosymbol dog contains information, namely, its predictions of the partwise interactions and experiences that will emerge from sensory contact with a dog.
Elaborating on this last point: learning in neural systems seems to depend substantially on cross-prediction, as we have seen many times. Each sense predicts what is found in another sense, so that the image of a dog can be predicted from the sound of a bark. The future is predicted from the past, a place from its contents, an object from its parts. But the relationship is reciprocal, so that the past is inferred from the present, the contents from a place, and the parts from an object. It is this reciprocal prediction that takes on an explicitly symbolic quality due to its one-to-many nature. The present implies many pasts, a place many contents, and an object many parts.
Still, every cross-prediction is a kind of symbol. The bark is a symbol for the dog in exactly the same way that the sounds d-o-g are a symbol for the barking animal. This comparison is not just an analogy; many linguists have suggested a significant role for onomatopoeia in the origin of language. If we broaden the term pseudosymbol to represent not just the whole but rather to represent cross-system predictive representations more generally, then we see that there might be no symbols at all in the brain, but only pseudosymbols. The distinction between symbols and pseudosymbols being that symbols are arbitrary whereas pseudosymbols contain predictive information about the referent.
Symbols, then, only exist outside of the brain. For the brain to process symbols, it must first perceive them as pseudosymbols that contain an intention as to their meaning. To a computer, the variable names in a program are arbitrary. But as any programmer knows, meaningful variable names are critical to writing computer code that can be updated and maintained, precisely because the maintainer is a human who always perceives intention and can only suppress that intention with effort.
To summarize, the brain is not a symbol processor but rather an aggregate of many interacting systems that are constantly trying to predict each other’s behavior. These predictions attain a symbolic quality that is not arbitrary, to which I assign the name pseudosymbolic. Unlike symbols, pseudosymbols are learnable precisely because they are not arbitrary but are formulated as predictions.
In the next post, I will explore the part-whole relationship further, reaching towards its role in language, reminiscent of frame semantics. Please leave any comments below, and let me know if you find this interesting!