In the previous post, I introduced the idea that language is understood through a kind of simulation. As I will eventually discuss, I am not the first to suggest this. At the same time, I am not yet aware of any researchers or philosophers who have discussed the implications of this claim in detail (please post in the comments if you would like to point me towards someone). Thus I think it is important to begin to clarify what such a simulatory semantics might entail and what evidence might exist to support the claim.
The term semantics refers to the study of meaning in language, derived from the Greek word sema that translates as ‘sign’ or ‘symbol’. The word semantics has a checkered history in common English, often being deployed to dismiss sophistic magnification of minute, irrelevant distinctions (that is, distinctions without a difference). But the fundamental questions of semantics are, first, what is meaning in a linguistic context and second, how is meaning derived from or extracted from linguistic utterances?
These questions have a long history involving some of the most eminent philosophers, logicians, mathematicians, scientists, and linguists. Over time, I hope to explore the relationship between the claims I am making and the various historical proposals, but for now I will simply say that the perspective of simulation I am presenting is consistent with certain schools of thought on semantics while contradicting others.
Most especially, the idea of semantics as simulation contradicts the tradition of truth semantics, in which linguistic utterances are to be interpreted as statements about the world that are either true or false. Truth semantics could only ever account for the portion of language that pertained to facts. That portion includes informational statements, such as “John goes running every morning at 5:30 a.m.” as well as questions requesting information, such as “When does John go running each morning?”
But there are wide swaths of language that are not particularly concerned with truth or falsehood. If someone tells me that “John feels angry”, to what degree is that a claim of truth? How is the speaker justified in judging John’s emotional state? If I issue a command, such as telling a child “Go to your room!”, in what sense can that utterance be true or false? If the weatherman says “it might rain tomorrow”, how are we to assess the truth or falsehood of the statement? Although a variety of tools have been developed by logicians to deal with such scenarios, it seems doubtful to claim that the speakers or hearers of utterances like the ones above are primarily concerned about their truthfulness.
The idea of simulatory semantics falls instead within the long tradition of philosophical thought that meaning in language arises from internal mental experience. Criticisms of this tradition point to the fact that such internal experiences are not shared among individuals, whereas people expect language to refer to shared, external experiences. My response to this criticism is that shared meaning arises out of a negotiation between the individual and the external world. The individual enters this negotiation with a set of goals that ultimately emanate from the individual’s basic needs and drives. The success of this negotiation depends on leveraging shared or conventional meaning for words in order to influence the mental state of the listener and thus obtain their cooperation. Since many individuals are mutually engaged in these negotiations, the necessity of this negotiation process enforces shared references within a community and brings external order to our internal experiences. Thus individual experience within a social context is sufficient to bring about shared linguistic conventions that ultimately rely on our common biology and acquired culture as a shared basis for understanding.
Simulatory semantics, then, answers the question of meaning with the claim that the meaning of a linguistic utterance is a simulated internal experience. This claim is immediately problematic because it does not assign a unique meaning to each utterance. Rather, there is the meaning that the speaker wished to convey, that is, the internal experience of the speaker that inspired the utterance, and there is the meaning that the listener derived, that is, the internal experience of the listener upon interpreting the utterance. Not only may the internal experiences of the speaker and listener differ, but there may be as many potential internal experiences as there are potential listeners, and perhaps even more than that.
Yet the particular words of the utterance do constrain or limit the range of resulting internal experiences that may be constructed, particularly through the context of a shared culture. Thus the statement that George Washington was the first president of the United States is relatively unambiguous within an American cultural frame. The referent of George Washington is a person known to all Americans through history, the United States similar refers to a shared entity, and the role of president within the United States is also commonly understood. These commonalities arise from a common experience in the world shared by Americans.
I chose the phrase ‘relatively unambiguous’ rather than simply ‘ambiguous’ because no statement is ever truly unambiguous. For example, there might be another individual named George Washington who merely shares a name with the first president of the United States. More importantly, the qualia of George Washington and the United States will differ widely among individuals. To some, Washington was an eminent and wise leader who refused to be made king and began a tradition of the peaceful transfer of power. To others, he was a slave-owner and an agent of racial oppression. Thus, taken as a whole, the simple statement of fact will trigger significantly different internal experiences.
Ultimately, this plurality of qualitative meanings is not a weakness of the claim that meaning is an internal experience but rather a strength. It is the job of both speaker and listener to integrate context in order to remove ambiguities and arrive at a negotiated common understanding. For example, in a history class on the foundation of the United States, test questions are unlikely to refer the George Washington who lives next door and works at the local hardware store. In a course on ethnic studies, portrayals of George Washington as a modern-day Cincinnatus are unlikely to be well received. Part of human intelligence is the integration of context, and in fact, this plays out as a consequence of the structure of simulation machinery in the human mind. Simulatory semantics can account for divergent responses to a single utterance as a process of interpretation wherein the individual’s particular life experience is to be combined with the social norms of linguistic meaning.
Readers paying close attention will note that most of the foregoing statements regard internal experience without being particularly concerned with simulation. I will now be more specific in describing this simulation.
To describe simulation, a breakdown of the mind into parts will be helpful to our understanding. In point of biological fact, the human mind does not seem to be particularly modular, with the underlying reality being that neural components are interconnected in rich, non-modular ways and neural functions are distributed widely among brain regions. Nonetheless, I will present a layout of modules of intelligence that will explicate the concept of simulated experience. These modules should not be taken as indicative or definitive; I might, at another time, break down these modules in another way.
So, at a minimal level, a mind might include a basic set of functional modules:
A perception system that extracts a working model of the current environment, including episodic and static features, through the various sensory modalities, including vision, hearing, touch, taste, smell, balance, and proprioception (that is, detection of body configuration);
A drive system whose lower elements regulate primitive needs such as hunger, shelter, safety, reproduction and potentially also the particularly human drives such as curiosity and prosociality and whose higher elements encode a hierarchy of goals and values and establish a current set of priorities given the state of the body and the environment;
A behavioral system that proposes and implements action plans in response to the working model from the perceptual system and the priorities established by the drive system;
An executive system that elaborates and refines the working model of the perceptual system, coordinates with the drive system to establish a plan of action that implements the current priorities in response to the working model, and negotiates that plan of action with the behavioral system, all utilizing top-down feedback to influence perception, drives, and behaviors.
A content-associative memory of episodes, objects, places, and agents that links experiences to each other and is used by the executive system to elaborate its working model and to establish subgoals in accordance with the priorities of the drive system.
As should be obvious from the tight control that the executive system has over the behavioral system, it seems reasonable to think of this system as growing out of the behavioral system. If I may be permitted some leeway to speculate, there is biological support for this view. Among other roles, the motor cortex supervises muscular behavior and is located roughly at the center and top of brain, stretching across left and right hemispheres. The premotor cortex is an evolutionarily later portion of the brain in front of the motor cortex, shared among mammals, that apparently aids in motor planning and shares many features of the motor cortex. Together, the motor and premotor cortex make up the frontal cortex. Finally, the prefrontal cortex sits at the very front of the brain and appears responsible for executive functions; this portion of the brain is far more developed in humans than in other mammals.
It has been suggested elsewhere that the latter of these two components are an evolutionary outgrowth of each other, with the premotor cortex emerging to examine the effect of actions before undertaking them and thus regulating the motor cortex, and with the prefrontal cortex emerging subsequently to allow for still more complex analysis and planning. The picture that then emerges is a three-fold system of behavioral control: the prefrontal cortex for long-range planning, the premotor cortex for short-term planning, and the motor cortex for immediate action implementation. In the modular breakdown above, the executive system would reside primarily in the prefrontal cortex, the behavioral system would reside primarily in the motor cortex, and the premotor cortex would participate in both behavioral and executive functions.
By simulation I am referring to the planning operations of the executive system, which I presume to involve an abstract generation of the consequences of various courses of action and a comparison of those consequences with the priorities arising from the drive system. ‘Abstract generation of consequences’ is the essence of simulation. In order to determine these consequences, the executive system must play forward the working model from the perceptual system based on the contents of the content-associative memory. The capacity to project future consequences based on memory is a kind of world model. The executive system must build and maintain this model so that its simulations are accurate, which is determined through the action of the body in the world. The quality of the executive model is determined by the quality of its simulations and the efficacy of the action plans it proposes.
When the executive system runs a simulation, it is unclear how much of the neural machinery is included in the simulation. The fact that people can become angry just by imagining a situation suggests that a significant portion of the brain is utilized in simulation, given that the amygdala, which regulates emotion, is quite far biologically from the prefrontal cortex. Likewise, the existence of top-down feedback to the perceptual and behavioral systems from the prefrontal cortex is well established. Of note, there are many inhibitory neurons in the premotor cortex responsible for disengaging the motor system when possible actions are being considered, so that plans are not implemented until “approved”.
These simulations, then, are biological facts, though their exact content and mechanism are not at all understood. But the ability to simulate the possible actions of the self suggests the ability to also simulate the actual actions of others. It is only a slight shift from imagining ourselves doing something to imagining the same thing done by another. Thus one might expect that the same circuitry used to plan one’s own actions might be used to understand the actions of others.
For an agent that models the world in order to guide its own behavior, one of the most difficult tasks is to incorporate the likely responses of other agents into its world model. Actions are likely to provoke reactions. If I am angry at someone, I could punch them, but they could also punch me back. A planning system that does not model the likely responses of other agents will fail quickly. The most compact way for a planning system to model the behavior of other agents is to assume that the others are using the same planning system as oneself. More nuanced models might start with a self-model and modify it minimally so that it matches observed behavior of others.
If a model of the self is also used as a model of others, we expect to find neural systems in the brain that respond equivalently regardless of whether an action is performed by the self or by another. And in fact we do have evidence for such systems in the form of mirror neurons. Mirror neurons have been studied extensively in monkeys. They are located in the premotor cortex, but also in the motor cortex and elsewhere. Mirror neurons fire both when a monkey performs a task and when it observes another monkey or person performing the same task. As stated in a 2013 review:
“the key characteristics of mirror neurons are that their activity is modulated both by action execution and action observation, and that this activity shows a degree of action specificity” (Milner & Lemon, 2013)
That is, mirror neurons are active whether the self is performing a task (action execution), or whether the self is observing the task (action observation), and the particular neurons that fire are specific to the task being performed (action specificity). This behavior is exactly what we would expect in a planning system that efficiently models the action of others by presuming that they are similar in behavior and intent to the self.
I do not want to belabor the role of mirror neurons too much; experiments in humans using fMRI have been less suggestive than those involving monkeys. Nonetheless, a core aspect of the simulation thesis is the existence of a capacity to model other agents in similar ways as one models oneself.
Returning to semantics, I propose that the subjective meaning of a linguistic utterance is a particular simulation. Thus if I say, the monkey grasped the banana, then I understand this utterance by simulating the act of grasping with the monkey as the agent and a banana as the target. Since I personally have grasped bananas, and since I have observed monkeys holding bananas in videos, I can understand the experience of grasping a banana myself through a kind of remembering, and I can overlay this experience onto a presumed similar experience with the grasping done by a monkey, producing a blended memory. The subjective meaning of the statement is nothing other than the sequence of neural firings that occur in my planning machinery as I imagine a monkey grasping a banana.
As emphasized in the last paragraph, this meaning is subjective. I assert that there is no such thing as an objective meaning of the statement. Within a society, we as individuals negotiate subjective meanings that allow us to cooperate effectively. And if 100 sketch artists from our society were asked to draw a realistic sketch of a monkey grasping a banana, then nearly every individual within the same society would be able to describe the resulting images as a monkey grasping a banana, even though no two drawings would be identical and despite the absence of a concrete, unambiguous, objective meaning to the words.
Philosophically, it is important to note that the idea that language has no objective meaning is quite distinct from denying the existence of an objective reality. Although it is plain that we are all living out of our subjective experience, the negotiation of our experience with other agents and with the external world substantially constrains our experiences and hence we would not be wise to say that all experience is purely subjective or, as some say, that perception is reality.
The claim that language is understood through simulation might be restated as a claim that humans have a simulation facility, and that this facility is active in the interpretation of language. This claim results in a testable hypothesis. If the claim is correct, then we ought to be able to find an analogue of mirror neurons that fire equally under three distinct scenarios: (1) when an action is performed; (2) when the action is observed; (3) when the action is verbally described. That is, if in fact language is understood through simulation, then some components of the simulation must be shared whenever an action is simulated, whether as a precursor to execution, as an observation of another agent, or as the interpretation of an utterance. Thus there should be some analogous neural activity that occurs when grasping a banana, when observing a monkey grasping a banana, or when talking about grasping bananas. Although it may prove difficult to find these neural activity patterns, they must be present if the hypothesis is correct.
I would make two clarifying observations about the hypothesis in the prior paragraph. First, it must be tested in humans, since only humans have language as we understand it. Second, the neural activity in question need not reside in the premotor cortex. Indeed, it seems more likely that it will be found in the prefrontal cortex instead.
I will venture some final thoughts about the evolutionary development of this simulation capability. One generally supposes that an organism is oriented towards actions that will increase its survival capability. To the degree that the organism lives in a complex world, it is beneficial for that organism to have a model of the world sufficiently complex to guide its action. The complexity of the human world primarily comes from other humans. Thus there is a significant survival benefit that accrues to those humans who can effectively model the behavior of other humans.
We can conclude based on the existence of mirror neurons in monkeys that primates generally can model the behavior of others. One might then conjecture that this ability led to a kind of evolutionary arms race in which primates were more effective at reproducing when they could better model other individuals both within their own troops and in outside troops. That is, behavioral models of others are useful both for cooperation and competition, leading to progressively increasing capacity of these models over evolutionary time. Thus one might expect the continuous development of an ever more robust predictive model of how other people behave in a sort of evolutionary feedback loop, and you might further expect the dominance hierarchy in humans, as a proxy for reproductive strength, to reflect this development, as in fact we do typically see. The most “successful” humans are those who can effectively organize or inspire the efforts of others.
Similar statements could also be made along parallel lines for an evolutionary feedback loop based on tool use, and the two facilities — ability to predict social behavior and ability to create increasingly abstract tools — are two hallmarks of human development. But it seems to be the social facility that is primarily implicated in the evolution of language. And it is the mechanisms behind this social prediction facility that I am calling a simulation facility.
The reasoning of the last few paragraphs provides an evolutionary basis for the claim in my previous post that the purpose of language is to influence the mental states of others, so that the listener will cooperate with the speaker, or so that the speaker may dominate the listener, depending on the intent and goals of the speaker. The evolutionary basis is that speech as a tool of control for both cooperation and competition is beneficial to the reproductive prospects of the individual.
Wrapping up, I claim that the proper foundations for semantics resides within the structure of a simulation facility that humans possess, and that the evolutionary origins of this facility lie in an evolutionary arms race over the ability to effectively model other individuals within a social context.
As mentioned in the introduction, these ideas did not emerge from a vacuum. The concept of language as simulation is suggested in a wonderful little book From Molecule to Metaphor by Berkeley professor Jerome Feldman. Mechanisms of the simulation facility might be based, for example, on the techniques of Gilles Fauconnier’s Mental Spaces, which were eventually developed with Mark Turner in their joint book The Way We Think, which introduces the concept of blending mental spaces as a method of analogical thought and propounds the notion of living in the blend, which at essence provides a formulation of how a simulation might function. Daniel Everett’s Dark Matter of the Mind addresses how much of the interpretation of language is based on assumed cultural context. The concept of the premotor cortex providing a simulation prior to execution in the motor cortex as a survival mechanism has been popularized by Jordan Peterson in his podcasts and lectures. For neural structure, I have relied on Larry Swanson’s Brain Architecture. I have been strongly influenced in my perspective by the school of cognitive linguistics, including Leonard Talmy and George Lakoff among others, whose ideas will become more relevant in later posts.
In the next post, I will discuss the functional nature of the simulation facility in more detail, beginning with the universality of the ternary subject, verb, object construct across human languages and eventually progressing to discussion of mental spaces and embodied analogy. Please share any comments you may have below, as I’d love to refine, clarify, and correct my views based on feedback.
References J. M. Milner and R.N. Lemon “What we currently know about mirror neurons” Curr Biol. 2013 Dec 2; 23(23); R1057-R1062 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3898692/ J. A. Feldman Molecule to Metaphor: A Neural Theory of Language MIT Press, 2013. G. Fauconnier Mental Spaces: Aspects of Meaning Construction in a Natural Language MIT Press, 1985, with later publication by Cambridge University Press, 1994. G. Fauconnier and M. Turner The Way We Think: Conceptual Blending and Mind's Hidden Complexities Basic Books, 2002. D. Everett Dark Matter of the Mind: The Culturally Articulated Unconscious University of Chicago Press, 2016. L. W. Swanson Brain Architecture: Understanding the Basic Plan Oxford University Press, 2012 (2nd edition).
It's interesting to consider mirror neurons, or, for that matter, any biological mechanics that enable mimicry or empathy, as part of a simulation function supporting language and meaning more broadly.
Phenomena that stand out as simulation on an individual, ontological level might include the "call to the void," dreams, nightmares, fears, aspirations, and other activities of the imagination. Whether they are emergent of the limbic functions or top-down executive orders from the prefrontal cortex is an interesting question and likely depends from case to case, but the fact that they could emerge from one place or another is an interesting thought in-itself.
And a true sense of meaning would seemingly only arise by the presence of the other. How do they perceive it? Do they perceive it at all? Are we anything at all without the other?
Indeed, mirror neurons and various forms of communication and collective activity would seemingly enable a kind of "simulating together", "group/population simulation", or, borrowing a term from the world of edge compute, "federated learning."
Now there's an interesting thought. Could a group in communication about a shared imagining, say, a dream or aspiration, be a "federated learning" event simulating toward a model of reality that makes such a vision reality?