When John Donne intoned that no man is an island, entire to himself, he might have taken the thesis even farther, for not only men but objects too are connected in a web of cause and effect that propagates through space and time. In Mix-and-Match Creativity, I discussed how the perception of an object arises through a bundling of attributes that are tied to a point in space, based on the fact that the major external senses — sight, hearing, smell, and touch — are equipped to process location and direction. By tracking predictable movement in these sensory spaces, as described in Observing Processes of Change, these objects are also bound together in time. But the very nature of identifying an object plucks it out of the sensory scene. How, then, shall we account for the web-like strands we must cut to isolate an object from its environment?
The human perceptual system does not perceive objects in isolation but instead relates them through their interactions. Many of these interactions are perceived as spatial relationships in the visual system: recall from Mix-and-Match Creativity the example of the red rose on a yellow taxi. Both the yellow taxi and the red rose are perceived as separate objects, but their relationship is represented by the preposition on. I would argue that the perception of the relationship on is just as fundamental as the perception of the objects that it connects. I would be in good company making this argument; Leonard Talmy (1983, 2000), Barbara Landau (1993, 2016), and many other linguists and cognitive scientists have written extensively on the subject.
The brain seems to have a built-in model of geometric and spatial relationships. Even infants express surprise when object movements defy normal interactions, as when a ball suddenly disappears for no reason or rolls behind a book but does not emerge out of the other side (Dehaene, 2020, p. 55). Given that basic, human-scale kinematics are so essential for survival, it should be no surprise that all mammals share some level of preexisting capability to predict how objects will move and interact. Furthermore, as Landau and Jackendoff (1993) point out, we can locate objects in space from vision, hearing, and touch, which means that our model of space is not specific to any one sensory modality but rather derives representation that integrate information from all these senses.
I previously discussed how the brain models space but skipped over the nature of how spatial relationships are encoded and processed in the brain. Spatial relationships in language are represented at a basic level by prepositions. Certainly there are many verbs and adjectives that have a spatial quality, but the case of prepositions is curious. Firstly, prepositions as a category could be entirely replaced by verbs, as they are in Chinese, for example, as in yòng (用) = “use, with”, gêi (给) = “give, for”, zài (在) = “live, exist, at, in, on”. Yet the vast majority of languages do have either prepositions, postpositions, case endings, or some mixture of the three. Evidently, these linguistic categories are particular convenient for representing some aspect of how the mind works.
Barbara Landau and Ray Jackendoff wrote an influential 1993 article on prepositions and the nature of spatial relationships. The first observation they made is that out of 91 prepositions or preposition-like elements they identified in English, 80 were spatial in nature, while of the remaining 11, five were temporal, two involved logical consequence (because of, despite), and two were comparative (as, like), leaving only for and of. I could make an argument that all eleven of these have direct or indirect connection to space as well. Firstly, time is often conceptualized as a one-dimensional space, accounting for the temporal prepositions (and the single English postposition, ago). Secondly, the mental conception of causation is perhaps just an abstraction of a predictable temporal sequence as distinct from an unpredictable one, which folds causation back onto time and thence onto space. Notice how since can be either causal or temporal as an example. Thirdly, comparison could be viewed as an abstraction of setting two items beside each other in space. Finally, for and of can be considered in origin as spatial preposition indicating destination and origin whose role has broadened to other purposes by analogy. Thus there is some basis to say that the nature of prepositions originates in spatial relationships.
Perhaps the most striking feature of prepositions is their asymmetric, binary nature. Leonard Talmy (1983) described prepositions as relating a first object, the figure, to a second object, the ground. In particular, the figure tends to be more mobile than the ground, as in the elephant is next to the rock, or else smaller than the ground, as in the house is on the cliff. Although this asymmetry can be reversed, as in the cliff is under the house, it is strange to do so, and Landau and Jackendoff (1993) reference several experimental studies showing that, in fact, people do prefer expressions in which the ground to be either larger or more stable.
The roots of the asymmetry are twofold. Firstly, as we will discuss in detail below, prepositions likely reflect the structure of sensory feedback for motor control. But there are perhaps also perceptual reasons for asymmetry. When we perceive an object, we perceive it in a context that serves as an anchor for it. Prepositions serve to identify the anchoring context and thus reflect perception. Recalling the proposal in Recursive Mapping of Reality that perception over time may be conceptualized as a traversal of a conceptual space, we might notice that as I described in that post, traversal proceeds from smaller to larger and back to smaller. Hence one arrives at smaller things from larger things, and the spatial preposition then references the larger anchoring context. So our vision traverses a face and then scans the nose, and hence we say that the the nose is on the face rather than the face is behind the nose or the face is under the nose. In fact, there doesn't seem to be any preposition at all that quite suits the phrase the face is ____ the nose.
Landau and Jackendoff also observed that spatial prepositions generally have meaning that constrain themselves to object location as opposed to object shape. That is, there are no English prepositions that refer to the parts of an object or require an object to have a certain shape or structure, and the few that appear to do so refer to particular types of location, such as astern and abaft for ships. We do not have words to replace the word on when the figure is on different parts of the ground. Instead, we have to add a specifying noun to represent parts, as in the fork is on the edge of the plate versus the fork is in the center of the plate. Talmy (1983, 2000) describes this property by saying that spatial prepositions are topological in nature, that is, they observe so-called “rubber sheet” geometry, in which two shapes are called the same if either one can be squashed or stretched into the other without tearing, as though they were made of rubber. Thus when we say something is on a car or on a plate without caring about the exact shape of the car or the plate.
We have already seen that the brain has separate pathways for processing object identity as opposed to object location. This distinction is known as the what/where distinction, with the what pathway leading to the ventral (top) side of the brain and the where pathway leading to the dorsal (bottom) side of the brain, near the hippocampus (Ungeleider and Mishkin, 1982). Based on the observation that spatial prepositions seem unlikely to take object shape into account and presuming that the primary visual marker of object identity is shape, Landau and Jackendoff proposed that spatial prepositions reflected processing along the where pathways, where object shape would be less available.
These ideas have been tested experimentally, with mixed support. Damasio et al. (2001) found that recognizing spatial relations did preferentially activate the inferior parietal lobe as compared with object naming. More recently, Amorapanth et al. (2010) sought to learn which brain regions were active when processing object identity and spatial relationships. They showed pairs of scenes with two objects in them to subjects and asked them to decide whether the scenes matched. In some cases, the participants were asked to decide whether the scenes had the same types of objects, and in others they were asked whether the scenes illustrated the same spatial relationship. They concluded that the identification of categorical spatial relationships increase resource use in both the superior and inferior parietal cortex and in the middle temporal gyrus. The inferior parietal lobe is indeed along the where pathway, but the other regions are not.
The medial temporal gyrus in macaque monkeys is mainly involved in higher order visual processing, but in humans the left middle temporal gyrus has taken on additional roles in semantic processing and language. This region also is necessary for recognizing faces, a capability which is degraded in schizophrenic patients due to lower activity in this region, and for recognizing complex symbols. One interesting study showed that Japanese speakers with damage to this region showed no loss in reading symbols for syllables (Katakana) but significant loss in ability to read the more complex Chinese-origin ideograms (Kanji), which represent whole words and are more numerous. So the middle temporal gyrus seems to play a role in putting complex perceptual scenes together into a whole.
Thus, as Landau (2016) acknowledges, the original thesis of Landau and Jackendoff (1993) that spatial prepositions are representative of the where pathway requires modification, but their observation that prepositions do not seem to take specific shapes into account remains valid, whatever the cause. By contrast, many verbs or complex phrases can refer to spatial relationships in ways that do take composition and shape into account, as in hang, smear, lean on, balance on, etc. (Landau, 2016). Thus prepositions do seem to reflect something fundamental about how humans perceive shape.
If prepositions are by nature spatial, then it is rather curious that they can only relate two things to each other. After all, spatial relationships can involve many objects. For example, we might place three objects at the vertices of a triangle. Prepositions are then inadequate to tell us which objects are at which vertices, and language sufficient to communicate this situation can be rather complex (e.g., the first object is at the closest vertex, the second object is on the vertex to the left of that, and the third object is on the last vertex at the right). The closest prepositions we have to non-binary prepositions are between, amidst, and among, and even for these, the object of the preposition is conceived of as forming a group.
The question that is raised by this observation is whether the binary nature of prepositions is a purely linguistic feature, or whether it reflects something about how the brain recognizes spatial relationships in the first place. There is some evidence that the binary relationship is a consequence of how the brain processes space. Conder et al. (2017) carefully tested neural activations in response to spatial vs. non-spatial sentences, where the spatial sentences involve prepositional relationships such as above, below, to the left of, and to the right of. They found spatial processing differentially caused activation of the superior parietal lobe as well as regions at the occipital-parietal junction, with symmetric activation in both left and right hemispheres of the brain. They concluded that these regions are active precisely because they process binary relationships between percepts. To quote:
We ... believe that the involvement of S[uperior] P[arietal] L[obe] in processing spatial language reflects its role in maintaining and integrating multiple representations that are characterized by a configurational relation in which one representation is defined in terms of its configurational relation to another.
Recall that the parietal lobe is responsible for integrating sensory percepts across multiple senses in order to translate them into a feedback signal to drive motor control. Specific actions, such as grasping or reaching for an object, at essence require comparing the position of an effector (the body part doing the moving) with the position of a target (the goal of the motion). Notice the same asymmetry between figure and ground is present here; there is a moving body part (the figure) and a potentially stationary target (the ground). Of course, the target can be moving too, but then the goal of spatial tracking is to establish a reference frame that makes the target as stationary as possible to enable a motor plan to intercept the target. You can observe these reference frame adjustments in certain real-life scenarios. For example, if I spend an hour riding a particularly different mountain-biking trail, then for some period after stopping, my whole visual field feels as though it is moving back and forth in the way that the trails themselves were winding when riding. I view this experience as reflecting the influence of these stabilizing mechanism that enable motor planning, and I have had similar experiences after playing fast-moving video games that require intensely tracking objects across a screen. Spatial relationships, then, may be modeled on self-motion with respect to a target, resulting in both the binary and asymmetric nature of prepositions in language.
How, then, are spatial scenes with many relationships integrated into a whole? Conder et al. noted the involvement of a region at the center of the parietal lobe called the precuneus, which has connections to the frontal and motor cortices as well as both the inferior and superior parietal lobe mentioned above. The precuneus is known to be involved with generating mental imagery, as when you imagine laying on the beach sipping a piña colada. It would seem that the precuneus provides the bridge between linguistic recognition of prepositions and their spatial interpretation.
But the totality of a visual scene, as in the image schemata of Lakoff and Johnson (2010) or the concept diagrams I introduced in Recursive Mapping of Reality, is probably assembled elsewhere. Amorapanth et al. (2012) examined whether schematic diagrams were processed similarly to linguistic descriptions of the same diagram and to corresponding photographs. They generated schemas representing a variety of spatial scenarios, such as a red square in a black square, or a red circle above a black square. Participants had damage to either the right or left hemisphere of the brain, but not both. Participants were asked to match schemas to photographs, prepositions to schemas, and photographs to other photographs with the same spatial relationship. People with right hemisphere damage succeeded at matching schemas to photographs and photographs to photographs, but failed at matching prepositions to schemas. Conversely, people with left hemisphere damage succeeded at matching prepositions to schemas, but failed at the other two tasks. Examination of the damaged regions suggested that the left middle temporal gyrus was critical to success at matching schemas to prepositions, but the right middle temporal gyrus was critical for matching photographs to schemas and other photographs.
The underlying question that remains unanswered by these experiments is whether the failure of the patients with left hemisphere damage to associate the right prepositions with their schemas is purely a linguistic failure or a failure of understanding the underlying categorical spatial relationships. In general, there seems to be a gradient across the brain from right to left, with the left brain representing categorical relationships and the right brain representing continuous relations. I personally wonder whether this gradient might be realized as the steepness of a neural activation function, so that neurons at the far left of the brain show sharper transitions in neural firing rates in response to stimuli, whereas neurons at the far right of the brain exhibit more gradual transitions in firing rates, allowing the right brain to more accurately reflect stimuli, at the cost of making less useful distinctions. If this view is correct, then I would argue that failure at the task of matching spatial prepositions to schemas does indeed reflect a lack of categorical spatial knowledge, not just a loss of linguistic knowledge.
What is perhaps most interesting, though, is that the schemas seem to span the gap between the categorical relationships and the purely spatial relationships. Notice that all participants tended to succeed at at least one of the tasks involving schemas, either matching them to prepositions (right-damaged) or photographs (left-damaged). Thus these schemas tow the line between left- and right-brain representations, potentially representing some intermediate state. The finding of Conder et al. (2017) that spatial processing occurs on both sides of the brain also supports this claim.
It is worth remarking that when we put together the results showing that the left middle temporal gyrus is involved in recognizing Chinese ideograms with the observation that the right middle temporal gyrus is involved in associating schemas to photographs, we find some interesting implications with respect to left-right asymmetry in the brain. Firstly, ideograms are quite detailed, and to recognize them requires fine spatial distinctions with respect to object structure. At some level of complexity, object shape is a spatial concept in that objects parts must be integrated into a whole. Similarly, complex scenes are likewise an assemblage of various spatial relationships. It would seem that the middle temporal gyrus performs this aggregation of information into a whole picture, with object shapes being aggregated categorically in the left hemisphere and spatial scenes being aggregated in the right hemisphere. In either case, although we do not yet have an answer for how this aggregation is performed, we have potentially answered the question of where the brain aggregates objects and relationships into holistic scenes.
Secondly, if we consider hearing as opposed to vision, the left brain is more associated with language where the right brain is associated more with music. In language, we find that timbre is more important than pitch (Patel, 2008). Timbre refers to the shape of the sound; two different instruments playing the same note, such as a saxophone or a violin, will have the same pitch but different timbre. Pitch refers to the frequency of sounds. Language sounds are distinguished more by timbre than by pitch. Music, however, depends more on pitch than timbre, though timbre is important. In a way, it seems there is an analogy here with the visual system. The left visual system tends to assemble complex objects from their parts, while the right visual system integrates complex scenes from simpler, potentially binary spatial relationships. In analogy, the left auditory system processes sound to assemble complex timbres, while the right auditory system integrates notes to form pitch relationships and melodies.
One question is how the understanding of spatial relationships relates to the brain's understanding of space more generally. Places and Maps introduced the two distinct spatial systems known to neuroscience, the allocentric navigation system and the egocentric motor control system. Landau (2016) updated the earlier work of Landau and Jackendoff to incorporate this distinction by proposing two categories of spatial prepositions. On the one hand are the purely geometric spatial relationships, such as above, below, beside, left, right, in front, behind, inside, and outside, which Landau associated with the allocentric navigation system, and on the other hand are the prepositions that involve motion or contact as a core element, including in, on, against, across, along, off of, away from, and towards. Landau calls this latter category force-dynamic, indicating that the meaning of these preposition involves some kind of force. This force may be implicit as in cases of contact in that one object is perceived as holding the other object; if a book is on a table and the table is removed, the book would fall.
Landau (2016) hypothesized that whereas the geometric prepositions might be processed by navigation systems, the force-dynamic prepositions would engage the egocentric spatial system in the parietal lobe of the brain, which enables sensory feedback in motor control. The work of Conder et al. (2017), discussed above, contradicts this view somewhat, because their tests involved the purely geometric prepositions and yet activated the egocentric parietal systems. Additionally, there is no reason why the allocentric spatial system should be limited to the binary relationships that limit prepositions.
Rather than proposing a fundamental distinction between force-dynamic and geometric spatial relationships, it makes more sense to me to suppose that all prepositions are processed at least in part by egocentric parietal spatial systems. An interesting consequence flows from this assumption, namely, that our representation of spatial relationships coopts how we perceive the relationship of our bodies to the world. That is, when we say the book is on the table, we understand this using the same neural circuits with which we understand the sentence my hand is on the table. Taking it further, we perceive the table as an object in the world on which we can use our hands, and then we perceive the book as relating to the table in the same way that a hand might.
Thus while force is certainly part of the meaning of many spatial prepositions, it does not distinguish where spatial relationships are processed neurally. Rather, just as we saw in Observing Processes of Change, motion is a part of how spatial relationships are perceived. If my hand is on the table, then it exerts ongoing force on the table, whereas if my hand is above the table, then it exerts no force on the table but has the potential to do so. In each case, different actions are available. The hand on the table might push, press, or slide on the table. The hand above the table might drop, slap, or fall on the table. If my hand is already moving towards the table, it can slap or hit the table, but it cannot fall or drop because it is already moving.
To reiterate, motion and force are a part of how spatial relationships are perceived. The perception of motion is not separate from the perception of space, rather, both motion and space are tied together. Even further, this motion is relativized by selecting a reference frame such that it is the figure that moves with respect to the ground, which is stable within the reference frame. There are no prepositions, even among the force-dynamic preposition, for which the ground moves while the figure stays still.
As a final topic, it remains unclear what role the allocentric spatial system might play in spatial relationships. John O'Keefe, discoverer of place cells, presented an interesting theory on vector matching that is worthy of further examination (O'Keefe, 2003). In his model, the relative positions of figure and ground would be represented as vectors, with the difference of the vectors indicating their binary relationship. The meaning of a particular spatial relationship could then be associated with a vector, and operations comparing these vectors would determine whether that relationship would be satisfied. There is more depth in this proposal, but I will not further examine it here. One important point to notice is that Landau's set of geometric prepositions actually have a ternary (threefold) rather than a binary interpretation. When we say that the tree is to the left of the house, there are two objects being related, but there is also an observer acting as a third participant by providing a reference frame. Without the observer, there is no left or right, no front or behind. This ternary aspect likewise applies to some force-dynamic prepositions as well, such as leftward or rightward.
Summing up the content so far, the semantic simulation of spatial relationships happens in the superior parietal lobe of the brain, where binary relationships are modeled as an interaction between a potentially moving figure and a stable ground. These relationships map naturally onto the class of prepositions in English and their analogues in other languages. The interaction of figure and ground is likely modeled as an analogy of how we understand the relationship of our body parts towards objects we are trying to use or control and probably developed to support sensory feedback in motor control.
In contrast to binary spatial relationships, understanding of a spatial scene as a whole is likely assembled in the middle temporal gyrus, which lies further along the what pathway, if I have understood. The right middle temporal gyrus appears to take more responsibility for representing scenes in a photorealistic way, while the left middle temporal gyrus might produce a stylized representation more suited to language. These whole scenes potentially communicate with the binary representations in the parietal lobe by passing through the temporal and frontal regions ultimately through the precuneus, which, due to its know role in mental imagery, is ultimately supportive of the claim that the understanding of language is a kind of simulation.
Regarding linguistic descriptions of whole scenes, we describe a scene by jumping from object to object, naming the relationships as we go. It is as though we talk through a scene by traversing our way through its constituent objects, walking the web of relationships as we go. In the same way, we generate an understanding of a scene by traversing it, both with our eyes and bodies. Thus again, traversal seems to lie at the center of understanding.
This review of spatial relationships is incomplete, but the topic is large. There is much more to say on the details, but I want to turn my attention to the behavioral system for the next few posts. Please leave comments or questions below!
B. Landau and R. Jackendoff "What" and "Where" in Spatial Language and Spatial Cognition Behavioral and Brain Sciences 16, 1993, pp 217-265 Source, Full Text B. Landau Update on “What” and “Where” in Spatial Language: A New Division of Labor for Spatial Terms Cognitive Science 41(S2), 2017, pp 321-350 https://onlinelibrary.wiley.com/doi/full/10.1111/cogs.12410 B. Landau and J. Hoffman Parallels between spatial cognition and spatial language: Evidence from Williams syndrome Journal of Memory and Language 53(2), 2005, pp 163-185 https://www.sciencedirect.com/science/article/abs/pii/S0749596X05000501 L. Talmy How Language Structures Space In: Pick, H.L., Acredolo, L.P. (eds) Spatial Orientation, 1983 Springer, Boston, MA L. Talmy Towards a Cognitive Semantics MIT Press, 2000 S. Pinker The Stuff of Thought Penguin Books, 2007 S. Dehaene How We Learn: Why Brains Learn Better Than Any Machine, For Now Viking Press, 2020 M. Mishkin, L. G. Ungerleider, K. A. Macko Object vision and spatial vision: two cortical pathways Trends in Neurosciences 6, 1983, pp. 414-417 https://www.sciencedirect.com/science/article/abs/pii/016622368390190X Damasio H, Grabowski TJ, Tranel D, Ponto LL, Hichwa RD, Damasio AR. Neural correlates of naming actions and of naming spatial relations. Neuroimage. 2001;13(6 Pt 1):1053–1064. https://pubmed.ncbi.nlm.nih.gov/11352611/ P. Amorapanth, P. Widick, A. Chatterjee The neural basis for spatial relations J Cogn Neurosci 22(8), 2010, pp 1739-1753 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2933471/ P. Amorapanth, A. Kranjeca, B. Bromberger, M. Lehet, P. Widick, A. J. Woods, D. Y. Kimberg, A. Chatterjee Language, perception, and the schematic representation of spatial relations Brain and Language 120(3), 2012 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3299879/ J. Conder, J. Fridriksson, G. C. Baylis, C. M. Smith, T. W. Boiteau, and A. Almor Bilateral parietal contributions to spatial language Brain Lang. 2017 Jan; 164: 16–24. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5179296/ A. D. Patel Language, Music, and the Brain Oxford University Press, 2008 J. O'Keefe Vector Grammar, Places, and the Functional Role of the Spatial Prepositions in English. In: Van der Zee, E and Slack, J, (eds.) Representing Direction in Language and Space, 2003. (pp. 69-85). Oxford University Press: Oxford, UK Full Text
Very interesting to consider these Landau studies, particularly the points on 91 prepositions and the idea of motion and force processing with allocentric and egocentric brain regions.
I had a metaphysics professor in college who said in the history of philosophy, if one were to distill all the arguments to their core with symbolic logic, you'd only have approximately 200 arguments. I've often thought the next step in a big language model technology would be to fully work through all 200 of those arguments in a variety of linguistic "timbres" (so to speak) ... Perhaps there is a nice experiment opportunity to perform similar studies on these philosophical arguments.
I suppose reasoning itself is a kind of motion. You are probably going to speak directly to reasoning in later posts, but I'm curious how you're thinking of building up to it.
Regarding the left-right asymmetrical nature of the brain, it's worth noting that there are lots of exceptions, particularly in left-handed people. fMRI studies usually select against left-handed participants because enough functionality is flip-flopped (or different in some way) that they can completely throw off the experiment. (Which is why I (a lefty) was never able to be a subject to any of my classmates fMRI studies.) I wonder if these left/right temporal gyrus responsibilities are flipped in left-handed populations or not.