I have divided subcognition into three parts: perceptual, motivational, and behavioral. Previous posts have examined the components of the perceptual and motivational systems. This post will begin to address the architecture of the behavioral system by presenting a set of principles for motor control.
When speaking about the behavioral system, I am focusing primarily on motor control. I will use the term action to indicate a complete, coordinated set of movements at human scale, such as walking, reaching, grasping, throwing, laughing, speaking, jumping, and so on. Based on the prior discussion in my posts on behavior and spatial relationships, there are seven principles of motor control that I will discuss here:
Perception focused: Perception, especially spatial perception, defines and drives action.
Causally organized: Actions can be represented as causal relationships among individual perceptual attributes.
Spatially oriented: Movement consists of maintaining a dynamic spatial relationship and a manner of execution.
Planned through waypoints: Movement implies a path that is specified according to spatial relationships with waypoints; handling multiple waypoints requires planning
Varying in complexity: Movements can be categorized on a gradient from simple to complex, from those that involve a few neighboring muscles to those using the whole body
Event-timed: Action timing can be coordinated with respect to perceptual events, internal or external, that serve as conditioning factors for movement
Spatially dichotomous: Actions can be egocentric or allocentric in nature, resulting in a dichotomy of object / trajectory vs. place / route that is fundamental
Perception defines and drives actions
Said otherwise, a basic conscious action is defined with reference to some high-level aspect of perception that it manages. The converse is also true: perception is organized for the purpose of supporting action. Together, the perception-action loop forms a cohesive whole. We do not perceive the world in itself; rather, we perceive the world as an arena for human-scale actions. Our actions then do not need to operate on minute percepts, but can rather use the gross, high-level outputs of the perceptual system. We do not move our hands at speed v along path p to attain a target object x. Rather, we reach for the object quite naturally, and reaching is precisely what we perceive it to be. There is no hidden mathematical code to the movement, no secret calculations to be unlocked. The perception of the action is the control signal for the action itself.
Imagine implementing a reaching task in robotics. The typical approach would be to define a three-dimensional coordinate system, figure out the x,y,z coordinates of both the reach target and the robot hand, and then calculate the motor controls using inverse kinematics. The key characteristic is that the entire task is mediated through x,y,z coordinates: the position of the hand, the position of the object, the location of any obstacles, the path of each robot part through space — all of these are computed with reference to the mathematical, measurement-oriented, three-dimensional notion of space. To accomplish the task, the topology of the robot and the physics of its motor are explicitly and precisely taken into account. The roboticist cannot move the robot without a lot of complex math.
As we have previously discussed, the human perceptual system does not treat space in this measurement-oriented fashion. And I contend that the human behavioral system does not use metric-like system. Instead, motor control is tightly driven by visual feedback or other sensory feedback. Of course we can make movements without seeing what we are doing, but it is somewhat harder. To succeed we must either intensively use spatial memory (if you dare, try looking at a glass of water, turning your back to it, and picking it up in a rapid, smooth motion without spilling water) or else fall back on laboriously learned routines (as we do in dancing or when playing an instrument).
To pick up a cup of water, we look at it, and then move our hand towards the cup at a reasonable speed with the hand opening to face the cup as we do it. The maintenance of perceiving towards-ness, combined with the speed of the motion and the shaping of the hand is the action itself. There is no further depth to it. There are subcalculations in the form of neural activity, but this activity is subconscious. The operation of the action, from a conscious level, is nothing more and nothing less than the first sentence of this paragraph. Language is an expression of how we perceive and manage high-level motor control. Furthermore, the subconscious calculations that do occur are concerned with maintaining the high-level percepts of towards, reasonable speed, and open hand facing the cup. You could not extract the task Jacobian from the underlying neural firing if you wanted to; the neural firing in the neocortex is independent of partial derivative equations. The spatial relationship, motion included, is the action itself. Perception drives action, and action drives perception.
Actions are causal relationships between individual perceptual attributes
Cause and effect are central topics of philosophy, especially Western philosophy, and for good reason. But I believe that at essence, cause and effect is nothing other than a cognitive generalization of how human action intervenes in the real world in order to change outcomes. I discussed this view at length here. I use the word causal in this sense.
More broadly, actions provide a path to change the world. But actions are driven by perception, and change in the world is encapsulated through perception into a transition in the state of perceptual attributes. I have referred to the perception of change as a change process. Any sequence of percepts can imply an action that we may recognize as a causal process. So, we perceive in a closed door followed by an open door. We conclude someone opened the door. The action of opening connects the closed door to the open door. Conversely, if we have a goal of entering a room and encounter a closed door, the action of opening suggests itself. In this sense the action is the causal relationship between the closed door and the open door.
Within this causal relationship, we notice a tendency for the actions to focus on a single perception attribute: open vs. closed, towards vs. away. Our actions generally do not transform multiple attributes unless those attributes are inextricably linked.
Movement = Spatial Relation + Manner
As just mentioned, a reaching task can be broken down into the maintenance of towards-ness, the rough categorical speed of the arm movement, and the shaping and orientation of the hand. I suggest that most natural physical movements can be described in a similar manner. Consider a few examples:
To hold is to maintain in-ness or with-ness
To grasp is to hold with the addition of around-ness
To hit is to maintain the hand at towards-ness at high speed until against-ness
To punch is like to hit with the hand closed to a fist
To slap is like to hit with the hand open and facing the target
To walk is to maintain forward-ness using cyclic motion of the legs
To touch is to maintain towards-ness at low speed until on-ness
To hug is to maintain around-ness with the arms while squeezing moderately
Thus motor control for physical movements can be attained through sensory feedback on perceptual qualities, especially spatial relationships, binary or unary. Recall that spatial relationships as perceived by the brain include motion. So towards can only be perceived when there is motion; this motion includes and sometimes originates in self-motion. Maintaining the perception of towards-ness requires movement, and I claim that this percept (as opposed to some lower-level percept) drives the sensory feedback loop for motor control.
Obstacle avoidance uses multistep motor control
To reach for an object requires the hand and arm to traverse a region in space. The standard reaching motion follows a minimal energy path that I will refer to as the natural path. The reach is feasible if the natural path is clear of obstacles, and then the reach can be performed by maintaining the towards relationship between the palm of the hand and the target object. But if there are obstacles, then the obstacles must be removed or avoided. If the choice is made to remove the obstacles, then a multi-step action plan is obviously required. But what if the object can be reached simply by following a path other than the natural path?
The problem with introducing another path is that we then require a mechanism for judging adherence to the path. In many robotics applications, we would solve this problem by fully planing the path as through a motion plan, ensuring the entire path is free of obstacles, and then issuing motor commands that follow the planned path as exactly as possible. It should not surprise that I think the human mind uses a simpler method relying on binary spatial relationships.
Suppose that we recognize each obstacle as a waypoint. Now then we can make an avoiding motion that goes around, between, through, or away from an obstacle. Then, once the obstacle has been avoided, we can then go towards the original reach target. Of course, the manner in which we avoid the obstacle must be consistent with the ultimate target, but that can be handled through high-level planning.
Thus, if we wish to pick up a phone that has fallen under the bed, we do not need to plan a complex trajectory for our arm that avoids the bed. Instead, we simply reach under the bed and towards the phone. Again, the language reveals the underlying structure of the motor plan, which has two stages: first, passing the hand under the bed such that it is pointed towards the phone, and then reaching naturally towards the phone after the hand has achieved the under relationship with respect to the bed. If we have left our shoes beside the bed at that location, then there is now a third step in that we must go around the shoes, under the bed, and towards the phone. If this account is correct, then the obstacle avoiding path to reacquire the phone can be staged as a sequence of motions defined by the maintenance or establishment of binary spatial relationships recognized by perception.
More complex paths can be planned in the same way as a sequence of planned movements with respect to a series of waypoints, each intended to establish a new spatial relationship enabling the next action. This statement holds for both egocentric motions towards objects and allocentric motions towards places (walking a path). As an aside, this sort of waypoint-oriented planning seems to involve the lateral posterior parietal cortex.
Movements range from simple or complex
The complexity of a movement can be assessed according to the number and location of distinct muscle groups that must be engaged. Simple movements involve only one limb or group of muscles; these are executed by the primary motor cortex, just in front of the sensorimotor cortex at the back of the frontal lobe of the brain. Simple movements include bringing the hands to the mouth, grasping an object with one hand, or reaching towards an unobstructed point. Control of individual motor groups is mapped topographically to the primary motor cortex according to the motor homunculus, which I have shown before, and which has extra capacity for fine-scale movements of the hand and face.
Complex movements, on the other hand, engage multiple motor groups and typically involve left-right coordination. These movements are generally coordinated through the premotor cortex, which has outputs both to the primary motor cortex as well as directly to the spine. The premotor cortex is responsible for coordinating actions. Thus, while performing a complex or extended reach, the primary motor cortex controls the arm while the premotor cortex directly adjusts the muscles of the trunk and shoulder to assist the reach.
Complex movements require greater motor planning and by nature involve multiple simultaneous movements, each one supporting the other.
Actions are timed by events, internal or external
The brain does not have reference to an absolute clock and does not time actions according to an externally referenced timing. The brain does have a relative periodic notion of timing available to it in two ways.
First, it is well known that an electrode attached to the brain will detect an oscillating electrical signal; these are the brainwaves, and they are observed at various speeds, from the fastest gamma waves at 40 Hertz or more during high concentration down to the slowest delta waves at 2 Hertz during deep relaxation. Brainwave frequencies vary considerably, and this variation aids us to synchronize ourselves with timing and importances of the tasks we are performing. In a sense, these brainwaves act like the clock-rate of a computer, governing the speed at which instructions are executed. But unlike computers, there is no counter or absolute time associated with the base clock-rate, and hence these waves cannot provide a reliable timing mechanism for managing action execution at timescales beyond the sub-second level.
Second, we are aware of our heartbeat and blood flow. Go out for a brisk walk, then stop and pay attention to yourself. You can feel the blood coursing through your veins. This feeling provides a slower clock-rate than the brain waves roughly 1-2 Hertz (about a second per period). Since we can consciously sense the circulatory rate, it is likely available for motor timing. Again, however, the heartbeat is variable.
The brain has no obvious innate tools to implement action timing at much longer scales than a second. If you do not believe me, trying playing an instrument at an extraordinarily slow tempo (say, 20 beats per minute) without internally counting subintervals. You will find it extraordinarily difficult.
To get around this difficulty, actions can be timed based on external events. So, for example, to help with the slow tempo, you could use a metronome, which provides an externally visible indication of the right timing. Or you could internally count subintervals at a rate of about one per second, roughly around the heart rate. In either case, you will have engaged event-based timing, which is the only timing available at the timescale of conscious thought.
Event-based timing is one type of conditional action execution. When you execute the steps of an action plan to coincide with or quickly follow a perceptible event, you are dependent on the percept for the action to occur. If the event does not occur, then the action does not occur. We perceive this conditional execution as waiting. Conditional execution of events is regulated by the premotor cortex. The lateral premotor cortex responds to externally perceived events, such as the metronome, whereas the medial (central) premotor cortex responds to internally perceived events, such as the internal counting. This division of responsibilities follows a general organizational principle of the brain whereby the medial surface of the neocortex is responsible for self-perception and social perception, while the lateral surface manages interfaces with the external world.
Actions can be egocentric or allocentric
As we have seen repeatedly, the brain contains a fundamental dichotomy between an egocentric positioning system converging on the parietal lobe and an allocentric navigation system centered on the hippocampus, with both systems integrated around the retrosplenial cortex. Actions may be performed relative to objects (grasping, reaching) or places (going, leaving), reflecting the divide between the egocentric and allocentric systems, respectively.
Interestingly, actions with respect to places require full-body coordination through tasks such as walking, running, swimming, or driving. We would then expect such actions to be complex rather than simple by their nature, and hence we would look for them to coordinated in the brain at regions even further removed from the primary motor cortex, say, in the prefrontal cortex or the most forward parts of the premotor cortex.
Actions with respect to objects may sometimes be simple in the sense of using fewer motor groups, but they are complex in another way, because they rely more heavily on visual feedback. Thus in more egocentric tasks we have a greater dependence on hand-eye coordination, which involves the frontal eye fields and implicates strong connections between these fields and the parietal lobe, which connections do in fact exist.
In either case, there are different patterns of interaction and dependence between actions with respect to objects versus actions with respect to places, and we expect these differences to be expressed in an architecture of motor control.
Conclusion
In the next post, I will attempt to present an architecture for a motor control system that is perception focused, causally organized, spatially oriented, planned through waypoints, varying in complexity, event-timed, and spatially dichotomous.
Thanks for reading, and please leave any comments below!
My first thought when reading the section on obstacles here is wondering how to characterize other actions, or actions of others ... Sometimes they will be obstacles, other times they will shorten the path and be assistive. There's a politics of action, perception, and, ultimately, intent. (Which reminds me of a fun philosophy book by psychotherapist RD Laing called "The Politics of Experience"). As organisms navigating an inherent social construct as part of perception, it would seem some core logic for encountering the actions of others and deciphering intent would be essential to these basic mechanics. What do you think?