A Quick Note On Causation
Unifying Pearl with Lakoff and Johnson Through Object-Oriented Motor Control
While I am developing the next post on the architecture of the motivational system, I wanted to pause to address some observations I had regarding two books I am reading simultaneously. The first is Causality by Judea Pearl, and the second is Philosophy in the Flesh by George Lakoff and Mark Johnson, which has a long chapter on cause and effect from an embodied perspective. My observations pertain to how the motor planning could induce a functional perspective on objects that would enable graph-based planning in an abstract object space. This planning would resemble Pearl’s causal reasoning, though with some important differences.
Judea Pearl on Causation
Judea Pearl may not be a familiar name to most, but he is without doubt one of the most influential AI researchers of the past forty years. His most significant work is the development of a probabilistic logic of actions called the do-calculus. Most current machine learning, including most deep learning, is based on statistical correlations as expressed using conditional probabilities. Pearl vociferously argued that models based solely on conditional probabilities were inadequate to account for human reasoning, and that more human-like intelligence would require models that could express cause and effect.
To explain briefly why Pearl argued as he did, notice that conditional probabilities are in a certain sense symmetric. If an observation A is correlated with an observation B, then B is also correlated with A. For example, if drunk driving is related to car accidents, then at least to some degree, car accidents are related to drunk driving. The quantitative details of this symmetry are encoded in a formula known as Bayes’ rule, which is at the foundations of probabilistic reasoning in machine learning. But cause and effect are asymmetric almost by definition and certainly in the way we conceive of them. Drunk driving causes car accidents, but car accidents do not cause drunk driving.
Humans are interested in identifying causes, not correlations, for the simple reason that if we can identify the cause of an ongoing problem, we might be able to intervene in the world to change the outcome. Pearl’s do-calculus considers the probabilistic effects of such interventions and formulates a complete logical system for describing what can be known about such effects. The name comes from the fact that the logic is a logic of action, that is, of doing. The models that implement this logic are called causal models.
Interestingly, Pearl concluded that cause and effect are independent of statistics precisely due to symmetry, in that the exact same statistics can be generated by two contradictory causal models in which the direction of cause and effect are reversed. Causation, therefore, is not a statistical property, that is, it cannot be elicited purely from counting the occurences of events or from reasoning about symmetric probabilistic connections. Pearl did discover many situations where causes could be extracted from statistics (identified, in Pearl’s language) and managed to establish necessary and sufficient conditions that describe these situations. But importantly, in situations where these conditions do not apply, statistics cannot identify cause.
But this conclusion begs the question, what is causation? And can causation be observed in any meaningful way? Pearl highlights a statement by the eighteenth-century philosopher David Hume that one event causes another whenever the second event would never have happened but for the first. This expression connects causation to counterfactuals, which have traditionally caused a number of problems for logicians.
Pearl’s Approach to Counterfactuals
Counterfactuals are problematic from a logical point of view because they require one to reason about situations that have not occurred and hence whose truth semantics cannot be assessed. The most well-known philosophical attempts to account for counterfactuals rely on the idea of possible worlds, which regards counterfactuals as a statement not about the real world but about some other world that could exist.
The trouble with possible worlds, of course, is how to distinguish what could exist from what could not possibly exist on the basis of what does exist. We are, after all, starting with counterfactuals, which discuss situations that in point of fact have not happened.
Consider the counterfactual statement that if Newton had not discovered the calculus, then everyone would have heard of Leibniz. In what sense is this a statement about possible worlds? The simplistic interpretation says that in all possible worlds where Newton did not discover the calculus it is also true that everyone has heard of Leibniz. Yet all possible worlds would include worlds in which Leibniz was never born, so this simplistic interpretation is simply too strict.
A less stringent interpretation holds that we should not consider all possible worlds but only those which are similar to the world that exists. In this view, counterfactuals deny some aspects of reality while keeping most of reality intact. Thus when we presume counterfactually that Newton did not discover the calculus, we do not deny that Leibniz also discovered it concurrently. Then we reason that since everyone has heard of Newton, everyone would has heard of the inventor of the calculus, and hence in our alternative world everyone might have heard of Leibniz instead of Newton in the same role.
The invocation of similar possible worlds does not solve the problem of counterfactuals, however, but merely replaces one difficult problem with another. How are we to determine that two worlds are similar? Where is the threshold for similarity that would allow us to conclude that one counterfactual statement is true but another is false because of some sufficiently similar possible world that denies the consequent?
Pearl resolves this difficulty by treating counterfactuals as an intervention on a causal model that that can be computed using the do-calculus. His underlying assumption is that probabilistic causal models can represent human reasoning. Thus a human mind would have a representation of reality that recognizes that recent inventors of famous ideas are known by name, but that in cases of multiple invention, they only remember the most prominent inventor. Hence (ignoring his physics discoveries) the fact that Newton discovered the calculus explains and therefore causes the observation that everyone has heard of Newton. By intervening in the model to deny explicitly that Newton discovered the calculus while holding everything else constant, we arrive at the conclusion that Leibniz’s concurrent discovery would therefore make him a household name. Thus Pearl solves the problem of identifying similar worlds by defining them as minimal change to an underlying causal model in which these minimal changes can be represented formally and often computed. Of note, a similar solution was proposed earlier in the context of linguistics by Fauconnier and Turner.
Pearl’s view of causation and counterfactuals is formally elegant but has a serious flaw in that it provides no reliable mechanism for learning causality. Causality cannot be inferred from statistics except in limited cases, and even then identification of cause requires the assumption of stability or minimality. Yet causation seems to be a core part of how people view the world.
Causation as Abstracted Motor Control
So how could causality relate to actual human brains? In Philosophy in the Flesh, Lakoff and Johnson approach the issue of cause and effect by pointing out from linguistic evidence that there are numerous possible meanings of causation, none of which cohere as a single, overriding formal idea. They then conclude that a concept with such an unstable and variable nature cannot be real in an objective sense, yet it is clearly real in an experiential sense.
From my point of view, the most interesting claim made by Lakoff and Johnson is that one of the sources of causal reasoning, perhaps the primary source, lies in motor control. To quote:
At the heart of causation is its most fundamental case: the manipulation of objects by force, the volitional use of bodily force to change something physically by direct contact in one's immediate environment. (Lakoff & Johnson, Philosophy in the Flesh, pg. 177)
To explain how our sense of causation might arise in this way, we begin from our own perspective as agents in a world that is objectively separate from it, but which we can influence through our actions. We then recognize our own ability to change things in the world, as when we walk down a path (changing our position), kick a ball (changing its position), or open a door (creating new possible actions). We further recognize that not only ourselves but also other agents, human and animal, can effect similar changes, and we thus attribute them and ourselves as prototypical causes. Through another transfer, we treat the kinds of movement that could have been caused by a human or animal as having been caused by something, and we name this cause, as in the wind blew the door open. These transferences yield a full range of causes arising from certain physical changes that we can achieve using motion.
Sometimes we only observe the effects of an action and infer the cause, as when we leave a full trashcan by the curb, go to work, return to find the trashcan empty, and conclude that garbagemen have taken the trash during the day. It is not a great step from accepting the existence of hidden causes to the human predilection to impute a cause to any undesired or unexpected change in the world. A little self-reflection easily reveals our own tendency to invent causes for any significant effect, as when we credit own industriousness as a cause for our material possessions, or when we attribute the incompetence or malevolence of others as a cause for a divorce or a lost job.
When motor control is abstracted to cover all the actions that we represent as transitive verbs, as I tend to believe, causation goes along for the ride. In fact, one might suggest that most of what we call causation is in fact precisely the abstraction of motor control to include cognitive planning and social behavior.
The story above is by no means proven, but it nicely matches the linguistic evidence presented by Lakoff and Johnson. They summarize their observations in a collection of metaphors, including Causes are Forces, Causation is Forced Movement, and Causation is Transfer of Possession. The strongest element of their argument is that actions that can be applied to movement including blocking, preventing, accelerating, interrupting and so forth can also be applied to causes. Thus a cause may try to bring about an effect, but be stopped in mid-course. Their chapter on causation is full of such examples that I will not repeat here.
Objects as Enablers of Motor Control
In my post on motor behavior, I conceptualized motor control as operating on the basis of start and end conditions, but I did not in any way justify the existence of such conditions or describe how these conditions could be identified. An insight that struck me upon reading Lakoff and Johnson was that objects, or at least objects in their relationships to each other could likely serve as the necessary conditions. This insight arose in part out of some comments from Jordan Peterson on perception that in fact we do not perceive objects through their structure so much as through the behavioral opportunities they afford us, so that a door is not so much a rectangular object as it is a means of passage from one place to another.
The particular assertion Peterson makes is that some region of the frontal cortex inhibits actions that would otherwise occur automatically upon perception of an object. If this region is damaged, then subjects will apparently walk through an open door immediately upon perceiving it without being able to stop themselves. It is as though the conditions for action being present leads inexorably to the action itself.
I will name the idea that objects, or more specifically their representations and binary interrelations, serve as the conditions for motor control the functional object hypothesis for short. The claim is that objects and their binary relationships provide the functional conditions under which actions can be performed. The functional object hypothesis fits nicely into the subcognitive architecture I am designing, because in this architecture, the motivational system must identify which objects from the perceptual system can satisfy the needs of an organism. Need satisfaction is a function, and thus this architecture presumes at the core that objects are in fact functionally identified.
As I discussed, an encapsulation of motor control between start and end conditions permits graph-like motor planning on the conditions. If you want to eat, then you must first approach an eatable thing. To approach an eatable thing, you must identify the place where eatable things can be found; let us call this an eatable place. And so the satisfaction of needs becomes possible through planning on objects and places that are organized according to the needs they satisfy and the actions they afford.
The conditions for an action can sometimes regard the state of an object by itself. For example, a rag must be wet before it can be used to clean a surface, and a bottle must be open before it can be used to drink from. But quite often, the conditions for action regard a relationship. To continue the examples, we must hold the rag in order to wipe with it, and the bottle must be on our lips to drink. As discussed before (here and especially here), human spatial relationships appear to have a binary character that is ultimately represented in the posterior parietal lobe, and this binary character is asymmetric, distinguishing the figure, a smaller and often mobile object, from the ground, a larger and often immobile object. In formulating the functional object hypothesis, I consider the binary spatial relationships that bind an object to other objects as either figure or ground to be a part of the object’s functional representation. Thus an action condition could include the attributes of object (that is, its adjectival properties) or the binary relationships of the object (its prepositional relationships). To take a drink from a bottle, we require the bottle to be open (adjective) and at the lips (preposition).
Causal-ish Reasoning in Humans
The obvious idea flowing from the discussion above is that perhaps causal reasoning in brains could be implemented as an abstraction of motor planning on functional objects. That is, the ideas about which we reason would be treated as the conditions of abstract actions, and a path of reasoning would be computed analogously to how a path is selected for physical movement. Such reasoning strategies would resemble causal reasoning in Pearl’s systems. These strategies would not in any way resemble Bayesian reasoning with conditional probabilities, which is a good thing, because study after study shows that humans have little intuitive understanding of conditional probabilities. To keep this post from becoming too long, I will only give for now a rough outline of how this reasoning would work.
Pearl gives the example of a causal model for whether a sidewalk is slippery (Causality, Figure 1.2). The model considers the season, whether the sprinkler system has run, whether it has rained, and whether the sidewalk is wet, in that causal order. If we observe that the sidewalk is slippery, we can infer from the model that it is wet and hence either the sprinkler has run, or else it has rained. The probability of rain versus the sprinkler system having run will change with the season; it is more likely to have rained in Spring or Fall. If we ask the model whether it rained given that the sidewalk is slippery and the season is Spring, it will compute a probability that integrates this information and assigns a reasonably high value to rain.
But if we turn on the sprinkler and then ask the model the same question, we expect the probability of rain to be reduced to its background state. The fact that the sidewalk is wet no longer provides evidence of rain. Our intervention in the system has changed the outcome. This is an example of how Pearl’s interventions work.
Now consider how such reasoning would work under the functional object hypothesis. We observe the slippery sidewalk as associated with a wet sidewalk, such that the two representations should overlap or coincide. A wet sidewalk is the functional result of falling water. Falling water is a functional result of either clouds or sprinklers, or numerous other conditions. Thus we can reason from the slippery sidewalk to either rain or sprinklers or something else using purely associational or functional links. But there the similarity ends, because the human mind is not a probabilistic model.
A human will not naturally assign a probability to either rain or sprinklers, nor will a human naturally exclude all other possibilities other than rain or sprinklers at that stage. If asked whether the sidewalk is wet because of rain or sprinklers, a human would typically look around for more evidence. The human would specifically know that sprinklers and rain each dispense water in patterns that could be observed from the environment and would seek to observe those patterns. In this way, a conclusion would be reached that rules in or out each of sprinklers and rain based on observation. A human would balk at being asking to treat sprinklers and rain as mutually exclusive possibilities, because the human would know the two are independent. And if you or I, being human, only heard that the sidewalk was slippery but could not observe the environment and were then asked whether it rained or the sprinklers were on, we would most likely respond with one of the following responses:
“I don’t know.”
“What time of day was it?”
“What time of year was it?”
“Maybe it rained, or maybe the sprinklers were on, or maybe something else.”
“Do I have to pick one?”
“What kind of canned question is this?”
“How am I supposed to know?”
“Why are you asking me?”
The point is that causal models make the closed world assumption that no further information is available, whereas humans will require the information they need in order to give a reasonable answer (though we will happily assume information without proof once an answer seems reasonable).
Summary
Causation is a fundamentally important aspect of human reasoning that should not be ignored. Pearl’s causal models provide a formal foundation for causal reasoning, but it fails to account for how causes can be learned from sensory observations. Lakoff and Johnson identify self-motion in the world as the source of many linguistic formulations of causation, opening the possibility that causation could be learned as an expanded metaphor for human or animal action. If so, then causal reasoning could be performed in the brain as an abstraction of motor path planning based on functional representations of objects and their binary interrelationships. There is far more to develop here, and I will return to this topic in future posts to address how cognition could arise through paths in abstract spaces.
For now, I will continue in the next post with an account of the motivational system and its connections to perception and action through the medium of spatial maps. Thanks for reading, and please leave any comments below!
This reminds me of Aristotle's Metaphysics and the cosmological argument derived from his meditation on causality - the idea that if C was caused by B, and B was caused by A ... and so on ... eventually there must be an 'unmoved mover'. I'm likely missing some nuance, but it almost seems like Pearl's formulation is a solipsistic metaphysics where human consciousness is the unmoved mover - what do you think?