The origin of language in gesture–speech unity

Part 3: Mead’s Loop (1).

by Professor David McNeill

Part 1 of this series put forth the idea that language is inseparable from imagery, in particular the imagery of gesture, and that theories of language origin can be judged by how well they predict this gesture–speech unity.  The second part applied the test to a widely held origin theory, gesture-first, and found it wanting – doubly so, in fact. This part applies the test to a new hypothesis, which I call “Mead’s Loop.”

Mead’s Loop holds that gesture was essential in the origin of language.  In this it agrees with gesture-first, but differs in that, it says, gesture and speech had to be naturally selected together.  Rather than gesture-first (or speech-first), gesture and speech were what Liesbet Quaeghebeur, philosopher at the University of Antwerp, has called “equiprimordial,” the antithesis of gesture- or speech-first.

Mead’s Loop rests upon an idea from the early 20th Century philosopher, George Herbert Mead, formulated as an origin hypothesis to portray what, some one-half to one million years ago, emerged in the evolution of the human brain. It posits the mirror neuron circuits that gesture-first also assumes, but again with a difference. Mirror neurons in Mead’s Loop were “twisted” to respond to one’s own gestures as if they were from someone else.

Mirror neurons have been directly recorded in monkeys and reside supposedly in all primate brains, including ours.  Part 2 quoted Rizzolatti’s and Arbib’s definition. A Wikipedia article also defines it succinctly: “[a] mirror neuron is a neuron that fires both when an animal acts and when the animal observes the same action performed by another.” I call these mirror neurons “straight,” to distinguish them from Mead’s Loop.  Note what they provide.  The significance of the straight mirror neuron response is that of the action it mimics.  For example, seeing someone picking up a treat, the mirror neuron repeats this action, with its meaning. The action of another is repeated (not necessarily overtly but at the orchestration level) and it becomes one’s own. If the mirror neuron circuit produces a gesture it will be a mimicked action like the one perceived. It in fact resembles pantomime, a gesture, as we saw in Part 2, that systematically blocks gesture–speech unities.

Mead’s Loop refers to a posited new adaptation, a thought-language-hand link, located at least in part in the area now called Broca’s Area (other brain areas also must have been involved). Here is the twist:  G. H. Mead said that a gesture is meaningful when it evokes the same response in the one making it as it evokes in the one receiving it. For evolution, this suggests that mirror neurons came to bring one’s own gesture imagery and its significance into Broca’a Area, the motor area for orchestrating actions including speech and gesture. While straight mirror neurons reproduce the actions of another, with meanings that are those of the actions, the Mead’s Loop twist responds to one’s own gestures as if from another, and brings different meanings into the action-orchestration areas of the brain, those of the gestures.

The Mead’s Loop twist, because it brings the gesture’s meaning into the orchestration process, merges and synchronizes speech with gesture at points where they co-express the same idea. Hence the unity: it is built in. In all of this the gesture is fundamental. Mead’s Loop creates “new actions,” actions orchestrated under significances other than their practical goal-directed meanings – those of the gestures that Mead’s Loop imports.  Because Mead’s Loop gave gestures the power to orchestrate speech, Mead’s Loop was the beginning of everything in language.

These achievements opened a door to language dynamically. Mead’s Loop had both semiotic and motor effects:

  • Semiotically, it brought the gesture’s meaning into the mirror neuron area. Mirror neurons no longer were confined to the semiosis of actions. One’s own gestures entered, opening action control to the imagery of gesture. Extended by metaphoricity, the significance of imagery is unlimited. So from this one change, the meaning potential of language moved away from only action and expanded vastly.
  • At the motor level, in the areas of the brain where speech movements are orchestrated, Mead’s Loop enabled significant imagery – gesture – to “chunk” motor control of the vocal tract and diaphragm, and laid the foundation of the GP.

How does Mead’s Loop produce gesture-speech unity?  As mentioned, it was built in from the start.  The evolutionary step was a self-response by mirror neurons.  Mirror neurons complete Mead’s loop in a part of the brain where action sequences are organized – two kinds of sequential actions, speech and gesture, converging, with meaningful imagery the integral component. Co-opting sequential actions by a socially referenced stimulus (imagery) provides a new kind of action in the vocal tract – speech, with its own movements, timing, tongue postures, and breathing. It thus explains, which gesture-first could not explain, why gesture and speech are unified.

By treating imagery as a social stimulus Mead’s loop also explains why gestures occur preferentially in a social context of some kind (face-to-face, on the phone, but not alone talking to a tape recorder).

But was the twist needed?  It was, because the gesture, although emanating with full meaning from the same brain area as speech, does not unite with it. It is neither synchronous nor co-expressive. It is incomplete. Gesture–speech unity happens only when the gesture gets a self-response via Mead’s Loop and becomes able to orchestrate speech movements (not sequentially, but self-response is an essential aspect of the gesture’s meaning, the meaning under which speech and gesture combine). This was Mead’s insight. He recognized that gesture (and speech) have fundamentally a social character and, to be meaningful, must have a social/public presence: the gesture, he said, evokes the same response in the one making it as in the one receiving it. With Mead’s Loop, this occurs when the one making and the one receiving are the same; this is the “twist”; then the gesture is a meaningful and socially pertinent event, with the potential to connect to everything else in language dynamically. It orchestrates vocal and manual movements.

Then speech passes from “display” (of which chimps are capable) to “communicating messages to the other” (a phrase from Werner & Kaplan).

Straight mirror neurons do not respond because there is no external action; only the “twist” can self-respond in this way.

A self-response to the gesture can pick up other meanings as well, and these can further cement gesture–speech unity.  In the process the gesture also changes to meet its role of forming a unit with speech. In Part 4 of this series I’ll show gestures reshaped by gesture–speech unity.

Mead’s Loop also gains substance for a reason identified by Merleau-Ponty: “Language … presents or rather it is the subject’s taking up of a position in the world of his meanings. (p. 193).  Via Mead’s Loop and its social reference, the gesture takes up its position in the world of meanings as well.  This move equally reshapes the gesture, in keeping with gesture–speech unity.

That Mead’s Loop gave one’s gestures a public, social significance had importance for another reason, natural selection. It meant that the Mead’s Loop twist was adaptive in social-interactive situations (so those favorites, “man-the-tool-maker” and “man-the-hunter,” would be incidental to language origin, effective insofar as they are also social but not significant in themselves). The social reference gave adults, in particular mothers inculcating cultural norms in infants, the sense of being an instructor as opposed to being just a doer with an onlooker (which is what happens with chimpanzees).  Entire cultural practices of human childrearing depend upon this sense. The adult must be sensitive to her own gestures as social/public actions. Hence the adaptiveness of Mead’s Loop.  Sensing actions as social impacts the next generation of children who, as a result of it, do better at coping, and pass it on.

Origin of syntax.  To many the origin of patterned language, of syntax, is the crux of the origin of language as a whole. How, when, or even why syntax emerged is far from obvious. Proposals range from a “big bang” single mutation, through cultural practices such as ritual or grooming, to no special sources at all, just a natural by-product of human intelligence in general.  Whatever it was, over eons it has led to vast crosslinguistic diversity. I follow Eric Lenneberg and affirm that syntax rests on a biological foundation, hence is a topic in the origin of language.

The basic idea stemming from Mead’s Loop is that words and syntax are continuations of GPs. They and GPs are linked organically. I seek the natural selection of syntax (the general ability, not specific constructions, although some constructions also could have been naturally selected) in three places – the nature of the GP and its unpacking; the new paths this opened; and shareability. These in turn suggest three kinds of adaptive advantages.

First, syntax is crucial for a GP dialectic. Without morphs and combinations of morphs there cannot be a semiotic opposition to gesture imagery.

Second and linked, syntax stabilizes the dialectic. It is the resting point par excellence.

Third, syntax helps make language shareable in sociocultural encounters.

Any or all of these factors could have favored an ability to form syntactic patterns, defined generally as creating meaningful wholes out of segmented elements (morphs); meeting standards of form; providing cultural identity; learning this system, and transmitting and maintaining it over space and time. We are focusing on the dynamic dimension of language. This dimension crosscuts the static and is not reducible to it (nor vice versa, the static is not reducible to the dynamic; they are two dimensions, not one dimension in two forms).

That the static and dynamic arose together, were equiprimordial, is explained by Mead’s Loop’s built-in social referencing, combined with gesture imagery. From this vantage point, we can claim that words and sentences continue the evolution of the GP. Contrary to traditions both philological and Biblical, language did not begin with a “first word.” Words emerged from GPs. There was an emerging ability to differentiate newsworthy points in contexts; a first gesture–speech unit but not a first word.

The paradox of an emerging syntax is that it is almost invisible in current humans. Children learn their language with speed, but they are given a language, not inventing one. When gestures are forced to be the sole medium of communication in experiments, however, they quickly develop linguistic values original to the speaker and the situation, not borrowed from an existing language, including novel axes of selection (paradigmatic values) and combination (syntagmatic values), suggesting a faculty for syntactic innovation in current-day humans. It is this hidden ability we propose that arose out of GPs at the origin.

Whence such a faculty for syntactic invention?  An important insight is “shareability” from a 1983 paper by Jennifer Freyd. To share information imposes a “discreteness filter” such that the semiotic properties of words (discreteness) and word combinations arise.  Shareability would have existed at the dawn.  It also existed in the gesture-communication experiments, so conditions for new word forms and combinations existed in both. Words and combinations of words are part of the GP’s imagery-language dialectic, both opposing codified linguistic form to gesture semiotically, and providing a dialectic stop-order through unpacking.

GPs and syntax thus emerged together according to Mead’s Loop, and could do so because in Mead’s Loop the gesture assumes the guise of a social other and invites shareability from the beginning. Here began the static dimension of language.

Most of the static dimension, however, is not biological but socio-cultural and historical, shaped over time. To have forms that are repeatable, standardized, and non-context-bound makes them durable and portable from encounter to encounter, where they can be reshaped by intragroup and intergroup encounters, including migrations where newcomers encounter existing populations (there may be spontaneous “mutations” beyond encounters as well).

This in itself would have given syntactic innovation adaptive value and replaced temporal order syntax with morph elaborations, releasing static structure meanings from temporal sequence. The primordial syntax according to Mead’s Loop was mapping meanings onto temporal sequences. The orchestration of actions under some significance with shareability allots meaning fragments to ordered segments of time.  The response to encounters is to shake up this temporal syntax. Given gesture–speech unity, gestures change as well (the examples in the fourth part of this series illustrate this as well).

The cumulative effect would have been to liberate temporal sequence for other expressive functions, some of which may also take part in imagery–language dialectics on muliple levels. Edward Sapir long ago divided the world’s languages according to how they combine meanings into single words – analytic or isolating, relying on temporal sequence (e.g., Chinese), synthetic, with some liberation (e.g., Latin), or polysynthetic, with much freedom (e.g., Inuit), which reflect degrees of adornment of the basic brain orchestration plan (and English, with its relatively fixed word orders, is one of the less adorned).

Rethinking language as action control. Speech according to Mead’s Loop, among other things, is thus a culturally mandated action, orchestrated by imagery. Action is a target of natural selection in any case, and in the selection scenario where Mead’s Loop had adaptability, adults inculcating cultural norms in infants, the overt actions of the adults fed natural selection. Linguistic standards are not only about “good forms” but also about “good actions.”  The discovery of the FOXP2 gene points to the centrality of action control at the foundation of language. The mutation in the KE family that led to its discovery affects fine motor control, speech articulation and other actions, as well as syntax. As a gene affecting fine-tuned action control, it would influence the raw material on which Mead’s Loop and its new form of action worked (the Mead’s Loop innovation itself would be something else genetically). The gene (actually, a transcription factor, a genetic “on-off” switch), which differs in the human version compared to that in chimps, has undergone accelerated evolution and when implanted into engineered mice changes vocalization. Taking this lead, we can consider syntax as a form of culturally authorized action control of the vocal organs, the hands and other body parts.

Brain model. The language centers of the brain have classically been regarded as just two, Wernicke’s and Broca’s areas. But if we are on the right track with Mead’s Loop, many other areas of the brain are involved and are equally “language areas.” Broca’s Area itself is not a “language area” but a region for complex action orchestration under various significances. Typical item-recognition, memory and production tests would not tap these other brain regions, but discourse, conversation, play, work, and the exigencies of language in daily life (where language originated) would.  Broca’s area may be the convergence point of Mead’s Loop and the imagery–language dialectic, including unpacking, but other areas – the left rear hemisphere (categorial content in GPs), the right hemisphere (imagery and metaphor), and the prefrontal cortex (the alternatives a GP differentiates) – can equally be called the “language areas” of the brain. Thought-language-hand links tie them together when the dynamic dimension of language is engaged.

Selection scenario. The family, particularly in its child-rearing aspects, is an environment where the social/public value of one’s own gestures is adaptive, and where Mead’s Loop could have been naturally selected (no doubt Mead’s Loop was adaptive in other contexts as well). Archeologists date the dawn of family life (with cooking hearths) to about one million years ago, implying a stable family membership and a division of labor. So it was possibly back then that the natural selection of Mead’s Loop also began.

The focus of this selection pressure was adults, women in particular. In this scenario language began in adults, in the form of mothers instructing infants. Their infants, both female and male, would benefit from superior cultural inculcation, and so become more able to carry on any genetic disposition for Mead’s Loop themselves.

Did Neaderthals speak?  The Neanderthal genome project has shown that this extinct form of human also had FOXP2, and also may have been capable of fine motor control. Whether this control covered the vocal tract is unknown but speech seems not impossible.  Some have suggested that the Neanderthal brain, although large, had a different developmental time course from that of human children (much briefer) and did not sustain robust activity of the prefrontal cortex.  A short ontogenesis meant less time for any GP-like development. The prefrontal cortex, among its other functions, arranges and selects alternatives. The formation of the contexts that GPs differentiate is a place in language where this ability is tapped.  Weakened contexts would have yielded cognitive inflexibility and gesture–speech redundancy rather than unity. Any GP-like dynamics is thus also likely to have been muted.

Even if Neanderthals could speak, their speech is likely to have been temporally sequenced, and limited to what Derek Bickerton posited as proto-language and what Martin Braine called pivot grammars, each pivot a separate “grammar” unto itself. A collection of disparate pivot grammars, lacking an overall system, may have been their highest achievement. Possible gestures would be gesture-first-like pantomimes and pointing (available to today’s sub-two-year-olds).  Kindly opinion is that our ancestors had nothing directly to do with the Neanderthal extinction but we may have out-competed them. A cultural superiority over cognitive inflexibility and a limited, single semiotic (a profile not unlike Downs syndrome) could have been fatal, if unintended.

Further Reading

Adult–Infant inculcation

Hrdy, Sarah Blaffer. 2009. Mothers and others: The evolutionary origins of mutual understanding.  Harvard.

Tomasello, Michael. 1999. The Cultural Origins of Human Cognition. Harvard.

Brain model

McNeill, David, & Pedelty, Laura. 1995. Right brain and gesture.  In K. Emmorey & J. Reilly (eds.), Sign, Gesture, and Space, pp. 63-85.  Erlbaum.

Nishitani, Nobuyuki, Schürmann, Martin, Amunts, Katrin and Hari, Riitta. 2005. ‘Broca’s region: from action to language.’ Physiology 20: 60-69.

Where language began

Atkinson, Quentin D. 2011. ‘Phonemic diversity supports a serial founder effect model of language expansion from Africa.’ Science 332: 346-349.

Mead’s Loop “twist”

Cohen, Akiba A. 1977. ‘The communicative function of hand illustrators.’ Journal of Communication 27: 54-63.

McNeill, David., Duncan, Susan. D., Cole, Jonathan., Gallagher, Shaun. and Bertenthal, Bennett. 2008. ‘Growth points from the very beginning.’ Interaction Studies (special issue on proto-language, D. Bickerton and M. Arbib, eds.) 9: 117-132.

Mead, George Herbert. 1974. Mind, self, and society from the standpoint of a social behaviorist (C. W. Morris ed. and introduction).  Chicago.

Merleau-Ponty, Maurice. 1962.  Phenomenology of Perception (C. Smith, trans.). Routledge.


Bickerton, Derek. 1990. Language and Species. Chicago.

Braine, Martin D. S. 1963. ‘The ontogeny of English phrase structure: the first phase.’ Language 39: s1-13.

Pääbo, S. and colleagues. 2009. News focus in Science 323: 866-871.

Rozzi, Fernando V. Ramirez and de Castro, José Maria Bermudez. 2004. ‘Surprisingly rapid growth in Neanderthals.’ Nature 428: 936-939.

Wynn, Thomas & Coolidge, Frederick. 2011. How to Think Like a Neandertal. Oxford.

Speech as action control

MacAndrew, Alec. ‘FOXP2 and the evolution of language.’

MacNeilage, Peter F. 2008. The Origin of Speech. Oxford.

“Straight” mirror neurons

Rizzolatti, Giacomo and Arbib, Michael. 1998.  ‘Language within our grasp.’  Trends in Neurosciences 21: 188-194.

Wikipedia article on the Mirror Neuron.

Syntax and shareability

Freyd, Jennifer J.  1983.  ‘Shareability:  The social psychology of epistemology.’  Cognitive Science 7: 191-210.

Lenneberg, Eric. 1967. Biological Foundation of Language. Wiley.

McNeill, David and Sowa, Claudia. 2011.  ‘Birth of a morph.’ In G. Stam and M. Ishino (eds.), Integrating Gestures: The Interdisciplinary Nature of Gesture, pp. 27-47. Benjamins.

Sapir, Edward 1921. Language: An Introduction to the Study of Speech. Harcourt, Brace & World.

Thomason, Sarah. 2011. ‘Does language contact simplify grammars? (No).’ Talk given at the University of Chicago, April 12.


David McNeill is a professor in the Departments of Linguistics and Psychology at the University of Chicago.

His new title How Language Began: Gesture and Speech in Human Evolution is now available from Cambridge University Press


3 comments to The origin of language in gesture–speech unity

Leave a Reply to The origin of language in gesture–speech unity « Cambridge Extra at Linguist List




You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>