The origin of language in gesture–speech unity

Part 4:  Mead’s Loop (2). Wider consequences.

David McNeill, University of Chicago

As it evolved Mead’s Loop created “new actions,” as mentioned previously.  New actions are one of the “wider consequences” of Mead’s Loop. Action itself was a target of natural selection, and the new actions emerged organically. They did not need a separate evolution. A second consequence is metaphoricity. A third is the emblem, a culturally established gesture with metaphoricity at the core. A fourth is how children acquire language – twice, the first of which goes through the equivalent of extinction.  A fifth (many more can be identified) is what phenomenologist philosophy calls “being” – “inhabiting” gesture and speech, rather than only displaying them as elements of communication.

1. New actions and old. What remains of “old actions” in the new world? Many consider actions to be the source of gestures, that by adopting an action-based semiotic a gesture of something being flat is a truncated version of “making something flat.”

The idea that gestures derive from actions is plausible at first glance but there is more (or less) than meets the eye. A gesture may look like a pragmatic action but the action has changed at its core. To describe a gesture as “outlining” or “shaping” is useful as a description but to also say that such a practical action is still within the gesture is to disregard what makes the gesture a human sign.

In the illustration, a “rising hollowness” gesture looks like the action of lifting something in the hand, but it is not lifting at all. It is an image of the character rising, of the interior of the pipe through which he rose, and of the direction of his motion upward – all compacted into one symbolic form to differentiate a field of meaningful equivalents having to do with HOW TO CLIMB A PIPE: ON THE INSIDE. This complex idea, as a unity, orchestrated the hand shape and movement; it is the same motor response but it is not the same action as lifting up an object.

While a gesture may engage some of the same movements and tap in part the same motor schemas as an “action-action” it has its own thought–language–hand link. In keeping with the unity of speech and gesture, the manual movements of gestures as well as the actions of speech are co-opted by Mead’s Loop and orchestrated in new ways by significances other than those of the original actions. We observe the separation of “action-actions” and “gesture-actions” directly in the IW case, where there has been a complete deafferentation.  Without vision, action-actions and gesture-actions dissociate, the first being impossible while second are normal (IW is described fully in a later post).

Some gestures do depict actions; they ritualize the actions they depict. They are more than actions – a kind of performance, a replication of the action, and may also include posture, spatial location, voice, etc. as well as the manual action.

These gestures are of two kinds. Some are pantomimes at their own locus on the continuum of gesture types. Others are “character viewpoint” (C-VPT) gestures, gesticulations with the viewpoint of the character that is being recounted. And here again the difference of pantomimes and gesticulations applies

The C-VPTs pass through the thought–language–hand link. Unlike pantomimes they are co-expressive with speech. Their character’s viewpoint is part of the semiotic which speech opposes in a dialectic. A C-VPT is, among other things, not an O-VPT or “observer viewpoint,” a contrast that it has but that is not part of a pantomime.

Part of the scope, creativity and distinctiveness of human thought lies precisely in its freedom from pragmatic action constraints. When the hands make a gesture, it is thought that controls them and not a hidden action with its own purposes in relation to the physical world.

2. Metaphoricity is the semiotic basis of metaphor, and it also arises out of Mead’s Loop’s “new actions.” It is the mental ability (apparently unique to humans) to experience one thing in terms of something else (“metaphors” = specific cultural-linguistic packages, or temporary individual impromptu ones, that rest on this semiotic). Thinking with metaphors is natural and irresistible, and is explained if metaphoricity (not necessarily any specific metaphor) was a product of how language began. According to Mead’s Loop metaphoricity has existed from the very beginning.

The metaphoricity semiotic came about when the orchestration of actions of the vocal tract and hands was undertaken by something other than those actions, by meaningful gesture imagery, one thing (voice, hands) gaining significance in terms of something else that it is not.

Examples are the hands rotating in the Part 1 illustration – a process as a rotation (co-expressive with “barreling up,” a spoken version of the same metaphor). A novel, not culturally shared, impromptu example is a gesture for inaccessibility – hands separated, one above, the other below, Tweety on his perch, Sylvester on the street below.  The speaker first struck this pose as she said, “part of the problem is that Tweety Bird’s inaccessible,” and then struck it six more times as she listed the various attempts that Sylvester had made to reach Tweety, each gesture starting as an iconic depiction of the attempt and seguing into the metaphor, conveying the futility of the attempt because of this inaccessibility.

3. Emblems as metaphors.  Metaphoricity shows up spontaneously in these impromptu gestures made on the fly by individual speakers but it also appears in gestures of the seemingly opposite kind – culturally ratified gestures of the kinds listed in dictionaries of “Neapolitan Gestures” or “Persian Gestures” and the like – the so-called emblems.  Many if not all cultural emblems contain probably originally impromptu metaphors at the core. The very possibility of an emblem can be considered another consequence of Mead’s Loop, by way of metaphoricity.

Cultures imbue certain metaphors (arguably ones reflecting cultural values and history) with standards of form and specific functions.  This moves them from impromptu gesticulation toward (but not quite reaching) the sign language end of the gesture continuum.

The “OK” sign (sometimes called “the ring”) is a convenient example that appears in many cultures. It is a culturally mandated version of gestures for precision. In performing them one experiences the abstract idea of precision as a feeling of minimization of the space between surfaces; surfaces not automatically in contact but which in the gesture touch. The OK sign’s meaning as an emblem reflects this precision origin, something “being OK,” earning approbation, because it “precisely” meets the requirements at hand. In raw form a precision gesture can be made in various ways, and it has variety even with a single hand, any finger in contact with the thumb (the thumb is invariable for anatomical reasons, but the different fingers contacting it may have their own significances).

The “OK” emblem restricts the handshape – only the forefinger makes contact, the other

fingers extending outward. Meaning is likewise restricted – approbation because something is precise, is just so (like the spoken, “that’s it!”). Reflecting its precision source “OK” differs from another approbation emblem, thumbs-up, which has its own metaphor – “up is good.”  Thumbs-up (or -down) uses pointing to indicate the metaphor proper. The upturned thumb indicates the location of the good in “up-is-good” (conversely, thumbs-down indicates the location of the bad in “down-is-bad”). (The “precision” meaning of “OK” also rules out another theory that the gesture has nothing to do with metaphor but reproduces the letters “O” and “K” – then why is precision the core?)

Emblems do not reach the sign language end of the gesture continuum because, while they can co-occur with other emblems, the combinations lack any stable syntagmatic value: waving the “OK” sign back and forth could be “not OK,” or “everything is OK,” or “look, it’s OK!” a range so broad and replete with contradictions that it is fundamentally non-languagelike.

Children pick up some emblems as early as the first birthday (waving “bye-bye” and others) but it is hugely doubtful that metaphoricity plays any part. A metaphor source of any kind is unlikely (to say the least, if the source of “bye-bye” is something like wiping a situation or oneself away).

So-called “child metaphors” likewise probably do not involve metaphoricity – a 24-month-old saying “cup swimming” as he pushed a cup along in his bath or “I’m a big waterfall” as he slides down his father’s side while wrestling, presumably are not experiencing the cup’s motion as swimming or his own motion as a waterfall.  Instead, there is a piling on of descriptions as they are “shared with the other,” which is quite a different thing (cf. Werner & Kaplan’s remark: the speech of children this young has “the character of ‘sharing’ experiences with the other rather than of ‘communicating’ messages to the other”).

4. What it means if children acquire language twice. Of all the possible indicators of gesture-first it is early ontogenesis that most convincingly suggests it may once have existed. Something like a recapitulation of it arises and performs a scaffolding function like that envisioned by gesture-first advocates. But it dies out and is followed by a kind of extinction during a transitional period from 2 to 3 years roughly.

GPs then emerge “late,” at age 3 or 4 years, with several indications of dual semiosis emerging at the same time, suggesting that gesture-first had once existed (both in children and in the ancient past) but went extinct and a new form of language followed, where speech and gesture imagery merged into the unified packages inhabited by thought and being that we see in ourselves.

That is, language appears to emerge in the child twice, with the first emergence extinguishing – children first acquiring a single semiotic language of which a gesture-first creature also would have been capable; later developing the dual semiotic language we all carry with us.

This style of argument – resting on ontogeny-recapitulates-phylogeny – has often been derided but there has been a recent revival of interest in it. It can be useful and heuristic for sorting out steps in phylogenesis.

For current-day children, the argument implies, contrary to a longstanding assumption that children develop more or less continuously (perhaps with stages, but earlier acquisitions still carrying forward), that ontogenesis is not cumulative; it is a mixture of continuity and discontinuity.

Discontinuities come from the recapitulation of the two origins. Continuities come from an autonomous development of speech control. Speech in the child separates from Mead’s Loop, its evolutionary origin according to theory, which could be due to other evolutionary pressures that adapted speech to garner parental attachment (“baby-talk” is the adult half of the same adaptation).

The early single-semiotic acquisition is limited, much as Bannard et al. remark:

“…children’s speech for at least the first 2 years of multiword speech is remarkably restricted, with constructions being seen with only a small set of frequent verbs … and many utterances being built from lexically-specific frames.”

Limitation is seen in what Braine called “pivot grammars” and Lieven et al “templates.”  Pivots could have been the highest reach of gesture-first.  The table below is an example from Braine.  There would be as many “grammars” as there are pivots, possibly in the hundreds for gesture-first creatures (such a language would not have one of the major features of human language, the infinite productivity in which there is no last sentence).

In other words, the first steps children take toward language may not lead to language, but to something coming from a long-extinct creature; then a second origin of the language that we take for granted, but this is not until the first has extinguished. The single-semiotic gestures without co-expressive speech of the first phase – pointing, pantomime, emblems, action-stubs, diffuse motor responses – are quire different from those of the last – dual semiotic and unified with speech.

A pivot “grammar” with  “want”

want     + { baby

We can take the recapitulation argument a step further: when something emerges in current-day ontogenesis only at a certain stage we reason (in this way of arguing) that the original natural selection of the feature (if any) took place in a similar psychological milieu in phylogenesis. We exploit the fact that children’s intellectual status is not fixed; it is changing. Thus we look for new states that seem pegged to steps in the ontogenesis of growth points and Mead’s Loop underlying them, and consider these steps as possible windows onto phylogenesis.

Using this argument, we are able to look at the ontogenesis of the GP and formulate possible phylogenetic landmarks.  Most importantly, the GP’s emergence seems tied to the development of the child’s self-aware agency, appearing first at age 4 or so, suggesting that in phylogenesis a similar sense of one’s own agency was a condition for Mead’s Loop, a plausible hypothesis given that Mead’s Loop is adaptive when the adult sees her own gesture–speech as social/public. It made “instruction” possible as opposed to “doing” with an onlooker.

5. Material carriers, “inhabitance,” and cognitive being. Another consequence of Mead’s Loop is what Vygotsky termed the material carrier – the embodiment of meaning in enactments or material experiences. Having a material carrier enhances the symbolization’s experiential potency. The speaker/hearer “inhabits” the materialized symbols.

Experiential enhancement of language is possible if the gesture is the image in an imagery–language dialectic, not an “expression” or “representation” of it, but is it. From this viewpoint, a gesture, the global-synthetic whole, is an image in its most developed – that is, in its most materially, naturally embodied – form. The absence of a gesture is the converse, an image in its least material form.

The material carrier concept thus explains how an imagery–language dialectic still is possible in the absence of visible gestural movement. When there is no overt gesture there is still imagery and with linguistic categorization a dialectic, still a simultaneous rendering of meaning in opposite semiotic modes – the dialectic in its essentials – but bleached and at the lowest level of materialization. This leads us to expect that gestures are more elaborate, more materialized and more frequent – more “existent” – when the gesture has greater newsworthiness, as we shall see in the next post of this series.

The source of the material carrier effect is ultimately Mead’s Loop with gesture-actions orchestrated under significances other than action-actions: materialization follows ineluctably. Materialization implies that the gesture, this natural material carrier, the actual motion of the gesture itself, is a dimension of meaning.

The concept of a material carrier is brought to a whole new level when we turn to Merleau-Ponty for insight into the unity of gesture and language and what we expect of gesture in a dual semiotic process.

Gesture, the instantaneous, global, nonconventional component, is “not an external accompaniment” of speech, which is the sequential, analytic, combinatoric component; it is not a “representation” of meaning, but instead meaning “inhabits” it (partly quoted in Part 3):

The link between the word and its living meaning is not an external accompaniment to intellectual processes, the meaning inhabits the word, and language “is not an external accompaniment to intellectual processes” (Merleau-Ponty’s quotation is from Gelb and Goldstein 1925). We are therefore led to recognize a gestural or existential significance to speech . . . Language certainly has inner content, but this is not self-subsistent and self-conscious thought. What then does language express, if it does not express thoughts? It presents or rather it is the subject’s taking up of a position in the world of his meanings. (p. 193).

The GP is geared to this “existential content” of speech – this “taking up a position in the world.”  Gesture, as part of the GP, is inhabited by the same “living meaning” that inhabits the word (and beyond, the discourse).

A deeper answer to the query – when we see a gesture, what are we seeing? – is that it is part of the speaker’s current cognitive being, her very mental existence, at the moment it occurs. This extends the material carrier and ultimately rests on a gesture–speech unit. By performing the gesture, a core idea is brought into concrete existence and becomes part of the speaker’s existence at that moment.

The Heideggerian echo in this statement is not accidental. Following Heidegger’s emphasis on being, a gesture is not a representation, or is not only such: it is a form of being. From a first-person perspective, the gesture is part of the immediate existence of the speaker. Gestures (and words, etc., as well) are themselves thinking in one of its many forms – not only expressions of thought, but thought, i.e., cognitive being, itself. To the speaker, gesture and speech are not only “messages” or communications, but are a way of cognitively existing, of cognitively being, at the moment of speaking.

The speaker who creates a gesture of Sylvester rising up fused with the pipe’s hollowness is, according to this interpretation, embodying thought in gesture, and this action – thought in gesture-action over the thought–language–hand link – was part of the person’s being cognitively at that moment.

To make a gesture, from this perspective, is to bring thought into existence on a concrete plane, just as writing out a word can have a similar effect. There is not a causal sequence: thought → speech/gesture. Speech and gesture are the thought coming into being at that instant. The greater the felt departure of the thought from the immediate context, the more likely its materialization as a gesture, because effort adds to being. Thus, gestures are more or less elaborated depending on the importance of material realization to the existence of the thought.

6. The theater of the mind, closed. The “H-model” avoids the homunculus problem encountered by the third person perspective inherent to the concept of a “representation” and with it the “theater of the mind” problem.  The theater of the mind is the presumed central thinking area in which representations are “presented” to a receiving intelligence. The possibilities for homunculi – each with its own theater and receiving intelligence – spiraling down inside other homunculi are well known.  In the H-model, there is no theater and no extra being; the gesture is, rather, part of the speaker’s momentary mode of being itself, and is not “watched.”  The theater is closed or, rather, it never opened.

Further Reading


Kendon, Adam. 2009. ‘Manual actions, speech and the nature of language.’ In Gambarara, Daniele and Givigliano, Alfredo (eds.). Origine e sviluppo del linguaggio, fra teoria e storia. Pubblicazioni della Società di Filosofia del Linguaggio, pp. 19-33. Rome: Aracne editrice s.r.l.

Kendon, Adam. 2010. ‘Accounting for forelimb actions as a component of utterance: An evolutionary approach.’ Plenary Lecture. International Society for Gesture Studies, Frankfurt/Oder, July 25, 2010. (abstract at, accessed 09/10/12).

LeBaron, Curtis and Streeck, Jürgen. 2000. ‘Gestures, knowledge, and the world,’ in D. McNeill (ed.) Language and Gesture, pp. 118-138. Cambridge.

Streeck, Jürgen. 2010. Gesturecraft: The manu-facture of meaning. Benjamins.

Child language and ontogeny recapitulates phylogeny

Bannard, C., Lieven, E., and Tomasello, M. 2009. ‘Evaluating constructivist theory via Bayesian modeling of children’s early grammatical development.’ Abstract posted on the International Cognitive Linguistics Conference website, accessed 03/30/09.

Braine, Martin D. S. 1963. ‘The ontogeny of English phrase structure: the first phase.’ Language 39:1-13.

Butcher, Cynthia & Goldin-Meadow, Susan. 2000. Gesture and the transition from one- to two-word speech: When hand and mouth come together. In D. McNeill (ed.), Language and Gesture, pp. 235-257. Cambridge

Goldin-Meadow, Susan & Butcher, Cynthia. 2003. Pointing toward two-word speech in young children. In S. Kita (ed.), Pointing: Where language, culture, and cognition meet, pp. 85-107. Erlbaum.

Levy, Elena, 2011.  ‘A new study of the co-emergence of speech and gestures: Towards an embodied account of early narrative development.’ Language Fest, University of Connecticut, Storrs, CT.

Lieven, Elena, Salomo, Dorothé and Tomasello, Michael. 2009. ‘Two-year-old children’s production of multiword utterances: A usage-based analysis’, Cognitive Linguistics. 20: 461-507.

MacNeilage, Peter F. 2008. The Origin of Speech. Oxford.

Werner, Heinz and Kaplan, Bernard. 1963. Symbol Formation. Wiley.

Material carriers, inhabitance, and cognitive being

Dreyfus, H. 1994. Being-in-the-World: A Commentary on Heidegger’s Being and Time, Division I. MIT.

Gallagher, Shaun.  2005. How the Body Shapes the Mind. Oxford.

Merleau-Ponty, Maurice. 1962.  Phenomenology of Perception (C. Smith, trans.). Routledge.

Quaeghebeur, Liesbet. 2012. The ‘All-at-Onceness’ of embodied, face-to-face interaction. Journal of Cognitive Semiotics 4: 167-188.


Cienki, Alan and Müller, Cornelia. 2008. Metaphor and Gesture. Benjamins.

Lakoff, George and Johnson, Mark. 1980. Metaphors We Live By. Chicago.

Müller, Cornelia. 2008. Metaphors – Dead and Alive, Sleeping and Waking. A Dynamic View. Chicago.


Carlson, Patricia and Anisfeld, Moshe. 1969. ‘Some observations on the linguistic competence of a two-year-old child.’ Child Development 40:569-575.

Theater of the mind

Dennett, Daniel C. 1991. Consciousness Explained. Little, Brown.

David McNeill is a professor in the Departments of Linguistics and Psychology at the University of Chicago.

His new title How Language Began: Gesture and Speech in Human Evolution is now available from Cambridge University Press at £19.99/$36.99


3 comments to The origin of language in gesture–speech unity

Leave a Reply to The origin of language in gesture–speech unity « Cambridge Extra at Linguist List




You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>