The origin of language in gesture–speech unity

Part 6. Gumbo: The thought–language–hand link, social interactive growth points, the timeline of Mead’s Loop, and bionic language.

David McNeill, University of Chicago

To end this series, I address four questions regarding Mead’s Loop: 1) what evidence is there for the thought-language-hand link that in theory it established; 2) how did it change face-to-face social interaction; 3) when did it emerge; and 4) how far can it be duplicated artificially?  The questions, disparate as they are, are connected through the concept of the growth point, which is the linchpin of each.

 

The “IW case” reveals the thought–language–hand link   

Natural selection of a thought-language-hand link, chiefly in Broca’s Area but also with links to the other “language areas” indicated in Part 3[HYPERLINK TO 3], was part of Mead’s Loop’s evolution.  This thought–language–hand link is usually submerged among other actions, but in certain forms of neuropathy, where sensory feedback is eliminated, it becomes visible.  Then action is disrupted but gestures are unaffected.  It is this dissociation that reveals a specific path from thought and language to gesture in the human brain.

Mr. Ian Waterman, sometimes referred to as “IW,” whose character and achievements are captured in the title of Jonathan Cole’s book about him, Pride and a Daily Marathon, at age 19, suffered a sudden, total deafferentation of his body from the neck down – the near total loss of all the touch, proprioception, and limb spatial position senses that tell you, without looking, where your body is and what it is doing.  The loss followed a never-diagnosed fever that Jonathan Cole believes set off an auto-immune reaction.  The immediate behavioral effect was immobility, even though IW’s motor system was unaffected and there was no paralysis. The problem was not lack of movement per se but lack of control. Upon awakening after three days, IW nightmarishly found that he had no control over what his body did – he was unable to sit up, walk, feed himself or manipulate objects; none of the ordinary actions of everyday life, let alone the precise actions required for his vocation.

To imagine what deafferentation is like, try this experiment suggested by Shaun Gallagher: sit down at a table (something IW could not have done at first) and place your hands below the surface; open and close one hand, close the other and extend a finger; open the first hand and put it over the closed hand, and so forth. You know at all times what your hands are doing and where they are but IW would not know any of this – he would know that he had willed his hands to move but, without vision, would have no idea of what they are doing or where they are located.

After years of constant self-drill, IW has mastered movement in an entirely new way – he plans movements in advance and visually monitors them as they occur.  It is remarkable to watch him since these movements look so normal – accurate, at speed, and seemingly effortless (although actually the result of great concentration).

IW also performs gestures in the same planned, monitored way.  He refers to them as “constructed,” and distinguishes them from others he terms “throw-aways,” which just happen without planning or monitoring.  For us, of course, the focus is on precisely these “throw-aways.”

Thanks to the BBC, filming for a Horizon program about IW (“The Man Who Lost His Body,” 1998), IW, Jonathan Cole, Shaun Gallagher and the University of Chicago gesture researchers gathered at our Gesture and Speech Lab for several days of filming. We wanted to record IW under a variety of conditions, both with and without vision. IW cannot simply be blindfolded.  He would be unable to orient himself and be at risk of falling over.  Taking up an idea of Nobuhiro Furuyama, we devised a tray-like blind, that could be pulled down in front of IW, blocking vision of his hands, while allowing him space to move and preserving visual contact with his surroundings. IW was videotaped retelling our usual animated cartoon.  He also was recorded under the blind in casual conversation with Jonathan Cole.

IW’s gestures without vision. The first pair of illustrations shows a coordinated two-handed tableau (a “throw-away”) in which the left hand is Sylvester and the right hand is a streetcar pursuing him.  IW was saying, “[and the atram bcaught him up]” (a, b referring to the illustration’s first and second panels). His right hand moved to the left in exact synchrony with the co-expressive “caught” (boldface), although slightly out of alignment (reflecting a lack of topokinetic control, which requires feedback, versus morphokinetic control, which IW achieves).  Moreover, a poststroke hold (underlining) extended the stroke image through “him” and “up,” capturing more of the co-expressive speech.  It is important to recall that this synchrony and co-expressivity were achieved without proprioceptive or spatial feedback.

This kind of performance by IW – coordinated gestures without feedback (there are many other examples) – is part of the evidence we need of a thought–language–hand link.

The other part is that his gestures are separate from practical actions, which without vision are impossible for him. The second pair of illustrationss shows two steps in his attempt to remove the cap of a thermos bottle.  The first is immediately after Jonathan Cole has placed the thermos into his right hand and placed his left hand on the cap (IW is strongly left handed); the second is a second later, when IW has begun to twist the cap off.  As can be seen, his left hand has fallen off and is turning in midair.  Similar disconnects occurred during other instrumental actions (threading a cloth through a ring, hitting a toy xylophone, etc. – this last of interest since IW could have made use of acoustic feedback or its absence to know when his hand had drifted off target, but still he could not perform the action).

coordinated two handed iconic gesture without vision

 

action

 

action with speech

 

In keeping with the thought–language–hand link, IW was able to “remove” the cap of an imaginary thermos in gesture (third illustration) and, although not asked to speak, spontaneously produced synchronous, co-expressive speech as he performed it.

IW, without vision, changes speech and gesture in tandem. Another manifestation of the thought–language–hand link is that, without vision, IW modulated the speed at which he presented meanings in both speech and gesture, and did this in tandem. As his speech slowed, his gesture slowed, and to the same extent, so that synchrony and speech–gesture unity were preserved (which the Warlpiri speaker, described in post 2, who was not producing impromptu gestures as with IW but a sign language, specifically could not do while also producing speech),

If IW is forming gesture–speech units, this joint modulation of speed is explicable. He does it based on a sense (available to him) of how long a given joint imagery-linguistic unit remains “alive,” and lacking peripheral sensory feedback needn’t play a part.

During a conversation with Jonathan Cole, while still under the blind, IW reduced his speech rate at one point by about one-half (paralinguistic emphasis). Speech and gesture remained in synchrony:

Normal Speed “and I’m startin’ t’use m’hands and that’s be-” (bold = hands rotating)

Slow Speed “-cause I’m startin’ t’get into” (bold = hands rotating)

The gestures are of a familiar metaphoric type in which a process is depicted as a rotation in space.  IW executes the metaphor twice; first at normal speed, then at slow speed.

The crucial observation is that the hand rotations are locked to the same landmarks in speech at the two speeds (across nearly the same syllable counts). If we look at where his hands orbit inward and outward we see that the rotations at both speeds coincide with the same lexical words and the same stress peaks.

It is important to recall that this tandem slowing was produced without any proprioceptive and spatial feedback; IW could not tell what his hand were doing, yet they were in perfect synchrony with speech.

Whatever controlled the slowdown it was exactly the same for speech and gesture.

As the rotating hands were metaphors for the idea of a process, the pacesetter accordingly was activated by a thought–language–hand link and was co-opted by significances other than the action of rotation itself.  This co-opting is shown in the timing, since the hands rotated only while IW was presenting the metaphor, “I’m starting to…,” and there was actually a cessation of the gesture between the first (normal speed) and second (reduced speed) rotations when he said, “and that’s because.”

That is, the rotation and any phonetic linkages it claimed were organized specifically around the idea of a process as rotation. This is gesture–speech unity over the thought–language–hand link.

Overall significance of the IW case for the origin of language. The IW case suggests that control of the hands and the relevant motorneurons is possible directly from the thought-linguistic system. It does not pass through any pantomime (so is yet another piece of evidence that pantomime is not connected to human speech). Without vision, IW’s dissociation of gesture, which remains intact, from instrumental action, which is impaired, implies that the “know-how” of gesture is not the same as the “know-how” of instrumental movement.  In terms of brain function, it implies that at some point gesture enters a circuit of its own and hooks into speech.  A likely locus of this thought–language–hand link, at least in part, is in areas 44 and 45 or Broca’s area.

Mimicry, interpersonal synchrony of growth points.

The Mead’s Loop “twist” sustains a host of social interactions including turn-exchanges in conversations and two-person mind-merging.

Irene Kimbara studied gestural mimicry as an interactive phenomenon. Mimicry she describes as a process of “interpersonal synchrony,” which creates a sense of solidarity and is prominent when the interlocutors are personally close.

Mimicry can merge GPs and contexts between speakers.  If through mimicry people approximate similar growth points they come to some common ground.  It works through embodiment. Recreating a gesture in mimicry is more than imitating a movement; it is the envelopment of the mimic in the other’s world of meanings. Imagine experiencing an interpersonal misconstrual. One can overcome it by mind-merging the other’s GP and, from this, finding its context, the only context in which this mimicked GP is a possible differentiation. By their nature, growth points are not independent of the context, which means that if a speaker is generating a GP contexts also tend to emerge.

The mimicry need not be overt. We focus on the mimicry of growth points. It is accomplished at a level of orchestration, with or without overt movement.

Two-body GPs are joint constructions, collaborative GPs wherein Mind 2 mimics the gesture and speech of Mind 1. A psychological predicate and field of equivalents seemingly belonging to Mind 1 arise as if by magic (but it is not magic – it is because the original gesture had absorbed this context and mimicking it recreates it at least in part).

Mimicry imports the GP of the other (or rather, recreates it over one’s own thought–language–hand link). It is a kind of borrowed embodiment.  It recreates the other’s gesture–speech unit as if it were one’s own.  The many experimental demonstrations of sympathetic responses to verbs that denote actions (“grab” accompanied by a listener’s incipient grabbing) are, in this version, mimicry of “new actions,” of GPs.

Two-body GPs appeared in experiments devised by Nobuhiro Furuyama.  The setting was one person teaching a second person, a stranger, how to create an origami box but without actual paper in hand. In one version shared embodiment occurred when the learner mimicked the teacher’s gesture without the learner speaking. The gesture instead was synchronized with the teacher’s speech – “[pull down] the corner,” the learner performing a gesture during the bracketed portion. One person, that is, appropriated the other’s speech, combining it with her gesture, as if student and teacher were jointly creating a single GP.

The reverse also occurs.  The learner appropriates the teacher’s gesture by combining it with her speech. In one such case the learner (female) said, “[you bend this down?]” and during the bracketed speech seized and moved the (male) teacher’s hand down. It is striking that the taboo normally prohibiting strangers, especially of opposite genders, from non-accidental physical contact was overridden, possibly because both the learner’s and tutor’s hands were no longer “hands,” actual body-parts subject to the taboo, but pure symbols.

Turn-taking at momentary overlaps of GPs depends on this process, and creates yet another interactive discourse unit.  Turn-taking is typically analyzed as the coordinated activity of one speaker authorizing the next speaker to speak. But the process also involves joint GPs at the exchange point, with gestures playing a critical role. A GP starts with one speaker and passes over to the next speaker.  Emanuel Schegloff, in an early gesture study, used gesture to forecast what would be “in play” in the next round of a conversation.  We follow his lead, supplemented with the concept of a GP, and look for joint GPs and the contexts they differentiate.  A new joint discourse unit is formed when the listener mimics the gesture of the speaker; or when two individuals participate in one GP, one providing the speech, the other the gesture.

Shared tip of the tongue. Mimicry also offers an explanation (found by Liesbet Quaeghebeur, not published) of the curious phenomenon of tip of the tongue contagion – one person cannot recall a common word whose meaning is clear to all, and you, the interlocutor, suddenly also cannot recall it.  If conversation includes “mind merging,” it also could include “tip-of-the-tongue merging,” through spontaneous mimicry.

Gesture-coder mimicry. Coders frequently mimic gestures and speech as they work.  Mimicry brings the speaker’s differentiation and context into the coder’s own momentary cognitive being; she inhabits the other’s gesture and speech. It is mimicry of a stranger visible in video (or, as here, in a screenshot). The following illustrations demonstrate the experience.

In Panel 2 the field of equivalents is something like EXPECTED TWEETY and the differentiated newsworthy point is GRANNY.

In Panel 3 it is something like HOW THIS ESCAPADE ENDS, with the point of differentiation NOT WHAT HE THOUGHT.  Each speaker forms his own story, and with speech the gestures tell it.

 

Mimicking of fields of oppositions self test, as much as gesture codes do spontaneously

 

Thanks ultimately to the social references built into Mead’s Loop, we find discourse units in conversations formed by two persons, their gestures and contexts realized in common, through mimicry. Mimicry can take place in conversations, in deception or during instruction, or in the virtual interaction of a gesture coder with video images of another person’s gestures.

Mead’s Loop timeline.

The phrase, “the dawn of language,” suggests that language burst forth at some definite point, say 150200 kya (thousand years ago), when the prefrontal expansion of the human brain was complete.

But the origin of language has elements that began long before – 5 mya (million years ago) for bipedalism, on which things gestural depend. I think 2 mya, based on humanlike family life dated to then, for starting the expansion of forebrain and the selection of self-responsiveness of mirror neurons and the resulting reconfiguration of Areas 44/45. I imagine this form of living was itself the product of changes in reproduction patterns, female fertility cycles, child rearing, neotony, all of which must have been emerging over long periods before.

So this says that language as we know it emerged over 1 to 2 million years and that not much has changed since the 150K200K landmark of reconfiguring Broca’s area with the mirror neurons/Mead’s Loop circuit (although this date could overlook continuing evolution: there are hints that the brain has changed since the dawn of agriculture and urban living).

The Mead’s Loop model doesn’t say what might have been a protolanguage before 2 mya – Lucy and all. It would have been something an apelike brain is capable of. There are many proposals about this – Kendon, for example, proposed that signs emerged out of ritualized incipient actions (or incomplete actions). Natural gesture signals in modern apes have an incipient quality as well, the characteristic of which is that an action is cut short and the resulting action-stub becomes a signifier. The figure in Part 2 shows a truncated shove by one bonobo signaling a demand for a second bonobo to move in a certain direction.

The slow-to-emerge precursor from 5 mya to 2 mya may have built up a gesture language from instrumental actions, a gesture-first type language. It would have been an evolution track leading to pantomime.

But the human brain evolved a new system in which gesture fused with vocalization.

Mead’s Loop also does not say where language evolved (an argument by Atkinson suggests the southwestern corner of Africa), but it does “predict” that wherever it was the languages there now would tend to be of the isolating type (and this appears to be the case in SW Africa; see Part 3 for the “isolating type”). In any case the origin point would have been an area where human family life also was emerging.

A proposed time line for the origin of Mead’s Loop is as follows:

  1. To pick a date, the evolution of a thought–language–hand link started 5 mya with the emergence of habitual bipedalism in Australopithicus. This freed the hands for manipulative work and gesture, but it would have been only the beginning. Even earlier there were preadaptations such as an ability to combine vocal and manual gestures, to perform rapid sequences of meaningful hand movements, and the sorts of iconic/pantomimic gestures we see in bonobos, but not yet an ability to orchestrate movements of the vocal tract by gestures.
  2. The period from 5 to 32 mya – Lucy and the long reign of Australopithicus – would have seen the emergence of various precursors of language, such as the protolanguage Bickerton attributes to apes, very young children and aphasics; also, ritualized incipient actions becoming signs as described by Kendon.
  3. At some point after the 32mya advent of H. habilis and later H. erectus, there commenced the crucial selection of self-responsive mirror neurons and the reconfiguring of areas 44 and 45, with a growing co-opting of actions by language to form speech-integrated gestures, this emergence being grounded in the appearance of a humanlike family life with a host of other factors shaping the change (including cultural innovations like the domestication of fire and cooking). The timing of this stage is not clear but recent archeological findings strongly suggest that hominids had control of fire, had hearths, and cooked 800 kya.
  4. Thus, the family as the scenario for evolving the thought–language–hand link we see in the IW case seems plausible, commencing no more recently than 800 kya.

Another crucial factor would have been the physical immaturity of human infants at birth and the resulting prolonged period of dependency giving time for cultural exposure and GPs to emerge, an essential delay pegged to the emergence of self-aware agency (Neanderthals, in contrast, may have had a short period of development).

Along with this sociocultural revolution was the expansion of the forebrain from 2 mya, and a complete reconfiguring of areas 44 and 45, including Mead’s loop, into what we now call Broca’s area. This development was an exclusively human phenomenon and was completed with H. sapiens about 200–100 kya. If a “dawn” occurred , it was here.

At least two other human species have existed, Neanderthals and the recently discovered Denisova hominin; each may have had a gesture-only form of communication but our species also developed Mead’s Loop and GPs. These other humans went extinct, one factor in which could have been a confinement to pantomime and consequent inability to reach a new form of language, inhabitance with thought and action, just as, in our case having evolved this ability, we were spared the same fate (but it is also possible that Mead’s Loop emerged earlier and Neanderthals also had speech–gesture units and extinguished for other reasons; see Part 3 for more).

Language with dual semiosis came into being over the last 1 or 2 million years. Considering protolanguage and then language itself the time-line seems to be over five million years (low hum more than big bang). Meaning-controlled manual and vocal gestures that combine under imagery, emerged over the last two million years. The entire process may have been completed not more than 100 kya, a mere 5,000 human generations, if it is not continuing.

Bionic language.

A bionic version of human language is one possible continuation.  Language in this vision of the future is extended with artificial enhancements. In a book published in 1968 Herbert A. Simon made the case for the sciences of what he called “the artificial.” He was careful not only to explain these sciences but also to distinguish the artificial from the natural.  He seems to have believed that language was “man-made.”  Over the years enthusiasm for the artificial has grown while a sense of its limits has shrunk. However, this enthusiasm underestimates the gulf between the artificial and the natural in the case of gesture–speech unity.

The artificial is not natural.  To begin with, the origin of language was not artificial.  It was “man-made” in one way – it was made in “man” (actually, probably in “woman”) as the product of natural selection but was not artificial in Simon’s sense, the outcome of human purpose and goal-directedness.

Simulations such as automatic speech “recognition,” while constantly improving, do not recreate, nor do they aim to recreate, the human inhabitance of language.

And for good reason. Purpose-designed artificial devices cannot model GPs and their evolved global-synthetic semiotics from Mead’s Loop.  Even gestures, while inputting them might improve recognition, could not lead to an imagery–language dialectic. This is because systems that model gestures (as in conversational agents and physical robots) do it in a bottom-up, features-to-whole, static dimension language-like way that, even if synchronized with speech, is inherently incapable of forming a dialectic.

The problem is not just adjusting models to include imagery. Mead’s Loop is beyond their reach basically because action, which speech fundamentally is and which the linguistic system evolved in part to orchestrate, does not exist as a unit in these artificial systems. They instead construct actions using a feature-based mode wherein the features are the units and the actions the outcomes (in a GP, features are outcomes, actions are the units).

Foremost of these difficulties is the global-synthetic imagery of the GP, essential for the dynamic dimension as a whole. The problem is that the use of features in computational models forces the process of gesture creation to be combinatoric, to move from parts to whole rather than whole to parts; and this loses the semiotic opposition.

Once created we can usually identify form and meaning features, e.g., enclosure means interiority, and so forth. But we must not conclude that composition was the process of creation; it is the result of our analysis. Features are products. This is the paradox of natural gestures – they work in the opposite direction from modeling based on features.

Coordinative structures, drawn to ideas as attractors, may avoid the bottom-up problem but they create a new problem. (An anonymous Yale linguistics handout defines coordinative structures as “flexible patterns of cooperation among a set of articulators to accomplish some functional goal.”) The weakness is that they impose a distinction between “image” and “gesture” (the attractor is the image and coordinative structures fashion a gesture to embody it).

This creates a new contradiction with the concept of a gesture as a material carrier (see part 4). The gesture is the image – the image in its most material form; it is not a copy of it. Thus we have merely exchanged one contradiction for another, and are no closer to a model of the GP and imagery–language dialectic.

Analog machines. One may think that a hybrid analog–digital machine with self-defining, self-segregating imagery would do the trick. The most effective approach would be to build in a self-responding Mead’s Loop and then attempt to have the machine evolve a new language. Robots capable of limb and hand motion may be the nearest approximation to such a machine. To do this information needs to be (or simulated to be):

  • 3D, that is, embody variation as in gesture space.
  • With correct orientation, as in gesture space.
  • In the correct direction, as in gesture space.
  • With texture, as in gesture space.
  • As a spatial array, as in gesture space
  • With local identity (in all 3Ds), as set up in gesture space.
  • With memory of past configurations, as in catchment space.
  • And organized by action.

No doubt the list can be extended but it is already a substantial departure from what I understand to be modeling practice. Its feasibility is far from assured and a global analog device is at present more a deus ex machina than a realizable thing.

But there is a more profound difficulty. None of this is actually imagery, global-synthetic, and meaningful. Meaningful imagery is totally absent, so a hybrid machine is no closer to the imagery–language dialectic than the digital one.

Synchrony.  My co-worker Susan Duncan once contrasted an autonomous agent, “Max,” to the GP in how it synchronizes gesture and speech.  In a GP the synchrony is a condition of the dialectic, and achieving it is a matter of thought, not of external signals tying speech and gesture together. However, as Duncan writes: “Max works as follows – looks ahead, sees what the linguistic resource will be, calculates how far back the preparation will have to be in order for the stroke to coincide with this.  Then speech and gesture are generated on their own tracks, and the two assembled into a multimodal utterance.  In contrast, in the GP the gesture image and linguistic categorization constitute one idea unit, and timing is inherent part of how this thought is created. The start of preparation is the dawn of the idea unit, which is kept intact and is unpacked, as a unit, into a full utterance.”

The natural is not artificial. We learn from these thought experiments that artificial models do not match an evolved biological/psychological process, or head in the right direction to reach the GP – most crucially, that the semiosis must include a global component (to drive the dialectic), that there is a dialectic, and that finally the process is embodied in and tied to action, and requires accordingly a “body” that is the embodiment of meaning. Further, the gesture–speech unit differentiates a context and the context and its differentiation are one “thing.” Finally, all these models conflict with Quaeghebeur’s “all-at-onceness,” in that in their logic they are sequential. Conversational agents can simulate many of these properties, but the basic difference between an artificial system, designed by rational intelligence, and what has naturally evolved remains a root fact in the contrast of the GP with modeling schemes.

Why the fascination with the artificial?  A machine that thinks (or seems to), speaks, or evolves a human language or something like one, strikes us as uncanny.  It captures life in the making, a new existence or being; and for scientific interest, the elements and history of this being.  Uncanniness is one reason for fascination. The fascination (similar to the fascination with chimps schooled in sign language) is actually with our own existence; the bionic has an existence close to but not quite ours, one that can be dismantled and regarded objectively.

And it is this fascination the jeremiad here must disappoint. Machines that attempt to “inhabit” language as Merleau-Ponty would have agreed seem blocked from the possible.

We learn (or recall) the uniqueness of human evolution.  It was it that gave us language. Bionic man tries to make it artificial, and here lies hubris.

**********

And here ends our series on how language began in gesture–speech unity. To all who have participated, I express my thanks and admiration.  Comments are more than welcome at dmcneill@uchicago.edu.  I thank R.B. McNeill, N.B. McNeill and E.T. Levy for very helpful comments.

Further Reading

Atkinson, Quentin D. 2011. ‘Phonemic Diversity Supports a Serial Founder Effect Model of Language Expansion from Africa.’ Science 332: 346-349.

Bickerton, Derek. 1990. Language and Species. Chicago.

Cole, Jonathan. 1995.  Pride and a Daily Marathon.  MIT.

Deacon, Terrence W. 1997. The Symbolic Species: The Co-evolution of Language and the Brain. Norton.

Donald, Merlin. 1991. Origins of the Modern Mind: Three Stages in the Evolution of Culture and Cognition. Harvard.

Evans, Patrick D, Gilbert, Sandra L., Mekel-Bobrow, Nitzan, Vallender, Eric J., Anderson, Jeffrey R., Vaez-Azizi, Leila M., Tishkoff, Sarah A., Hudson, Richard R. and Lahn, Bruce T. 2005. ‘Microcephalin, a gene regulating brain size, continues to evolve adaptively in humans.’ Science 309:1717-1720.

Freud, Sigmund. 2003. The “Uncanny” (Part two), David McLintock, trans., Intro Hugh Haughton. Penguin.

Furuyama, Nobuhiro. 2000a. Gestural interaction between the instructor and the learner in origami instruction, in McNeill D. (ed.), Language and Gesture, pp. 99–117.  Cambridge.

Gill, Satinder. 2007. ‘Entrainment and musicality in the human system interface.’ AI & Society. 21:567–605.

Goren-Inbar, Naama, Alperson, Nira, Kislev, Mordechai E., Simchoni, Orit, Melamed, Yoel, Ben-Nun, Adi and Werker, Ella. 2004.  ‘Evidence of hominid control of fire at Gesher Benot Ya’aqov, Israel.’  Science 304:725-727.

Kimbara, Irene. 2006. On gestural mimicry. Gesture 6: 39–61.

Lieberman, Philip. 2002.  On the nature and evolution of the neural bases of human language. Yearbook of Physical Anthropology 45: 36-63.

McNeill, David, Duncan, Susan, Franklin, Amy, Goss, James, Kimbara, Irene, Parrill, Fey, Welji, Haleema, Chen, Lei, Harper, Mary, Quek, Francis, Rose, Travis, and Tuttle, Ronald. 2009. ‘Mind merging,’ in Morsella, E. (ed.). Expressing Oneself / Expressing One’s Self:  Communication, Language, Cognition, and Identity, pp. 143-164. Taylor and Francis.

McNeill, David. 2010. ‘Gesten der Macht und die Macht der Gesten’, in Wulf, Christoph & Fischer-Lichte, Erika (eds.). Gesten, pp. 42-57. Munich: Wilhelm Fink (translation of ‘Power of Gestures and the Gestures of Power,’ available under Writings: Essays at http://mcneilllab.uchicago.edu/).

Merleau-Ponty, Maurice. 1962. Phenomenology of Perception, Colin Smith (trans.), Rourledge.

Pika, Simone and Bugnyar, Thomas. 2011. ‘The use of referential gestures in ravens (Corvus corax) in the wild.’ Nature Communications 29 November.

Quaeghebeur, Liesbet. 2012. The ‘All-at-Onceness’ of embodied, face-to-face interaction. Journal of Cognitive Semiotics 4: 167-188.

Schegloff, Emanuel A. l984. On some gestures’ relation to talk, in Atkinson J. M. and Heritage J. (eds.). Structures of Social Action, pp. 266–298. Cambridge.

Simon, Herbert A. 1968. The Sciences of the Artificial.  MIT.

Wachsmuth, I., Lenzen, M. and Knoblich, G. (eds.). 2008. Embodied Communication in Humans and Machines.  Oxford.

Wrangham, Richard W. 2001. ‘Out of the pan, into the fire: How our ancestors’ evolution depended on what they ate.’ In F. de Waal (ed.), Tree of Origin: What Primate Behavior Can Tell Us about Human Social Evolution, pp. 119-143. Harvard.

David McNeill is a professor in the Departments of Linguistics and Psychology at the University of Chicago.

His new title How Language Began: Gesture and Speech in Human Evolution is now available from Cambridge University Press at £19.99/$36.99

 

The origin of language in gesture–speech unity

Part 5: The dynamic dimension, modes of consciousness.

David McNeill, University of Chicago

The dual semiosis of global-synthetic gesture, merging with analytic-combinatoric speech, synchronizing at points where they are co-expressive – namely, gesture–speech unity – led to other dynamic properties: the imagery–language dialectic, and three others, “psychological predicates,” “communicative dynamism,” and that GPs self-unpack by “calling” constructions to do it.

Collectively these properties comprise “the sentence” viewed dynamically.  Dynamic properties arose organically out of Mead’s Loop. They would not have been separately selected. They are among the “new actions” mentioned in Part 4, are themselves linked and are inseparable from context. The context, dynamic in itself, penetrates GPs and leads ineluctably to dynamic properties.

We will also see that Wundt’s two consciousnesses of the sentence, the “simultaneous” and the “sequential,” plus a third to be introduced, “metapragmatic consciousness,” relate to the dynamic dimension while focusing on different aspects of it.

1. The dialectic. Mead’s Loop was the author of the dialectic, by bringing the gesture, where the semiotic is global and synthetic, into unity with speech, where cultural ratification and shareability create the opposite semiotic of analysis and syntagmatic value. It also explains why gesture-speech synchrony occurs: synchrony is a matter of thought in the dialectic; it is not an accident or the result of signals back and forth, but an intrinsic part of thinking and language.

2. The psychological predicate (as opposed to a grammatical predicate) is the newsworthy content in a field of contextual possibilities or “equivalents” (Jakobson).  The gesture–speech unit is equally a discourse unit.  It has absorbed its context as a matter of its formation. A psychological predicate:

  • Marks a significant departure in the immediate context; and
  • Implies this context as background.

One of Vygotsky’s examples is a crashing clock (p. 250): There is a crash in the next room – someone asks: “what fell?” (the answer: “the clock”), or: “what happened to the clock?” (“it fell”). Depending on the context – here crystallized in the questions – the newsworthy reply (the psychological predicate) highlights different elements.

This logic also applies to the GP. In forming a GP, the speaker shapes the background in a certain way, in order to highlight an intended differentiation within it, much as the questioner about the falling clock shaped the context of the replies.  We trace this logic using the following examples:

The (a) illustration (also shown in Part 4) is a GP in mid-flight (just as the speaker was saying the extra-stressed vowel of “through”), “he goes up thróugh the pipe this time,” with the gesture during the boldfaced portion.

The speaker had just before described how Sylvester had climbed the pipe on the outside (with the usual catastrophic outcome), so that he was going up it now on the inside was newsworthy. Her psychological predicate was this information.

The context of equivalents it differentiated after the previous mention of the pipe was approximately WAYS OF CLIMBING A PIPE TO REACH TWEETY. Regarded dynamically, stemming from Mead’s Loop, this context and psychological predicate, ON THE INSIDE, were as much component parts of “he goes up thróugh the pipe this time” as were the words themselves.

As before, gesture and speech are co-expressive and semiotically opposites. The gesture depicted the rising character, his direction of movement, and that his path was inside the pipe; a collection of meanings we labeled “rising hollowness,” and depicted them in a single symbol.  Speech at the same time depicted the content analytically – broken into segments (he, goes, up, and through) and relating the segments by combination. The stress on “through” with extra effort was co-expressive with the added effort of the gesture’s spread open hand. In this way, the GP, as a unit, was comprised of synchronous and co-expressive semiotic opposites.

It was this content, differentiated in the psychological predicate and alive throughout the utterance, that filled in part Wundt’s “simultaneous” consciousness.

3. Communicative dynamism (a concept from Firbas) is the force pushing communication forward. As a dynamic feature it too emerged from Mead’s Loop and is also part of simultaneous consciousness.

In fact, the psychological predicate and communicative dynamism are deeply connected, actu­ally are the same thing, a GP looking backwards or forwards at its context.  It is a psycho­logical predicate as the GP looks at the context that it differentiates, a context possibly shaped to make the differentiation possible; it is communicative dyna­mism as the GP looks forward at the new context that by its own differentiation it reshapes.

This is possible because a GP absorbs the context as a matter of its formation. The context is continu­ously running through it, being both differentiated and reconfigured as the GP absorbs it.

For the psychological predicate the maxim issignificance” = “two things.”  “One” meaning is (a) a point of differentiation and (b) the context in which it is differentiated; a GP is both.  Merely having an association or intending a meaning, etc. is not enough.

For communicative dynamism the maxim is “more significant” = “more effort,” effort being the material carrier of the force pushing the communication forward (see Part 4 for the material carrier). It is not that the gesture expands as speech shrinks (as one might think if gesture took over from speech, but gesture and speech are unified). The most elaborate linguistic units are accompanied by the most developed gestures, the least with the least. The more discontinuous an utterance from the previous context, the more probable a gesture, the more internally complex it will be, the more complex the synchronous speech, and the more all of it adds to communicative dynamism.

Communicative dynamism is the opposite of an economic, costs-minimizing/benefits-maximizing, “least effort” = “best result” dictum that some claim to be the optimal form of action. The economic model, for those fond of eponymous acronyms, can be called “Summer Lazing on the Beach” or SLOB: so minimal effort, “Ugh!” brings maximal benefit, Jeeves: “Would you care for a cool drink, sir?”  However, the communicative dynamism of the GP is the opposite of SLOB.

A natural experiment featuring psychological predicates and communicative dyna­mism shows these two faces of the GP at work.

Fortui­tously, in the cartoon stimulus, Sylvester uses the drainpipe twice to reach Tweety. So far we’ve concentrated on the second of his attempts, the inside ascent. Just before it, in the cartoon, he climbed the pipe on the outside, like a ladder.  Comparing descriptions of the two ascents shows how the context is divided between psychological predicate and communicative dynamism.  Whatever a psychological predicate differentiates, communicative dynamism pushes forward, and these are the GP’s two faces (the natural experiment was discovered by my colleague, Susan Duncan).

If a speaker recalls both attempts, in the correct outside-inside order, as did (a), the psychological predicate relating to the second attempt should focus on interiority, reflecting the effects on context of the previous GP’s communicative dynamism. This follows from the psychological predicate concept. With the second attempt, climbing itself is no longer newsworthy; interiority is now the newsworthy point, and we get ON THE INSIDE, the psychological predicate orchestrating Speaker (a)’s “rising hollowness” growth point.

However, if a speaker recalls only the inside attempt and fails to recall the outside attempt, or recalls both attempts but reverses their order, the context would not have been reshaped by the outside ascent.  Then the interiority of the inside ascent would not be newsworthy, it lacks equivalents to which it contrasts and would not be part of a new psychological predicate.

Speaker (b) illustrates this situation. She had forgotten the outside ascent and, for her, the context of the inside ascent was not WAYS OF CLIMBING THE PIPE, with a psychological predicate of ON THE INSIDE, but WAYS OF USING A DRAINPIPE, with CLIMB IT the newsworthy point.  That is, not having the communicative dynamism of the outside’s CLIMB IT, this very context and psychological predicate now arise for the inside climb.

This also follows from the psychological predicate concept. Lacking discourse significance, interiority would be just another detail from the cartoon (no one in any experiment has ever recalled only the outside attempt).

And as the (b) illustration shows, the gesture for the inside ascent depicted a simple ascent, with no interiority. Speech likewise, co-expressively, contained only CLIMB IT (“he tries climbing …”).

We can be certain Speaker (b) knew that Sylvester had climbed the pipe on the inside since she continued her description with an allusion to it, saying “Tweety … drops a bowling ball down the rain barrel”.

This natural experiment shows that Speakers (a) and (b) had different communicative dynamisms for the same cartoon event, correlated with their two psychological predicates, and although the event was the same were experiencing different simultaneous consciousnesses while describing it.

4. GP’s self-unpacking.  The essence of unpacking is: (1) without contradiction, to render a GP into a communicable, socioculturally mandated construction; and (2) to use the sense of well-formedness this construction brings as the GP’s stop-order.  A construction is the stable form on which a GP comes to rest.

Wundt’s “sequential consciousness” is awareness of this unpacking process while it is going on.

The GP core and its unpacking are on different functional levels, yet are intimately tied. This is because unpacking is a kind of self-unpacking by a GP.  A GP does it by calling forth a construction and this can take place at any point, while or after the GP forms.

Something similar could have existed at the dawn. Primordial GPs, seeking dialectic resolutions, produced pressures to develop ways of unpacking them. GPs and self-unpacking could have evolved together.

Both successes and failures of unpacking can be traced to a common source, the need to find constructions with semantic values that mesh with (or at least do not violate) the intended point of differentiation of a GP.

An example of an unpacking going awry is what Nancy Dray and I called the “nurturing example”: a would-be unpacking that contradicted the intended meaning. A speaker in a conversation with a friend was attempting to convey a nuanced idea, that a third person she was describing was given to performing nurturing acts, but these good deeds were also intrusive, cloying, and unwelcome. Initial false starts were based on trying to unpack this idea with “nurture” as a transitive verb (she would “nurture” someone). The effort was repeatedly rejected.

Ultimately an oblique construction that circumvented transitivity was found in (2):

(1) The fact [that she’s . . . ] [she’s nu- uh]

(2) [ . . . she’s somehow. . . she’s] [done this nurtur] [ing] [thing and here you]

The problem was a mismatch at (1) of the transitive construction to the context being differentiated. Transitivity means, roughly, that the woman described has a direct transformative impact via nurturing on the recipient of her action. However, this meaning distorted the idea the speaker intended to convey, which was something more like HER OTIOSE ACTS. The slight but successful updating in (2) separated effect from act and gave an unpacking that differentiated this context without clashing with it.

An example of a success, by yet another speaker retelling the Sylvester and Tweety cartoon, is illustrated in the following:

The drawings show two phases of a downward thrusting gesture for Tweety’s launch of the bowling ball: first (left), the beginning of the gesture (the start of its preparation), which we conclude was the GP’s onflash (there is no other reason the hands took this shape and position than to get ready to perform the gesture). Then (middle), the gesture stroke surfacing with its co-expressive speech:

  1. And Tweety Bird runs and gets a bowling ba[ll (bracket where preparation starts)
  2. And drops it down the drainpipe” (boldface = the gesture stroke).

The “it down” plus gesture GP core unpacked itself by calling a construction in which the Tweety Good-force could be transferred to the bowling ball. This was “caused-motion,” where Tweety appears as agent and causes a change of state of the bowling ball. The change was two-fold, both causing the bowling ball to move and, metaphorically, turning it into Tweety’s surrogate.

A construction has its own meaning, pre-analyzed and broken into slots where words or other constructions are inserted with pre-determined syntagmatic values.

With caused-motion there is a slot for the agent of change (Tweety) and for the object undergoing the change (the bowling ball).  Thus the GP had points of contact in the construction by which it could “call” it.

To do this however the speaker had to transform the cartoon to make Tweety, not gravity, the agent.  We see this transformation at the very start. The gesture was shaped, at the onset of preparation, to “take up a position in the world of meanings” (Merleau-Ponty’s remark, quoted in Part 3), a meaning-world having to do with caused-motion, agency and unpacking.

The handshape was not an iconic depiction of the cartoon event – there, as seen in the last panel, Tweety supported the bowling ball with his “hands” facing upward and released it, letting gravity do the rest.  In the gesture the handshape was rather that of an agent thrusting a curved surface down. It was with this agent- and object-readiness the GP reached out to caused-motion.

5. Metapragmatic awareness. The simultaneous and sequential are consciousness of the sentence on the inside. Metapragmatic awareness is consciousness of the external forces shaping, guiding and motivating it, including context.

The term metapragmatics is from Michael Silverstein.  Its essential feature is awareness of the pragmatic effects of one’s speech in the social/communicative situation.

Metapragmatic consciousness stems from the social reference that Mead’s Loop provides. Without a sense of one’s own actions as social and public, metapragmatic indicators would not arise.

To bring Silverstein’s emphasis on structural indicators into our discussion of the dynamic dimension, I slightly modify his terminology to “metapragmatic orchestration.”

Caused-motion. To understand caused-motion metaphorically, not only to effect a change of the bowling ball’s location but to change its character from object to a moral force, metapragmatic orchestration of the construction by the speaker’s paradigm of opposed forces took place, Tweety–the Good–down versus Sylvester–the Bad–up.

The plurifunctional “it” had the expected cohesive effect of tying together the two references to the bowling ball in (1) and (2) but it also reflected a metapragmatic presupposition that objects introduced at one point, bowling balls, do not vanish a moment later. This presupposition orchestrated the pronoun’s double role in the unpacking.

The “it” reflected, not just anaphora, which it does (and is important for how the “it” could carry the assumption that the bowling ball did not vanish but continued in new form), but also that the bowling ball was changing from an object into the force for Good against Sylvester. All of this reflects the speaker’s construal of the episode as a paradigm of antagonistic forces and, more broadly, as Silverstein, after Halliday and Hasan, terms it, its “delicacy.”

What to say. Metapragmatics told the speaker that the bowling ball had to be mentioned, that the GP of one event (the outside ascent) had created a context the next GP (the inside ascent) was able to differentiate, etc.

It also accounts for “drops,” which, while part of caused-motion, allowed the motion of the bowling ball to stem from gravity, over “throws” or “thrusts,” which did not.

And, finally, this current-day sentence, with all its richness beyond denotative content, models the many selection pressures (obviously not the content) that GPs could have radiated at the dawn.

Summary: How Mead’s Loop engendered GP properties.

I conclude by listing the chief properties of the dynamic dimension and GP, and how Mead’s Loop could have engendered them.

1. The dual semiotic dialectic.  The new form of mirror neuron response in Mead’s Loop merged vocal movements and gesture, and synchronized them at points where they were co-expressive of the same underlying idea, laying the ground for the imagery–language dialectic and the dynamic dimension of language. Once codified linguistic forms evolved (and this would have been immediate, along with GPs) a dialectic would be the response. Without Mead’s Loop, gesture and speech would have had only the loose connection seen with pantomime, and language could not have escaped the single-semiosis box.

2. The social reference.  The social orientation of mirror neurons with the Mead’s Loop “twist” gave gestures and GPs a social reference character. Without Mead’s Loop gestures could have social reference only if directed at an interlocutor. But speech-unified gesticulations are not necessarily aimed at someone (and except for emblems and points, both culturally mandated to include interpersonal phatic connections, they rarely are) yet gestures and their GPs are social (“public”) entities. Also, crucially, a foundation in Mead’s Loop opened a route over which the social conventions of speech and thought could form, GPs absorbing social-interactive content. An important effect of the inherent sociality of GPs due to Mead’s Loop arose in the origin of syntax, namely, “sharaeability.”

3. Origin of psychological predicates. The psychological predicate, the differentiation of what the speaker deems newsworthy in the immediate context, was inherent to Mead’s Loop by virtue of how it brought gesture in as a speech-orchestrating force. The inseparability of the psychological predicate from context resides in Mead’s Loop’s self-response. Similarly for the other direction of the GP’s regard, communicative dynamism.

4. Origin of catchments. Catchments are threads of consistent imagery attached to a discourse theme. They also arose with Mead’s Loop as a matter of course. Mead’s Loop binds imagery with speech and brings the meaning of the image into the (“twisted”) mirror neuron circuit. Although not shown, each time the “it down” speaker regarded the bowling ball as an antagonistic force its ball-shaped imagery returned along with the theme (a half dozen such occurrences).

5. Metapragmatic indicators.  Mead’s Loop engendered gesture–speech unities and also the metapragmatic scaffolding that encompasses the GP dialectic.  The social reference of gesture–speech unity makes the pragmatic effects of one’s own symbolic actions stand out and sets up conditions for metapragmatic indicators.

6. The equiprimordiality of speech and gesture. To select Mead’s Loop, speech and gesture had to evolve together. One could not have come first and the other later. This basic difference from gesture-first is perhaps the one step, if we try to single out one, that sets human linguistic evolution apart. Some avian species (crows, ravens) have evolved surprisingly elaborate vocal and gestural repertoires but have not taken the step that led to language, evolving a unit that is both sound production and gesture integrally. The entire Mead’s Loop process involves one’s own gesture impinging on the same area of the brain where vocal actions are orchestrated, so both are necessary.

7. Origin of unpacking. The GP is the core idea at the moment of speaking, differentiating a context; unpacking cradles it and intuitions of well-formedness comprise the stop-order. Even if simultaneous GP and unpacking are functionally distinct, with unpacking dependent on the GP.  This was they key to the origin of unpacking as well.  By combining imagery with codified form, a self-created selection pressure for the static dimension also arose.

But not theory of mind. Mead’s Loop however is not theory of mind. They are in fact opposites. The Mead’s Loop adaptation brings self-awareness of one’s own behavior as social, not a theory of the cognitions and intentions of another (also, a theory of mind would tap straight mirror neurons; it could work with Mead’s Loop but could not duplicate it).

Both theory of mind and Mead’s Loop depend on a more fundamental faculty, self-aware agency, a topic the late Susan Hurley explored in depth. The origin of Mead’s Loop depended on this sense. It is through awareness of one’s own agency that gestures can be responded to as if from another. Awareness of self as agent develops in the child around age 4, and both Mead’s Loop and theory of mind show themselves then as well (children’s language was discussed in Part 4).

Further reading

Dray, N. L. and McNeill, D. 1990.  ‘Gestures during discourse: The contextual structuring of thought,’ in S. L. Tsohatzidis (ed.), Meanings and prototypes: Studies in linguistic categorization, pp. 465-487. Routledge.

Firbas, Jan. 1971.  ‘On the concept of communicative dynamism in the theory of functional sentence perspective.’  Philologica Pragensia 8: 135-144.

Givón, Talmy. 1985. ‘Iconicity, isomorphism and non-arbitrary coding in syntax’, in J. Haiman (ed.). Iconicity in Syntax, pp. 187-219. Benjamins.

Goldberg, Adele. 1995.  Constructions: A construction approach to argument structure. Chicago.

Halliday, M.A.K. and Hasan, Ruqaiya. 1976. Cohesion in English. Longman.

Hurley, Susan. 1998. Consciousness in Action. Harvard.

Jakobson, Roman. 1960.  Concluding statement: Linguistics and poetics, In Sebeok, T. (ed.). Style in Language, pp. 350-377. MIT.

Pika, Simone and Bugnyar, Thomas. 2011. ‘The use of referential gestures in ravens (Corvus corax) in the wild.’ Nature Communications 29 November.

Silverstein, Michael. 2003. ‘Indexical order and the dialectics of sociolinguistic life.’   Language & Communication 23:193–229.

van Eijck, Jan and Visser, Albert, ‘Dynamic semantics,’ The Stanford Encyclopedia of Philosophy (Fall 2010 Edition), Edward N. Zalta (ed.), URL = <http://plato.stanford.edu/archives/fall2010/entries/dynamic-semantics/> (accessed 19 Nov. 2012).

Vygotsky, Lev S. 1987. Thought and Language. Edited and translated by E. Hanfmann and G. Vakar (revised and edited by A. Kozulin). MIT.

 

His new title How Language Began: Gesture and Speech in Human Evolution is now available from Cambridge University Press at £19.99/$36.99

The origin of language in gesture–speech unity

Part 4:  Mead’s Loop (2). Wider consequences.

David McNeill, University of Chicago

As it evolved Mead’s Loop created “new actions,” as mentioned previously.  New actions are one of the “wider consequences” of Mead’s Loop. Action itself was a target of natural selection, and the new actions emerged organically. They did not need a separate evolution. A second consequence is metaphoricity. A third is the emblem, a culturally established gesture with metaphoricity at the core. A fourth is how children acquire language – twice, the first of which goes through the equivalent of extinction.  A fifth (many more can be identified) is what phenomenologist philosophy calls “being” – “inhabiting” gesture and speech, rather than only displaying them as elements of communication.

1. New actions and old. What remains of “old actions” in the new world? Many consider actions to be the source of gestures, that by adopting an action-based semiotic a gesture of something being flat is a truncated version of “making something flat.”

The idea that gestures derive from actions is plausible at first glance but there is more (or less) than meets the eye. A gesture may look like a pragmatic action but the action has changed at its core. To describe a gesture as “outlining” or “shaping” is useful as a description but to also say that such a practical action is still within the gesture is to disregard what makes the gesture a human sign.

In the illustration, a “rising hollowness” gesture looks like the action of lifting something in the hand, but it is not lifting at all. It is an image of the character rising, of the interior of the pipe through which he rose, and of the direction of his motion upward – all compacted into one symbolic form to differentiate a field of meaningful equivalents having to do with HOW TO CLIMB A PIPE: ON THE INSIDE. This complex idea, as a unity, orchestrated the hand shape and movement; it is the same motor response but it is not the same action as lifting up an object.

While a gesture may engage some of the same movements and tap in part the same motor schemas as an “action-action” it has its own thought–language–hand link. In keeping with the unity of speech and gesture, the manual movements of gestures as well as the actions of speech are co-opted by Mead’s Loop and orchestrated in new ways by significances other than those of the original actions. We observe the separation of “action-actions” and “gesture-actions” directly in the IW case, where there has been a complete deafferentation.  Without vision, action-actions and gesture-actions dissociate, the first being impossible while second are normal (IW is described fully in a later post).

Some gestures do depict actions; they ritualize the actions they depict. They are more than actions – a kind of performance, a replication of the action, and may also include posture, spatial location, voice, etc. as well as the manual action.

These gestures are of two kinds. Some are pantomimes at their own locus on the continuum of gesture types. Others are “character viewpoint” (C-VPT) gestures, gesticulations with the viewpoint of the character that is being recounted. And here again the difference of pantomimes and gesticulations applies

The C-VPTs pass through the thought–language–hand link. Unlike pantomimes they are co-expressive with speech. Their character’s viewpoint is part of the semiotic which speech opposes in a dialectic. A C-VPT is, among other things, not an O-VPT or “observer viewpoint,” a contrast that it has but that is not part of a pantomime.

Part of the scope, creativity and distinctiveness of human thought lies precisely in its freedom from pragmatic action constraints. When the hands make a gesture, it is thought that controls them and not a hidden action with its own purposes in relation to the physical world.

2. Metaphoricity is the semiotic basis of metaphor, and it also arises out of Mead’s Loop’s “new actions.” It is the mental ability (apparently unique to humans) to experience one thing in terms of something else (“metaphors” = specific cultural-linguistic packages, or temporary individual impromptu ones, that rest on this semiotic). Thinking with metaphors is natural and irresistible, and is explained if metaphoricity (not necessarily any specific metaphor) was a product of how language began. According to Mead’s Loop metaphoricity has existed from the very beginning.

The metaphoricity semiotic came about when the orchestration of actions of the vocal tract and hands was undertaken by something other than those actions, by meaningful gesture imagery, one thing (voice, hands) gaining significance in terms of something else that it is not.

Examples are the hands rotating in the Part 1 illustration – a process as a rotation (co-expressive with “barreling up,” a spoken version of the same metaphor). A novel, not culturally shared, impromptu example is a gesture for inaccessibility – hands separated, one above, the other below, Tweety on his perch, Sylvester on the street below.  The speaker first struck this pose as she said, “part of the problem is that Tweety Bird’s inaccessible,” and then struck it six more times as she listed the various attempts that Sylvester had made to reach Tweety, each gesture starting as an iconic depiction of the attempt and seguing into the metaphor, conveying the futility of the attempt because of this inaccessibility.

3. Emblems as metaphors.  Metaphoricity shows up spontaneously in these impromptu gestures made on the fly by individual speakers but it also appears in gestures of the seemingly opposite kind – culturally ratified gestures of the kinds listed in dictionaries of “Neapolitan Gestures” or “Persian Gestures” and the like – the so-called emblems.  Many if not all cultural emblems contain probably originally impromptu metaphors at the core. The very possibility of an emblem can be considered another consequence of Mead’s Loop, by way of metaphoricity.

Cultures imbue certain metaphors (arguably ones reflecting cultural values and history) with standards of form and specific functions.  This moves them from impromptu gesticulation toward (but not quite reaching) the sign language end of the gesture continuum.

The “OK” sign (sometimes called “the ring”) is a convenient example that appears in many cultures. It is a culturally mandated version of gestures for precision. In performing them one experiences the abstract idea of precision as a feeling of minimization of the space between surfaces; surfaces not automatically in contact but which in the gesture touch. The OK sign’s meaning as an emblem reflects this precision origin, something “being OK,” earning approbation, because it “precisely” meets the requirements at hand. In raw form a precision gesture can be made in various ways, and it has variety even with a single hand, any finger in contact with the thumb (the thumb is invariable for anatomical reasons, but the different fingers contacting it may have their own significances).

The “OK” emblem restricts the handshape – only the forefinger makes contact, the other

fingers extending outward. Meaning is likewise restricted – approbation because something is precise, is just so (like the spoken, “that’s it!”). Reflecting its precision source “OK” differs from another approbation emblem, thumbs-up, which has its own metaphor – “up is good.”  Thumbs-up (or -down) uses pointing to indicate the metaphor proper. The upturned thumb indicates the location of the good in “up-is-good” (conversely, thumbs-down indicates the location of the bad in “down-is-bad”). (The “precision” meaning of “OK” also rules out another theory that the gesture has nothing to do with metaphor but reproduces the letters “O” and “K” – then why is precision the core?)

Emblems do not reach the sign language end of the gesture continuum because, while they can co-occur with other emblems, the combinations lack any stable syntagmatic value: waving the “OK” sign back and forth could be “not OK,” or “everything is OK,” or “look, it’s OK!” a range so broad and replete with contradictions that it is fundamentally non-languagelike.

Children pick up some emblems as early as the first birthday (waving “bye-bye” and others) but it is hugely doubtful that metaphoricity plays any part. A metaphor source of any kind is unlikely (to say the least, if the source of “bye-bye” is something like wiping a situation or oneself away).

So-called “child metaphors” likewise probably do not involve metaphoricity – a 24-month-old saying “cup swimming” as he pushed a cup along in his bath or “I’m a big waterfall” as he slides down his father’s side while wrestling, presumably are not experiencing the cup’s motion as swimming or his own motion as a waterfall.  Instead, there is a piling on of descriptions as they are “shared with the other,” which is quite a different thing (cf. Werner & Kaplan’s remark: the speech of children this young has “the character of ‘sharing’ experiences with the other rather than of ‘communicating’ messages to the other”).

4. What it means if children acquire language twice. Of all the possible indicators of gesture-first it is early ontogenesis that most convincingly suggests it may once have existed. Something like a recapitulation of it arises and performs a scaffolding function like that envisioned by gesture-first advocates. But it dies out and is followed by a kind of extinction during a transitional period from 2 to 3 years roughly.

GPs then emerge “late,” at age 3 or 4 years, with several indications of dual semiosis emerging at the same time, suggesting that gesture-first had once existed (both in children and in the ancient past) but went extinct and a new form of language followed, where speech and gesture imagery merged into the unified packages inhabited by thought and being that we see in ourselves.

That is, language appears to emerge in the child twice, with the first emergence extinguishing – children first acquiring a single semiotic language of which a gesture-first creature also would have been capable; later developing the dual semiotic language we all carry with us.

This style of argument – resting on ontogeny-recapitulates-phylogeny – has often been derided but there has been a recent revival of interest in it. It can be useful and heuristic for sorting out steps in phylogenesis.

For current-day children, the argument implies, contrary to a longstanding assumption that children develop more or less continuously (perhaps with stages, but earlier acquisitions still carrying forward), that ontogenesis is not cumulative; it is a mixture of continuity and discontinuity.

Discontinuities come from the recapitulation of the two origins. Continuities come from an autonomous development of speech control. Speech in the child separates from Mead’s Loop, its evolutionary origin according to theory, which could be due to other evolutionary pressures that adapted speech to garner parental attachment (“baby-talk” is the adult half of the same adaptation).

The early single-semiotic acquisition is limited, much as Bannard et al. remark:

“…children’s speech for at least the first 2 years of multiword speech is remarkably restricted, with constructions being seen with only a small set of frequent verbs … and many utterances being built from lexically-specific frames.”

Limitation is seen in what Braine called “pivot grammars” and Lieven et al “templates.”  Pivots could have been the highest reach of gesture-first.  The table below is an example from Braine.  There would be as many “grammars” as there are pivots, possibly in the hundreds for gesture-first creatures (such a language would not have one of the major features of human language, the infinite productivity in which there is no last sentence).

In other words, the first steps children take toward language may not lead to language, but to something coming from a long-extinct creature; then a second origin of the language that we take for granted, but this is not until the first has extinguished. The single-semiotic gestures without co-expressive speech of the first phase – pointing, pantomime, emblems, action-stubs, diffuse motor responses – are quire different from those of the last – dual semiotic and unified with speech.

A pivot “grammar” with  “want”

want     + { baby
car
do
get
glasses
hand
purse
ride
upetc.

We can take the recapitulation argument a step further: when something emerges in current-day ontogenesis only at a certain stage we reason (in this way of arguing) that the original natural selection of the feature (if any) took place in a similar psychological milieu in phylogenesis. We exploit the fact that children’s intellectual status is not fixed; it is changing. Thus we look for new states that seem pegged to steps in the ontogenesis of growth points and Mead’s Loop underlying them, and consider these steps as possible windows onto phylogenesis.

Using this argument, we are able to look at the ontogenesis of the GP and formulate possible phylogenetic landmarks.  Most importantly, the GP’s emergence seems tied to the development of the child’s self-aware agency, appearing first at age 4 or so, suggesting that in phylogenesis a similar sense of one’s own agency was a condition for Mead’s Loop, a plausible hypothesis given that Mead’s Loop is adaptive when the adult sees her own gesture–speech as social/public. It made “instruction” possible as opposed to “doing” with an onlooker.

5. Material carriers, “inhabitance,” and cognitive being. Another consequence of Mead’s Loop is what Vygotsky termed the material carrier – the embodiment of meaning in enactments or material experiences. Having a material carrier enhances the symbolization’s experiential potency. The speaker/hearer “inhabits” the materialized symbols.

Experiential enhancement of language is possible if the gesture is the image in an imagery–language dialectic, not an “expression” or “representation” of it, but is it. From this viewpoint, a gesture, the global-synthetic whole, is an image in its most developed – that is, in its most materially, naturally embodied – form. The absence of a gesture is the converse, an image in its least material form.

The material carrier concept thus explains how an imagery–language dialectic still is possible in the absence of visible gestural movement. When there is no overt gesture there is still imagery and with linguistic categorization a dialectic, still a simultaneous rendering of meaning in opposite semiotic modes – the dialectic in its essentials – but bleached and at the lowest level of materialization. This leads us to expect that gestures are more elaborate, more materialized and more frequent – more “existent” – when the gesture has greater newsworthiness, as we shall see in the next post of this series.

The source of the material carrier effect is ultimately Mead’s Loop with gesture-actions orchestrated under significances other than action-actions: materialization follows ineluctably. Materialization implies that the gesture, this natural material carrier, the actual motion of the gesture itself, is a dimension of meaning.

The concept of a material carrier is brought to a whole new level when we turn to Merleau-Ponty for insight into the unity of gesture and language and what we expect of gesture in a dual semiotic process.

Gesture, the instantaneous, global, nonconventional component, is “not an external accompaniment” of speech, which is the sequential, analytic, combinatoric component; it is not a “representation” of meaning, but instead meaning “inhabits” it (partly quoted in Part 3):

The link between the word and its living meaning is not an external accompaniment to intellectual processes, the meaning inhabits the word, and language “is not an external accompaniment to intellectual processes” (Merleau-Ponty’s quotation is from Gelb and Goldstein 1925). We are therefore led to recognize a gestural or existential significance to speech . . . Language certainly has inner content, but this is not self-subsistent and self-conscious thought. What then does language express, if it does not express thoughts? It presents or rather it is the subject’s taking up of a position in the world of his meanings. (p. 193).

The GP is geared to this “existential content” of speech – this “taking up a position in the world.”  Gesture, as part of the GP, is inhabited by the same “living meaning” that inhabits the word (and beyond, the discourse).

A deeper answer to the query – when we see a gesture, what are we seeing? – is that it is part of the speaker’s current cognitive being, her very mental existence, at the moment it occurs. This extends the material carrier and ultimately rests on a gesture–speech unit. By performing the gesture, a core idea is brought into concrete existence and becomes part of the speaker’s existence at that moment.

The Heideggerian echo in this statement is not accidental. Following Heidegger’s emphasis on being, a gesture is not a representation, or is not only such: it is a form of being. From a first-person perspective, the gesture is part of the immediate existence of the speaker. Gestures (and words, etc., as well) are themselves thinking in one of its many forms – not only expressions of thought, but thought, i.e., cognitive being, itself. To the speaker, gesture and speech are not only “messages” or communications, but are a way of cognitively existing, of cognitively being, at the moment of speaking.

The speaker who creates a gesture of Sylvester rising up fused with the pipe’s hollowness is, according to this interpretation, embodying thought in gesture, and this action – thought in gesture-action over the thought–language–hand link – was part of the person’s being cognitively at that moment.

To make a gesture, from this perspective, is to bring thought into existence on a concrete plane, just as writing out a word can have a similar effect. There is not a causal sequence: thought → speech/gesture. Speech and gesture are the thought coming into being at that instant. The greater the felt departure of the thought from the immediate context, the more likely its materialization as a gesture, because effort adds to being. Thus, gestures are more or less elaborated depending on the importance of material realization to the existence of the thought.

6. The theater of the mind, closed. The “H-model” avoids the homunculus problem encountered by the third person perspective inherent to the concept of a “representation” and with it the “theater of the mind” problem.  The theater of the mind is the presumed central thinking area in which representations are “presented” to a receiving intelligence. The possibilities for homunculi – each with its own theater and receiving intelligence – spiraling down inside other homunculi are well known.  In the H-model, there is no theater and no extra being; the gesture is, rather, part of the speaker’s momentary mode of being itself, and is not “watched.”  The theater is closed or, rather, it never opened.

Further Reading

Action

Kendon, Adam. 2009. ‘Manual actions, speech and the nature of language.’ In Gambarara, Daniele and Givigliano, Alfredo (eds.). Origine e sviluppo del linguaggio, fra teoria e storia. Pubblicazioni della Società di Filosofia del Linguaggio, pp. 19-33. Rome: Aracne editrice s.r.l.

Kendon, Adam. 2010. ‘Accounting for forelimb actions as a component of utterance: An evolutionary approach.’ Plenary Lecture. International Society for Gesture Studies, Frankfurt/Oder, July 25, 2010. (abstract at http://www.gesturestudies.com/past.php, accessed 09/10/12).

LeBaron, Curtis and Streeck, Jürgen. 2000. ‘Gestures, knowledge, and the world,’ in D. McNeill (ed.) Language and Gesture, pp. 118-138. Cambridge.

Streeck, Jürgen. 2010. Gesturecraft: The manu-facture of meaning. Benjamins.

Child language and ontogeny recapitulates phylogeny

Bannard, C., Lieven, E., and Tomasello, M. 2009. ‘Evaluating constructivist theory via Bayesian modeling of children’s early grammatical development.’ Abstract posted on the International Cognitive Linguistics Conference website, accessed 03/30/09.

Braine, Martin D. S. 1963. ‘The ontogeny of English phrase structure: the first phase.’ Language 39:1-13.

Butcher, Cynthia & Goldin-Meadow, Susan. 2000. Gesture and the transition from one- to two-word speech: When hand and mouth come together. In D. McNeill (ed.), Language and Gesture, pp. 235-257. Cambridge

Goldin-Meadow, Susan & Butcher, Cynthia. 2003. Pointing toward two-word speech in young children. In S. Kita (ed.), Pointing: Where language, culture, and cognition meet, pp. 85-107. Erlbaum.

Levy, Elena, 2011.  ‘A new study of the co-emergence of speech and gestures: Towards an embodied account of early narrative development.’ Language Fest, University of Connecticut, Storrs, CT.

Lieven, Elena, Salomo, Dorothé and Tomasello, Michael. 2009. ‘Two-year-old children’s production of multiword utterances: A usage-based analysis’, Cognitive Linguistics. 20: 461-507.

MacNeilage, Peter F. 2008. The Origin of Speech. Oxford.

Werner, Heinz and Kaplan, Bernard. 1963. Symbol Formation. Wiley.

Material carriers, inhabitance, and cognitive being

Dreyfus, H. 1994. Being-in-the-World: A Commentary on Heidegger’s Being and Time, Division I. MIT.

Gallagher, Shaun.  2005. How the Body Shapes the Mind. Oxford.

Merleau-Ponty, Maurice. 1962.  Phenomenology of Perception (C. Smith, trans.). Routledge.

Quaeghebeur, Liesbet. 2012. The ‘All-at-Onceness’ of embodied, face-to-face interaction. Journal of Cognitive Semiotics 4: 167-188.

Metaphor

Cienki, Alan and Müller, Cornelia. 2008. Metaphor and Gesture. Benjamins.

Lakoff, George and Johnson, Mark. 1980. Metaphors We Live By. Chicago.

Müller, Cornelia. 2008. Metaphors – Dead and Alive, Sleeping and Waking. A Dynamic View. Chicago.

(children’s)

Carlson, Patricia and Anisfeld, Moshe. 1969. ‘Some observations on the linguistic competence of a two-year-old child.’ Child Development 40:569-575.

Theater of the mind

Dennett, Daniel C. 1991. Consciousness Explained. Little, Brown.

David McNeill is a professor in the Departments of Linguistics and Psychology at the University of Chicago.

His new title How Language Began: Gesture and Speech in Human Evolution is now available from Cambridge University Press at £19.99/$36.99

 

The origin of language in gesture–speech unity

Part 3: Mead’s Loop (1).

by Professor David McNeill

Part 1 of this series put forth the idea that language is inseparable from imagery, in particular the imagery of gesture, and that theories of language origin can be judged by how well they predict this gesture–speech unity.  The second part applied the test to a widely held origin theory, gesture-first, and found it wanting – doubly so, in fact. This part applies the test to a new hypothesis, which I call “Mead’s Loop.” 

Mead’s Loop holds that gesture was essential in the origin of language.  In this it agrees with gesture-first, but differs in that, it says, gesture and speech had to be naturally selected together.  Rather than gesture-first (or speech-first), gesture and speech were what Liesbet Quaeghebeur, philosopher at the University of Antwerp, has called “equiprimordial,” the antithesis of gesture- or speech-first.

Mead’s Loop rests upon an idea from the early 20th Century philosopher, George Herbert Mead, formulated as an origin hypothesis to portray what, some one-half to one million years ago, emerged in the evolution of the human brain. It posits the mirror neuron circuits that gesture-first also assumes, but again with a difference. Mirror neurons in Mead’s Loop were “twisted” to respond to one’s own gestures as if they were from someone else.

Mirror neurons have been directly recorded in monkeys and reside supposedly in all primate brains, including ours.  Part 2 quoted Rizzolatti’s and Arbib’s definition. A Wikipedia article also defines it succinctly: “[a] mirror neuron is a neuron that fires both when an animal acts and when the animal observes the same action performed by another.” I call these mirror neurons “straight,” to distinguish them from Mead’s Loop.  Note what they provide.  The significance of the straight mirror neuron response is that of the action it mimics.  For example, seeing someone picking up a treat, the mirror neuron repeats this action, with its meaning. The action of another is repeated (not necessarily overtly but at the orchestration level) and it becomes one’s own. If the mirror neuron circuit produces a gesture it will be a mimicked action like the one perceived. It in fact resembles pantomime, a gesture, as we saw in Part 2, that systematically blocks gesture–speech unities.

Mead’s Loop refers to a posited new adaptation, a thought-language-hand link, located at least in part in the area now called Broca’s Area (other brain areas also must have been involved). Here is the twist:  G. H. Mead said that a gesture is meaningful when it evokes the same response in the one making it as it evokes in the one receiving it. For evolution, this suggests that mirror neurons came to bring one’s own gesture imagery and its significance into Broca’a Area, the motor area for orchestrating actions including speech and gesture. While straight mirror neurons reproduce the actions of another, with meanings that are those of the actions, the Mead’s Loop twist responds to one’s own gestures as if from another, and brings different meanings into the action-orchestration areas of the brain, those of the gestures.

The Mead’s Loop twist, because it brings the gesture’s meaning into the orchestration process, merges and synchronizes speech with gesture at points where they co-express the same idea. Hence the unity: it is built in. In all of this the gesture is fundamental. Mead’s Loop creates “new actions,” actions orchestrated under significances other than their practical goal-directed meanings – those of the gestures that Mead’s Loop imports.  Because Mead’s Loop gave gestures the power to orchestrate speech, Mead’s Loop was the beginning of everything in language.

These achievements opened a door to language dynamically. Mead’s Loop had both semiotic and motor effects:

  • Semiotically, it brought the gesture’s meaning into the mirror neuron area. Mirror neurons no longer were confined to the semiosis of actions. One’s own gestures entered, opening action control to the imagery of gesture. Extended by metaphoricity, the significance of imagery is unlimited. So from this one change, the meaning potential of language moved away from only action and expanded vastly.
  • At the motor level, in the areas of the brain where speech movements are orchestrated, Mead’s Loop enabled significant imagery – gesture – to “chunk” motor control of the vocal tract and diaphragm, and laid the foundation of the GP.

How does Mead’s Loop produce gesture-speech unity?  As mentioned, it was built in from the start.  The evolutionary step was a self-response by mirror neurons.  Mirror neurons complete Mead’s loop in a part of the brain where action sequences are organized – two kinds of sequential actions, speech and gesture, converging, with meaningful imagery the integral component. Co-opting sequential actions by a socially referenced stimulus (imagery) provides a new kind of action in the vocal tract – speech, with its own movements, timing, tongue postures, and breathing. It thus explains, which gesture-first could not explain, why gesture and speech are unified.

By treating imagery as a social stimulus Mead’s loop also explains why gestures occur preferentially in a social context of some kind (face-to-face, on the phone, but not alone talking to a tape recorder).

But was the twist needed?  It was, because the gesture, although emanating with full meaning from the same brain area as speech, does not unite with it. It is neither synchronous nor co-expressive. It is incomplete. Gesture–speech unity happens only when the gesture gets a self-response via Mead’s Loop and becomes able to orchestrate speech movements (not sequentially, but self-response is an essential aspect of the gesture’s meaning, the meaning under which speech and gesture combine). This was Mead’s insight. He recognized that gesture (and speech) have fundamentally a social character and, to be meaningful, must have a social/public presence: the gesture, he said, evokes the same response in the one making it as in the one receiving it. With Mead’s Loop, this occurs when the one making and the one receiving are the same; this is the “twist”; then the gesture is a meaningful and socially pertinent event, with the potential to connect to everything else in language dynamically. It orchestrates vocal and manual movements.

Then speech passes from “display” (of which chimps are capable) to “communicating messages to the other” (a phrase from Werner & Kaplan).

Straight mirror neurons do not respond because there is no external action; only the “twist” can self-respond in this way.

A self-response to the gesture can pick up other meanings as well, and these can further cement gesture–speech unity.  In the process the gesture also changes to meet its role of forming a unit with speech. In Part 4 of this series I’ll show gestures reshaped by gesture–speech unity.

Mead’s Loop also gains substance for a reason identified by Merleau-Ponty: “Language … presents or rather it is the subject’s taking up of a position in the world of his meanings. (p. 193).  Via Mead’s Loop and its social reference, the gesture takes up its position in the world of meanings as well.  This move equally reshapes the gesture, in keeping with gesture–speech unity.

That Mead’s Loop gave one’s gestures a public, social significance had importance for another reason, natural selection. It meant that the Mead’s Loop twist was adaptive in social-interactive situations (so those favorites, “man-the-tool-maker” and “man-the-hunter,” would be incidental to language origin, effective insofar as they are also social but not significant in themselves). The social reference gave adults, in particular mothers inculcating cultural norms in infants, the sense of being an instructor as opposed to being just a doer with an onlooker (which is what happens with chimpanzees).  Entire cultural practices of human childrearing depend upon this sense. The adult must be sensitive to her own gestures as social/public actions. Hence the adaptiveness of Mead’s Loop.  Sensing actions as social impacts the next generation of children who, as a result of it, do better at coping, and pass it on.

Origin of syntax.  To many the origin of patterned language, of syntax, is the crux of the origin of language as a whole. How, when, or even why syntax emerged is far from obvious. Proposals range from a “big bang” single mutation, through cultural practices such as ritual or grooming, to no special sources at all, just a natural by-product of human intelligence in general.  Whatever it was, over eons it has led to vast crosslinguistic diversity. I follow Eric Lenneberg and affirm that syntax rests on a biological foundation, hence is a topic in the origin of language.

The basic idea stemming from Mead’s Loop is that words and syntax are continuations of GPs. They and GPs are linked organically. I seek the natural selection of syntax (the general ability, not specific constructions, although some constructions also could have been naturally selected) in three places – the nature of the GP and its unpacking; the new paths this opened; and shareability. These in turn suggest three kinds of adaptive advantages.

First, syntax is crucial for a GP dialectic. Without morphs and combinations of morphs there cannot be a semiotic opposition to gesture imagery.

Second and linked, syntax stabilizes the dialectic. It is the resting point par excellence.

Third, syntax helps make language shareable in sociocultural encounters.

Any or all of these factors could have favored an ability to form syntactic patterns, defined generally as creating meaningful wholes out of segmented elements (morphs); meeting standards of form; providing cultural identity; learning this system, and transmitting and maintaining it over space and time. We are focusing on the dynamic dimension of language. This dimension crosscuts the static and is not reducible to it (nor vice versa, the static is not reducible to the dynamic; they are two dimensions, not one dimension in two forms).

That the static and dynamic arose together, were equiprimordial, is explained by Mead’s Loop’s built-in social referencing, combined with gesture imagery. From this vantage point, we can claim that words and sentences continue the evolution of the GP. Contrary to traditions both philological and Biblical, language did not begin with a “first word.” Words emerged from GPs. There was an emerging ability to differentiate newsworthy points in contexts; a first gesture–speech unit but not a first word.

The paradox of an emerging syntax is that it is almost invisible in current humans. Children learn their language with speed, but they are given a language, not inventing one. When gestures are forced to be the sole medium of communication in experiments, however, they quickly develop linguistic values original to the speaker and the situation, not borrowed from an existing language, including novel axes of selection (paradigmatic values) and combination (syntagmatic values), suggesting a faculty for syntactic innovation in current-day humans. It is this hidden ability we propose that arose out of GPs at the origin.

Whence such a faculty for syntactic invention?  An important insight is “shareability” from a 1983 paper by Jennifer Freyd. To share information imposes a “discreteness filter” such that the semiotic properties of words (discreteness) and word combinations arise.  Shareability would have existed at the dawn.  It also existed in the gesture-communication experiments, so conditions for new word forms and combinations existed in both. Words and combinations of words are part of the GP’s imagery-language dialectic, both opposing codified linguistic form to gesture semiotically, and providing a dialectic stop-order through unpacking.

GPs and syntax thus emerged together according to Mead’s Loop, and could do so because in Mead’s Loop the gesture assumes the guise of a social other and invites shareability from the beginning. Here began the static dimension of language.

Most of the static dimension, however, is not biological but socio-cultural and historical, shaped over time. To have forms that are repeatable, standardized, and non-context-bound makes them durable and portable from encounter to encounter, where they can be reshaped by intragroup and intergroup encounters, including migrations where newcomers encounter existing populations (there may be spontaneous “mutations” beyond encounters as well).

This in itself would have given syntactic innovation adaptive value and replaced temporal order syntax with morph elaborations, releasing static structure meanings from temporal sequence. The primordial syntax according to Mead’s Loop was mapping meanings onto temporal sequences. The orchestration of actions under some significance with shareability allots meaning fragments to ordered segments of time.  The response to encounters is to shake up this temporal syntax. Given gesture–speech unity, gestures change as well (the examples in the fourth part of this series illustrate this as well).

The cumulative effect would have been to liberate temporal sequence for other expressive functions, some of which may also take part in imagery–language dialectics on muliple levels. Edward Sapir long ago divided the world’s languages according to how they combine meanings into single words – analytic or isolating, relying on temporal sequence (e.g., Chinese), synthetic, with some liberation (e.g., Latin), or polysynthetic, with much freedom (e.g., Inuit), which reflect degrees of adornment of the basic brain orchestration plan (and English, with its relatively fixed word orders, is one of the less adorned).

Rethinking language as action control. Speech according to Mead’s Loop, among other things, is thus a culturally mandated action, orchestrated by imagery. Action is a target of natural selection in any case, and in the selection scenario where Mead’s Loop had adaptability, adults inculcating cultural norms in infants, the overt actions of the adults fed natural selection. Linguistic standards are not only about “good forms” but also about “good actions.”  The discovery of the FOXP2 gene points to the centrality of action control at the foundation of language. The mutation in the KE family that led to its discovery affects fine motor control, speech articulation and other actions, as well as syntax. As a gene affecting fine-tuned action control, it would influence the raw material on which Mead’s Loop and its new form of action worked (the Mead’s Loop innovation itself would be something else genetically). The gene (actually, a transcription factor, a genetic “on-off” switch), which differs in the human version compared to that in chimps, has undergone accelerated evolution and when implanted into engineered mice changes vocalization. Taking this lead, we can consider syntax as a form of culturally authorized action control of the vocal organs, the hands and other body parts.

Brain model. The language centers of the brain have classically been regarded as just two, Wernicke’s and Broca’s areas. But if we are on the right track with Mead’s Loop, many other areas of the brain are involved and are equally “language areas.” Broca’s Area itself is not a “language area” but a region for complex action orchestration under various significances. Typical item-recognition, memory and production tests would not tap these other brain regions, but discourse, conversation, play, work, and the exigencies of language in daily life (where language originated) would.  Broca’s area may be the convergence point of Mead’s Loop and the imagery–language dialectic, including unpacking, but other areas – the left rear hemisphere (categorial content in GPs), the right hemisphere (imagery and metaphor), and the prefrontal cortex (the alternatives a GP differentiates) – can equally be called the “language areas” of the brain. Thought-language-hand links tie them together when the dynamic dimension of language is engaged.

Selection scenario. The family, particularly in its child-rearing aspects, is an environment where the social/public value of one’s own gestures is adaptive, and where Mead’s Loop could have been naturally selected (no doubt Mead’s Loop was adaptive in other contexts as well). Archeologists date the dawn of family life (with cooking hearths) to about one million years ago, implying a stable family membership and a division of labor. So it was possibly back then that the natural selection of Mead’s Loop also began.

The focus of this selection pressure was adults, women in particular. In this scenario language began in adults, in the form of mothers instructing infants. Their infants, both female and male, would benefit from superior cultural inculcation, and so become more able to carry on any genetic disposition for Mead’s Loop themselves.

Did Neaderthals speak?  The Neanderthal genome project has shown that this extinct form of human also had FOXP2, and also may have been capable of fine motor control. Whether this control covered the vocal tract is unknown but speech seems not impossible.  Some have suggested that the Neanderthal brain, although large, had a different developmental time course from that of human children (much briefer) and did not sustain robust activity of the prefrontal cortex.  A short ontogenesis meant less time for any GP-like development. The prefrontal cortex, among its other functions, arranges and selects alternatives. The formation of the contexts that GPs differentiate is a place in language where this ability is tapped.  Weakened contexts would have yielded cognitive inflexibility and gesture–speech redundancy rather than unity. Any GP-like dynamics is thus also likely to have been muted.

Even if Neanderthals could speak, their speech is likely to have been temporally sequenced, and limited to what Derek Bickerton posited as proto-language and what Martin Braine called pivot grammars, each pivot a separate “grammar” unto itself. A collection of disparate pivot grammars, lacking an overall system, may have been their highest achievement. Possible gestures would be gesture-first-like pantomimes and pointing (available to today’s sub-two-year-olds).  Kindly opinion is that our ancestors had nothing directly to do with the Neanderthal extinction but we may have out-competed them. A cultural superiority over cognitive inflexibility and a limited, single semiotic (a profile not unlike Downs syndrome) could have been fatal, if unintended.

Further Reading

Adult–Infant inculcation

Hrdy, Sarah Blaffer. 2009. Mothers and others: The evolutionary origins of mutual understanding.  Harvard.

Tomasello, Michael. 1999. The Cultural Origins of Human Cognition. Harvard.

Brain model

McNeill, David, & Pedelty, Laura. 1995. Right brain and gesture.  In K. Emmorey & J. Reilly (eds.), Sign, Gesture, and Space, pp. 63-85.  Erlbaum.

Nishitani, Nobuyuki, Schürmann, Martin, Amunts, Katrin and Hari, Riitta. 2005. ‘Broca’s region: from action to language.’ Physiology 20: 60-69.

Where language began

Atkinson, Quentin D. 2011. ‘Phonemic diversity supports a serial founder effect model of language expansion from Africa.’ Science 332: 346-349.

Mead’s Loop “twist”

Cohen, Akiba A. 1977. ‘The communicative function of hand illustrators.’ Journal of Communication 27: 54-63.

McNeill, David., Duncan, Susan. D., Cole, Jonathan., Gallagher, Shaun. and Bertenthal, Bennett. 2008. ‘Growth points from the very beginning.’ Interaction Studies (special issue on proto-language, D. Bickerton and M. Arbib, eds.) 9: 117-132.

Mead, George Herbert. 1974. Mind, self, and society from the standpoint of a social behaviorist (C. W. Morris ed. and introduction).  Chicago.

Merleau-Ponty, Maurice. 1962.  Phenomenology of Perception (C. Smith, trans.). Routledge.

Neanderthals

Bickerton, Derek. 1990. Language and Species. Chicago.

Braine, Martin D. S. 1963. ‘The ontogeny of English phrase structure: the first phase.’ Language 39: s1-13.

Pääbo, S. and colleagues. 2009. News focus in Science 323: 866-871.

Rozzi, Fernando V. Ramirez and de Castro, José Maria Bermudez. 2004. ‘Surprisingly rapid growth in Neanderthals.’ Nature 428: 936-939.

Wynn, Thomas & Coolidge, Frederick. 2011. How to Think Like a Neandertal. Oxford.

Speech as action control

MacAndrew, Alec. ‘FOXP2 and the evolution of language.’ http://www.evolutionpages.com/FOXP2_language.htm

MacNeilage, Peter F. 2008. The Origin of Speech. Oxford.

“Straight” mirror neurons

Rizzolatti, Giacomo and Arbib, Michael. 1998.  ‘Language within our grasp.’  Trends in Neurosciences 21: 188-194.

Wikipedia article on the Mirror Neuron.

Syntax and shareability

Freyd, Jennifer J.  1983.  ‘Shareability:  The social psychology of epistemology.’  Cognitive Science 7: 191-210.

Lenneberg, Eric. 1967. Biological Foundation of Language. Wiley.

McNeill, David and Sowa, Claudia. 2011.  ‘Birth of a morph.’ In G. Stam and M. Ishino (eds.), Integrating Gestures: The Interdisciplinary Nature of Gesture, pp. 27-47. Benjamins.

Sapir, Edward 1921. Language: An Introduction to the Study of Speech. Harcourt, Brace & World.

Thomason, Sarah. 2011. ‘Does language contact simplify grammars? (No).’ Talk given at the University of Chicago, April 12.

 

David McNeill is a professor in the Departments of Linguistics and Psychology at the University of Chicago.

His new title How Language Began: Gesture and Speech in Human Evolution is now available from Cambridge University Press at £19.99/$36.99

 

The origin of language in gesture–speech unity

Part 2: Gesture-first

By Professor David McNeill

This popular hypothesis says that the first steps of language phylogenetically were not speech, nor speech with gesture, but were gestures alone.  In some versions, it was a sign language. In any case, it was a language of recurring gesture forms in place of spoken forms. Vocalizations in non-human primates, the presumed precursors of speech without gesture’s assistance, are too restricted in their functions to offer a plausible platform for language, but primate gestures appear to offer the desired flexibility. Thus, the argument goes, gesture could have been the linguistic launching pad (speech evolving later). The gestures in this theory are regarded as the mimicry of real actions, a kind of pantomime, hence the appeal of mirror neurons as the mechanism. To quote Rizzolatti and Arbib (1998), in their exposition of gesture-first, mirror neurons are “neurons that discharge not only when the monkey grasped or manipulated the objects, but also when the monkey observed the experimenter making a similar gesture” (p. 188).  Current chimps show this kind of action mimicry (see illustration later in this post).

Did gesture scaffold speech, then speech supplant it?  Even if mirror neurons were a factor in the origin of language, our basic claim is that a primitive phase in which communication was by gesture or sign alone, if it existed, could not have evolved into the kind of speech–gesture combinations that we observe in ourselves today. We see two problems. First, gesture-first must claim that speech, when it emerged, supplanted gesture; second, the gestures would be pantomimes, that is, gestures that simulate actions and events. However, such gestures do not combine with co-expressive speech but rather fall into other slots on the continuum of gestures, supplements (rather than co-expressiveness) and pantomime.

Looking over a roster of gesture-first advocates, including several writing before the mirror neuron discovery, all say at some point that speech supplants the original gesture language, which then is marginalized. For example, Henry Sweet (said to be Shaw’s model in Pygmalion for Henry Higgins) wrote, “…gesture which later would be dropped as superfluous” (pp. 3-4).  More recently, Rizzolatti and Arbib said,  “… gesture became purely an accessory factor to sound communication.”  In all cases, as in these quotes and many others, gesture withers to the status of an “add-on.” 

This is the first wrong assertion. Gesture-first commits one to the false prediction that speech replaced gesture rather than, as we see in ourselves, speech and gesture united as one “thing.”  We say that gesture-first incorrectly predicts that speech would have supplanted gesture, and fails to predict that speech and gesture became a single system. It thus is falsified – twice in fact. The contradiction of gesture-first is that speech supplants gesture, it says, yet ends up integrated with it. The logic of gesture-first, at its very core, means that supplantation, overt or hidden, is inescapable. This is why every advocate naturally posits it.

Empirically, there is this perfect correlation of those advocating gesture first and the supplantation step. Moreover, there is a conceptual point that explains it. It is important to see that gesture-first is a theory about the origin of speech (not gesture). Given that aim, it must logically consider that from gesture one gets to speech; and here supplantation enters: it is unavoidable. Even Sweet, who envisions a transition from hand gestures to tongue gestures, and with them to speech, wants to leave hand gestures out at the end as “superfluous.” He has no way to say from his several transitions that gestures in the end are other than left-overs.

When it emerged, why did speech not gradually integrate with gesture? This is possibly what “scaffolding” intends in part. But even if scaffolding took place it could only have been a temporary arrangement. For speech to become an autonomous system, sooner or later gesture and speech must have separated. The reason again lies in the gesture-first tenet. The whole logic of gesture-first is to picture one code coming after another. The models of supplantation immediately below show the effects. The most that can happen is that the codes divide the labor of communication, as will be seen with the second model. Even if speech integrates with a gesture-language (as a kind of vocal gesture) it must sooner or later become an encoded system of its own, and the would-be integration is lost. The first of the models shows this happening – two codes, one for gesture and one for speech refusing to synchronize. It does not help to point to gestures in non-linguistic primates. There is nothing in them to show how they could lead to language without encountering the same roadblock of supplantation.

Models of supplanting and scaffolding.  To see what may happen when two codes co-occur, as they would at the hypothetical gesture-first/speech supplantation crossover, we have two models: Aboriginal signs performed with speech, and hearing bilingual ASL signs with spoken English. In neither case is there the formation of packages of semiotic opposites, as the example in post 1 illustrated and the growth point explains. When a pairing of semantically equivalent gesture and speech is examined in these models, the two actively avoid speech–gesture combinations at co-expressive points. They repel each other in time or functionality or both, and do not coincide at points of co-expressivity.

1. Warlpiri sign language. Women use the Warlpiri sign language of Aboriginal Australia when they are under (apparently quite frequent) speech bans and also, casually, when speech is not prohibited. When this latter happens signs and speech co-occur and lets us see what may have occurred at the hypothetical gesture or sign-speech crossover. Here is one example from Kendon:

 

The spacing is meant to show relative durations, not that signs and speech were performed with temporal gaps (both were performed continuously). Speech and sign start out together at the beginning of each phrase but, since signing is slower, they immediately fall out of step. Each is on a track of its own and they do not unify. Speech does not slow down to keep pace with gesture, as would be expected if speech and gesture were unified (mutual speech–gesture slowing is shown by the deafferented patient, IW, “the man who lost his body,” described in post 3). They then reset (there is one reset in the example) and immediately separate again. So, according to this model, co-expressive speech–gesture synchrony would be systematically interrupted at the crossover point of gesture and speech codes. Yet synchrony of co-expressive speech and gesture is what evolved.

2. English-ASL bilinguals. The second model is Emmorey et al.’s observation of the pairings of signs and speech by hearing ASL/English bilinguals. While 94% of such pairings are signs and words translating each other, 6% are not mutual translations. In the latter, sign and speech collaborate to form sentences, half in speech, half in sign. For example, a bilingual says, “all of a sudden [LOOKS-AT-ME]” (from a Sylvester and Tweety narration; capitals signify signs simultaneous with speech). This could be “scaffolding” but it does not create the combinations of unlike semiotic modes at co-expressive points that we are looking for. First, signs and words are of the same semiotic type – segmented, analytic, repeatable, listable, and so on. Second, there is no global-synthetic component, no built-in merging of analytic/combinatoric forms with gesture’s global synthesis, and the spoken and gestured elements are not co-expressive but are the different constituents of a sentence. Of course, ASL/English bilinguals have the ability to form GP-style cognitive units. But if we imagine a transitional species evolving this ability, the bilingual ASL-spoken English model suggests that scaffolding did not lead to GP-style cognition; on the contrary, it implies two analytic/combinatoric codes dividing the work. If we surmise that an old pantomime/sign system did scaffold speech and then withered away, this leaves us unable to explain how gesticulation emerged and became engaged with speech. We conclude that scaffolding, even if it occurred, would not have led to current-day speech-gesticulation linkages.

Corballis, in his 2002 argument for speech supplanting a gesture-first system of communication, points out the advantages of speech over gesture. There is the ability to communicate while manipulating objects and to communicate in the dark. Less obviously, speech reduces demands on attention since interlocutors do not have to look at one another (p. 191). While valid, these qualities are not necessary. There are also positive reasons for gestures not being language-like, and they would be so even if gesture and speech co-evolved as a single adaptation. All across the world, languages are spoken/auditory unless there is some interference to the channel (deafness, acoustic incompatibility, religious practice, etc.), and no culture has a visual/gestural primary language. Susan Goldin-Meadow, Jenny Singleton and I once proposed that gesture is the non-linguistic side of the speech–gesture dual semiotic because it is better than speech for imagery: gesture has multiple dimensions on which to vary, while speech has only the one dimension of time.  Given this asymmetry, even if speech and gesture were jointly selected, as proposed in this series, it would work out that speech is the medium of linguistic segmentation.

Problems with pantomime. The second problem is that the gestures of gesture-first would be pantomimes. Gesture-first claims the initial communicative actions were symbolic replications of actions of self, others and entities, and these pantomimes later scaffolded speech. This process appeals because it so clearly taps the mirror neuron response. Merlin Donald likewise posited mimesis as an early stage in the evolution of human intelligence. It is conceivable that pantomime is something that an apelike brain is capable of and was already in place in the last common chimp–human ancestor, some 8 million years back. Contemporary bonobos are capable of it, supporting this idea:

Bonobo Gestures 

The problem is not a lack of pantomime precursors but that pantomime repels speech. The distinguishing mark of pantomime compared to gesticulation is that the latter is integrated with speech; it is an aspect of speaking. In pantomime this does not occur. There is no co-construction with speech, no co-expressiveness; timing is different (if there is speech at all), and no dual semiotic modes. Pantomime, if it relates to speaking at all, does so, as Susan Duncan points out, as a “gap filler” – appearing where speech does not, for example completing a sentence (“the parents were OK but the kids were [pantomime of knocking things over]”). Movement by itself offers no clue to whether a gesture is “gesticulation” or “pantomime”; what matters is whether or not two modes of semiosis combine to co-express one idea unit simultaneously. Pantomime does not have this dual semiosis. 

Last word on gesture-first.  Whether you are persuaded by these arguments depends, ultimately, on taking seriously gesture–speech unity, that gesture and speech comprise a single multimodal system, and that gesture is not an accompaniment, ornament, supplement or “add-on” to speech but is actually part of it. Gesture-first does not predict this language–gesture integration. When we look at models of speech–gesture crossovers of the kind that, in theory, gesture-first would have encountered when speech supplanted an original gesture language, we do not find conditions for gesture–speech unity, but instead non-co-expressiveness or mutual speech–gesture exclusion.

Joining the damage is Woll’s (2005/2006) argument that not only does gesture-first leave gestures unable to integrate with speech but it also blocks, within speech itself, the arbitrary pairing of signifiers with signifieds that is characteristic of (or, Saussure says, defining of) a linguistic code.

Michael Arbib, in his gesture-first theory, envisions an “‘expanding spiral’ of increasingly sophisticated protosign and protospeech,” a spiral moving from gesture-first to speech.  A spiral pictures gradual changes from gesture (or protosign) to speech (or protospeech). This appears not to be the “crossover” modeled above, but the models still apply. Pantomime and signs push synchrony and co-expressiveness with speech away, and do not break out of this self-defeating pattern (despite the spiral’s openness, as Arbib also argues, to sign and speech shaping each other). Nothing in the spiral forms co-expressiveness and gesture–speech unity. With each turn gesture spins off (“scaffolds”) a bit more of itself into speech; but then speech, far from shaping gesture or being shaped by it, repels it and/or divides the labor between itself and its former gesture master.

Michael Corballis likewise continues to advocate gesture-first in a new work, which takes as its central theme a posited linguistic universal, recursion.  However, recursion is equally beyond gesture-first. This is because recursion enters into gesture–speech unities. It co-expressively appears in both gesture and speech simultaneously. In one example, a speaker outlined what she took to be an ambiguity in the bowling ball episode of the cartoon stimulus described in post 1.  She first states the perceived ambiguity (“you can’t tell if the bowling ball”) and then, recursively, states the alternatives (“is under Sylvester or inside of him”); concurrently and co-expressively, she moves her left hand to a certain space for the ambiguity itself, and then opposes two spaces within it for the two poles of the ambiguity (two further gestures in the “ambiguity” space  – first the hand moves forward with “is under”, then inward with “or inside of him”); so there is recursion on both sides of the dialectic. The recursions, spoken and gestured, partake of the usual semiotic oppositions: while speech is codified, comprised of recurrent elements with constraints of meaning and form, gesture is global and synthetic and the meaning of the whole (ambiguity) determines the meanings of the parts (the “under” pole, in particular, being anti-iconic for the meaning of being under something).  None of this can gesture-first explain.

I do not deny that gesture-first may once have existed, and in fact I assume that it did exist once. But if it did it could not have led to human language.  It would have created pantomime, a type of gesture that does not unify with speech. Gesture-first either extinguished or shunted off into a dead end.  I propose in a later post that it was a dead end seen now only in children’s earliest language.

The upshot is that gesture-first has little light to shed on the origin of language, as we know it; at best it explains the evolution of pantomime as a stage of phylogenesis that, if it once occurred, went extinct as a code and landed at a different point on the continuum of gestures.

 

Further Reading

Gesture-first:

Arbib, M. A. 2005. ‘From monkey-like action recognition to human language:  An evolutionary framework for neurolinguistics.’  Behavioral and Brain Sciences, 28: 105-124.

Armstrong, David F. and Wilcox, Sherman E. The Gestural Origins of Language. Oxford.

Armstrong, David F., Stokoe, William F. and Wilcox, Sherman E. 1995. Gesture and the Nature of Language. Cambridge.

Corballis, Michael C. 2002. From hand to mouth: the origins of language. Harvard.

Corballis, Michael C. 2011.  The Recursive Mind: The Origin of Human Language, Thought, and Civilization. Princeton.

Donald, Merlin. 1991. Origins of the Modern Mind: Three Stages in the Evolution of Culture and Cognition.  Harvard.

Henderson, E. (ed.). 1971. The Indispensable Foundation: a selection from the writings of Henry Sweet. Oxford

Hewes, Gordon W. 1973.  ‘Primate communication and the gestural origins of language.’  Current Anthropology 14:5-24.

Rizzolatti, Giacomo and Arbib. Michael. ‘Language within our grasp.’ Trends in Neurosciences 1998 21:188-194

Critiques:

McNeill, David, Duncan, Susan D., Cole, Jonathan, Gallagher, Shaun & Bertenthal, Bennett. 2008.  ‘Growth points from the very beginning.’  Interaction Studies 9: 117-132.

Goldin-Meadow, Susan, McNeill, David, and Singleton, Jenny. 1996. ‘Silence is liberating: Removing the handcuffs on grammatical expression in the manual modality.’ The Psychological Review 103: 34-55.

Woll, Bencie. 2005/2006. ‘Do mouths sign? Do hands speak?’ in Botha, Rudie & de Swart, Henriette (eds.), Restricted Linguistic Systems as Windows on Language Evolution. Utrecht: LOT (Netherlands Graduate School of Linguistics Occasional Series, Utrecht University). http://lotos.library.uu.nl/publish/articles/000287/bookpart.pdf (accessed 05/02/11).

Sign languages with speech:

Emmorey, Karen, Borinstein, Helsa B. and Thompson, Robin. 2005. ‘Bimodal bilingualism: Code-blending between spoken English and American Sign Language’, in Cohen, Rolstad and MacSwan (eds.) Proceedings of the 4th International Symposium on Bilingualism, pp. 663-673.  Somerville, MA: Cascadilla Press.

Kendon, Adam. 1988. Sign languages of aboriginal Australia: cultural, semiotic and communicative perspectives. Cambridge.

Gestures of Apes:

Call, Josep and Tomasello, Michael (eds.). 2007. The gestural communication of apes and monkeys. Erlbaum.

 

 

 

David McNeill is a professor in the Departments of Linguistics and Psychology at the University of Chicago. 

His new title How Language Began: Gesture and Speech in Human Evolution is now available from Cambridge University Press at £19.99/$36.99

A Layman’s Guide to “Roots of English”

by Professor Sali A. Tagliamonte
 

Have you ever wondered about the weird ways of speaking of someone you know? In 1995, I moved to England from Canada, taking up a position at the University of York in Yorkshire. My colleagues came from all over Britain, the south, the north, Scotland and Northern Ireland as well as other parts of Europe. The topic of dialect differences was in the air all the time as we compared our varieties of English. Surprisingly, despite the obvious phonological differences in my speech compared to all my colleagues, there were unexpected correspondences between myself and my Scots, Northern Irish and Northern English colleagues. In some cases, we had the same vowel merger or we had the same lexical item or some odd bit of syntax was similar or we used the same form of one adverb or another. The correspondences came from all levels of grammar and sometimes in unexpected ways. It was curious to me that there were so many similarities and I wondered, why? I discovered that northern varieties of British English were among the most prominent dialect regions from which people migrated to other parts of the world in the late 18th century, particularly my own country of origin, Canada. Could it be that the roots of my way of speaking could be tracked back to these founding dialects? In 1999, embarked upon a research project to study the varieties of English these dialect regions.

 

Linguistic Wooly Mammoths. The research traditions of dialectology, historical linguistics and sociolinguistics have demonstrated that researchers can gain access earlier points in time. In the absence of a time machine, how is this possible? Consider a woolly mammoth frozen in a glacier. We can gain remarkable insight into past time by studying its characteristics. Linguists employ a similar method.

Places that are geographically remote, socially isolated or set apart from the rest are slow to adopt new changes, or are missed entirely. Such areas are referred to as tend to preserve older features. In this way remote, inaccessible, or otherwise isolated locations provide prime evidence about an earlier stage (or ancestor) of a language and play a key role in reconstructing earlier stages of a language’s development. There is perhaps no place more akin to these descriptions than the British and Northern Irish north country.

 

Dialects galore!  What I refer to as the Roots Archive is a rich compendium of oral histories from dozens of elderly people that I collected between 2001-2003. The materials contain rich language data with a wealth of rarely heard features of the English language. There are innumerable dialect words and expressions, e.g. fuzzok, peery, thrang. There are unusual sounds, och, aye. There are unexpected twists in the arrangement of sentences and in the way sentences begin and end, e.g. and that, you know. There are unusual conversational rituals. There are many things that are unusual and exotic; there are some things that are entirely unknown and yet others are hauntingly familiar. In many cases, features long gone from mainstream varieties of English endure.  In order to give readers a profound sense of the dialects, I have sprinkled the chapters with quips, stories and interchanges from the conversations e.g. weans and it’s a good job, as in:

 

Weans

Aye, they just come on the phone- “Morag could you come out the night there’s somebody, ken. Such and such a body can nae manage yin”. “Aye, Aye, I’ll just come out aye”. She’s just leaving the dogs. Says I, it’s a good job it’s no weans you’ve got for you would nae- could nae go!

 

These quotes expose innumerable dialect features. I have made note of some of them in footnotes so that readers can try to spot the features themselves and then verify whether they have found them all. Here is the footnote to the ‘weans’ quote.

 

Note the use of aye as a discourse marker; ken as a discourse particle; somebody rather than someone followed by use of a body in the generic; yin for ‘one’; inverted, says; the expression it’s a good job; the syntactic structure it’s no weans you’ve got ‘you’ve got no children’; use of can nae, would nae for ‘wouldn’t’, ‘couldn’t’.

 

Many of the features I discuss in the book are well known across English vernaculars, including regularized pasts, e.g. knowed, come, past tense seen and done among others. Others are typical of the northern UK dialects and often reported in compendia of varieties of English. However, a few have rarely been reported.

 

Linguistic detectives. Each chapter of Roots of English offers readers a “Dialect Puzzle” so that they can get a taste of what it is like to be a sociolinguist.

 

Dialects are the storehouse of the heart and soul of culture, history and identity. For analysts of language, dialects are a tremendous resource for understanding the grammatical mechanisms of linguistic change. Delving deep into the nuts and bolts of language, deeper than words and phrases and expressions, down into the grammar, we discover a treasure trove. Beneath the anecdotes and nonce tales are hidden patterns and constraints that are a system unto themselves reflecting the legacy of regional factions, social groups and human relationships. As language evolves through history its inner mechanisms are evolving incrementally, but not in the same way in every place nor at the same rate in all circumstances. One of my goals is to leave the reader with new ideas about the roots of his or her own dialect and how its particular socio-geographic co-ordinates might offer a ‘goldmine’ for ongoing study.

 

Sali A. Tagliamonte is a professor in the Department of Linguistics at the University of Toronto.

Her new title, Roots of English is now available from Cambridge University Press at £19.99/$34.99

The origin of language in gesture–speech unity

Part 1: Language and Imagery

By Professor David McNeill

Why do we gesture? Many would say that it brings emphasis, energy, and ornamentation to speech (which is assumed to be the core of what is taking place); in short, as Adam Kendon says, also arguing against the view, gesture is an “add-on.”  However, the evidence is against this.  The reasons we gesture are more profound. Language is inseparable from imagery. The natural form of imagery with language is gesture, with the hands especially.  While gestures can enhance communication, the core is gesture and speech together. They are bound more tightly than saying the gesture is an “add-on” or “ornament” implies. Even if for some reason a gesture is not made (social inappropriateness, physical difficulty, etc.), its imagery is still present, hidden but part of the speech process (it may surface in some other part of the body, the feet for example).

To answer to the question, why we gesture?, it is because gesture was built in from the start. Language could not have evolved without it.  If a theory of language origin is to predict the nature of language, it must among other things predict this gesture-speech unity.  But if a theory says that gesture-speech unity did not evolve, and/or predicts that something incompatible with it did evolve, the theory cannot be correct.  A widespread theory that I will call “gesture-first” fails this test. In this first post I explain gesture–speech unity. In later posts I apply the test and propose a new theory, called “Mead’sLoop,” that meets it.

The smallest unit of gesture–speech unity is called a growth point, or GP.  Growth points are inferred from the totality of communicative events, with special focus on speech–gesture synchrony and co-expressivity. They are called growth points because they are meant to be the initial pulses of thinking-for-and-while-speaking, a dialectic (or “multilectic”) from which a dynamic process of organization emerges.  The result is what Wundt described as the two modes of consciousness in speech:

 “From a psychological point of view, the sentence is both a simultaneous and a sequential structure.  It is simultaneous because at each moment it is present in consciousness as a totality even though the individual subordinate elements may occasionally disappear from it.  It is sequential because the configuration changes from moment to moment in its cognitive condition as individual constituents move into the focus of attention and out again one after another.” (Blumenthal 1970 translation of Wundt 1900, p. 21).

In a GP dialectic Wundt’s two modes are a natural outcome. The “simultaneous” is consciousness of the GP and dialectic itself; the “sequential” is awareness of what I call unpacking. The model is that a GP differentiates what for the speaker is the point of newsworthiness in the immediate context of speaking. This differentiation, partly linguistic form, partly imagery, is then “unpacked” into a construction, both rendering it communicable as a social effort and putting a stop-order to the dialectic.  (It’s the nature of this post that I must introduce a number of new terms, several of them binary opposites; I’ve placed a diagram at the end that lists them and shows how they relate.)

The figure below is an example of the kind of gesture we focus on (sometimes called “gesticulation”), illustrating the two simultaneous modes (the gestures were spontaneous occurrences, recorded during an experiment in which speakers were retelling an animated Tweety and Sylvester cartoon they had just watched; in this episode, Sylvester (an ever-seeking cat) attempts to reach Tweety (a pugnacious canary), who is perched on a windowsill several stories above the street, in a stealth approach by climbing a drainpipe on the inside). The gestures are iconic signs but the iconicity is semantic, not photo-like. They are images of concepts of the events in the story:

  • Pointing upward, she says, “he tries to go up inside,” localizing the character in gesture space.
  • Then, making a spiraling upward movement, she says, “barreling up through it,” depicting the character’s presumed spinning inside the pipe – an inference (not shown in the film).
  • Her left hand is shaped as a cup and embodies her concept of the inside of the drainpipe, timed exactly to go with the “barreling” part of her description.

Photos of a man gesturing

It is important to note that gesture and speech cover the same idea units. It is not that gesture holds hidden messages. Statements that such and such a percentage of meaning is “non-verbal” fly against the reality of gesture-speech unity. In nearly every case, speech and gesture convey the same meaning, but they do it in opposite ways. We see in the gesture visuospatial thinking – not only about space as such but about the same content also expressed verbally. Note the use of the left hand for inside the pipe. It is her concept of interiority, not a depiction of the actual pipe, which enclosed Sylvester and was vertical, not horizontal. Both the verbal and imagistic modes capture interiority but in opposite ways.  Unlike the sentence, the ‘parts’ of the gesture (the shape, the direction, the motion, etc.) do not have their own meanings; they are meaningful only in the context of the gesture as a whole.  This is called the global property: the meanings of the parts depend on the meaning of the whole. It is the opposite in speech.  There the parts (words) have their own meanings and build up the meaning of the whole through combination.  This is called the syntagmatic property

So in gesture-speech unity different modes of semiosis (“semiosis” and “semiotic” refer to the nature of symbols) are presenting the same meanings at the same time – global whole-to-part in gesture, syntagmatic part-to-whole in speech, and they are synchronous. Throughness is visualized as a hollow space – not an iconic replica of the pipe but the concept realized imagistically with its own location.  It goes not with ‘inside’ but with its conceptually parallel ‘barreling up through.’  When gesture and speech synchronize (as they do in that vast majority of utterances), one idea – here, Sylvester’s ascent via the pipe – is simultaneously in two semiotic modes, imagery and language. The result is an idea unit in which imagery and words combine, and this is an inherently dynamic situation.  Such a system of language, an imagery-language dialectic, would be explained if we found, independently, that language began in a gesture-speech unity.  We shall later see how this may have happened (post 3).

Seeing the gesture and the co-expressive speech it synchronizes with, we witness a moment of an ongoing imagery–language dialectic. Gesture-speech unity is the nexus at which imagery and the codified forms of language form intersect – two dimensions of language with equal weight. The picture is not unlike Humboldt’s distinction of Ergon and Energeia (language viewed as structure and language as an “embodied moment of meaning located both in the organism and in the medium that the organism uses for expression.” The latter is language at the moment of its use, “alive, in an actor”: Joseph Glick describing seminars by Heinz Werner; from Elena Levy).

The larger picture – 1. Historically, the dynamic and static have been approached separately – each with its own traditions, methodologies, sciences, and institutional practices (& prejudices).  Each tradition describes something of substance:

Static = language is a thing, not a process.  This is the Saussurian tradition and it bears on Wundt’s sequential mode.  The academic field of linguistics has specialized on the static dimension.

Dynamic = language is a process, not a thing. This is the Vygotsky tradition and it bears on Wundt’s simultaneous mode.  The budding field of gesture studies focuses on this dimension.

However, we must combine them. The dynamic does not replace the static. Gesture gives us access to the dynamic mode. Linguistic form gives the static (no particular synchronic description is favored: we go with whatever fits best the dynamic picture we are trying to paint). The important point is that both modes are present.

The larger picture – 2. An imagery–language dialectic implies:

  • A conflict or opposition of some kind, in our case between the two semiotic modes, a dual semiosis.
  • Resolution of the conflict through change, its unpacking.

A dialectic is inherently dynamic and a good model of the psycholinguistics of speaking.

A dialectic presupposes Vygotsky’s concept of a unit as the smallest component that retains the quality of a whole.  This whole is the imagery–language dialectic.  A GP of unified gesture and speech is the smallest unit in which an imagery-language dialectic takes place. Further reduction to a gesture and a linguistic segment separately destroys the unit itself, leaving only a gesture or linguistic segment but not a dynamic process.

A quick list of GP’s properties:

  • It is proposed as the minimal unit of the imagery-language dialectic. 
  • It is a dialectic package that has both linguistic categorial and imagistic components. 
  • Growth points are inferred from the totality of communicative events with special focus on speech-gesture synchrony and co-expressivity.
  • By focusing on these properties we bring out the modes of cognition envisioned by Wundt.

All of this is why we gesture.  Gesture is an integral part of speaking. And language could not have begun without it.  The next post in this series will take up the evolutionary precursors of this dual semiotic system of gesture-speech unity.

The many binaries. I have made the following diagram to sort out the several distinct but related binary oppositions, plus a few other critical terms in this posting.  The numbers are the order in which the first mention of the term occurred:

Further Reading

Kendon, Adam. 2008. ‘Some reflections on the relationship between ‘gesture’ and ‘sign.’’ Gesture 8:348-366.

McNeill, D. and Duncan, S. D. 2000.  Growth points in thinking for speaking.  In D. McNeill (ed.), Gesture and Language, pp. 141-161. Cambridge University Press.

Saussure, Ferdinand de. 1959. Course in General Linguistics (Charles Bally and Albert Sechehaye, eds., Wade Baskin, trans.).  New York: The Philosophical Library.

Vygotsky, Lev S. 1987. Thought and Language. Edited and translated by E. Hanfmann and G. Vakar (revised and edited by A. Kozulin). MIT Press.

Wundt, Wilhelm. 1970.  ‘The psychology of the sentence.’ In A. Blumenthal (ed. and trans.). Language and Psychology: Historical Aspects of Psycholinguistics, pp. 20-33. Wiley.


David McNeill is a professor in the Departments of Linguistics and Psychology at the University of Chicago. 

His new title How Language Began: Gesture and Speech in Human Evolution is now available from Cambridge University Press at £19.99/$36.99

 

Christopher Brumfit Award prize winners announced

The Editor and Board of Language Teaching are pleased to announce that there were two tied winners of the 2011 Christopher Brumfit thesis award: Dr. Cecilia Guanfang Zhao and Dr Catherine van Beuningen. Both theses were selected by an external panel of judges based on their significance to the field of second language acquisition, second or foreign language learning and teaching, originality and creativity and quality of presentation. This year’s runner-up was Dr Rebecca Sachs, whose work was singled out for praise as ‘an exceptional thesis, which clearly involved an immense amount of work in its conceptualization, implementation and analysis.’

The annual Christopher Brumfit Thesis Award commemorates the work of one of the world’s most renowned applied linguists and one of Language Teaching’s most active contributors. Since 2008, the award has recognized doctoral thesis research that makes a significant and original contribution to the field of Second Language Acquisition and/or foreign/second language teaching and learning. Previous winners of the prize include Irina Elgort, Andrea Borbély Hellman, Okim Kang and Susan Mary Macqueen.

Applications for the award are open from November each year, and examiners pay particular attention in any submission to a number of categories including significance to the field, originality and creativity, presentation, use of the background literature, methods of enquiry, analysis of data and the discussion of the outcomes.

Click here to read further information about the 2012 Christopher Brumfit thesis award

In subsequent posts we will be profiling the work of all three prize winners and providing an insight into what made them prize-winning pieces of work.

Friday 27th July - Dr Cecilia Guanfang Zhao ‘The Role of Voice in High-Stakes Second Language Writing Assessment’

Friday 3rd August - Dr Catherine van Beuningen ‘The effectiveness of comprehensive corrective feedback in second language writing

Friday 10th August - Dr Rebecca Sachs ‘Individual differences and the effectiveness of visual feedback on reflexive binding in L2 Japanese’

 

The Semantics of Colour: a Historical Approach

a blog by Dr C. P. Biggam

For several decades now, anthropological linguists have probed and investigated the various ways in which humans describe colour. This is not simply a matter of translation. We can translate English green into German (grün), into French (vert), into Spanish (verde) and so on, but do they all mean the same? Do they all include greenish yellows, for example? How much of turquoise do they include, if any? Do they all have metaphorical overtones of immaturity and inexperience? We may find that these four words have only minor differences in meaning, since they all belong to the Indo-European language family, but would their rough equivalents in Asian or African languages be so close?

 

The colour concepts common to members of a particular society, and the words they use to ‘label’ them can be surprising and even bizarre to speakers of an unrelated language. English speakers think of colour as the surface appearance of an object, such as blue, red, green, burgundy, taupe and many others, but, technically speaking, these are hues. Hue is just one element in colour studies, and some of the others are brightness, dullness, vividness, paleness, darkness, surface texture, moisture, dryness, size and shape. Moisture? Shape? What have they got to do with colour?

 

Some societies use words which combine two or more elements, such as a hue and dryness, to describe the surface appearance of objects. This is not the same as me describing dry cereal grains as ‘yellow’, because I can also describe a moist fruit as ‘yellow’. Some societies, however, can only use certain words for particular combinations, for example, for yellowness plus dryness. They may or may not have another yellow word for moist things. Consider Victoria Bricker’s findings in the Yucatec language of Mexico that (to take one example) colours could not be described as glossy or gleaming if they were associated with small, rounded objects. To uncover the colour systems of other societies, the anthropologist has to keep a (very) open mind.

 

Anthropologists can listen to native speakers, and record how they refer to colour as they go about their normal lives. My book, however, is concerned with historical languages, for which no native speakers survive. It suggests how historical linguists can extract as much colour information as possible from written evidence, and how to spot the likely presence of unfamiliar systems. The book introduces information which is essential for colour studies in general, such as how to recognize basic colour concepts and terms (in English, red is basic but crimson is not); how to deal with sub-sets (perhaps horse-colours or hair-colours); how to suspect the use of macro-categories (where, for example, blue and green are considered a single colour), and much more. It also considers the means employed by various scholars to record and explain colour systems, such as Anna Wierzbicka’s Natural Semantic Metalanguage and Robert MacLaury’s Vantage Theory. All these ideas, illustrated by examples from around the world, are presented as the essential background to understanding historical references to colour. And it must be pointed out that the past is definitely a foreign country: our familiar English-language colour system was not always the same, as we can see from the following two cases.

 

Old English (the language up to c.1100) includes the word græg which developed into Modern English grey, but did it mean ‘grey’, that is, a mixture of black and white? To answer this question, the researcher looks at, among other things, the Latin words which Anglo-Saxons translated with græg. We find that græg has been used to translate croceus ‘saffron-coloured, yellow (or ruddy)’ and cycneus ‘swan-like (in whiteness)’. It would seem apparent that, when Anglo-Saxons used græg, they were not always referring to a mixture of black and white.

 

Even when we move nearer the present time, by looking at Middle English (from c.1100 to c.1500), we find that colour matters are still not entirely familiar to us. The word bleu, whose modern descendant is blue, was a new introduction to the language from Norman French, and some of its early meanings included ‘blonde’ and ‘pale’. Its almost exclusive connection with the blue hue developed gradually over many years. The researcher, of course, wants to know why and how, and it is hoped that, whatever the language under investigation, The Semantics of Colour will help him or her to investigate such mysteries.

The Semantics of Colour

Dr Biggam is Honorary Senior Research Fellow, English Language in the School of Critical Studies at the University of Glasgow. Her new book The Semantics of Colour is now available from Cambridge University Press, including a Kindle edition.

Meaning and Humour

A blog by Andrew Goatly

Meaning and Humour

To what extent is humour a liberating force? According to the theory advanced in Meaning and Humour, humour defeats expectations or introduces incongruities. And, linguistically speaking, this can be analysed as an overriding of lexical priming (Hoey), or as surprising foregrounding (Leech). For example, consider this joke:

“Give a man a fish–feed him for a day. Give a man two fish—feed him for two days”.

 


Internally the second sentence is not foregrounded—it is entirely predictable, to the point of near redundancy. Whereas externally, according to the expectations of this epigrammatic genre, where we anticipate something clever, unpredictable, entropic, the second sentence is foregrounded. The fact that most humour depends upon the overriding of lexical priming or startling departures from discoursal norms would appear to make it a linguistically disruptive or rebellious force. However, humour does little to change priming patterns.One might contrast it with original metaphor in this respect, which can change the lexical meanings of a language more permanently. It is no accident that most metaphorical humour, at least metaphorical punning, depends upon conventional, lexicalised metaphors. For instance:

“You could walk through George W Bush’s deepest thoughts and not get your ankles wet”


This depends upon the conceptual metaphor UNDERSTANDING/INSIGHT IS DEPTH realised in the two meanings of deep.In terms of its social uses and functions humour is also rather ambiguous as a liberating force. On the one hand it can be a tool for rebellion against authority. On the other, it could be regarded as a means of social control, often through embarrassment (Billig). Either way, as well as its uses as a social lubricant, humour can be seen as a weapon.

This example of unintentional humour encapsulates some of these ideas.

Dale Martin, an entertainer, has been ordered by a provincial court judge to avoid making anyone pregnant for the next three years. The order not to impregnate any girls came from Judge Leslie Bewley, who gave Martin a suspended sentence and three years probation for possession of an offensive weapon. (Toronto Globe Tibballs 2006: 490)

In the context of the conceptual metaphor sex is violence and due to the textual priming by “pregnant” we are likely to interpret “weapon” as a sexual metaphor. However, an alternative priming, by the genre of (news reports of) legal judgements, rules out this meaning, and predicts the literal one. By the way, this conceptual metaphor association of sex with violence underlies the theories of humour as both aggressive and sexual (Freud, Koestler, Fonagy). Jokes have “punch” lines, and may “misfire”, as well as being associated with sex. In the Collins Cobuild Wordbanks Online corpus sex is the most common (T-score) collocate of joke, and dirty is the 25th most common.


Andrew Goatly is a professor in the Department of English at Lingnan University, Hong Kong. His new title Meaning and Humour is now available from Cambridge University Press at £22.99 / $35.99