Scissors cut paper.
Paper wraps rock.
Rock blunts scissors.
These sentences are not only the ‘unspoken rules’ of the game; they are the game.
The players also understand other unspoken rules. Let us consider them as a ‘rule of three’.
(i) Each sentence must contain two items and an action.
(ii) The order of words shows which item is acting (‘the subject’) and which is acted upon (‘the object’).
(iii) The rules alone are not enough; the meaning must make sense to both players.
As we speak, we create ‘story’. We do this using the unspoken rules of our language (the grammar and syntax) to ‘make sense’ of our words and those of others. The rules used are specific to each language; there is no underlying structure of language (a ‘universal grammar’) as Chomsky has suggested. Instead, a universal principle applies; all languages, even the simplest of creoles, quickly evolve their own patterns of grammar and syntax. These patterns produce the attributes of ‘story’, the structure which allows a language to work.
Stories have a structure.
Stories should have a beginning, a middle and an end…
But not necessarily in that order
(Jean Luc Godard)
The mimed game of ‘rock, paper, scissors’ tells the story of how these objects interact. The sentences build relationships between the objects. The rules of grammar and syntax allow us to organise the words into sequences that show what these relationships are. We use this structure to code meaning into our speech.
As we learn words, so we ‘imprint’ the structure of our language. We also acquire common expressions, such as sayings and other phrases or ideas with metaphorical meanings (e.g. characters from fairy tales) that are understood by our cultural group.
Many songbirds and other animals also produce imprinted learned calls made of ordered sequences (phrases) of sounds. We recognise these repeating patterns in the speech we hear and ‘imprint’ as children. The differences in our childhood learning environments give rise to variations in the way we produce sounds, such as regional accents.
Speaking involves learning and repeating these sound-generating movement patterns, along with other movements such as facial expressions and hand gestures. In this sense, our speech is formed from a set of stylised vocal and non-vocal movement patterns. We learn this ‘dance’ from our cultural group.
Most of our words, phrases, sayings, metaphors and inferences result from inherited patterns in our cultural context. This ‘local effect’ has also been observed in ‘dialects’ (local pattern variations) of birdsong.
Stories ‘move us’ physically and through emotions.
Movement is part of all types of communication. In the course of the action (the verb) within a speech phrase, the ‘subject’ (the ‘character’ which is active) moves between physical states as a result of having made this action.
We understand that this change has taken place through the emotional shift we associate with this change. The very word ‘emotion’ contains the Latin root movere, to move, and the prefix ‘e-’ means ‘out of’.
We code our words with symbolic meaning by tagging our understanding of what they represent with how we feel. Our words are therefore symbols that contain both semantic and emotional content. This content is conveyed into our speech through the pitch, timbre, tone and pace of our voice, as well as the rhythm and stress patterns within our speech.
Vital as this is, usually we are not conscious of this musical content, known as ‘prosody’, which transmits emotional information. Beyond ourselves, we can also discern and infer emotional cues in the involuntary vocal calls and facial expressions (‘facial gestures’) of other primates.
Producing speech with mouths, hands, or even through remote means such as writing, involves movement. Indeed, the same neural pathways, involving the so-called ‘mirror neurons’, are activated both when we either think of or hear a word, and when we encounter the experience which the word symbolises. This neural network extends into the emotional centres of our brains, allowing us to code the meaning of words by relating emotions to our experiences.
Spoken words are rhythmical ‘vocal gestures’ made with our eating apparatus (the mouth and throat), and are coordinated with the rhythm of our breathing. Controlling these movements involves coupling and modifying rhythmic outputs from multiple Central Pattern Generator circuits.
These ‘neural metronomes’ generate autonomous rhythmic nerve impulses that drive repetitive movements like chewing and walking. Higher brain centres initiate and coordinate these signals, integrating them via the basal ganglia into ordered sequences of finely controlled motor movements. The order, pace and musicality of our speech results from combining sets of these neural patterns with pattern modulations and interruptions, in a way which works with the rules that structure our language.
Our body movements, like our language use and accents, follow patterns that we learn as children from ‘our tribe’. The unique rules and patterns of a language, like its vocabulary, are indicators of the cultural content and perspective they express. A speaker and listener must agree the meaning of the words and phrase structure they use. Communication then, involves creating ‘story’ using movements which are simultaneously a whole body and a whole society activity.
Stories have something to say.
In the game ‘scissors, rock, paper’, both players understand the meaning of the symbolic gestures used and the relationship between them. Their interaction is understood as an event in time and space. In language, we ‘tell stories’ using ordered words and phrases in order to convey an intended meaning.
‘Scissors cut paper’
The sentence describes an event which may have happened, is happening or could happen. An object acts upon another object in time and space, until there is a resolution. The objects are represented as gestures. Considering words as gestures in sound, when we hear or read a simple sentence from the game, we internally reproduce (‘represent’) the associated objects, actions and representing gestures in our mirror neuron circuits along with the experiences they symbolise. The mirror system, associated with areas of motor activity, allows us to revisit the embodied experience of movements associated with these ideas.
We perceive the word symbols and emotional content of speech as patterns. Mammals may be particularly competent at recognising patterns and edges, but our pattern recognition ability is exceptional. When we speak, we use sound pattern motifs (words) coded with symbolic ideas, and then order these into sequences.
Forming our words and phrases involve making a series of movement patterns. Our speech and body posture reveals how we understand ourselves. As our children progress in learning to speak, they reveal the progress of their capacity to take charge of their thoughts. The story that emerges through our spoken words reveals what we think and how we feel. We craft this narrative from the meaning we assign to our observations, and use music and movement to mirror this back to ourselves and others. How could this story mechanism have begun, and how has it evolved?
Did our stories begin with singing?
Amongst the apes, our ability to sing is unique. The musicality of our speech has many components, including variations in rhythm, phrasing, pitch and tone.
Speech is inherently rhythmical. Indeed in English, we even shift the stress of a word to maintain that rhythm. Andrew Carstairs-McCarthy has suggested that syntax evolved from the same basic (open-close) ‘oscillating motor’ mechanism by which we also organise our articulated sounds into ‘consonant-vowel-consonant’ syllables. Alternation is discernible in the rhythm of speech, and appears also in word order, such as the object-action-object structure of ‘paper wraps rock’.
Every human culture has some form of music with a beat. Rhythm determines how we perceive and process musical information. Syllables and small clauses have a bimodal rhythmic structure which helps to articulate the sounds within the phrase; for example:
‘PAper wraps ROCK’.
Asif Ghazanfar suggests that rhythmic communication is found throughout the higher primates, in the form of behaviours such as chimpanzee pant-hoots and rhythmical facial gestures such as lipsmacks. This implies that a similar bimodal rhythm mechanism was present in the communication behaviours of our shared ancestors.
William Tecumseh Fitch suggests that speaking began with prosody; he proposes that our hominin ancestors’ ‘proto-language’ may have initially used intonation (controlling the pitch of sounds) rather than word-based syllables. Many current world languages are based on tones rather than word forms.
Fitch argues that the vocal control which allowed our ancestors to sing may have been later ‘exapted’ to produce such pitch-based proto-syllables, and that articulation of vowel and consonant sounds came later as a means of expanding the diversity of this sound repertoire. This makes a plausible case for the origins of our ability to articulate syllables arising from song.
The human male larynx descends during puberty, giving a deeper formant (resonant frequency) which enables us to distinguish male from female voices mainly by their pitch and tone. In contrast to most songbirds, our ability to speak (and sing) is balanced between the sexes. This suggests that however human speech arose, it was not primarily to attract mates, as is the case in most songbirds and vocal mammals.
This may be true, but David Puts and colleagues have found that the pitch of a man’s voice alters in the presence of other males, inferring an element of between-male competition in the establishment of dominance. The second descent of the human male larynx may therefore have evolved as a cue to signal the individual’s status within the tribe, or to defend territorial boundaries.
In addition, low sounds travel further. Deeper male voices may have proven more effective at coordinating the tribe when hunting in low visibility conditions such as dense forest or over long distances in the open savannah.
(iii) Musical phrasing
In English, we use variations in relative pitch to shape our phrases and code emotional meanings. These ‘melodies’ overlay variations in pitch on to the rhythmic patterns within words. Shifts in pitch give stress and emphasis, so shaping these phrases. This musicality is crucial to reveal our intended meaning to others, while intonation makes it easier for a listener to distinguish the endings of our phrases. As children we learn to make these musical (prosodic) sound patterns alongside our capacity to articulate the vowels and consonants.
Both young songbirds and human children have sensitive periods of vocal learning that requires social feedback from adults. Like human babies, songbirds have a ‘babbling’ phase where they try out sounds prior to rehearsing and imprinting their adult call. These birds build phrased sequences from repeating syllable elements, punctuated with explicit pauses.
Repeated elements may add emphasis, although modifying the order does not seem to alter the meaning of the call, which advertises their suitability as a mate. Sequence variations may however serve to identify individuals within the flock for some species, such as starlings.
Steven Brown shows that the most complex song forms in birds and in other primates arises amongst monogamous pairs of duetting tropical songbirds and gibbons. These calls appear to be significant in the defence of territories and maintaining social bonds. He notes that courtship calls are rare to non-existent in our sub-clade of the higher primates; neither chimps nor bonobos vocalise complex learned song. Although in theory their vocal tract anatomy would enable them to produce some vowel and consonant sounds, no chimps have ever done so.
In contrast, territorial calls are found throughout the entire primate clade. A deeper male voice may permit a means to distinguish authority and hierarchy within the tribe, and may have been influential in ‘vocal grooming’ amongst our hominin ancestors.
In order to produce sequences of symbolically coded, articulated sounds, however, our ancestors must have been capable of organising their proposed actions into sequences. Neurological studies show that organising our thoughts into sequences requires a high degree of brain connectivity.
This would mean that the developmental shift in the degree of connectedness between neurons that allowed hominins to enact organised sequences of behaviours, may have provided the circuitry that was later ‘exapted’ for language syntax.
How did our storytelling behaviour evolve?
Other animals’ behaviours are driven broadly by instinctual need. Whilst humans do operate at this level, what is distinctive about our behaviour is the capacity to produce actions to achieve an intended purpose. Our speech is unique because we can intentionally use it to order our thoughts and tasks in time.
At the cognitive level, this is in essence the same type of task as ordering a sequence of actions using manual tools to achieve a goal. We use words as tools to share information by constructing an idea before we can share this information with others.
In both cases, we construct the tasks in a sequence, formulating the goal (the completed manual task or the delivering of the information) before we begin. This structuring is an active process that takes place even for a remembered event.
As we revisit our memories, we include only certain details in our narrative; these details trigger pattern recognition in our own thinking (and that of our listener). This processing and editing of information is the essence a structuring of patterned information. The establishment of a ‘narrative’ (a sequence of events) applies equally to using a manual tool and ‘telling a ‘story’.
Our repertoire of stories, folk tales and fairy tales are amongst the tools by which we share culturally what is important to us and how we think. Using story in this way reveals patterns; we discern deeper meanings and lessons from the presented information, which is a mechanism to share understanding.
Being able to project different interpretations to evaluate cues from the environment provide a means for assessing risk, and resulting in different types of behaviour choices. The means to effectively evaluate a cue such as rustling in the undergrowth as either an opportunity (a potential food item such as a small mammal) or a threat (a wolf) would quickly prove a selectable advantage.
Many animals recognise cues which index the presence of another animal (e.g. a scent, or footprints). It is easy to imagine how our ancestors’ capacity to evaluate these cues strategically using their ‘thinking tools’ would quickly change their chances of survival.
Being able to share these thoughts and coordinate their responses with others would radically shift the ecology of the tribe into a mode where sensory awareness and experience held and understood collectively by the ‘quorum’.
Words then are tools that define boundaries between ideas, and help to structure our collective thinking. The making and using of tools requires the execution of sequences of patterns, in the form of intentional fine motor movements.
Philip Lieberman suggests that our upright posture may have pre-adapted our ancestors for the enhanced motor control needed for tool-making, tool using and speech. Walking itself is a simple patterned, repetitive movement.
Our basic ‘walking instinct’ initially activates a Central Pattern Generator circuit driving movement in all four limbs. Our newborns’ initial locomotion usually involves crawling. The subcortical basal ganglia of the language network in the brain also regulate the muscles controlling our upright posture.
Walking, and other sequences of behaviour such as speech, are learned gradually over years. Heel strike, which marks efficient bipedal locomotion, takes years to develop.
Learning to use words as symbols requires repetition of the word and an internal coding of it as a pattern motif. The characteristic of any pattern is that it contains repeating elements, and that repetition stimulates our memory to the degree of the strength (i.e. our familiarity with) that pattern. The sequence of word order determined by a language’s grammar and syntax provides in essence a framework for creating patterns.
This ability to associate gestures with ideas and order them into sequences is present to a limited degree in our primate relatives. Wild bonobos combine call types together into longer mixed sequences upon finding a food cache. Other tribe members understand information about the quality of this find from these different sound sequences, and respond with appropriate types of foraging behaviour.
Higher primates have a mirror neuron network which is triggered by facial expressions and grasping movements associated with obtaining food, although they cannot mirror mimed actions.
Why do we tell stories?
Communication involves transmitting and receiving a message that is understood the same way by sender and recipient. Human speech is the same, but is driven by the intention to share a meaning. As well as a repertoire of pre-arranged signals with agreed meanings, we need to have something to say. We tell stories therefore to communicate an intention. This means that our ancestors must have felt compelled to convey the contents of their thoughts to others.
We speak to communicate the contents of our minds. It is perhaps then the patterns of our thinking, rather than our speaking, which are the truly unique feature of human communication. Our ability to combine ideas into a syntactic structure and create new associations demarcates a boundary between our thinking and that of our closest relatives, the chimps.
Captive chimpanzees and bonobos can be taught to understand some signed words or use pictorial symbols, and the most accomplished of these learners can combine certain of their vocabulary into ‘small clause’ forms, for example ‘agent + action’ or ‘action + object’.
Ljiljana Progovac argues that these ‘small clauses’ are the basic units of word combination that all languages share. She proposes these forms as the ‘proto-syntax’ from which our more complex structures have evolved. If this is correct, then this suggests that the most basic components of our ability to combine words into language were present in our common ancestors with chimpanzees.
The way we acquire language reveals the process by which we develop a sense of self-identity. Our children’s awareness of ‘self’ and ‘other’ is expressed in their language, although it is not dependent upon it. An understanding that others share similar experiences (known as a ‘theory of mind’) develops gradually during their first five years. The ability to organise words into small clauses such as ‘scissors cut’, or ‘cut paper’ appears in human children at between 1 and 2 years of age.
The more accomplished language learning primates, such as the bonobo Kansi, associate sounds and symbols with objects and even seemed to understand abstract concepts such as ‘happy’. Bonobos at the Georgia State University primate language research project use a lexigram (symbols board) to create two-word small clauses. None of these animals, however, has ever attempted a self-description such as ‘I think…’, ‘I feel…’ or ‘I want…’ This ability develops in human children between 2 and 3 years of age. The bonobo’s use of human symbol-based language, in contrast, arrests at a stage roughly equivalent to a human child of around two years.
All young mammals play, but perhaps the most intriguing part of human communication is that play is incorporated into our information sharing. The simple story of scissors, rock and paper is fiction in that the items do not need to exist to be ‘present’ and interact in the game. Fiction captivates our attention and holds it far more easily than factual narratives.
Within a story, our thoughts project information based upon past experience into the future, allowing us to ‘play out’ the actions in drama in mime before attempting the task. Studies on primate mirror neuron responses report that these animals typically do not respond to such mimed gestures.
The act of telling stories shifts our ecology into the ‘cognitive niche’; we operate in a world of endlessly combined and recombined ideas. Emotions are the source of meaning for these ideas. Perhaps our evolved story mechanism can be considered as a means of evolved emotional language, conveying higher orders of meaning and inferred understanding to our experiences. This combining of ideas, putting one into another to make a new meaning which is different from that of the ideas on their own, is known as ‘recursion’. It is considered to be a defining characteristic of our capacity for creativity in our thoughts.
In summary then, our ‘story mechanism’ translates thoughts into actions, and enables us to bring new things and events into being. Coding information in story form makes it possible to share past experiences, ‘reword’ them into new sequences, and by communicating with others, project these imaginings into the future.
It is clear that our hominin ancestors developed neural pathways that allowed them to copy and learn movement sequences, including vocal movements, and used these skills to share their intentions with others. What is less clear is how they came to understand that others had minds like their own, prompting their yearning for connection.
Whatever its cause, however, an expansion of conscious awareness drove our ancestors to share their ideas and understanding, and to begin to tell each other their story.
- Our speech comprises a complexity of coded sounds that we are able to order and organise. This organisation has ‘rules’ that a listener can use to perceive and translate what they hear into meaning.
- Using speech-based language allows us to put our thoughts in an order, and then control our actions in a defined and directed way. Neurologically there is no difference between using manual tools and word tools in an ordered sequence; the brain codes both of these as a set of gesture-based motor movements.
- Some birds and other animals are able to learn complex sound sequence patterns, mostly as a display signal for sexual selection. In contrast, human speech and other forms of communication are gender balanced. We use our communication sequences to bond socially and carry and transmit collectively held ideas.
- The behavioural choices of other animals and birds is the result of reactions to their circumstances. Humans put their thinking into words, and voice their intentions.
- There is nothing biologically unique about the behaviours that allow us to speak. Our language function suggests instead a greater level of connection between these abilities than is found in other animals. This has allowed us to assume fine control over our movements that produce vocalisations and other actions, and also by implication our thoughts. This may be a result of the physical changes needed to allow our ancestors to walk upright.
- What human language enables us to express is a sense of self-identity; it is a means of defining ‘our story’.
Text copyright © 2015 Mags Leighton. All rights reserved.
Amandor, A. et al. (2013) Elemental gesture dynamics are encoded by song premotor cortical neurons. Nature 495, 59-64.
Astington, J. W. and Edward, M.J. (2010) The development of theory of mind in early childhood. Encyclopedia on Early Childhood Development, 1-6. Edward M.J (Ed) Published online at www.child-encyclopedia.com
Atran, S. (1982) Constraints on a theory of hominid tool-making behaviour. L’Homme 22, 35-68.
Bouwer, F.L. et al. (2014) Beat processing is pre-attentive for metrically simple rhythms with clear accents: an ERP study. PLoS ONE 9, e97467.
Boyd, B. (2009) On the Origin of Stories: Evolution, Cognition and Fiction. Harvard.
Brown, R. (1973) Development of the first language in the human species. American Psychologist 28, 97-106.
Brown, S. (2000) Evolutionary models of music: from sexual selection to group selection. Perspectives in Ethology 13, 231-281.
Bruner, J.S. (1975) From communication to language – a psychological perspective. Cognition 3, 255-287.
Clark, K.B. and Clark, M.K. (1939) The development of consciousness of self and the emergence of racial identification in negro preschool children. Journal of Social Psychology, S.P.S.S.I. Bulletin 10, 591-599.
Corballis, M.C. (2007) Recursion, language, and starlings. Cognitive Science 31, 697-704.
Dittrich, F. et al. (2013) Maximized song learning of juvenile male zebra finches following BDNF expression in the HVC. European Journal of Neuroscience 38, 3338-3344.
Donald, M (2001) A mind so rare: the evolution of human consciousness. Norton
Dunbar, R.I.M. (2003) The social brain: mind, language, and society in evolutionary perspective. Annual Review of Anthropology 32, 163-181.
Eisen, A. et al. (2014) Tools and talk: an evolutionary perspective on the functional deficits associated with amyotrophic lateral sclerosis. Muscle & Nerve 49, 469-477.
Evans, N. and Levinson, S. (2009) The myth of language universals: language diversity and its importance for cognitive science. Behavioral and Brain Sciences 32, 429-448.
Everett, D. L. (2009) Don't Sleep, There are Snakes: Life and Language in the Amazonian Jungle. Random House.
Everett, D. L. (2012) Language: The Cultural Tool. Random House.
Everett, D. L. (in press) The role of culture in the emergence of language 1. In The Handbook of Language Emergence (W. O’Grady and B. MacWhinney, eds). Wiley-Blackwell.
Everett, D. L. (in press) Sculpting language: A review of the David McNeill Gesture Trilogy. In The Handbook of Language Emergence (W. O’Grady and B. MacWhinney, eds). Wiley-Blackwell.
Fitch, W.T. (2005) The evolution of language: a comparative review. Biology and Philosophy 20, 193-230.
Fitch, W.T. (2011) The evolution of syntax: an exaptationist perspective. Frontiers in Evolutionary Neuroscience 3, article 9.
Fitch, W.T. (2012) Evolutionary developmental biology and human language evolution: constraints on adaptation. Evolutionary Biology 39, 613-637.
Gallistel, C.R. (2011) Prelinguistic thought. Language Learning and Development 7, 253–262.
Gentner, T.Q. et al. (2006) Recursive syntactic pattern learning by songbirds. Nature 440, 1204-1207.
Ghazanfar, A. (2013) Multisensory communication in primates and the evolution of rhythmic speech. Behavioural Ecology and Sociobiology 67, 1441-1448.
Gould, S.J. and Vrba, E.S. (1982) Exaptation – a missing term in the science of form. Paleobiology 8, 4-15.
Iverson, J.M. and Thelen, E. (1999) Hand, mouth and brain. Journal of Consciousness Studies 6, 19-40.
Jürgens, U. (2002) Neural pathways underlying vocal control. Neuroscience and Biobehavioural Reviews 26, 235–258.
Lai, J. and Poletiek, F.H. (2011) The impact of adjacent-dependencies and staged-input on the learnability of center-embedded hierarchical structures. Cognition 118, 265-273.
Lieberman, P. (1984) The Biology and Evolution of Language. Harvard.
Lieberman, P. (2001) Human language and our reptilian brain: the subcortical bases of speech, syntax, and thought. Perspectives in Biology and Medicine 44, 32-51.
Lieberman, P. (2006) Toward an Evolutionary Biology of Language. Harvard.
Lieberman, P. (2009) Human language and our reptilian brain: The subcortical bases of speech, syntax, and thought. Harvard.
Naoi, N. et al. (2012) Prosody discrimination by songbirds (Padda oryzivora). PLoS ONE 7, e47446.
Patel, A.D. and Iversen, J.R. (2014) The evolutionary neuroscience of musical beat perception: the Action Simulation for Auditory Prediction (ASAP) hypothesis. Frontiers in Systems Neuroscience 8, article 57.
Petitto, L.A. et al. (2001) Language rhythms in baby hand movements. Nature 413, 35-36.
Petkov, C.I. and Wilson, B. (2012) On the pursuit of the brain network for proto-syntactic learning in non-human primates: conceptual issues and neurobiological hypotheses. Philosophical Transactions of the Royal Society of London, B 367, 2077-2088.
Progovak, L. (2010) Syntax: its evolution and its representation in the brain. Biolinguistics 4, 234-254.
Puts, D.A. et al. (2006) Dominance and the evolution of sexual dimorphism in human voice pitch. Evolution and Human Behavior 27, 283-296.
Puts, D.A. et al. (2007) Men’s voices as dominance signals: vocal fundamental and formant frequencies influence dominance attributions among men. Evolution and Human Behavior 28, 340-344.
Rey, A. et al. (2012) Centre-embedded structures are a by-product of associative learning and working memory constraints: evidence from baboons (Papio papio). Cognition 123, 180-184.
Roy, A.C. et al. (2013) Syntax at hand: common syntactic structures for actions and language. PLoS ONE 8, e72677.
Savage-Rumbaugh S et al. (1998) Apes, language, and the human mind. Oxford
Selezneva, E. et al. (2013) Rhythm sensitivity in macaque monkeys. Frontiers in Systems Neuroscience 7, article 49.
Suddendorf, T. and Corballis, M.C. (2007) The evolution of foresight: what is mental time travel, and is it unique to humans? Behavioral and Brain Sciences 30, 299-351.
Vaesen, K. (2012) The cognitive bases of human tool use. Behavioral and Brain Sciences 35, 203-262.
Yip, M.J. (2006) The search for phonology in other species. Trends in Cognitive Sciences 10, 442-446.
Zawidzki, T.W. (2006) Sexual selection for syntax and kin selection for semantics: problems and prospects. Biology and Philosophy 21, 453-470.
Ziegler, W. (2013) Therhythmicorganisationofspeechgesturesandthesenseofit. Language, Cognition and Neuroscience 29, 38-40.