Telling Tales

Scissors cut paper.

Paper wraps rock.

Rock blunts scissors.

These sentences are not only the ‘unspoken rules’ of the game; they are the game.

The players also understand other unspoken rules. Let us consider them as a ‘rule of three’.

(i) Each sentence must contain two items and an action.

(ii) The order of words shows which item is acting (‘the subject’) and which is acted upon (‘the object’).

(iii) The rules alone are not enough; the meaning must make sense to both players.

As we speak, we create ‘story’. We do this using the unspoken rules of our language (the grammar and syntax) to ‘make sense’ of our words and those of others. The rules used are specific to each language; there is no underlying structure of language (a ‘universal grammar’) as Chomsky has suggested. Instead, a universal principle applies; all languages, even the simplest of creoles, quickly evolve their own patterns of grammar and syntax. These patterns produce the attributes of ‘story’, the structure which allows a language to work.

Stories have a structure.

Stories should have a beginning, a middle and an end…

But not necessarily in that order

(Jean Luc Godard)

The mimed game of ‘rock, paper, scissors’ tells the story of how these objects interact. The sentences build relationships between the objects. The rules of grammar and syntax allow us to organise the words into sequences that show what these relationships are. We use this structure to code meaning into our speech.

As we learn words, so we ‘imprint’ the structure of our language. We also acquire common expressions, such as sayings and other phrases or ideas with metaphorical meanings (e.g. characters from fairy tales) that are understood by our cultural group.

Many songbirds and other animals also produce imprinted learned calls made of ordered sequences (phrases) of sounds. We recognise these repeating patterns in the speech we hear and ‘imprint’ as children. The differences in our childhood learning environments give rise to variations in the way we produce sounds, such as regional accents.

A male Zebra Finch (Taeniopygia guttata) at Dundee Wildlife Park, South Australia. The complex calls of Zebra finches and many other songbirds are learned. Juvenile birds have a period of imprinting, in which they mimic... more the adult calls of their ‘tribe’. This learning window closes as they reach adult age, at which point their song pattern becomes ‘fixed’ (Image: Wikimedia Commons)

Speaking involves learning and repeating these sound-generating movement patterns, along with other movements such as facial expressions and hand gestures. In this sense, our speech is formed from a set of stylised vocal and non-vocal movement patterns. We learn this ‘dance’ from our cultural group.

Most of our words, phrases, sayings, metaphors and inferences result from inherited patterns in our cultural context. This ‘local effect’ has also been observed in ‘dialects’ (local pattern variations) of birdsong.

Stories ‘move us’ physically and through emotions.

Movement is part of all types of communication. In the course of the action (the verb) within a speech phrase, the ‘subject’ (the ‘character’ which is active) moves between physical states as a result of having made this action.

We understand that this change has taken place through the emotional shift we associate with this change. The very word ‘emotion’ contains the Latin root movere, to move, and the prefix ‘e-’ means ‘out of’.

We find that we are able to ‘read’ emotion in the face of this male Barbary Macaque (Macaca sylvanus) and his offspring. All primates use facial expressions and involuntary vocal calls to communicate emotions. Macaq... moreues and other monkeys change their facial expression involuntarily, revealing information about their emotional state even when these animals are not interacting directly with each other. For instance, these and other primates are sensitive to audible rhythms, and upon hearing them, produce changes in facial expression which is matched by a shift in the neural circuitry of the brain (Image: Wikimedia Commons).

We code our words with symbolic meaning by tagging our understanding of what they represent with how we feel. Our words are therefore symbols that contain both semantic and emotional content. This content is conveyed into our speech through the pitch, timbre, tone and pace of our voice, as well as the rhythm and stress patterns within our speech.

Vital as this is, usually we are not conscious of this musical content, known as ‘prosody’, which transmits emotional information. Beyond ourselves, we can also discern and infer emotional cues in the involuntary vocal calls and facial expressions (‘facial gestures’) of other primates.

Producing speech with mouths, hands, or even through remote means such as writing, involves movement. Indeed, the same neural pathways, involving the so-called ‘mirror neurons’, are activated both when we either think of or hear a word, and when we encounter the experience which the word symbolises. This neural network extends into the emotional centres of our brains, allowing us to code the meaning of words by relating emotions to our experiences.

Spoken words are rhythmical ‘vocal gestures’ made with our eating apparatus (the mouth and throat), and are coordinated with the rhythm of our breathing. Controlling these movements involves coupling and modifying rhythmic outputs from multiple Central Pattern Generator circuits.

These ‘neural metronomes’ generate autonomous rhythmic nerve impulses that drive repetitive movements like chewing and walking. Higher brain centres initiate and coordinate these signals, integrating them via the basal ganglia into ordered sequences of finely controlled motor movements. The order, pace and musicality of our speech results from combining sets of these neural patterns with pattern modulations and interruptions, in a way which works with the rules that structure our language.

We walk as we speak; with intention. The rhythmic motor movements that both of these processes involve are learned patterns with a cultural basis. The distinctive walking gait of the Maasai tribe from the Kalahari in so... moreuthern Africa has a very low impact on the body. Their traditional nomadic life involves walking with their cattle for hundreds of miles, between grazing areas. Their walking style means that they suffer from little or none of the wear-and-tear we would expect from this level of activity (Image: Wikimedia Commons)

Our body movements, like our language use and accents, follow patterns that we learn as children from ‘our tribe’. The unique rules and patterns of a language, like its vocabulary, are indicators of the cultural content and perspective they express. A speaker and listener must agree the meaning of the words and phrase structure they use. Communication then, involves creating ‘story’ using movements which are simultaneously a whole body and a whole society activity.

Stories have something to say.

In the game ‘scissors, rock, paper’, both players understand the meaning of the symbolic gestures used and the relationship between them. Their interaction is understood as an event in time and space. In language, we ‘tell stories’ using ordered words and phrases in order to convey an intended meaning.

‘Scissors cut paper’

The sentence describes an event which may have happened, is happening or could happen. An object acts upon another object in time and space, until there is a resolution. The objects are represented as gestures. Considering words as gestures in sound, when we hear or read a simple sentence from the game, we internally reproduce (‘represent’) the associated objects, actions and representing gestures in our mirror neuron circuits along with the experiences they symbolise. The mirror system, associated with areas of motor activity, allows us to revisit the embodied experience of movements associated with these ideas.

What stories have to say involves a journey (a beginning, a middle and an end). The characters in a story are animated. They appear in the plot with a discernible ‘motivation’ revealed by their actions. This action ... moreresults in a change of state for that character. This movement is true at different levels of resolution , such as the object acting within a single phrase, or a character in an epic tale. The narrative created by grammar and syntax ‘animates’ objects into a change of state in the phrase. This codes for a change of meaning (Image: Wikimedia Commons)

We perceive the word symbols and emotional content of speech as patterns. Mammals may be particularly competent at recognising patterns and edges, but our pattern recognition ability is exceptional. When we speak, we use sound pattern motifs (words) coded with symbolic ideas, and then order these into sequences.

Forming our words and phrases involve making a series of movement patterns. Our speech and body posture reveals how we understand ourselves. As our children progress in learning to speak, they reveal the progress of their capacity to take charge of their thoughts. The story that emerges through our spoken words reveals what we think and how we feel. We craft this narrative from the meaning we assign to our observations, and use music and movement to mirror this back to ourselves and others. How could this story mechanism have begun, and how has it evolved?

Did our stories begin with singing?

Amongst the apes, our ability to sing is unique. The musicality of our speech has many components, including variations in rhythm, phrasing, pitch and tone.

(i) Rhythm

Basic ‘rock’ drum rhythm pattern, notated for bass, snare and cymbal. Listen to this being played here. Rhythm is a basic component of the music of our speech, and in English this rhythm often has priority over othe... morer factors in the way we pronounce words. For example, we typically pronounce the word thirteen as ‘thir-TEEN’, with stress on the second syllable. If this word comes ahead of another word with stress on the first syllable, such as ‘WO-men’, we pronounce this ‘THIR-teen WOmen’. This shift in the stress peak maintains a ‘beat, offbeat, beat, offbeat’ rhythm pattern in our speech, similar to this rock drum riff.

Speech is inherently rhythmical. Indeed in English, we even shift the stress of a word to maintain that rhythm. Andrew Carstairs-McCarthy has suggested that syntax evolved from the same basic (open-close) ‘oscillating motor’ mechanism by which we also organise our articulated sounds into ‘consonant-vowel-consonant’ syllables. Alternation is discernible in the rhythm of speech, and appears also in word order, such as the object-action-object structure of ‘paper wraps rock’.

Every human culture has some form of music with a beat. Rhythm determines how we perceive and process musical information. Syllables and small clauses have a bimodal rhythmic structure which helps to articulate the sounds within the phrase; for example:

‘PAper wraps ROCK’.

Asif Ghazanfar suggests that rhythmic communication is found throughout the higher primates, in the form of behaviours such as chimpanzee pant-hoots and rhythmical facial gestures such as lipsmacks. This implies that a similar bimodal rhythm mechanism was present in the communication behaviours of our shared ancestors.

(ii) Pitch

William Tecumseh Fitch suggests that speaking began with prosody; he proposes that our hominin ancestors’ ‘proto-language’ may have initially used intonation (controlling the pitch of sounds) rather than word-based syllables. Many current world languages are based on tones rather than word forms.

Organ pipes from the old church at Pellworm, Schleswig-Holstein. Organ pipes produce their note as air passes across the pipe and resonates the air in the cylinder. Longer pipes produce lower pitch notes. Male mammals d... morerop their larynx to vocalise, producing a lower pitched call. The second descent of the human male larynx during puberty deepens a man’s voice (Image: Wikimedia Commons)

Fitch argues that the vocal control which allowed our ancestors to sing may have been later ‘exapted’ to produce such pitch-based proto-syllables, and that articulation of vowel and consonant sounds came later as a means of expanding the diversity of this sound repertoire. This makes a plausible case for the origins of our ability to articulate syllables arising from song.

The human male larynx descends during puberty, giving a deeper formant (resonant frequency) which enables us to distinguish male from female voices mainly by their pitch and tone. In contrast to most songbirds, our ability to speak (and sing) is balanced between the sexes. This suggests that however human speech arose, it was not primarily to attract mates, as is the case in most songbirds and vocal mammals.

Our mode of communication is adaptable to our context. The Yoruba from west Africa traditionally use ‘talking drums’ to communicate with villages up to 5 miles away. The pitch of these drums can be varied when playe... mored, mimicking words from the Yoruba language, which is based on tonal shifts (Image: Wikimedia Commons)

This may be true, but David Puts and colleagues have found that the pitch of a man’s voice alters in the presence of other males, inferring an element of between-male competition in the establishment of dominance. The second descent of the human male larynx may therefore have evolved as a cue to signal the individual’s status within the tribe, or to defend territorial boundaries.

In addition, low sounds travel further. Deeper male voices may have proven more effective at coordinating the tribe when hunting in low visibility conditions such as dense forest or over long distances in the open savannah.

(iii) Musical phrasing

In English, we use variations in relative pitch to shape our phrases and code emotional meanings. These ‘melodies’ overlay variations in pitch on to the rhythmic patterns within words. Shifts in pitch give stress and emphasis, so shaping these phrases. This musicality is crucial to reveal our intended meaning to others, while intonation makes it easier for a listener to distinguish the endings of our phrases. As children we learn to make these musical (prosodic) sound patterns alongside our capacity to articulate the vowels and consonants.

European Starlings (Sturnus vulgaris) taking an opportunity to feed. These birds are accomplished vocal mimics, and can add new sounds to their song repertoire throughout life. Their calls comprise repeating syllable se... morequences; each bird’s song is distinctive, and seems to enable individuals to recognise others from their flock (Image: Wikimedia Commons)

Both young songbirds and human children have sensitive periods of vocal learning that requires social feedback from adults. Like human babies, songbirds have a ‘babbling’ phase where they try out sounds prior to rehearsing and imprinting their adult call. These birds build phrased sequences from repeating syllable elements, punctuated with explicit pauses.

Repeated elements may add emphasis, although modifying the order does not seem to alter the meaning of the call, which advertises their suitability as a mate. Sequence variations may however serve to identify individuals within the flock for some species, such as starlings.

Steven Brown shows that the most complex song forms in birds and in other primates arises amongst monogamous pairs of duetting tropical songbirds and gibbons. These calls appear to be significant in the defence of territories and maintaining social bonds. He notes that courtship calls are rare to non-existent in our sub-clade of the higher primates; neither chimps nor bonobos vocalise complex learned song. Although in theory their vocal tract anatomy would enable them to produce some vowel and consonant sounds, no chimps have ever done so.

In contrast, territorial calls are found throughout the entire primate clade. A deeper male voice may permit a means to distinguish authority and hierarchy within the tribe, and may have been influential in ‘vocal grooming’ amongst our hominin ancestors.

Babies progress from early babbling sounds through repeating simple naming words, and to simple abstracted words such as pronouns (he, she, they) and simple sentences (e.g. ‘carry me’) by age two, and talks simply a... morebout their day by age four. Chimpanzees growing up with close human contact cannot speak, although they can be taught sign gestures. However their ability to combine these gestures into sequences is limited (Image: Wikimedia Commons)

In order to produce sequences of symbolically coded, articulated sounds, however, our ancestors must have been capable of organising their proposed actions into sequences. Neurological studies show that organising our thoughts into sequences requires a high degree of brain connectivity.

This would mean that the developmental shift in the degree of connectedness between neurons that allowed hominins to enact organised sequences of behaviours, may have provided the circuitry that was later ‘exapted’ for language syntax.

How did our storytelling behaviour evolve?

Other animals’ behaviours are driven broadly by instinctual need. Whilst humans do operate at this level, what is distinctive about our behaviour is the capacity to produce actions to achieve an intended purpose. Our speech is unique because we can intentionally use it to order our thoughts and tasks in time.

Illustration by Grandville (1803-1847) for one of the Fables of Aesop. This is one of a number of tales credited to Aesop, a slave and story-teller believed to have lived in Ancient Greece between 620 and 560 BCE. These cautionary tales use metaphors of unbelievable scenarios (here a fox and a crow having a conversation) to code much bigger meanings. This tale cautions about being susceptible to flattery. A fox was walking through the forest when he saw a crow sitting on a tree branch with a fine piece of cheese in her beak. The fox wanted the cheese and decided he would be clever enough to outwit the bird. “What a noble and gracious bird I see in the tree!" proclaimed the fox, "What exquisite beauty! What fair plumage! If her voice is as lovely as her beauty, she would no doubt be the jewel of all birds." The crow was so flattered by all this talk that she opened her beak and gave a cry to show the fox her voice. "Caw! Caw!" she cried, as the cheese dropped to the ground for the fox to grab (Image: Wikimedia Commons)

Illustration by Grandville (1803-1847) for one of the Fables of Aesop. a slave and story-teller believed to have lived in Ancient Greece between 620 and 560 BCE. These cautionary tales use metaphors to code much bigger ... moremeanings. This tale cautions about being susceptible to flattery. A fox was walking through the forest when he saw a crow sitting on a tree branch with a fine piece of cheese in her beak. The fox wanted the cheese and decided he would be clever enough to outwit the bird. “What a noble and gracious bird I see in the tree!” proclaimed the fox, “What exquisite beauty! What fair plumage! If her voice is as lovely as her beauty, she would no doubt be the jewel of all birds.” The crow was so flattered by all this talk that she opened her beak and gave a cry to show the fox her voice. “Caw! Caw!” she cried, as the cheese dropped to the ground for the fox to grab (Image: Wikimedia Commons)

At the cognitive level, this is in essence the same type of task as ordering a sequence of actions using manual tools to achieve a goal. We use words as tools to share information by constructing an idea before we can share this information with others.

In both cases, we construct the tasks in a sequence, formulating the goal (the completed manual task or the delivering of the information) before we begin. This structuring is an active process that takes place even for a remembered event.

As we revisit our memories, we include only certain details in our narrative; these details trigger pattern recognition in our own thinking (and that of our listener). This processing and editing of information is the essence a structuring of patterned information. The establishment of a ‘narrative’ (a sequence of events) applies equally to using a manual tool and ‘telling a ‘story’.

Our repertoire of stories, folk tales and fairy tales are amongst the tools by which we share culturally what is important to us and how we think. Using story in this way reveals patterns; we discern deeper meanings and lessons from the presented information, which is a mechanism to share understanding.

Being able to project different interpretations to evaluate cues from the environment provide a means for assessing risk, and resulting in different types of behaviour choices. The means to effectively evaluate a cue such as rustling in the undergrowth as either an opportunity (a potential food item such as a small mammal) or a threat (a wolf) would quickly prove a selectable advantage.

These marks in sand reveal the recent passage of a grey wolf (Canis lupus). Understanding such cues from the environment would have been useful to the survival of our hominin ancestors (Image: Wikimedia Commons)

Many animals recognise cues which index the presence of another animal (e.g. a scent, or footprints). It is easy to imagine how our ancestors’ capacity to evaluate these cues strategically using their ‘thinking tools’ would quickly change their chances of survival.

Being able to share these thoughts and coordinate their responses with others would radically shift the ecology of the tribe into a mode where sensory awareness and experience held and understood collectively by the ‘quorum’.

Words then are tools that define boundaries between ideas, and help to structure our collective thinking. The making and using of tools requires the execution of sequences of patterns, in the form of intentional fine motor movements.

The gymnast Jade Barbosa, competing for the Mediterraneo Gym Cup July 5th 2008 in Rome, 2008. To work on the barr requires highly developed balance and precise coordination. Balancing whilst standing upright is a whole... more body activity. From Lieberman’s research, it appears that this posture has remodelled our ancestors’ entire physiology from breathing to childbirth. In addition, it frees up the hands, allowing us to perform new manual tasks with tools. The precise motor movement used to manipulate tools may have established the neurological networks through which our ancestors were better sequence their thoughts as well as their actions (Image: Wikimedia Commons)

Philip Lieberman suggests that our upright posture may have pre-adapted our ancestors for the enhanced motor control needed for tool-making, tool using and speech. Walking itself is a simple patterned, repetitive movement.

Our basic ‘walking instinct’ initially activates a Central Pattern Generator circuit driving movement in all four limbs. Our newborns’ initial locomotion usually involves crawling. The subcortical basal ganglia of the language network in the brain also regulate the muscles controlling our upright posture.

Walking, and other sequences of behaviour such as speech, are learned gradually over years. Heel strike, which marks efficient bipedal locomotion, takes years to develop.

Learning to use words as symbols requires repetition of the word and an internal coding of it as a pattern motif. The characteristic of any pattern is that it contains repeating elements, and that repetition stimulates our memory to the degree of the strength (i.e. our familiarity with) that pattern. The sequence of word order determined by a language’s grammar and syntax provides in essence a framework for creating patterns.

This ability to associate gestures with ideas and order them into sequences is present to a limited degree in our primate relatives. Wild bonobos combine call types together into longer mixed sequences upon finding a food cache. Other tribe members understand information about the quality of this find from these different sound sequences, and respond with appropriate types of foraging behaviour.

Bonobos (Pan paniscus) use four tonal calls when finding food (barks, peeps, peep-yelps, and yelps). These animals forage in dense forest. Upon finding a cache of fruit, their calls bring the group together to feed. Pla... moreying recordings of calls made upon encountering a good quality food find in a location where only a poor quality crop was available, prompted these animals to forage as though in the presence of the better food source (Image: Wikimedia Commons)

Higher primates have a mirror neuron network which is triggered by facial expressions and grasping movements associated with obtaining food, although they cannot mirror mimed actions.

Why do we tell stories?

Communication involves transmitting and receiving a message that is understood the same way by sender and recipient. Human speech is the same, but is driven by the intention to share a meaning. As well as a repertoire of pre-arranged signals with agreed meanings, we need to have something to say. We tell stories therefore to communicate an intention. This means that our ancestors must have felt compelled to convey the contents of their thoughts to others.

This diagram is adapted from Claude Shannon’s paper, ‘A mathematical theory of communication’; produced whilst working for the Bell Telephone company in 1948. In conjunction with Warren Weaver, this idea was popul... morearised into a book (The mathematical theory of communication), published in 1949. This is now known as the Shannon-Weaver communication model. In this model, information is transformed multiple times between the source and destination.• An information source that produces a message.• A transmitter that operates on the message to create a signal which can be sent through a channel.• A channel, through which the message travels (and during which process it may be distorted by ‘noise’ interference or otherwise modified by the environment).• A receiver, which transforms the signal back into the message intended for delivery.• A destination, which can be a person or a machine, for whom or which the message is intended.Messages are transmitted through the body by some means (for example the spoken word, mimed gestures or writing an email) and then through an external medium (e.g. the air, by sight, or cyberspace) until it is received by another human through their senses. The

We speak to communicate the contents of our minds. It is perhaps then the patterns of our thinking, rather than our speaking, which are the truly unique feature of human communication. Our ability to combine ideas into a syntactic structure and create new associations demarcates a boundary between our thinking and that of our closest relatives, the chimps.

Captive chimpanzees and bonobos can be taught to understand some signed words or use pictorial symbols, and the most accomplished of these learners can combine certain of their vocabulary into ‘small clause’ forms, for example ‘agent + action’ or ‘action + object’.

Ljiljana Progovac argues that these ‘small clauses’ are the basic units of word combination that all languages share. She proposes these forms as the ‘proto-syntax’ from which our more complex structures have evolved. If this is correct, then this suggests that the most basic components of our ability to combine words into language were present in our common ancestors with chimpanzees.

A bonobo (Pan paniscus) at Cincinatti zoo, communicates using a gesture. Arguably the most accomplished of the language trained chimps was the male bonobo Kansi, raised by Sue Savage Rumbaugh and others at Georgia State... more University. Kansi learned to use some 400 words, using a board of pictorial (abstract) symbols. He regularly used around 30 to 40 of these in a typical day, and combined them occasionally into pairs. However although Kansi was able to understand simple phrases (such as ‘subject + verb + object’), could follow simple instructions, and could identify animals and humans by name, he did not use these language structures to talk about himself (Image: Wikimedia Commons)

The way we acquire language reveals the process by which we develop a sense of self-identity. Our children’s awareness of ‘self’ and ‘other’ is expressed in their language, although it is not dependent upon it. An understanding that others share similar experiences (known as a ‘theory of mind’) develops gradually during their first five years. The ability to organise words into small clauses such as ‘scissors cut’, or ‘cut paper’ appears in human children at between 1 and 2 years of age.

The more accomplished language learning primates, such as the bonobo Kansi, associate sounds and symbols with objects and even seemed to understand abstract concepts such as ‘happy’. Bonobos at the Georgia State University primate language research project use a lexigram (symbols board) to create two-word small clauses. None of these animals, however, has ever attempted a self-description such as ‘I think…’, ‘I feel…’ or ‘I want…’ This ability develops in human children between 2 and 3 years of age. The bonobo’s use of human symbol-based language, in contrast, arrests at a stage roughly equivalent to a human child of around two years.

Adults and children pretending to be bears and following a trail of ‘footprints’ (Image: Wikimedia Commons)

All young mammals play, but perhaps the most intriguing part of human communication is that play is incorporated into our information sharing. The simple story of scissors, rock and paper is fiction in that the items do not need to exist to be ‘present’ and interact in the game. Fiction captivates our attention and holds it far more easily than factual narratives.

Within a story, our thoughts project information based upon past experience into the future, allowing us to ‘play out’ the actions in drama in mime before attempting the task. Studies on primate mirror neuron responses report that these animals typically do not respond to such mimed gestures.

The act of telling stories shifts our ecology into the ‘cognitive niche’; we operate in a world of endlessly combined and recombined ideas. Emotions are the source of meaning for these ideas. Perhaps our evolved story mechanism can be considered as a means of evolved emotional language, conveying higher orders of meaning and inferred understanding to our experiences. This combining of ideas, putting one into another to make a new meaning which is different from that of the ideas on their own, is known as ‘recursion’. It is considered to be a defining characteristic of our capacity for creativity in our thoughts.

In summary then, our ‘story mechanism’ translates thoughts into actions, and enables us to bring new things and events into being. Coding information in story form makes it possible to share past experiences, ‘reword’ them into new sequences, and by communicating with others, project these imaginings into the future.

The West Tofts handaxe (also called a ‘biface’) is believed to date to around 400 thousand years ago. This particular handaxe is intriguing because it has a shell fashioned into one side. Given that the cortex (the outer layer of the stone) around the fossil is intact, it seems likely that it was intentionally fashioned into the centre. This is significant because it could represent one of the earliest examples of an appreciation of aesthetics which extends beyond utilitarian function (Image: Reproduced by the permission of the University of Cambridge Museum of Archaeology & Anthropology. Accession no. 1916.82)

The West Tofts handaxe (also called a ‘biface’) is believed to date to around 400 thousand years ago. This intriguing tool has a shell at its centre. Unusually for this type of tool, the cortex (the outer layer o... moref the stone) has been left intact. This suggests that its maker took a deliberate aesthetic decision to make it in a way which preserved the shell. This may be one of the earliest examples of a human-crafted object intended to be both functional and beautiful. (Image: Reproduced by the permission of the University of Cambridge Museum of Archaeology & Anthropology. Accession no. 1916.82)

It is clear that our hominin ancestors developed neural pathways that allowed them to copy and learn movement sequences, including vocal movements, and used these skills to share their intentions with others. What is less clear is how they came to understand that others had minds like their own, prompting their yearning for connection.

Whatever its cause, however, an expansion of conscious awareness drove our ancestors to share their ideas and understanding, and to begin to tell each other their story.

Conclusions

Our speech comprises a complexity of coded sounds that we are able to order and organise. This organisation has ‘rules’ that a listener can use to perceive and translate what they hear into meaning.
Using speech-based language allows us to put our thoughts in an order, and then control our actions in a defined and directed way. Neurologically there is no difference between using manual tools and word tools in an ordered sequence; the brain codes both of these as a set of gesture-based motor movements.
Some birds and other animals are able to learn complex sound sequence patterns, mostly as a display signal for sexual selection. In contrast, human speech and other forms of communication are gender balanced. We use our communication sequences to bond socially and carry and transmit collectively held ideas.
The behavioural choices of other animals and birds is the result of reactions to their circumstances. Humans put their thinking into words, and voice their intentions.
There is nothing biologically unique about the behaviours that allow us to speak. Our language function suggests instead a greater level of connection between these abilities than is found in other animals. This has allowed us to assume fine control over our movements that produce vocalisations and other actions, and also by implication our thoughts. This may be a result of the physical changes needed to allow our ancestors to walk upright.
What human language enables us to express is a sense of self-identity; it is a means of defining ‘our story’.

References

Amandor, A. et al. (2013)  Elemental gesture dynamics are encoded by song premotor cortical neurons.  Nature 495, 59-64.

Astington, J. W. and Edward, M.J. (2010)  The development of theory of mind in early childhood.  Encyclopedia on Early Childhood Development, 1-6. Edward M.J (Ed)  Published online at www.child-encyclopedia.com

Atran, S. (1982)  Constraints on a theory of hominid tool-making behaviour.  L’Homme 22, 35-68.

Bouwer, F.L. et al. (2014)  Beat processing is pre-attentive for metrically simple rhythms with clear accents: an ERP study.  PLoS ONE 9, e97467.

Boyd, B. (2009)  On the Origin of Stories: Evolution, Cognition and Fiction.  Harvard.

Brown, R. (1973)  Development of the first language in the human species.  American Psychologist 28, 97-106.

Brown, S. (2000)  Evolutionary models of music: from sexual selection to group selection.  Perspectives in Ethology 13, 231-281.

Bruner, J.S. (1975)  From communication to language – a psychological perspective.  Cognition 3, 255-287.

Clark, K.B. and Clark, M.K. (1939)  The development of consciousness of self and the emergence of racial identification in negro preschool children.  Journal of Social Psychology, S.P.S.S.I. Bulletin 10, 591-599.

Corballis, M.C. (2007)  Recursion, language, and starlings.  Cognitive Science 31, 697-704.

Dittrich, F. et al. (2013)  Maximized song learning of juvenile male zebra finches following BDNF expression in the HVC.  European Journal of Neuroscience 38, 3338-3344.

Donald, M (2001)  A mind so rare: the evolution of human consciousness.   Norton

Dunbar, R.I.M. (2003)  The social brain: mind, language, and society in evolutionary perspective.  Annual Review of Anthropology 32, 163-181.

Eisen, A. et al. (2014)  Tools and talk: an evolutionary perspective on the functional deficits associated with amyotrophic lateral sclerosis.  Muscle & Nerve 49, 469-477.

Evans, N. and Levinson, S. (2009)  The myth of language universals: language diversity and its importance for cognitive science.  Behavioral and Brain Sciences 32, 429-448.

Everett, D. L. (2009)  Don't Sleep, There are Snakes: Life and Language in the Amazonian Jungle.  Random House.

Everett, D. L. (2012)  Language: The Cultural Tool.  Random House.

Everett, D. L. (in press)  The role of culture in the emergence of language 1.  In The Handbook of Language Emergence (W. O’Grady and B. MacWhinney, eds).  Wiley-Blackwell.

Everett, D. L. (in press) Sculpting language: A review of the David McNeill Gesture Trilogy.  In The Handbook of Language Emergence (W. O’Grady and B. MacWhinney, eds).  Wiley-Blackwell.

Fitch, W.T. (2005)  The evolution of language: a comparative review.  Biology and Philosophy 20, 193-230.

Fitch, W.T. (2011)  The evolution of syntax: an exaptationist perspective.  Frontiers in Evolutionary Neuroscience 3, article 9.

Fitch, W.T. (2012)  Evolutionary developmental biology and human language evolution: constraints on adaptation.  Evolutionary Biology 39, 613-637.

Gallistel, C.R. (2011)  Prelinguistic thought.  Language Learning and Development 7, 253–262.

Gentner, T.Q. et al. (2006)  Recursive syntactic pattern learning by songbirds.  Nature 440, 1204-1207.

Ghazanfar, A. (2013)  Multisensory communication in primates and the evolution of rhythmic speech.  Behavioural Ecology and Sociobiology 67, 1441-1448.

Gould, S.J. and Vrba, E.S. (1982)  Exaptation – a missing term in the science of form.  Paleobiology 8, 4-15.

Iverson, J.M. and Thelen, E. (1999)  Hand, mouth and brain.  Journal of Consciousness Studies 6, 19-40.

Jürgens, U. (2002)  Neural pathways underlying vocal control.   Neuroscience and Biobehavioural Reviews 26, 235–258.

Lai, J. and Poletiek, F.H. (2011)  The impact of adjacent-dependencies and staged-input on the learnability of center-embedded hierarchical structures.  Cognition 118, 265-273.

Lieberman, P. (1984)  The Biology and Evolution of Language.  Harvard.

Lieberman, P. (2001)  Human language and our reptilian brain: the subcortical bases of speech, syntax, and thought.  Perspectives in Biology and Medicine 44, 32-51.

Lieberman, P. (2006)  Toward an Evolutionary Biology of Language.  Harvard.

Lieberman, P. (2009)  Human language and our reptilian brain: The subcortical bases of speech, syntax, and thought.  Harvard.

Naoi, N. et al. (2012)  Prosody discrimination by songbirds (Padda oryzivora).  PLoS ONE 7, e47446.

Patel, A.D. and Iversen, J.R. (2014)  The evolutionary neuroscience of musical beat perception: the Action Simulation for Auditory Prediction (ASAP) hypothesis.  Frontiers in Systems Neuroscience 8, article 57.

Petitto, L.A. et al. (2001)  Language rhythms in baby hand movements.  Nature 413, 35-36.

Petkov, C.I. and Wilson, B. (2012)  On the pursuit of the brain network for proto-syntactic learning in non-human primates: conceptual issues and neurobiological hypotheses.  Philosophical Transactions of the Royal Society of London, B 367, 2077-2088.

Progovak, L. (2010)  Syntax: its evolution and its representation in the brain.  Biolinguistics 4, 234-254.

Puts, D.A. et al. (2006)  Dominance and the evolution of sexual dimorphism in human voice pitch.  Evolution and Human Behavior 27, 283-296.

Puts, D.A. et al. (2007)  Men’s voices as dominance signals: vocal fundamental and formant frequencies influence dominance attributions among men.  Evolution and Human Behavior 28, 340-344.

Rey, A. et al. (2012)  Centre-embedded structures are a by-product of associative learning and working memory constraints: evidence from baboons (Papio papio).  Cognition 123, 180-184.

Roy, A.C. et al. (2013)  Syntax at hand: common syntactic structures for actions and language.  PLoS ONE 8, e72677.

Savage-Rumbaugh S et al. (1998) Apes, language, and the human mind. Oxford

Selezneva, E. et al. (2013)  Rhythm sensitivity in macaque monkeys.  Frontiers in Systems Neuroscience 7, article 49.

Suddendorf, T. and Corballis, M.C. (2007)  The evolution of foresight: what is mental time travel, and is it unique to humans?  Behavioral and Brain Sciences 30, 299-351.

Vaesen, K. (2012)  The cognitive bases of human tool use.  Behavioral and Brain Sciences 35, 203-262.

Yip, M.J. (2006)  The search for phonology in other species.  Trends in Cognitive Sciences 10, 442-446.

Zawidzki, T.W. (2006)  Sexual selection for syntax and kin selection for semantics: problems and prospects.  Biology and Philosophy 21, 453-470.

Ziegler, W. (2013)  Therhythmicorganisationofspeechgesturesandthesenseofit.  Language, Cognition and Neuroscience 29, 38-40.