Jump to content

Evolution of language

From Slow Like Wiki
Revision as of 17:39, 16 February 2025 by 193.16.224.14 (talk)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

These notes are initially drawn from "The Origins of Language" by James R Hurford

The Prehistory of a Very Special Ape

  • 7m years ago - the line leading to humans split off from that leading to bonobos and chimpanzees
  • 4-2m years ago - Australopithecus is the first habitually bipedal ape. Bipedalism allowed us to separate the rhythm of breathing from that of walking and running and freed hands for meaningful gestures
  • 2.5-1.5m years ago - Homo Habilis (clever man), was the first to make stone tools. This indicates patience, postponement of gratification, a mind capable of foresight into future needs, and constructive planning,
  • 1.5m years ago - Homo Erectus, a tall robust ape made more complex tools and may have had a "protolanguage", a meaningful learned vocabulary but no grammar - just words strung together. They made the first migration of hominins out of Africa.
  • 1m years ago - Home Erectus can use controlled fire, which allows cooking and reduction of teeth and gut size, which may have freed up resources for bigger brains. These are the first hunter-gatherers, living and working to hunt and forage cooperatively in small groups.
  • 500,000 years ago - We diverge from Neanderthals
  • 500,000 years ago - Full language? Earliest estimate
  • 300,000 years ago - Home Sapiens emerge
  • 170,000 years ago - Home Sapiens start wearing clothing - a significant moment in the emergence of culture, clothing carried information about the status of individuals
  • 140,000 years ago - Mitochondrial Eve (the most recent woman that all living humans share as an ancestor on their mother's side) and Y-chromosome Adam (the most recent male ancestor that all living men share on their entirely paternal line) are alive.
  • 100,000 years ago - Homo Sapiens comes out of Africa
  • 70,000 years ago - Settles Asia, Australia
  • 40,000 years ago - Settles Europe
  • 40,000 years ago - Complex behavior - more refined and task-specific tools, carved and painted art
  • 30,000 years ago - Bows and arrows, flutes
  • 15,000 years ago - Settles Americas
  • 5,000 years ago - A single mother language of the Germanic, Romance, Slavic, and Indic language existed on the boundaries of Europe and Asia.

if any two individuals living today research their family tree, they will find at least one person in common in both trees who was alive at this time

Brain structure and the evolution of language:

  • The principal parts of human brains are structurally similar to other ape brains
  • Brain size correlates well with the typical size of a social group, with the occurrence of tactical deception, and the complexity of a communication system
  • All animals communicate but only humans have the elaborate learned systems that we call language
  • There are over 20 major language families on a scale with Indo-European and no proposals as to how they may be historically related
  • The rise of human language may have been very fast, but it's hard to imagine it taking less than a few centuries

Nature, Nurture, and Language

  • The 'FOXP2' gene is implicated in language
  • Humans constructed for themselves a "symbolic niche" and language-fixating genes went to fixation in the human population
  • Language has undergone a process of "grammaticalization", which makes it more complex
  • Social centripetal forces make languages, and people in the group try to conform to the language norms of the group and to show "competence" in using the language
  • Humans, uniquely are able to introspect about their own behavior, to talk about talk, and to think about thinking.
  • The language of a community is inevitably fuzzy at the edges, because nobody conforms 100% to the norms. But a core of normas, for any language, does exist, and individual speakers tacitly know them. 'Tacit knowledge' is an acceptable description of what is in speakers' heads, making them behave in regular conforming ways.

Instinct and Learning

  • Some behavior is clearly instinctive and some is obviously learned.
  • Instinctive behavior ranges from simple to relatively complex
  • Emotions, such as fear and anger are themselves instinctive
  • Learning is acquiring different behaviors in response to events in the environment, plasticity.
  • First-language learning is reasonably called 'involuntary' (or instinctive), and adult attempts to learn new languages, 'voluntary'.
  • There are involuntary drives that put one in a state ready for learning. Babies instinctively coo and babble, children have an instinct to follow eye gaze and pointing and a 'mind-reading' instinctive understanding of the goals of others (theory of mind?).
  • No learning is bias-free and any bias affecting learning is itself instinctive. Children are disposed to attend selectively to objects and we have an instinctive bias to seek sense in what people say to us and a bias toward learning complex grammar.
  • Humans have evolved to be the most plastic species.
  • A behavior not at first automatic can be learned in a deliberate way and practised until it becomes 'second nature'. Then it is reasonable to talk of a learned instinct - something learned becomes instinctual.
  • We can consciously learn to suppress or inhibit instinctive behavior as in processes of socialization.
  • Social learning is based on copying the behavior of others while non-social learning is solo trial and error.
  • Good solutions to practical tasks can be discovered by non-social learning, and then spread socially.
  • Evolving systems need innovation, inventions of new social behaviors that others can then replicate. Such inventions need not be conscious.
  • With a disposition to imitate, one can socially learn behaviors that have no obvious practical benefit. This instinct to imitate, not necessarily with any insight into meaning, is the driver for children learning language. They gain social reinforcement for playing the community's language games.

Iterated learning:

  • We assume that semantically compositional language was preceded by a 'protolanguage' stage, with meaningful words but with no grammatical organization.
  • While in simulations, compositional syntax emerges gradually in stages, real people have a disposition to impose order on chaos, and over successive generations, order emerges.

How Trusted Talk Started

  • All animals communicate in some way, but here we're talking about behavior that influences the behavior of others, where they respond as if recognizing a communicative intention on the part of the sender of the signal.
  • Animals have ritualized behavior, like the teeth-baring of dogs or birds' courtship behavior.
  • Dyadic communication is when animals do stuff to each other, while triadic communication is communication about something, and triadic communication in animals shows the evolutionary seeds of reference in language.
  • Whenever we talk to each other, we intend to do something to our hearer.
  • What human language added to animal communication was huge potential for joint engagement of speaker and hearer with situations beyond themselves
  • In the lineage of humans, and apes generally, there has been a progressive shift toward more learned behavor. Instinct never goes away completely, of course, as learning is always guided by some instinctive bases.
  • Children learn ritual behavior, such as raising their arms when they want their mother to pick them up.
  • In the evolution of the human capacity for language, there was a transition from purely innate instinctive communication to learned conventions over which the communicators have a high degree of voluntary control.
  • Spontaneous smiles and deliberate smiles are initiated by different parts of the brain.
  • Mutual communicators need to be able to maintain joint attention to whatever is being communicated about for at least as long as the communication lasts - there seems to be a co-evolutionary spiral between an increasing capacity for joint attention and increasing communicative success with language.
  • In the history of languages, there is a trend for frequently made inferences to become conventionalized.
  • Language signals are cheap to emit, though fairly costly to learn.
  • Humans are simultaneously in cooperation and competition with other members of their social group. We walk a delicate tightrop maintaining trustful reciprocal cooperative relationships, while also making sure we are not taken advantage of and get our fair share of resources.

One for all, and all for one

  • The basis of human language is a disposition to communicate cooperatively.
  • Humans are conspicuously altruistic.
  • Tit for tat - Help those who have helped you, and anyone you meet for the first time, and don't help a person who declined to help you in the past. It is successful because of its built-in memory of past collaborators and non-collaborators.``
  • Reciprocal altruism in a group requires certain advanced cognitive traits including memory for past good and bad deeds, some way of recognizing members of your own group, and mechanisms for detecting and punishing cheaters who take the benefits of group membership without paying their dues by occasional altruism.
  • As the expressive power of language evolved, so did its potential strength in forming social groups reaching beyond the bounds of close kin and humans have ruthlessly outcompeted other species with less cohesive group action.

Concepts Before Language

Meaning is no mystery:

  • Most of the time, we try to say what we mean. And our hearers try to work out what we mean from the stream of sounds that hit their eardrums.
  • When a person detects a clash of meanings in an anomalous sentence like 'He buttered his toast with shoe-leather.', there is a so-called N400 effect, a pattern of electrical activity in the brain, which may help with understanding how the brain stores and processes meanings (neurosemantics).
  • The things in the world that we talk about are our common point of reference when we communicate (shared truth).
  • The relationship between words and things (ie meaning) is indirect, mediated by concepts in the heads of language users:
    • Linguistic entities - such as words and sentences
    • Mental entities - concepts that are constructed somehow in the brain and are the links between language and the world.
    • Worldly objects and relations - such as dogs and clouds and eating and 'being higher than'.
  • Only humans have words that they can attach to concepts.

Beyond here and now:

  • How did full human concepts come into being?
  • We needed to develop a systematic response to classes of things in the world.
  • Some things we encounter are 'significant' or 'meaningful' to us and we react in a systematic way to them, like a frog reacting to a fly
  • For the frog, the impulses go straight through from perception to motor response without stopping to register on any long-term memory - this is just an instinctive reaction, uncontrollable
  • The frog may have a 'percept' of a fly, a fleeting conscious experience of it, and this is the first step on the evolutionary road to concepts.
  • There are no pictures or symbols in the head, just electrical and chemical effects
  • More complex representations are roughly the sum of other simpler representations.
  • This all starts as 'cued' representations, fleeting reactions to current input from the senses, though not so ephemeral as to have no effect on an animal.
  • In humans we can think of two type of representations or thinking:
    • Online thinking/cued representations - depend for their existence on long-term potentials to fire in certain ways consistent with certain types of external input - instinctual reactions
    • Offline thinking/detached representations - can be triggered internally, in a way detached from any external stimulus.
  • While animals can have some sense of object permanence, remembering an object that has just disappeared into the grass, humans can remember events for much longer, and they also remember a far wider class of events, not just involving food. This reflects humans' generally greater curiosity about the world, no directly related to survival or reproduction.
  • When we look at something, we are using two different brain mechanisms:
    • Dorsal stream - "where pathway" Attend to a location in space and
    • Ventral stream - "what pathway" Register the properties coming from it
  • Many animals that live in groups can remember, and systematically respond to , a large number of other individuals in their group (friend vs foe).
  • Any animal that remembers individual things in their absence must form an index, an internal pointer associated with all the properties of the remembered object, keeping it apart from other remembered objects.
  • Many species not only remember individuals, but can also remember past events and plan future events.
  • The next step in evolution toward full human-like concepts is a capacity to store, perhaps for only a short time, some memory of an experienced event, and act in response to that inner representation, without immediate stimulus from outside.
  • Episodic memory for very distant events has been called 'mental time travel', and only humans can remember details of things that have happened to them years before.
  • Episodic memory in animals for events is not as strong as memories of particular individuals, suggesting that objects, especially group members are more easily and permanently remembered.
  • Rats dream and relive their waking experiences while asleep. They even have:
    • Retrospective memory - for mental representations of past experience, and
    • Prospective memory - for representations used in planning future actions

Going Public with Thoughts:

  • All communication by non-human animals is about the here and now, never about things distant in time or place. Only human language allows "displaced reference".

More Abstract Thinking:

  • Apes can learn a task more quickly by realizing that it is, in some sense, the opposite of a previous learned task. This suggests that they have stored the first task as a rule and are able to apply a reversing or oppositeness rule to the stored rule.
  • Survival depends on classifying somewhat dissimilar things into classes, so that all members of a class can be treated the same, so same/different judgements are useful.
  • Alex, a talking parrot, used concrete terms as props for his thinking out abstract answers. Animals have kinds of abstract 'proto-concepts'.

We Began to Speak, and to Hear Differently

  • Other animals have fairly complex mental representations of the world
  • The foundations of the motivation to communicate are mind-reading, trust, and cooperation.
  • By having both, we have something to communicate and a motivation to communicate it

Human and Non-Human Vocal Anatomy

  • All parts of the human vocal tract are exapted (evolved from traits serving other functions
  • The larynx or Adam's apple' sits on top of the trachea (windpipe) and houses the vocal cords, whose vibration produces the basic buzz of the voice
  • The pharynx provides an extra shape-shirtable chamber through which the air passes.
  • Because they have larynxes that are higher up, other primates cannot produce the range of different vowels that humans can make, and which are so crucial in conveying meaning. Here is natural selection for more effective communication
  • The L-shape of the human vocal tract, which enables us to make vowels of different qualities is unique.
  • Our vocal tract had evolved to its modern shape already in Homo heidelbergensis, over 500,000 years ago.
  • Human breathing is remarkably controlled. Other animals have in-breaths and out-breaths of roughly the same duration. When speaking, we have 90% exhalation with only about 10% of time saved for quick in-breaths.
  • Also, when walking, we do not maintain any close coordination between our paces and our breathing

Hearing Speech

  • The range of sounds that we produce when speaking lie within the range of sounds the human ear can detect.
  • When a normal adult human hears speech, a train of events occurs, penetrating further and further into the head from the outside. The early processes are mechanical and the later one neurological or electrical
  • We need to draw a line between speech perception, the delivery of phonological units such as phonemes, tones, rhythm, and intonation patterns, and subsequent lexical and grammatical processing, which interprets the input as words and decodes sequences of words into their meanings
  • Conversion of accoustic waves to information for the brain happens in the middle and inner ear. The cochlea is a complex spiral structure with thousands of tiny hears which respond to different frequencies in the incoming vibrations.
  • Harmonic sounds like the ringing of a bell are relatively simple, while more complex sounds, like consonants in speech involve vibrations at many different frequencies, not spaced out in the neatly arithmetical way of higher harmonics
  • Between 2k and 4k Hz the typical yound adult human ear can detect sounds down to a quietness of 0dB

Recognizing Speech

  • Much of human language processing happens in the left hemisphere
  • We engage in 'auditory scene analysis', identifying which sounds come from which objects in the surroundings, and this begins the sorting of speech from non-speech
  • We have routinized the singling out of a human voice from other sounds
  • All cultures have music and probably music and language share some common processing but also have their own dedicated mechanisms

Coining Words

  • Language-ready children learn vocabulary voraciously before they begin to make grammatical utterances several words long.
  • Protolanguage is vocabulary with no grammar

First Words

  • First words were probably used for dyadic purposes - Hi, Sorry, Thanks, Ugh, Phew
  • Deitic words are rooted in the situation in which they are used - this, that, you
  • Conceptual words - dog, mountain, house - do not necessarily refer to something present
  • The essence of communication is to relate present situations to past experience
  • Probably the earliest forms of language had both deitic and conceptual words
  • Holistic languages use words to identify a whole situation - man-give-meat-to-woman
  • Atomistic languages use words to identify single objects or actions - man, meat, give - more likely
  • Children tend first to learn atomistic words to identify objects in episodes where both child and caregiver pay joint attention to the same object
  • We believe that children make "Whole Object Assumptions" about words, where that distil out names for specific object types - teddy, apple - in "Cross-Situational Learning".
  • They also have a "Taxonomic Assumption" where they expect words to classify the entities they see in the world.
  • The majority of the first 100 words a child learns refer to types of object
  • We assume that the earliest words were a mix of dyadic (hello), deitic (this), and a lot of individual object (child, cave, bear), and that at some stage, words began to have affective connotations (urinate vs piss)

Visible Gestures or Audible Speech

  • Spoken languages have probably been around for over 100,000 years
  • Writing emerged about 5,000 years ago
  • Language is a system independent of the medium
  • The meanings of verbs for different types of movement in spoken languages are usually not transparent, but in sign languages some iconicity has been retained.
  • There are also instinctive facial expressions as for pleasure or disgust
  • Humans are predominantly right-handed, and the left hemisphere also houses the major sites of language processing, like Broca's and Wernicke's areas.
  • In humans there is still some overlap between the brain's repsonses to meaningful speech and meaningful gestures
  • A transition from mainly gestural language to predominantly spoken language could have been gradual.
  • The spoken medium has several advantages, once it is up and running:
    • Can be used in the dark
    • Addressed to people behind you, around corners
    • Frees up the hands to do other things
    • Not obviously practically useful for anything other than communication - ie signalling that I am signalling

Articulate Sounds Emerge

  • There are just over a hundred phonetic 'segments' recognised by the International Phonetic Association in their alphabet, which cover all the sounds in all the world's languages
  • Speech sounds have evolved to be relatively easy to produce as a speaker and to distinguish as a hearer, all in a continuous joined-up stream of sound
  • Speech is like the coordination of a small orchestra with each instrument in the vocal tract - lips, jaw, tip, blade, body, and back of the tongue, velum (soft palate), vocal cords, and larynx - must work together in controlled ways to produce articulate speech
  • All the languages in the world long ago settled into using fairly stable subsets of the sounds listed on the IPA chart
  • Populations of communicators need to find stable points in the phonetic landscape, conveniently produced and reliably recognized combinations of the articulators, as building blocks for their vocabulary
  • Words in languages are formed so that sounds are as distinct from each other as possible along the chain axis
  • Consonants are easier to distinguish when they are surrounded by vowels
  • The most basic syllable structure is a single consonant followed by a single vowel - CV
  • A mouth opening gesture accompanied by vocalization is one of the simplest things the vocal tract can do
  • We can think of speech as a sequence, quite rhythmic, of successive openings and closings of the mouth - primitive syllables - onto which modern speech has superimposed exquisite levels of differentiated control
  • Children start with CV (Doggie = CVCV), which is easier than Dog (CVC) and aphasics may regress to it:
    • First CV (Papa, Mama)
    • Then V (Oh) or CVC (Dad)
    • Last VC (am) or CCV (Bra) or CVCC (past)

Groups converge on arbitrary signs

  • How does a vocabulary emerge from nothing?
  • Pairings between meanings and forms are arbitrary
  • There may have been some naturally occurring 'synaesthetic' connections between objects and some of their properties were exploited:
    • a tendency for words describing small size, weakness, lightness, thinness to use a high front vowel such as [i] in bee
    • low or back vowels (man, moon) are used for large size, heaviness, strength
    • near things have high front vowels
    • far things have vowels made with a lower tongue position
  • Forms get eroded and stylized for ease of use, and this takes them further from any original natural (eg onomatopoeic) connection with their meaning
  • With a will to communicate and the necessary memory and mental processing capacity, getting a shared code up and running is apparently not difficult
  • Once it becomes possible to learn large numbers of arbitrary meaning-form connections, the way is open for the vocabulary to expand enormously.
  • Modern humans can store tens of thousands of words, whole pairings of meanings with forms, and this is a uniquely human capacity.
  • We don't need focused training, we just absorb new words like a sponge

Words affect thought

  • The very application of a public label will apparently affect how a child mentally sorts the things they are dealing with.
  • The overt use of a word can draw attention to an aspect of a practical problem that might otherwise be overlooked
  • Simply knowing the words left and right enhances a child's ability to carry out a searching task, even if the task doesn't mention them
  • Words in a public language act as a mental prop, helping us, and in harder cases even enabling us, to think more abstract thoughts, suggesting a co-evolutionary spiral between the rise of public language and the capacity for more complex thought
  • Originally private concepts, once attached to a public label used by other people, are no longer completely one's own. They become standardized and tweaked and refined in subtle ways.
  • Concepts preceded words in evolution, but once words enter the scene, words and thoughts become intertwined
  • Concepts of basic emotional facial expressions such as anger and disgust are not affected by words and there is good translatability for them between languages.
  • But other concepts are specific to certain cultures. There is no exact German word for English 'kindness', and no exact Arabic word for English 'interesting'.

Building Powerful Grammar Engines

  • Grammar is like your digestive system or breathin - complicated and unconscious
  • Any system with rules for combining elements has a syntax
  • Phonemes don't have meanings, but there are rules of combination of phonemes - you can't string them together in any conceivable order. Only the morphosyntactic level (combinations of words and affixes) is semantically important.
  • Having syntactic structure at two levels, one semantically compositional and the other not, is a characteristic of all languages and is called 'duality of patterning'.
  • It is functionally efficient to have these two levels of patterning. We can only consistently produce a limited inventory of separate speech sounds and our ears can only detect distinctions down to a certain level of subtlety. So if we can put these sounds into memorized sequences, and have enough memory capacity to store thousands of such sequences (ie words), that is an efficient solution for the task of expressing vast numbers of meanings, given a semantically compositional syntax.
  • The rules for combining the basic elements into strings must offer different options for plugging substrings into an overall string. Human languages are extremely productive. They have tens of thousands of meaningful elements, the words and affixes, and they combine these very freely so that in principle billions of different sentences are possible.
  • Syntax of human languages is best seen as putting 'constructions' together, sometimes side by side but more often one inside another, and sometimes even interwoven in more intricate ways.
  • The simplest constructions are single words.
  • Hierarchical embedding of constructions within each other in human languages is almost entirely motivated by semantics, ie by matters of meaning.
  • Long-distance dependencies are a unique characteristic of human languages. We are able to take in a string of words and hold some of them in working memory waiting to be resolved by later incoming bits of the sentence.

Did humans start by singing like birds?

  • A typical simple sentence contains several noun phrases as members. Human sentences are organized with meaningful building blocks that are smaller than the whole but larger than the atomic elements, the words. These middle-sized chunks are phrases. Birdsong and whale song also have phrases or motifs.

Packaging messages in clauses ans sentences

  • Trivial compositionality where two or more meaninful words are strung together likely preceded more complex forms
  • A clause has a main verb and between one and three arguments (usually noun phrases). Nothing else is allowed:
    • Intransitive verbs like sleep take only one argument, a subject
    • Transitive verbs like see take two arguments, a subject and a direct object
    • Ditransitive verbs like give take three arguments, a subject, a direct object, and an indirect object
  • Clauses describe events or states of affairs, involving some participants, which are typically objects including people
  • In surveying the world around us we take in a scene, perhaps analyse it, and then move on to taking in another scene. The number of tracked participants is limited to no more than four.
  • This is a 'mimimal subscene - the eyes typically flit between a very small number of location in the scene, seldom more than four. The human perceptual apparatus packages the world into small units with up to four participants and this influences language structure.
  • Grammar may have started with simple juxtasposition - Man push. Woman fall. Then we started to make the intended relation with some over coordinating conjunction, such as "and"
  • Next step is to embed one clause inside another, using a subordinating conjunction, such as that, if, or because.

Making grammar

  • Noun and verb are the basic major word classes or syntactic categories in any language because the central function of communication is giving information about identified objects
  • In the most ordinary kind of simple sentence, the subject expression, with a noun at its core is the topic, and the predicate expression, with a verb at its core is the comment (Aristotle)
  • In modern language, the most common position for the verb in a sentence is near the end, after its subject and object - SOV. English is SVO and Welsh is VSO. There is no known case of a language changing to SOV, suggesting that it is the earliest form.
  • Content words include nouns, verbs, adjectives, and adverbs
  • Function words are typically little words, unstressed in the stream of speech and very frequent in texts signalling relations between parts of sentences (the, a, my, your, pronouns (I me, you, she, them), various auxiliary verbs (have, is, can, will, do), coordinating conjunctions (and, or, but) and subordinating conjunctions (if, when, as, because). Preposition (of, under) are a borderline case
  • Grammaticalization is a process by which a language gets more grammatically organized:
    • A language with few or no auxiliary verbs is less grammaticalized. Auxiliary verbs are often historically traceable to main verbs, eg "have" is still used as a main verb but has also become specialized as a function word indicating recent relevant pastness, and to indicate obligation (to have to). Similarly "can" is derived from "cunnan" to know, be acquainted with
    • Prepositions are derived from nouns or verbs.
    • Definite articles are often derivedd from demonstrative pronouns
    • Subordinators like which and who are derived from question particles
    • Indefinite articles often come from the numeral for 1
  • Languages started off without functional words at all
  • Inflections/conjugations give information about the person and number of the subject of the verb and also about the time of the event. isolating languages like Mandarin have no word inflections at all. English is near the isolating end of the scale, and languages with little inflection tend to be stricter in the word order
  • In some languages, ifnlection were once isolated words that have been squashed into neighboring words. Eg -ed may come from 'did'
  • Making an adverb out of an adjective is very productive
  • We see a two stage grammaticalization process:
    • From content words to function words
    • From function words to inflections

Civilization and grammar

  • Cooperation between people speaking diverse languages has tended to simplify grammar
  • Writing has allowed for the use of more complicated grammar, at least in quantitative terms
  • Writing uses the same constructions as spoken language, but pushes their combination further and to greater depths of embedding
  • People in everyday speech don't talk so complexy, using several sentences instead of one complex one

Pronunciation Gets Complex

The earliest vowel and consonant systems:

  • The emergence of syllables makes the first big division of speech sounds, between vowels and consonants - almost always,each syllable has at least one vowel and it is typically flanked by consonants.
  • You can't, generally speaking, replace a vowel with a consonant or vice versa

Vowels:

  • The range of possible spoken vowels is continuous, like the color spectrum. Each language carves up the vowel space in its own way. The most common partition is a five-vowel system, like Latin.
  • English, in its many dialect, is fairly extravagant in the vowels it uses. Standard Southern British English has fourteen different vowels: pit, pet, pat, putt, put, pot, peat, pa, bought, boot, pate, bite, quoit, pout.
  • Economy and distinctiveness are the key drivers of the evolution of vowel systems. Languages find a balance between maximum perceptual distinctiveness and minimum articulatory cost:
    • Don't have more vowels than you can realistically keep apart.
    • Make sure your vowels are different enough from each other to be usable in conveying distinctions of meaning.
  • There is a distinction between:
    • A primary type of vowel system, which exploits the more easily controlled parameters, such as fron vowels with spread lips and back vowels with rounded lips, all oral (not nasalized) and without the complication of length contrasts.
    • Secondary features, such as nasalization, length contrasts, and less common combinations of tongue and lip position.
  • The vowel systems of the earliest languages were probably simple with i, a, and u.
  • Particular vowels emerge as targets in the context of the parallel rise of other vowels concurrently emerging. Each vowel is sensitive to the presence of the others in the system, keeping as far as possible from them

Consonants:

  • The acoustic/articulatory space of possible consonants is more complex and lumpily structured than the vowel space.
  • The same competing pressures of distinctiveness and ease of articulation play the most significant role in shaping the consonant systems of languages, with a particular emphasis on ease of articulation.
  • There are basic, elaborated, and complex consonants.
  • Basics are as follows:
    • p as in pip
    • t as in tit
    • k as is kick
    • glottal stop as in Cockney butter
    • b as in bib
    • d as in did
    • g as in gig
    • f as in faff
    • s as in sis
    • tf as in church
    • m as in mum
    • n as in nun
    • n as at the end of sing
    • l as in lull
    • r as in roar
    • w as in why
    • h as in hi
    • j as in you
  • The weakening, or lessening of articulatory distinctiveness, happens often in the histories of languages, but the reverse process is very rare, if it happens at all
  • The only way h appears in languages is through a process of weakening from other more distinctive sounds
  • Fricatives, like the sibilant s, appear as a result of lenition (literally softening). As this process continues, voiceless fricatives like f, s, can become voiced to v, z
  • Lenition is common and fortition, an opposite process, from weak sounds to stronger sounds, is much rarer.
  • Some sloppiness in speech is universal in languages, motivated by minimization of effort, and is universally adjusted to by hearers. Communities strike a balance between ease of pronunciation and getting a message across.