24. The Probable Language Brain [2013, extended 2015]

Abstract: Let us sup­pose that you are a research lin­guist, tor­mented by some doubts and ques­tions about the state of your pro­fes­sion, and not con­strained by hav­ing to repeat a cat­e­chism of “known truths” to Lin­guis­tics 101 stu­dents, and not wor­ried about employ­ment tenure. How would you actu­ally go about tack­ling “the cen­tral prob­lem of lin­guis­tics”, namely how we acquire and main­tain knowl­edge of the prob­a­bil­ity of sys­temic rela­tion­ships in a lan­guage?

Here are two sim­ple prag­matic truths :

a) if you ask me the pro­duct of 9×8 I can tell you instantly : 72

b) if you ask me the pro­duct of 9×14 I have to cal­cu­late out each digit, then remem­ber to add the results. It is slow and I might eas­ily make a mis­take. That is because in my pri­mary school they only made us mem­o­rize up to 12×12.

The first act, a) is per­formed cour­tesy of my pro­ce­du­ral mem­ory and as a pro­duct of a phys­i­cal neu­ronal rela­tion­ship. (Pro­ce­du­ral mem­o­ries are rou­ti­nes acquired by prac­tice until they become sub­con­scious, such as the skill of dri­ving a car. Psy­chol­o­gists would prob­a­bly call the neu­ronal rela­tion­ship some kind of “long term mem­ory”). I am unlikely to ever for­get the answer to 9×8, but grow­ing that asso­ci­a­tion was hard. It took a lot of child­hood prac­tice.

The sec­ond act, b) is per­formed by the con­scious appli­ca­tion of rules I have learned. Delib­er­ate mul­ti­pli­ca­tion and addi­tion seems to take place in a work­room next to my declar­a­tive mem­ory. (Declar­a­tive mem­o­ries are learned facts acces­si­ble to con­scious recall. Psy­chol­o­gists would prob­a­bly call the work­room “short term mem­ory”). On a bad day I might stum­ble try­ing to apply the rules of arith­metic. Large num­bers of peo­ple never become any good at it.

In one way, my knowl­edge of a) is some­what sim­i­lar to my knowl­edge of my native lan­guage. I don’t have to sit there try­ing to apply “gram­mar rules” before I can talk. Rather, the flow of words, like the result of mul­ti­ply­ing 9×8 emerges instantly.

How­ever, the math­e­mat­i­cal out­come of mul­ti­ply­ing 9×8 is 100% cer­tain. In this, the behav­iour of nat­u­ral lan­guage is rather dif­fer­ent. Vis­i­ble or audi­ble lan­guage deals with long lists of tokens (mor­phemes, words) added together. Although these lan­guage lists seem lin­ear, it need not fol­low that their actual cre­ation or inter­pre­ta­tion is a strictly lin­ear process. In fact I will argue that such strict lin­ear­ity is unlikely. A nicer metaphor than ele­men­tary math or for­mal logic might be to think of a child wak­ing and stum­bling to the bath­room. The fam­ily is dis­turbed, lights go on, then grad­u­ally a whole city wakes up as all kinds of rou­ti­nes and dis­tur­bances more and more densely prod each other into life. These events are broadly pre­dictable, but the details always vary. The math­e­mat­ics describ­ing such processes would be more akin to com­plex­ity the­ory than arith­metic. In other words, the rela­tion­ship between a par­tic­u­lar word occur­ring in an utter­ance and all the words which came before it is rarely 100% cer­tain in either pro­duc­tion or decod­ing. Rather that rela­tion­ship is a prob­a­bil­ity. That prob­a­bil­ity is influ­enced by two dif­fer­ent kinds of things : one inter­nal to mem­ory, and one exter­nal to “the world” (the con­text of the talk­ing or writ­ing).

The power of the per­sonal mem­ory in word rela­tion­ships is very strong. It is called the col­lo­ca­tional prob­a­bil­ity. That is, when we hear the word stream “We never knew what he was going to …”, as native Eng­lish speak­ers we know that the fol­low­ing words will prob­a­bly be “do next” or “say next”, even though “pro­pose” or even “ingest” would be per­fectly gram­mat­i­cal in the sequence. Know­ing a lan­guage means in some way know­ing the prob­a­ble rela­tion­ship between each word in the lan­guage and all the other words, and groups of words in your vocab­u­lary. Although a native speaker can no more utter the list of his com­plete vocab­u­lary than he can “explain gram­mar”, his prob­a­bil­ity knowl­edge of word rela­tion­ships is quite secure, like the 9×8 asso­ci­a­tion. Access­ing it is fast and not usu­ally error prone. Know­ing col­lo­ca­tional prob­a­bil­ity helps the lan­guage user to guess mean­ings at great speed, even under poor con­di­tions. When we add our inter­nal knowl­edge of prob­a­ble word asso­ci­a­tions to the prompts of an exter­nal social sit­u­a­tion, the out­come for mean­ing is usu­ally highly pre­dictable. Of course, speak­ers of a sec­ond lan­guage learned later in life have no such advan­tage.

How we acquire and main­tain knowl­edge of the prob­a­bil­ity of sys­temic rela­tion­ships in a lan­guage is, in my view, the cen­tral prob­lem of lin­guis­tics. None of the solu­tions I have seen sug­gested for the lin­guis­tic prob­lem have looked con­vinc­ing to me. That is very inter­est­ing. We have a chal­lenge (and I cer­tainly don’t have any pat answers either!). For your ordi­nary worka­day lan­guage teacher this may also mean that the expen­sive text books and courses which qual­i­fied her, gave her a “sci­en­tific nar­ra­tive” to explain what hap­pens in the minds of her stu­dents were, well, a kind of hum­bug. That is sure to be a very unpop­u­lar pro­posal, but not out­ra­geous. For exam­ple, the train­ing that an 16th Cen­tury med­ical doc­tor under­took was also a kind of hum­bug in this sense (and 21st Cen­tury med­ical doc­tors …?).

Gram­mar” as found in books by lin­guists and lan­guage teach­ers, is a very crude abstracted sum­mary of some com­mon pat­terns of col­lo­ca­tional rela­tion­ship, albeit extremely prob­a­ble ones. Apply­ing such gram­mar rules con­sciously to sin­gle words to make sen­tences is even slower than mul­ti­ply­ing 9×14, and much more likely to gen­er­ate mis­takes. Of course, some peo­ple enjoy doing this just as some peo­ple enjoy doing arith­metic, but even they need the lux­ury of time for such a game. In speech you don’t have the lux­ury of time. It is not a chess game or a math game. It is more like danc­ing. If you stop to think about where your feet should go at each step, the music will have moved on. In short, most peo­ple find that con­sciously apply­ing gram­mar rules to words to make sen­tences is slow, bor­ing and unre­li­able. For them, try­ing to use the sum­mary of rela­tion­ships found in a ped­a­gog­i­cal gram­mar is an incred­i­bly inef­fi­cient way to make lan­guage. Indeed, few peo­ple ever mas­ter it, even if they can par­rot the gram­mar rules.

In my view, the “gen­er­a­tive gram­mars” first pop­u­lar­ized by Noam Chom­sky in his doc­toral dis­ser­ta­tion, “Syn­tac­tic Struc­tures” (1957) are also a mis­guided way to explain the rela­tion­ship between the human brain and lan­guage. The gen­er­a­tive par­a­digm seems directly trace­able to Chomsky’s train­ing in for­mal logic. The abil­ity of some humans to excel at per­form­ing such logic con­sciously is a very dif­fer­ent propo­si­tion from claim­ing that all humans excel at it sub­con­sciously and to the exclu­sion of other pos­si­ble solu­tions to the mys­tery of how mean­ing­ful lan­guage is made.

It is worth nail­ing gen­er­a­tive gram­mars in this con­text because they have become memes, mutat­ing social viruses, and a career vehi­cle for large num­bers of very clever aca­d­e­mics who will defend them to the death, though we hope not for the 1500 years it took to dis­pose of Ptolemy’s uni­verse which had the sun revolv­ing around the earth. For­tu­nately for lin­guis­tics, the high tide of ratio­nal­ist the­o­riz­ing seems to be in retreat at last, with empir­i­cal, evi­dence based research regain­ing cred­i­bil­ity within more sophis­ti­cated par­a­digms of neu­ral com­plex­ity. For exam­ple, see Evans 2014a, 2014b, 2015; LaPolla 2015. Golumbia (2015) very neatly traces the nature of Chomsky’s Carte­sian mind­set and sci­en­tific inco­her­ence. Now both Ptolemy’s heirs and Chomsky’s heirs learned lots of inter­est­ing, even use­ful things on the way to being wrong. How­ever, the gen­er­a­tive gram­mar par­a­digm seems warped for a sim­i­lar rea­son that delib­er­ately try­ing to make lan­guage using ped­a­gog­i­cal gram­mar rules is stu­pid. That is, gen­er­a­tive gram­mars, while they may hint at some broad men­tal con­straints, are crude, miss­ing thou­sands if not mil­lions of micro rela­tion­ships. They have never been demon­strated to actu­ally “gen­er­ate” real, cred­i­ble lan­guage strings free of garbage because they can’t. For the live human user gen­er­a­tive gram­mars are in prin­ci­ple insuf­fi­cient and inef­fi­cient. They also lack explana­tory power as mod­els.

If any kind of com­put­ing anal­ogy is appro­pri­ate at all our brains might be thought of as prob­a­bil­ity machi­nes, work­ing as par­al­lel proces­sors, albeit with out­comes which are fre­quently co-emer­gent with many other social ele­ments. Cer­tainly human brains are, on the whole, not log­i­cal machi­nes work­ing with binary arith­metic, and even recur­sive “gen­er­a­tive” lan­guage pro­grams could not change that.

While gen­er­a­tive lin­guists claim to be mod­el­ing human men­tal processes, oth­ers with a more social dis­po­si­tion ignore the mind alto­gether and insist that nearly all use­ful knowl­edge about lan­guage can be extracted from social sit­u­a­tions. Surely this is unbal­anced too. In one sense the social envi­ron­ment is often lit­tle more than a stage for our dis­guised mono­logues.

It is nev­er­the­less true that the prob­a­bil­ity of word rela­tion­ships on any par­tic­u­lar occa­sion can also be strongly influ­enced by the sit­u­a­tion. The “sit­u­a­tion” is the social con­text of the lan­guage, and the com­bi­na­tion of words which I con­sider appro­pri­ate to apply in that sit­u­a­tion. For the lis­tener, this social con­text influ­ences our attempts to under­stand the mean­ing of a string or words so strongly that even the wrong words can some­times be “given” a mean­ing which that lis­tener con­sid­ers appro­pri­ate. There is good evi­dence that many peo­ple lis­ten very care­fully to the “way” in which words are said (the con­no­ta­tion) when esti­mat­ing prob­a­ble mean­ing, rather than not­ing the for­mal prob­a­bil­i­ties (i.e. the deno­ta­tion of lexis in the con­text of gram­mar) which the word com­bi­na­tions are sup­posed to yield. That is one rea­son that politi­cians say­ing bad things in sooth­ing ways can still get voted for.

One kind of evi­dence for what real lan­guage users DON’T do men­tally comes from their attempted expla­na­tions to L2 learn­ers of that lan­guage. Part of a lan­guage teacher’s train­ing is to acquire a reper­toire of expla­na­tions to soothe the puz­zle­ment of learn­ers, espe­cially adult learn­ers. The anal­ogy with what trainee doc­tors acquire is very close here. The expla­na­tions of both the teach­ers and the doc­tors might or might not be based on cred­i­ble sci­ence. The expla­na­tions might or might not be help­ful to their clients. Even in the 16th Cen­tury doc­tors were val­ued, and so were witch doc­tors at other times, not to men­tion priests. Human beings in their natures seek expla­na­tions, and the placebo effect can be pow­er­ful. How­ever the naïf who asks an untu­tored “native speaker” to explain the mys­ter­ies of their lan­guage is more likely to excite humil­i­a­tion. What the native speaker can do, quite use­fully, is to say whether a phrase is “right” or “wrong”. That is, they can respond like any lan­guage user to their inner knowl­edge of col­lo­ca­tional prob­a­bil­i­ties.

Let us sup­pose then that you are a research lin­guist, tor­mented by some of the doubts and ques­tions which have been raised above, and not con­strained by hav­ing to repeat a cat­e­chism of “known truths” to Lin­guis­tics 101 stu­dents, and not wor­ried about employ­ment tenure. How would you actu­ally go about tack­ling “the cen­tral prob­lem of lin­guis­tics”, namely how we acquire and main­tain knowl­edge of the prob­a­bil­ity of sys­temic rela­tion­ships in a lan­guage?

Well, firstly you might need a very long life ahead of you, and a very sharp brain, to get on top of the maze of con­tribut­ing vari­ables. Sec­ondly, you might won­der how so many other very sharp researchers seem to have already gone astray, and why. Thirdly, con­sid­er­ing the sec­ond mat­ter, you might need a taste for the sui­cide mis­sion of stand­ing apart on the precipice of chal­leng­ing “truth from author­ity”, that is, the Ptolemy trap. If, in a dar­ing mind exper­i­ment, this lin­guist imag­i­nes leap­ing all the bar­ri­ers, where will he tread next?

One option for a fresh look at lin­guis­tics might be step out­side of the post-Enlight­en­ment par­a­digm of exper­i­men­tal research in some ways. Any such attempt requires cau­tion and clear think­ing. The pos­i­tivist par­a­digm of research in the hard sci­ences has a num­ber of bedrock require­ments. Above all, exper­i­ments must be inde­pen­dently repeat­able, which requires a strict con­trol of con­tribut­ing vari­ables and a com­pletely explicit method­ol­ogy. (In real life there have been repeated and con­tin­u­ing scan­dals on both fronts when the stakes are high). Social sci­ence research has been forced to fuzz this pure approach in var­i­ous ways, then pre­tend that it hasn’t by throw­ing up a smoke­screen of sta­tis­ti­cal trick­ery. We know now that social sci­ence research can be use­ful in a con­fus­ing world, but that cer­tainty is rarely one of its virtues. We also know that social sci­ence research can and has been done to “prove” almost every favoured polit­i­cal pos­ture.

Chom­sky and his cohort of gen­er­a­tive gram­mar aca­d­e­mics did attempt another vari­ant of vio­lat­ing “pure” research con­ven­tion by hark­ing back to 17th Cen­tury Euro­pean ratio­nal­ism (Golumbia 2015). Their lim­i­ta­tion how­ever was pre­cisely the old lim­i­ta­tion of clev­erly argu­ing how many angels can dance on the head of a pin: con­clu­sions are only as good as the premises which sus­tain them. Exclude incon­ve­nient premises and you have screwed the “sci­ence”. That is the road to scholas­tic steril­ity, which is sup­posed to be a medieval mem­ory, but which actu­ally per­me­ates insti­tu­tions of learn­ing to this day.

The for­mal research approach of this school of gen­er­a­tive lin­guis­tics was actu­ally to cre­ate avatars, known as “ideal speaker-hear­ers” who took on nom­i­nal prop­er­ties of real speaker-hear­ers, but only as defined by the model and only in selected envi­ron­ments. This lin­guis­tic avatar game had much in com­mon with online vir­tual com­puter games. The way to “kill” an avatar was to have them use a lin­guis­tic string which some native speaker, some­where, judged to be “wrong” or “non-felic­i­tous”. Then you res­ur­rected the avatar to use a slightly dif­fer­ent lin­guis­tic string which, hope­fully, would get a pass by the natives. The plan was to even­tu­ally col­lect a large enough num­ber of these con­sciously man­u­fac­tured felic­i­tous strings to pre­dict which rules would gen­er­ate them (and only them) reli­ably. Then the model could cred­i­bly rep­re­sent the gram­mar of a “real” lan­guage. Except that it couldn’t. Fifty years of play­ing this game has shown that it can’t suc­ceed, some­thing that crit­ics have long main­tained, but these things have their own momen­tum.

One fatal flaw of the lin­guis­tic “ideal speaker-hearer” avatar kind of research has always been that the avatars are arbi­trary and change­able at whim. The sam­ples which each researcher’s avatar col­lected as valid strings were essen­tially ran­dom frag­ments from mul­ti­ple sources.

This is the old, old story of seven wise men describ­ing dif­fer­ent parts of an ele­phant and imag­in­ing uni­ver­sal truths about the whole ele­phant. After many heated argu­ments, the seven wise men might even agree about the form of a com­pro­mise imag­i­nary ele­phant which incor­po­rates their sep­a­rate obser­va­tions, but it is still an imag­i­nary ele­phant. Of course, describ­ing the ele­phant, or Nature if you like, is the quin­tes­sen­tial prob­lem of big-let­ter Sci­ence. There is no final solu­tion to this dilemma, but we can say that in prin­ci­ple the smaller and more arbi­trary the observed frag­ments of a sys­tem are, the less reli­able any pre­dic­tion about the whole sys­tem is likely to be. In the case of the lin­guis­tic ele­phant, there is clearly scope for improv­ing the qual­ity of obser­va­tions which are made of the real lan­guage sys­tems used by real speaker-lis­ten­ers.

The sin­gle largest qual­i­ta­tive improve­ment which could be made to lin­guis­tic research, in my view, would be to nar­row the num­ber of observed speaker-lis­ten­ers to ONE, that is one idi­olect, and to mul­ti­ply the range of obser­va­tion of that idi­olect. In other words, the researcher would be max­i­miz­ing his com­pre­hen­sion of com­plex­ity in that sin­gle sys­tem.

Once the form and dynam­ics of one idi­olect was under­stood in max­i­mal detail, only then would it be appro­pri­ate to look at other idi­olects for the com­mon sys­temic ele­ments of dialect and lan­guage.

A cou­ple more metaphors come to mind here (one per­haps a bit too homely: I can’t help myself). When I visit a doc­tor occa­sion­ally, I have learned to pre­dict on aver­age that this per­son will have about a 20% chance of offer­ing use­ful infor­ma­tion, and an 80% chance of being use­less to dan­ger­ous. Usu­ally the man has never seen me before, he is focused on some small part of my anatomy which is caus­ing a prob­lem, he is dis­posed by train­ing and lim­ited time towards a quick, drug based “solu­tion”, and his under­stand­ing of the human body as an inte­grated sys­tem in motion (I am a dis­tance run­ner) is usu­ally laugh­able. That is, the ele­phant for this doc­tor is a very imag­i­nary beast which he tam­pers with at peril. And so it is with the doc­tor of lin­guis­tics …

Now per­haps a more sober anal­ogy. Engi­neers gen­er­ally under­stand that some sys­tems are scal­able, and some sys­tems are non-scal­able. An engi­neer can build a small model of, say, a bridge, test it in a wind tun­nel and pre­dict with fair accu­racy the stresses which will apply to an actual full sized bridge.

How­ever, com­puter sci­en­tists know very well (to their cha­grin) that although they can write a com­puter pro­gram of impres­sive com­plex­ity, even mil­lions of lines of code, it is sim­ply not pos­si­ble to write a smaller, sim­pler com­puter pro­gram to model the behav­iour of a larger, more com­plex com­puter pro­gram. They also know that every com­puter pro­gram ever writ­ten has had bugs which can only be elim­i­nated by trial and error, and fre­quently gen­er­ate new bugs in the process of cor­rec­tion.

There is a math­e­mat­i­cal rea­son for the exas­per­at­ing char­ac­ter­is­tics of com­puter pro­grams: they are ran­domly dis­con­tin­u­ous phe­nom­ena. The parts can­not reli­ably pre­dict the behav­iour of the whole.

Now when it comes to the dynamic behav­iour of nat­u­ral lan­guages, they are def­i­nitely much closer to the com­puter sci­ence end of engi­neer­ing than they are to the neatly scal­able behav­iour of mechan­i­cal engi­neer­ing. How­ever, to this point vast libraries of lin­guis­tic research have pre­tended that small, ran­dom frag­ments of observed lin­guis­tic behav­ior from strangers can be assem­bled as scal­able com­po­nents some imag­i­nary lin­guis­tic ele­phant, and be used for pre­dict­ing the form and behav­iour of the mas­sively com­plex lin­guis­tic sys­tem in my head or in your head.

Can’t we do bet­ter than this?

Per­haps we can. This essay has been sug­gest­ing that gen­er­a­tions of work by very clever peo­ple has been mis­di­rected. That would be a hard com­plaint to take seri­ously if there were no alter­na­tive par­a­digm to mea­sure the evi­dence against. As it hap­pens there is such a par­a­digm in the broad fields of sci­en­tific endeav­our. It relates to what has become the sci­ence of com­plex­ity, together with a whole com­pli­men­tary branch of math­e­mat­ics. Com­plex­ity research turns out to be full of dif­fi­cult chal­lenges, so it may not be sur­pris­ing that very few lin­guists have staked a career in it. How­ever, there are some gen­eral prin­ci­ples in com­plex sys­tems which front and cen­tre relate to the phe­nom­e­non of nat­u­ral lan­guages. I can only men­tion them in the briefest way in an essay like this.

Com­plex sys­tems are emer­gent. The term emer­gent sug­gests the absence of a super­or­di­nate causative agent. That is, such sys­tems tend to be self-orga­niz­ing, or in some con­texts can be appro­pri­ately described as self-teach­ing (Ran­som 2013). Hol­land (2014) points out that emer­gence is a prop­erty with­out sharp demar­ca­tion. There are degrees of emer­gence. Nev­er­the­less, when such sys­tems do go through a process of emerg­ing, their inter­nal rela­tion­ships become math­e­mat­i­cally non-lin­ear. In plain lan­guage, the whole is more than the sum of the parts. One of Holland’s exam­ples is that indi­vid­ual mol­e­cules of water are not “wet”. The qual­ity of wet­ness only emerges with a cer­tain aggre­ga­tion of water mol­e­cules. A sec­ond qual­ity of emer­gent com­plex sys­tems is that they con­tain inde­pen­dently func­tion­ing but related hier­ar­chies:

Hier­ar­chi­cal orga­ni­za­tion is … closely tied to emer­gence. Each level of a hier­ar­chy typ­i­cally is gov­erned by its own set of laws. For exam­ple, the laws of the peri­odic table gov­ern the com­bi­na­tion of hydro­gen and oxy­gen to form H2O mol­e­cules, while the laws of fluid flow (such as the Navier-Stokes equa­tions) gov­ern the behav­iour of water. The laws of a new level must not vio­late the laws of ear­lier lev­els.” [Hol­land 2014, p.4]

If you have any feel­ing for the mul­ti­ple sys­tems of lan­guage and their lev­els at all, the char­ac­ter­is­tics of emer­gent sys­tems will surely strike a clear echo. A word is more than the sum of its mor­phemes, a sen­tence more than the sum of its words, a novel more than the sum of its sen­tences. The super­or­di­nate emer­gent qual­ity at each level is what, in com­mon par­lance, we call mean­ing.

In our minds, if we reverse engi­neer the appar­ent con­stituents of a novel, a sen­tence, a word, a mor­pheme (or phoneme) and try to iden­tify them as clearly defined classes we, or at least the lin­guists amongst us, are apt to find that the classes are inde­ter­mi­nate at the mar­gins. Some nouns are more noun-like than other nouns (e.g. dog Vs swim­ming), just as some dogs are more dog-like than other dogs. As it hap­pens, some sen­tences are more sen­tence-like than other sen­tences, and some nov­els more novel-like than other nov­els. A num­ber of lin­guists (Eleanor Roch, George Lakoff and oth­ers) have called this effect pro­to­type the­ory and done some excel­lent work. But pro­to­type qual­i­ties are another of the com­mon prop­er­ties of emer­gent sys­tems.

The under­ly­ing assump­tion of lin­ear gen­er­a­tive mod­els of lin­guis­tics was that “well-formed sen­tences”, or well-formed sub-sys­tems at other lev­els of hier­ar­chy, were con­stituents with sharp cat­e­gory mar­gins which could be atom­ized and reassem­bled accord­ing to rather sim­ple and explicit rules. In prin­ci­ple it would indeed be pos­si­ble to tip a soup of words and a hand­book of the right syn­tac­tic rules into a prover­bial com­puter and expect well-formed nat­u­ral lan­guage to come out the other end.

The con­cept of nat­u­ral lan­guage as a (very) com­plex emer­gent sys­tem ren­ders gen­er­a­tive mod­els of lin­guis­tics inco­her­ent. The under­ly­ing rules of the game are not lin­ear, but exhibit the very dif­fer­ent math­e­mat­ics of non-lin­ear behav­iour. The out­comes of lan­guage cre­ation are greater than the indi­vid­ual words which com­prise the lan­guage.

At the begin­ning of this essay I said that learn­ing a lan­guage was learn­ing to pre­dict col­lo­ca­tions. I said that lan­guage use was a prob­a­bil­ity game. On the face of it, pre­dict­ing the prob­a­bil­ity of a col­lo­ca­tion would be per­fectly com­pat­i­ble with a lin­ear gen­er­a­tive model, even if the task with an enor­mous num­ber of words in play was sta­tis­ti­cally over­whelm­ing. Yet on the face of it, pre­dict­ing the prob­a­bil­ity of a col­lo­ca­tion within the non-lin­ear hier­ar­chies of lan­guage accord­ing to a com­plex­ity model might seem impos­si­ble. After all, another prop­erty of com­plex sys­tems is that out­comes are inher­ently unpre­dictable. In such sys­tems, each iter­a­tion is a bit dif­fer­ent.

There is an answer to the appar­ent con­tra­dic­tion implicit in pre­dict­ing col­lo­ca­tions within a com­plex­ity based sys­tem. The solu­tion is made pos­si­ble by the con­strained inde­ter­mi­nacy of cat­e­gories and occur­rences them­selves. That is, inde­ter­mi­nacy in com­plex sys­tems is bounded. Mete­o­rol­o­gists can pre­dict with pass­able accu­racy that a cer­tain num­ber of storms will strike your city in a given sea­son. They can­not pre­dict when and where those storms will strike. A lis­tener can pre­dict with use­ful accu­racy what his inter­locu­tor is likely to say, what words he is likely to use, and in which gen­eral syn­tac­tic con­fig­u­ra­tion. His mind pre­pares resources to man­age this. The lis­tener how­ever can­not be cer­tain when, where and quite how a speaker will use par­tic­u­lar words, only their like­li­hood within the social bounds of the sit­u­a­tion.

The con­fig­u­ra­tion of a pos­si­ble lan­guage brain is one of life’s most intrigu­ing mys­ter­ies. For most peo­ple it remains an invis­i­ble mir­a­cle within plain sight. I noticed the mir­a­cle long ago, went in search of some answers, then fol­lowed paths of expla­na­tion set out by those who had some con­fi­dence they under­stood (and pub­lished books to prove it). In the end it seemed that these sages were largely talk­ing to them­selves, in spite of some use­ful hints along the way.

I won­dered at my own incom­pe­tence at sec­ond lan­guage learn­ing, why lan­guage teach­ers as a species mostly seemed to loath ana­lytic lin­guis­tics, why the suc­cess or not of stu­dents I taught Eng­lish to as a sec­ond lan­guage seemed to bear no cor­re­la­tion to tal­ents for for­mal, lin­ear ana­lytic thought. My con­clu­sion was a deep sus­pi­cion that the nar­ra­tives about gram­mars which were lec­tured to “applied lin­guis­tics” stu­dents hop­ing to be teach­ers con­tained a large mix of aca­d­e­mic fan­tasy. Yet I was not wise or clever enough to invent a bet­ter nar­ra­tive myself.

The task ahead of us is to find a cred­i­ble nar­ra­tive to explain just how the lan­guages we learn and teach can pos­si­bly come into being, then func­tion in work­able ways. My hope­ful sus­pi­cion is that the study of nat­u­ral lan­guages as com­plex emer­gent sys­tems can set us on a pro­duc­tive path to that under­stand­ing.



After­note: As with many of the arti­cles I have put into pub­lic forums, The Prob­a­ble Lan­guage Brain is not a pro­duc­tion of beaver-like schol­ar­ship, scat­ter­ing the names of illus­tri­ous researchers and long lists of respectable ref­er­ences. There is a place for all of that.

My pur­pose here has been to syn­the­size some very inter­est­ing ques­tions about lan­guage in a way which encour­ages think­ing and debate. I have tried to make the argu­ments as acces­si­ble as pos­si­ble.

My own con­clu­sions may of course be wrong on many lev­els, but by pre­sent­ing issues in this cut-down for­mat the hope is that both the pro­po­nents and antag­o­nists to var­i­ous propo­si­tions may be moved to present their own cases with more per­sua­sive clar­ity.

For what it is worth, my own encoun­ters with for­mal lin­guis­tics have fluc­tu­ated in inten­sity, but stretch back as far as the 1970s, begin­ning with the lin­ear mod­els of gen­er­a­tive lin­guis­tics, and finally with me walk­ing away from two ear­lier doc­toral can­di­da­tures after years of mount­ing skep­ti­cism.




Golumbia, David (2015) “The Lan­guage of Sci­ence and the Sci­ence of Lan­guage: Chomsky’s Carte­sian­ism”. Dia­crit­ics, Vol­ume 43, Num­ber 1, 2015, pp. 38–62. Online at Academia.edu @ https://www.academia.edu/17808815/The_Language_of_Science_and_the_Science_of_Language_Chomsky_s_Cartesianism

Evans, Vyvyan (2014) “The lan­guage myth: Why Lan­guage is not an instinct”. UK: Cam­bridge Uni­ver­sity Press

Evans, Vyvyan. (2014b) “Real talk: For decades, the idea of a lan­guage instinct has dom­i­nated lin­guis­tics. It is sim­ple, pow­er­ful and com­pletely wrong”. Aeon mag­a­zine online @ http://aeon.co/magazine/culture/there-is-no-language-instinct

Evans, Vyvyan (2015) “The struc­ture of sci­en­tific rev­o­lu­tions: reflec­tions on rad­i­cal fun­da­men­tal­ism in lan­guage sci­ence”. Lan­guage in the Mind blog, Psy­chol­ogy Today, online @

Hol­land, John H. (Sep­tem­ber 2014) “Com­plex­ity: A Very Short Intro­duc­tion”. Kindle edi­tion online @ http://www.amazon.com/Complexity-Very-Short-Introduction-Introductions-ebook/dp/B00L4CK0M6/ref=tmm_kin_swatch_0?_encoding=UTF8&sr=&qid=

Lakoff, George (2008–08-08). Women, Fire, and Dan­ger­ous Things (p. 24). Uni­ver­sity of Chicago Press. Kindle Edi­tion. Ama­zon online @ http://www.amazon.com/Women-Dangerous-Things-George-Lakoff-ebook/dp/B009PS2RXG/ref=tmm_kin_title_0?_encoding=UTF8&sr=1–7&qid=1419218998
LaPolla, Randy J. (2015) “Review of The lan­guage myth: Why Lan­guage is not an instinct“. Academia.edu online @ https://www.academia.edu/16650761/Review_of_The_language_myth_Why_language_is_not_an_instinct._

Larsen-Free­man, Diane (29 March 2014) “Com­plex­ity The­ory: Renew­ing Our Under­stand­ing of Lan­guage, Learn­ing, and Teach­ing”. TESOL Inter­na­tional Con­fer­ence, Ore­gon USA. online @ http://www.tesol.org/attend-and-learn/international-convention/convention2014/featured-speakers/diane-larsen-freeman-keynote-video

Valiant, Leslie (2013) “Prob­a­bly Approx­i­mately Cor­rect: Nature’s Algo­rithms for Learn­ing and Pros­per­ing in a Com­plex World”. Basic Books. Kindle edi­tion avail­able online @ http://www.amazon.com/Probably-Approximately-Correct-Algorithms-Prospering-ebook/dp/B00BE650IQ/ref=tmm_kin_title_0?_encoding=UTF8&sr=&qid=


Pro­fes­sional bio: Thor May’s PhD dis­ser­ta­tion, Lan­guage Tan­gle, dealt with lan­guage teach­ing pro­duc­tiv­ity. Thor has been teach­ing Eng­lish to non-native speak­ers, train­ing teach­ers and lec­tur­ing lin­guis­tics, since 1976. This work has taken him to seven coun­tries in Ocea­nia and East Asia, mostly with ter­tiary stu­dents, but with a cou­ple of detours to teach sec­ondary stu­dents and young chil­dren. He has trained teach­ers in Aus­tralia, Fiji and South Korea. In an ear­lier life, prior to becom­ing a teacher, he had a decade of drift­ing through unskilled jobs in Aus­tralia, New Zealand and finally Eng­land (after back­pack­ing across Asia in 1972).


con­tact    http://thormay.net    thormay@yahoo.com

All opin­ions expressed here are entirely those of the author, who has no aim to influ­ence, pros­e­ly­tize or per­suade oth­ers to a point of view. He is pleased if his writ­ing gen­er­ates reflec­tion in read­ers, either for or against the sen­ti­ment of the argu­ment.


The Prob­a­ble Lan­guage Brain ©Thor May 2013, 2015; all rights reserved

This entry was posted in Cognition, Grammar, Language learning, Linguistics, Probability in language, Research and tagged , , , , . Bookmark the permalink.

Leave a Reply