23. Testing for Teaching; Teaching to What?

The out­line which fol­lows analy­ses the two halves of a lan­guage teacher’s pro­fes­sion:

a) The first half is daily class­room prac­tice : what is taught and how is it eval­u­ated?

b) The sec­ond half of a teacher’s pro­fes­sion is to know or at least esti­mate what is going on in the brains of her stu­dents : what is learned and how is it learned?

Teach­ing is a sim­u­la­tion machine. Learn­ing is for life. The implicit pro­fes­sional chal­lenge is in mak­ing the sim­u­la­tion use­ful for liv­ing.

Note: The dis­cus­sion here reflects a teacher’s inter­est in actual lan­guage learn­ing, rather than that spe­cial game which sets out to man­u­fac­ture “the IELTS/TOEFL per­form­ing clone”. Also, I have ter­med these notes an “out­line”. It would be an abuse of lan­guage to call them an aca­d­e­mic paper in any fin­ished sense, and the absence of ref­er­enc­ing rein­forces that. There are, after all, whole aca­d­e­mic fac­ul­ties devoted to the study of test­ing, though unfor­tu­nately most teach­ers have never heard of them. Still, for those in a hurry, these reflec­tions of my own may crys­tal­lize some of the ques­tions which, sooner or later, will trou­ble any thought­ful teacher.

[A] Teach­ing Prac­tice

1. Mass edu­ca­tion is gen­er­ally a poor medium for lan­guage learn­ing, lan­guage teach­ing or lan­guage test­ing.

2. We are stuck with mass edu­ca­tion, so what is dis­cussed here relates to that. Indi­vid­u­als, espe­cially intel­li­gent indi­vid­u­als, plan­ning their own learn­ing can do much fancier things and suc­ceed, or more basic things and still suc­ceed. The sec­ond part of this short dis­cus­sion explores some prop­er­ties of mind that may partly explain why indi­vid­u­als rather than class­rooms full of cap­tive stu­dents are bet­ter at real lan­guage learn­ing.

3. In mass edu­ca­tion what is tested deter­mi­nes, mostly, what is learned. There­fore ide­ally designed test­ing will encour­age desir­able learn­ing. Unfor­tu­nately tests in mass edu­ca­tion are almost uni­ver­sally designed to facil­i­tate admin­is­tra­tive con­ve­nience, and for­get about the back­wash effects on learn­ing.

4. Since admin­is­tra­tive struc­tures, agen­das, require­ments and hier­ar­chies are insep­a­ra­ble from mass edu­ca­tion, they have to be catered for in test­ing. One also aims for a design that facil­i­tates learn­ing, some­times a lit­tle for­lornly.

5. So what CAN be tested? At the class­room level in lan­guage edu­ca­tion the teacher faces excep­tional dif­fi­culty in eval­u­at­ing SPOKEN lan­guage reli­ably.

Read­ing can be tested indi­rectly by com­pre­hen­sion tests. Writ­ing can at least be taken away to eval­u­ate at leisure (although holis­tic eval­u­a­tion is time con­sum­ing and prone to incon­sis­tency). Lis­ten­ing can be eval­u­ated indi­rectly by dic­ta­tion (if stu­dents can write), or by task com­ple­tion.

Spo­ken lan­guage though requires one to one instant eval­u­a­tion unless it is recorded. Rel­a­tively few classes have facil­i­ties to record large num­bers of stu­dents simul­ta­ne­ously, or invig­i­late them for hon­est record­ing via computer/internet. One to one work leaves the rest of the class at a loose end (espe­cially unde­sir­able in a test­ing sit­u­a­tion). As with writ­ing, the eval­u­a­tion of speak­ing is reducible to a large num­ber of atom­istic com­po­nents which might be scored, but eval­u­a­tion of the holis­tic out­come is liable to much sub­jec­tive vari­a­tion.

6. If you can’t HEAR a sec­ond lan­guage reli­ably, the chances are that your con­struc­tion, pro­nun­ci­a­tion and appro­pri­ate­ness in spo­ken lan­guage is unlikely to be ade­quate either. With this ratio­nale, some teach­ers are con­tent to test lis­ten­ing com­pe­tence and forgo for­mal tests of speak­ing com­pe­tence. Indeed there is some sta­tis­ti­cal sup­port for the cor­re­la­tion of speak­ing and lis­ten­ing pre­sup­posed here. It is true that lis­ten­ing cues lan­guage recall while speak­ing involves the much more dif­fi­cult process of active (i.e. not cued) recall. Nev­er­the­less, pas­sive vocab­u­lary far exceeds active vocab­u­lary, even in a first lan­guage, and eval­u­a­tion tasks can cal­i­brate for this.

7. Speak­ing is a com­pos­ite of struc­tur­ing a mes­sage that is

a) coher­ent, cohe­sive and with com­pre­hen­si­ble dis­course pre­sup­po­si­tions;

b) socially appro­pri­ate;

c) decod­able for pro­nun­ci­a­tion and into­na­tion.

a) and b) can be more or less sim­u­lated by a writ­ten response. We have spe­cial reg­is­ters of writ­ing for record­ing speech. There­fore one solu­tion for the teacher in mass edu­ca­tion is to test sim­u­lated responses for a) and b), while partly or wholly for­go­ing for­mal eval­u­a­tion of c).

This is indeed a very clumsy sim­u­la­tion. A dance with­out the music is not a dance. A grad­u­ate from a cur­ricu­lum based on this kind of test­ing might well be a tongue-tied speaker. How­ever, while the atom­iza­tion of a), b) and c) will not yield an ade­quate whole lan­guage expe­ri­ence, it might be accepted in class­rooms as PART of the process. Such eval­u­a­tion would have to be com­bined with other things, such as task / role play/ dis­cus­sion par­tic­i­pa­tion. Par­tic­i­pa­tion can be scored on a fairly crude scale such as enthusiastic/average/lax/nil. The idea is to give stu­dents some motive for pol­ish­ing the whole as well as the parts.

8. An exam­ple of the sim­u­la­tion described in 7. might be a com­bined test of lis­ten­ing accuracy/comprehension and writ­ten dia­logue response by the stu­dent. That is, the stu­dent would have to write down a dic­tated dia­logue which pro­vided the lan­guage for speaker A, but in which the responses of speaker B were not pro­vided. He would then add responses for speaker B to what he had under­stood of speaker A.

Apart from the oral for­mat, live speech is con­structed in real time and assisted by body lan­guage and other extra lin­guis­tic con­text. Still, the teacher is unavoid­ably restricted. For­mal lan­guage test­ing never eval­u­ates the whole enchi­lada. The sim­u­la­tion com­pro­mise is bet­ter than noth­ing, and does have the advan­tage of admin­is­tra­tive man­age­abil­ity.

8b. The real prob­lem with 8 is that for the learner him­self, sim­u­la­tion is a pretty unre­ward­ing process. It is the spon­tane­ity and instant feed­back of live speech which gives it an emo­tional kick. When the learner is able to “fly” a lit­tle – respond to gen­uine com­mu­ni­ca­tion – he has a sense of achieve­ment that con­tributes greatly to both moti­va­tion and mem­ory. Mass edu­ca­tion of course is all about sim­u­la­tion.

9. Talk­ing of whole enchi­ladas, lan­guage teach­ers are always con­demned to tast­ing only a bit of the meal. For one thing, we are only enti­tled, really, to test what we teach. What we teach is only ever a frag­ment of lan­guage, regard­less of how sys­tem­atic or how chaotic the over­all cur­ricu­lum is. To take another anal­ogy from mar­tial arts, we can teach a few punches and blocks. When the learner tries to put those together in a real match with a rapid chore­og­ra­phy of moves, things are apt to fall apart pretty quickly. We can study frag­ments of any­thing, but when it comes to com­plex dynamic sys­tems, the only real descrip­tion of them is actual, total per­for­mance. The teach­ing pro­fes­sion is there­fore one with mod­est aims.


[B] The Men­tal Envi­ron­ment of Learn­ing

Here the dis­cus­sion will turn from class­room test­ing tech­niques to the men­tal con­di­tions that stu­dents bring to a class­room, and the bear­ing of that on what can be achieved in class­room envi­ron­ments.

10. The learn­ing of frag­ments is superbly han­dled by a uniquely human facil­ity called DECLARATIVE MEMORY. Declar­a­tive mem­ory is what you use when you recite a list of “facts”. Adults are bet­ter at this than chil­dren, so not sur­pris­ingly a fair bit of research claims that adults are bet­ter than chil­dren at most aspects of lan­guage learn­ing, into­na­tion and pro­nun­ci­a­tion excepted. I sus­pect that what this research really shows is that most for­mal lan­guage eval­u­a­tion draws on declar­a­tive mem­ory.

11. The per­for­mance of com­plex, dynamic skills is not unique to humans. The wolf clos­ing on a flee­ing antelope is mak­ing breath­tak­ing math­e­mat­i­cal cal­cu­la­tions of speed and dis­tance. Of course, the wolf is not “aware” of these cal­cu­la­tions – it is all auto­nomic and sub­con­scious. Sim­i­larly, when we speak our native lan­guage, the con­struc­tion pro­ceeds with astound­ing speed and accu­racy at a level entirely beyond our con­scious man­age­ment. The extent to which the wolf has learned its skills is open to debate. There is no doubt how­ever that chil­dren learn their first lan­guage socially (even though there is heated argu­ment about level of detail in an inherited men­tal tem­plate which makes this pos­si­ble). The mem­ory that we use to speak lan­guage natively, or to exer­cise the skill of rid­ing a bicy­cle, is called PROCEDURAL MEMORY.

12. Much about the nature of pro­ce­du­ral mem­ory remains obscure. It is sub­con­scious, and very resis­tant to con­scious for­mu­la­tion by intro­spec­tion (which is why lin­guis­tics is so arcane for most peo­ple). Stud­ies of brain dam­aged indi­vid­u­als, and now MRI res­o­nance scan­ning sug­gest that it is dis­trib­uted dif­fer­ently in the brain from cen­ters of declar­a­tive mem­ory. The input of declar­a­tive mem­ory to pro­ce­du­ral mem­ory is, at the moment, not under­stood well at all : there is much dis­agree­ment and con­fu­sion.

13. One likely expla­na­tion of the nature of pro­ce­du­ral mem­ory is that it is a kind of prob­a­bil­ity machine. This fits well with what is known about the phys­i­ol­ogy of ner­vous sys­tems, which depend upon cas­cad­ing elec­tri­cal sig­nals.

Lan­guage is cer­tainly a prob­a­bil­ity game. You lis­ten to a stream of sound or read a sen­tence. As you per­ceive one word, your brain is furi­ously cal­cu­lat­ing the prob­a­bil­i­ties of what can fol­low it. The more you hear of a sen­tence and know of a con­text, the bet­ter your guesses get. Thus “what am I going to say ….” … next/now/when/believing …. . “Believ­ing” is gram­mat­i­cally pos­si­ble, but surely far less prob­a­ble than “next”. This knowl­edge of col­lo­ca­tion prob­a­bil­i­ties is what makes it pos­si­ble for us to speak and lis­ten so quickly. It is knowl­edge which comes from a huge amount of live expe­ri­ence. It is the hard­est part of lan­guage learn­ing, yet few teach­ers let alone stu­dents, are really aware of the real nature of this task.

14. It is clear that the acqui­si­tion of declar­a­tive knowl­edge is rel­a­tively effi­cient and mea­sur­able in humans. Thus our mass edu­ca­tion sys­tems thrive on it. It is also evi­dent that the acqui­si­tion of pro­ce­du­ral knowl­edge, espe­cially relat­ing to very com­plex dynamic sys­tems like lan­guage, tends to be very slow and requires a great deal of holis­tic prac­tice. It is also very dif­fi­cult to mea­sure objec­tively, and hence tends to be down­played or ignored in mass edu­ca­tion sys­tems.

15. Since every­one learns a first lan­guage as an infant, but adults are much less uni­formly suc­cess­ful in mas­ter­ing a sec­ond lan­guage at a pro­ce­du­ral level (as opposed to regur­gi­tat­ing facts) it would seem that the infant brain is more acces­si­ble to fairly rapid pro­ce­du­ral learn­ing than the adult brain. Given the numer­ous auto­nomic skills (e.g. walk­ing) that infants have to learn apart from lan­guage, this makes some sense.

16. It is also appar­ent that those adults who do mas­ter a sec­ond lan­guage are often (even typ­i­cally) very poor at ped­a­gog­i­cal gram­mar. That is, fre­quently they do not excel at the declar­a­tive enu­mer­a­tion of gram­mat­i­cal ‘facts’, and abhor con­scious lin­guis­tic analy­sis. Some­how they have found a way to develop pro­ce­du­ral knowl­edge of a lan­guage with­out being able to con­sciously analyse the process.

17. Pro­ce­du­ral learn­ing appar­ently depends, above all, on rep­e­ti­tion, and some suc­cess­ful adult learn­ers have evi­dently been able to main­tain a kind of open alert­ness under con­di­tions that would drive many oth­ers to revolt, dis­trac­tion or just sleep. Maybe a lack of imag­i­na­tion helps some­times! There is obvi­ously scope for design­ing cur­ricu­lums and meth­ods that involve a great deal of holis­tic rep­e­ti­tion, but in ways which don’t lead the larger body of learn­ers to reject the process out of bore­dom.

18. The men­tal prepa­ra­tion of stu­dents is also clearly crit­i­cal in deal­ing with pro­ce­du­ral mem­ory, but not nec­es­sar­ily in ways that are ideal for declar­a­tive mem­ory. Per­haps the Lozanov idea of quasi hyp­no­sis (Sug­gesto­pe­dia) has a place here. Man­ag­ing the sub­con­scious is a slip­pery busi­ness, more stud­ied in East­ern tra­di­tions of med­i­ta­tion than West­ern philoso­phies. Nev­er­the­less, it has pop­u­lar accep­tance as a kind of magic learn­able by some fic­tional indi­vid­u­als : Harry Pot­ter, Luke Sky­walker etc. It will not trans­late eas­ily into the crass­ness of mass edu­ca­tion sys­tems.


The Great Conun­drum

Teach­ers test their own sim­u­la­tions of life. Stu­dent brains learn to per­form these declar­a­tive sim­u­la­tions. In life how­ever, what is use­ful for the lan­guage learner is auto­matic pro­ce­du­ral per­for­mance. In lan­guage class­rooms this is only acci­den­tally learned and poorly mea­sured.

It seems that we need twin objec­tives for the devel­op­ment of lan­guage teach­ing in mass edu­ca­tion sys­tems :

a) a way to develop pro­ce­du­ral mem­ory and pro­ce­du­ral skills in a sec­ond lan­guage for large num­bers of peo­ple;

b) a way to mea­sure this acqui­si­tion of pro­ce­du­ral skills that is admin­is­tra­tively man­age­able, reli­able and actu­ally encour­ages learn­ing.

At the moment I do not know of any­one who has found a cred­i­ble way to achieve these objec­tives on a large scale. The fail­ure rate for most pro­ce­du­ral for­eign lan­guage acqui­si­tion in most places of mass edu­ca­tion is very high (one famous esti­mate puts the fail­ure in Amer­ica to achieve use­ful com­pe­tence in L2 at around 95%). We remain an acci­den­tal pro­fes­sion.


Test­ing for Teach­ing; Teach­ing to What?© Thor May 2012 all rights reserved

Bio: Thor May’s PhD dis­ser­ta­tion, Lan­guage Tan­gle, dealt with lan­guage teach­ing pro­duc­tiv­ity. Thor has been teach­ing Eng­lish to non-native speak­ers, train­ing teach­ers and lec­tur­ing lin­guis­tics, since 1976. This work has taken him to seven coun­tries in Ocea­nia and East Asia, mostly with ter­tiary stu­dents, but with a cou­ple of detours to teach sec­ondary stu­dents and young chil­dren. He has trained teach­ers in Aus­tralia, Fiji and South Korea. Many of his papers, essays and sto­ries may be seen on his web­site at http://thormay.net ; e-mail thor­may AT yahoo.com .

This entry was posted in education, Language learning, Language teaching, language testing, Linguistics, method. Bookmark the permalink.

Leave a Reply