How does the new TOEFL Writing assess test-takers’ use of grammar? Joanna Buckle examines the test and gives her take
The TOEFL has gone from being a test with an extremely heavy grammar content—in fact, a grammar test that seemed almost impossible to all but the trained linguist or near-native speaker—to one with very little explicit assessment of test-takers’ ability to use English grammar accurately.
The feedback to teachers and test-takers that is given is non-specific and extremely brief. For example, an essay that scores a three or ‘low-Intermediate’, which is given to mean ‘good’, is described as a piece of writing that: ‘may demonstrate inconsistent facility in sentence formation and word choice that may result in lack of clarity and occasionally obscure meaning; may display accurate but limited range of syntactic structures and vocabulary’.
However, it is very difficult to find an answer to the question of whether a score of three will get you onto an undergraduate course. Guesswork may lead you to the conclusion that a ‘four’ is the minimum.
To add to the confusion, there are two differing sets of writing descriptors currently available on the ETS website. The alternative set of marking criteria reads that a low-intermediate scorer can ‘write with limited facility, with language errors obscuring connections or meaning at key junctures between ideas in the text’. Instead of scores 1-5, this set of descriptors is scored from 0-30.
This seems designed to cause confusion and conflict between teachers and learners. Which set of scores and descriptors should the teacher explain to their TOEFL class? Should they tell their learners that they have received a low-intermediate, or a ‘good’ score? What does ‘good’ mean? Will this score gain the learner access to a pre-sessional course, or not?
To add to the confusion, the threshold for acceptance onto undergraduate and postgraduate courses varies from institution to institution in the US, the UK and Australia. For example, a prestigious British institution such as The University of Bristol accepts a score of 70, whereas The University of Cambridge asks for a score of 100.
It is very time-consuming, and ultimately confusing, to find out how the TOEFL Writing is marked. Adult learners do not usually appreciate it when the teacher cannot give a definitive answer to such questions. This is a high-stakes test that determines their future and requires a lot of financial planning in terms of university application for a course in a foreign country.
While it is a good thing to focus more on the content of a foreign language learners’ writing, there needs to be a more specific way to assess grammar that is not debatable. The TOEFL descriptors are very much open to interpretation.
For example, we might take the TOEFL three to be roughly equivalent to an IELTS five, the standard for admission to an English language university preparation course for undergraduates.
For the IELTS, we can see that marking will be more consistent, as more specific detail is given: ‘The range of structures is limited and rather repetitive. Although complex sentences are attempted, they tend to be faulty, and the greatest accuracy is achieved on simple sentences. Grammatical errors may be frequent and cause some difficulty for the reader. Punctuation may be faulty.’
The key phrase here is ‘complex sentences are attempted’ but faulty, and simple sentences are more accurate. The TOEFL descriptors seem rather throwaway in comparison overall, as they are only one quarter of the length.
In terms of the usefulness and meaning of the TOEFL scoring, it seems that the top US universities ask for 100 out of 120 over all four skills, whereas less prestigious colleges will accept 50. On the other hand, IELTS candidates can easily understand that they need a 6.5 overall to enter a postgraduate degree at Masters level in the UK or Australia, and so on.
This greater consistency does suggest higher academic standards in the UK. There is no scoring by points to convert, as scores are holistic, let alone a system that is out of 120, not 100, and is therefore more difficult to calculate. A perfect score, meaning a test-taker is a native-speaker of excellent competence, is a 9, but no one is expected to score this highly, as a rule. PhD study requires a score of 7.5.
As mentioned in my previous article on this topic, published in EFL Magazine in September 2021, a writing test that spends 40 minutes on listening to a lecture and reading a summary before writing for only twenty minutes to paraphrase the contents has low content validity.
Listening to a lecture, reading an article and contrasting the main points made in the two has high context validity, in that this is certainly something students are expected to do in academia. However, it is a test of reading and listening as much as it actually assesses the learners’ writing. As the actual writing period is so short, it is not very demanding on the test-takers’ ability to write grammatically correct sentences.
In contrast, the IELTS gives an hour for essay writing, giving only a two sentence prompt and leaving ample time to brainstorm and plan your essay, matching the tasks actually performed in academia more closely.
Therefore, the level of grammatical accuracy and range tested is very much higher in the IELTS exam, and the test more stringent. To produce two essays, the first of 150 words, and the second of 250 words on a previously unseen question in one hour is obviously a far more difficult writing task than to write for only 20 minutes on visual and auditory prompts.
Another point in the favour of using the International English Language Testing System from Cambridge, is that, although the IELTS keeps its full marking descriptors under wraps, for examiners only, most of the information on how candidates are assessed is to be found in the partial descriptors for teachers and students to read.
After examining the TOEFL Writing scoring system and descriptors, it remains unclear how the new TOEFL assesses test-takers grammar from their work in the writing exam. Perhaps the teacher should keep their explanation ironically simple.
Grammar is scored ‘good,’ or ‘very good.’ ‘Very good’ will get you into university, ‘good’ needs a bit more work. As such, it may be difficult for teachers and learners to have confidence in the TOEFL examination, an unnecessarily stressful situation for prospective undergraduates whose whole financial future is at stake.
This article was originally published in the March/April issue of EL Gazette.