Friday, August 11, 2017

Word Booster vocabulary tool: a review

This morning a new online vocabulary tool popped up on my Twitter feed that claimed to create vocabulary activities from authentic texts in minutes. Of course, I couldn’t resist, so I went to check it out.

Word Booster is a website that allows you to input the URL of an online text, such as a news article, it then extracts a list of words a learner is “likely to look up in the dictionary”. From this, it generates a wordlist that includes dictionary definitions and follows up with a quiz based on these words, definitions and example sentences. It is incredibly easy to use and creates a reformatted version of the text, the wordlist with definitions and the quiz as three pdfs in just a couple of minutes. The results are nicely presented and, I was pleased to see, give appropriate acknowledgements for both the text and the source of the dictionary definitions. The reformatted version of the text, which you’d print out and give to students, gives not just the details of the source, author, etc. at the top of the page, but it also has a QR code which students can scan to take them directly to the original– a nice touch.

You can probably sense there’s a ‘but’ coming though … I was immediately wary for several key reasons:
1 Selecting the vocabulary that’s most useful for students to look up and then practise from a text isn’t a simple task. Do you just want to pick out the potentially ‘unknown’ words which will help them to process the text for comprehension? In which case, the definitions might be useful, but these might not be especially useful words to focus on and practise. Do you want to choose higher-frequency words which might be more useful to learn? In which case, you need to know something about the learner (their level, context, etc.) in order to make those choices.
2 To my knowledge, no one’s yet come up with a reliable computational method of distinguishing the correct sense of words in context.
3 Activities based purely on dictionary definitions are, at best, very limited and at worst, downright unhelpful, especially without any human editing.

Because the link to the site came via a couple of respected sources, I was hopeful that this tool might have found a way of getting round these problems. I tried it out with a news article from the Guardian than I used in a talk I gave earlier this year about choosing which vocabulary in a text to focus on. Because it was an article I’d already worked with, I had an idea which vocabulary I might expect to crop up.

Text length: Because you enter the URL of the text, the tool automatically uses the whole text. In an ELT class, you often want to shorten a text to make it fit more easily into a lesson and because the results are produced as pdfs, they’re not easily edited.  Not ideal, but something I could live with.

Word selection: This was decidedly odd. The list generated for this text was as below, EVP CEFR levels shown in brackets where relevant.

weather (A1), autumn (A2), freezing (B1), shocking (B1), scary (B1), flood (B1), sunlight (B2), alarming (C1), continual (C1), retreat (C2), unprecedented (C2), vicious (C2), Danish, peak, anomaly, magnitude, moisture, perpetuate, polar, hiccup

Even if you have niggles with the EVP classifications, I think this is clearly a slightly strange spread of vocabulary. For students tackling an authentic text of this kind, I would say the first 6-8 words would be unlikely look-ups and the first handful would probably be unhelpful even as revision.

Sense identification: I was unsurprised to find the tool didn’t always manage to assign the correct dictionary sense to words in context. The website acknowledges that it might sometimes get this wrong, but for this text the error rate was 5 words out of 20, that’s 25% of the words wrongly assigned. That seems quite high to me and is not only potentially pretty misleading for a learner, but awkward for the teacher who finds themselves in class trying to explain the mismatches. The clear errors here were:
vicious: this appears in the text in the idiom ‘a vicious circle’, but sees the individual adjective defined as ‘deliberately cruel and violent’
retreat: the text talks about retreating polar ice, but the definition is specifically about armies. The more general, usually second sense about something ‘moving back’ would have fitted here.
flood: again, the definition picked here is the first, most frequent sense involving water, but the text is actually talking about warm air moving in suddenly, a slightly different sense.
weather: this is perhaps the biggest gaffe as it fails to even identify the correct part of speech. It gives a verb definition for what’s actually a simple noun in the text. (The software is confused by the fact that it follows ‘to’: “Ice is very sensitive to weather.”)
hiccup: once more, we get the most literal definition here when the text is using the word in a metaphorical sense, to mean ‘a minor problem’

Dictionary choice: Even setting those issues aside, my really big bone of contention comes with the choice of dictionary. The definitions used by Word Booster come from Oxford Dictionaries, a reputable source yes, but with definitions taken from one of their native speaker dictionaries, not a learner’s dictionary **big sigh, shoulders sag** I’m not going to go through the reasons for using learner’s dictionaries in ELT again, but let’s just look at one definition from this text for ‘moisture’:
NS def used by Word Booster: water or other liquid diffused in a small quantity as vapour, within a solid, or condensed on a surface
Oxford Advanced Learner’s Dictionary: very small drops of water that are present in the air, on a surface or in a substance
Not only does this make many of the definitions incredibly unhelpful for the average learner (because many contain words that are far above the level of the word being defined), but it makes many of the items in the follow-up quiz completely incomprehensible.
[STOP PRESS: It looks like if you register with Word Booster and download the Chrome extension, it may be possible to choose between the OED and the Cambridge Advanced Learner's Dictionary, which would potentially solve this particular issue - although the option certainly isn't obvious at first sight. When I've had a chance to investigate further, I'll report back with another update ...]

Quiz: I could go on to critique the quiz format, but to be honest, there’s no point. Starting off with inappropriate definitions and example sentences aimed at native speakers rather than learners, the confusion just gets compounded. Then add to that a multiple-choice activity with randomly chosen definitions as distractors and I can no longer even bear to look.

If …: I was really hoping that this was going to be an exciting new tool, and I think the intention is good, but it just falls at too many key hurdles. I think it could maybe work if:
- it used a learner’s dictionary as its source
- it allowed some intervention from the teacher at the level of word selection (it could maybe suggest a wordlist that the teacher could then edit)
- it allowed teacher intervention again to check the sense selections
- it allowed teachers to edit the quiz

Of course, all that would make it much less of an instant tool providing a quickie, ready-made lesson. But with the tool doing a lot of the legwork and the teacher just needing to intervene where necessary to tidy things up, I think it would still be useful and it would certainly produce much more credible results.

Blogger Diane Nicholls said...

Hi Julie,

A great review of a new website for me. Thanks! I agree with you, of course, on everything.

I was interested to read about the developer and how the app came about as a competition winner for a competition to see what apps could be made using the Oxford Dictionaries API. It won first prize, and I must say it's an impressive achievement for (what sounds like) just one person making a free app!

I think there's loads of potential here - especially with your last three 'Ifs' - all definitely doable here for someone with a good development team behind them.
Your first 'If', though, is a real problem - getting appropriate data for products aimed at learners. This was a competition to use the Oxford Dictionaries API, so that's what it used. The data is free. Where are lone app developers with excellent ideas to get hold of free learner dictionary data? Now that would be a breakthrough!
Thanks again!

1:41 pm  
Blogger Diane Nicholls said...

Stop Press!!

I checked in the FAQs and it says this:

"We use two dictionaries: (1) Oxford English Dictionary (Yes, I mean OED) along with (2) Cambridge Advanced Learner's Dictionary 4th edition. You can choose which dictionary to use in the user profile menu."

Have you seen that?
I created an account but didn't find it before running out of time.

1:48 pm  
Blogger The Toblerone Twins said...

No, I hadn't seen that. I tried it out without signing up and didn't get any options, but will go back and try to sign up for an account to see ... watch this space!

2:03 pm  
Blogger Andrew B. Kim said...

Hello Julie! I am Andrew, the guy who made Word Booster. I deeply thank you very much for the detailed review. I feel honored and overwhelmed with such attention from many experts in the field with this simple tool.

I agree with you on all of the points you made here, and we have been working on them for some time, but with only a single guy working on it, they have been challenging. The good news is that I got funds from an investor and we are hiring to refine the product. We are currently a team of two persons - one is me, the requirement guy and the other is the programmer.

I am planning for the next update around end of October, and I hope to have implemented all of the points you described.
Again, thank you very much for the insightful review!

2:45 pm  
Blogger Andrew B. Kim said...

I forgot to mention something about learner's dictionary: we had a license from Cambridge Advanced Learner's Dictionary, however, the license got expired a few months ago. We are working on obtaining a learner's dictionary soon.

2:48 pm  
Blogger Andrew B. Kim said...

Diane, thank you very much for the discussion, too.
Allow me to respond to your points.
1. I can get a solid learner's dictionary. That costs money but I've done that before.
2. You are right in describing the issues as all doable. Yes, my team is only two persons: me and the programmer guy who has done almost everything for this in full time, plus a couple of occasional freelancing programmers. Very luckily for my project, I got investment money from a reputable publisher in South Korea and some other investors are interested too, for further developing this project. I really appreciate a review as this and discussions that follow. I can very clearly see what are lacking. Coincidentally, I have implemented all of the features in the internal version but haven't made them public due to slow performance issues.
Please wait for the next update. We are getting there.
Anyways, thank you so much for the discussion!

3:37 pm  
Blogger The Toblerone Twins said...

Hi Andrew,

I'm so glad you found the review and commented. I'm away for a few days, so just a quick reply via my phone for the moment, but I'll reply properly next week.


10:58 pm  

