Lexicoblog

The occasional ramblings of a freelance lexicographer

Tuesday, June 30, 2020

10 ways to tackle coronavocab: #2 Trending terms


As well as the new coinages that I looked at in my first post, lots of existing words and phrases have suddenly become used way more frequently than before.  Many of these have become part of our everyday usage over the past few months but might not be familiar to learners. 

Some dramatic surges in usage (from Timestamped JSI web corpus)

Some of the newly-popular items may wane in frequency as the pandemic subsides (quarantine, isolation), others though are generally useful for learners to have as part of their vocabulary in other contexts (household, gathering, distance). Remember to choose just a few items that you think will be most appropriate for your students and think about exploring collocations and patterns as well as just the words themselves.



A few examples in context:

The three people who tested positive will go into a 14-day quarantine at home.
We were able to quarantine people very quickly and the outbreak was quite quickly contained.
Alice is currently isolating in her flat with her partner Tim.
If you are in isolation, you cannot go to public places, even if it is for essential items.
We are living in unprecedented times.
Businesses will need to provide increased hand sanitization* for customers.
Public health experts have emphasized the importance of frequent hand washing.
You can only exercise outdoors on your own or with members of the same household.
Travel control is very important to prevent the spread of infection.
We've put signs out asking people to keep their distance.
Any gathering must be limited to a maximum of 10 people in compliance with social distancing guidelines.

*Can be spelled -iz- or -is-

Activities:
  • Find images of signs or instructions featuring some of these words (feel free to reuse the images below, all taken by me out and about in Bristol) – Where would you see these? What are people being asked to do? Have students seen similar signs in their language?









  • Lots of the words in this group are being commonly used in both their verb and noun forms, some without a change of form (quarantine, distance), many in different forms (isolate/isolation, comply with/compliance, sanitize/sanitizer/sanitization, gather/gathering, prevent/prevention) – these are ripe for some work on typical noun endings and/or some sentence transformations. And of course, the nouns will need appropriate collocations too.
Complete the sentences with the correct verb or noun form of the word:
1 ISOLATE
a. Alice is currently ______ in her flat with her partner Tim.

b. If you are in ______, you cannot go to public places, even if it is for essential items.

Rewrite the sentence using the noun form of the underlined verb. 
We were able to quarantine people very quickly.
We were able to _________________________. 
  • Especially with students from European language backgrounds, there may be scope for comparing similar and different terms from their L1. For example, several languages have been using a cognate of confinement to describe what’s been called lockdown or isolation in the UK (confinement in French, confinamiento in Spanish, confinamento in Italian). The word confinement does, of course, exist in English with a similar meaning, but hasn’t been widely used in the context of the pandemic.

 
For more ideas and activities for teaching vocab generally, take a look at ETpedia Vocabulary

Labels: , , , ,

Sunday, December 02, 2018

Corpus insider #4: The problem with polysemy


It's a bit of a standing joke that every talk I give includes the word polysemy, but it's such an important concept to bear in mind when you're looking at language in any context and especially for any corpus research. Recently, I gave a talk to students at Goldsmiths, University of London, about careers in linguistics. I wanted to give them a taste of both corpus research and lexicography, so I put together a small set of corpus lines for them to look at to tease out the different senses of a word and organize them into a dictionary entry.

Whilst it's possible to do a corpus search for a specific lemma (e.g. rest as a verb; rest, rests, rested, resting or rest as a noun; rest, rests), with reasonably reliable (if not 100%) results, corpus tools can't distinguish between the different senses or uses of a polysemous word. If you think about the noun rest, which sense immediately springs to mind? It's one of those words that highlights the difference between our intuitions and the realities of usage. Quite likely, the first sense you thought of was to do with 'taking a break or time to relax'. In fact, the rest (of) meaning 'what's remaining' or 'the others' is something like three times as frequent.

When lexicographers are working with a corpus to put together a dictionary entry, determining the sense division and ordering of senses is a manual process. You can get a flavour of a word by looking at its collocates (for example, using WordSketch in SketchEngine), but that only tells part of the story - you'll find the ‘relax’ sense of rest has far more strong collocates than the duller, more functional the rest of

Section of a WordSketch for rest (noun) - English Web 2015 via Sketch Engine

You can sort concordance lines to the left and right of the node word and you start to see the patterns emerge (here, the rest of becomes very obvious). But ultimately, you just have to go through a sample of cites manually to establish the different senses and uses (including as part of phrases), and the frequency order. The actual statistical frequency of a particular sense is almost impossible to determine in most cases, not least because, for many words, there are senses which overlap and examples that are ambiguous.

So what are the practical implications of this?

Dictionary frequency information: A number of learner’s dictionaries (Collins COBUILD, Macmillan, Longman) provide information about the frequency of a word using a system of stars or dots. Whilst this is useful in giving you a ball-park guide to more and less frequent words, the ratings are based on the frequency of the whole word, not the individual senses. For some words, all the senses may be relatively high frequency, while in other cases, the first sense(s) may be high frequency and others quite obscure.

Phrases: It is possible to find the frequency of many phrases with carefully constructed corpus searches, but phrases with variable elements and those containing very common words (such as phrasal verbs) which could co-occur in different ways are much trickier to pin down. For that reason, they’re not generally allocated their own frequency information and just get lumped in with the individual headwords.

Word lists: Many frequency-based word lists also don’t take into account the different senses of a word and their relative frequency.  Unless words on the list come with definitions attached, it’s difficult to know whether they just refer to the most frequent sense or to other senses as well.

Text analysis tools: Tools that allow you to input a text and get a breakdown of the words by frequency or as ranked in EVP, for instance, such as Text Inspector or Lextutor, will generally allocate words according to their overall frequency or most frequent sense. So, an obscure sense of a common word, such as leg in the context of a cricket match (see sense 5 here), will likely be labelled as high frequency. The paid version of Text Inspector does allow the user to choose the relevant sense of a word when looking at EVP labels from a drop-down menu, but it doesn’t offer off-list options (including the cricketing sense of leg which it just labels as A1) or allow you to allocate words to phrases that haven’t been automatically detected.

So, does this means that all these tools are completely useless? Of course not. In many cases, we’re using frequency information as a rough guide, so finer sense distinctions don’t come into play. Like anything though, it’s important to know the limitations of the sources and tools you use and to be on the look-out for anything that doesn’t seem quite right.

Labels: , , , ,