BNC SAMPLER CORPUS:
GUIDELINES TO WORDCLASS TAGGING

Updated : 16/09/97

CONTENTS

  1. PRELIMINARIES, including the Tagset and Tokenization

  2. INTRODUCTION TO WORD CLASSES

    1. Nouns
    2. Verbs
    3. Adjectives
    4. Adverbs
    5. Articles, Determiners and Pronouns
    6. Prepositions and prepositional adverbs
    7. Conjunctions
    8. Numerals
    9. Miscellaneous

  3. GUIDE TO DISAMBIGUATION, by TAG PAIR

  4. GUIDE TO DISAMBIGUATION, by WORD


1. PRELIMINARIES

SECTION 1A

SAMPLER CORPUS (C7) TAGSET

APPGE possessive pronoun, pre-nominal (e.g. my, your, our)
AT article (e.g. the, no)
AT1 singular article (e.g. a, an, every)
BCL before-clause marker (e.g. in order (that), in order (to))
CC coordinating conjunction (e.g. and, or)
CCB adversative coordinating conjunction ( but)
CS subordinating conjunction (e.g. if, because, unless, so, for)
CSA as (as conjunction)
CSN than (as conjunction)
CST that (as conjunction)
CSW whether (as conjunction)
DA "after-determiner", or post-determiner capable of pronominal function (e.g. such, former, same)
DA1 singular post-determiner (e.g. little, much)
DA2 plural post-determiner (e.g. few, several, many)
DAR comparative post-determiner (e.g. more, less, fewer)
DAT superlative post-determiner (e.g. most, least, fewest)
DB "before-determiner", or pre-determiner capable of pronominal function ( all, half)
DB2 plural before-determiner ( both)
DD central determiner (capable of pronominal function) (e.g any, some)
DD1 singular determiner (e.g. this, that, another)
DD2 plural determiner ( these, those)
DDQ wh-determiner (which, what)
DDQGE wh-determiner, genitive (whose)
DDQV wh-ever determiner, (whichever, whatever)
EX existential there
FO formula
FU unclassified word
FW foreign word
GE germanic genitive marker - (' or 's)
IF for (as preposition)
II general preposition
IO of (as preposition)
IW with, without (as prepositions)
JJ general adjective
JJR general comparative adjective (e.g. older, better, stronger)
JJT general superlative adjective (e.g. oldest, best, strongest)
JK catenative adjective (able in be able to, willing in be willing to)
MC cardinal number,neutral for number (two, three..)
MC1 singular cardinal number (one)
MC2 plural cardinal number (e.g. sixes, sevens)
MCGE genitive cardinal number, neutral for number (two's, 100's)
MCMC hyphenated number (40-50, 1770-1827)
MD ordinal number (e.g. first, second, next, last)
MF fraction,neutral for number (e.g. quarters, two-thirds)
ND1 singular noun of direction (e.g. north, southeast)
NN common noun, neutral for number (e.g. sheep, cod, headquarters)
NN1 singular common noun (e.g. book, girl)
NN2 plural common noun (e.g. books, girls)
NNA following noun of title (e.g. M.A.)
NNB preceding noun of title (e.g. Mr., Prof.)
NNL1 singular locative noun (e.g. Island, Street)
NNL2 plural locative noun (e.g. Islands, Streets)
NNO numeral noun, neutral for number (e.g. dozen, hundred)
NNO2 numeral noun, plural (e.g. hundreds, thousands)
NNT1 temporal noun, singular (e.g. day, week, year)
NNT2 temporal noun, plural (e.g. days, weeks, years)
NNU unit of measurement, neutral for number (e.g. in, cc)
NNU1 singular unit of measurement (e.g. inch, centimetre)
NNU2 plural unit of measurement (e.g. ins., feet)
NP proper noun, neutral for number (e.g. IBM, Andes)
NP1 singular proper noun (e.g. London, Jane, Frederick)
NP2 plural proper noun (e.g. Browns, Reagans, Koreas)
NPD1 singular weekday noun (e.g. Sunday)
NPD2 plural weekday noun (e.g. Sundays)
NPM1 singular month noun (e.g. October)
NPM2 plural month noun (e.g. Octobers)
PN indefinite pronoun, neutral for number (none)
PN1 indefinite pronoun, singular (e.g. anyone, everything, nobody, one)
PNQO objective wh-pronoun (whom)
PNQS subjective wh-pronoun (who)
PNQV wh-ever pronoun (whoever)
PNX1 reflexive indefinite pronoun (oneself)
PPGE nominal possessive personal pronoun (e.g. mine, yours)
PPH1 3rd person sing. neuter personal pronoun (it)
PPHO1 3rd person sing. objective personal pronoun (him, her)
PPHO2 3rd person plural objective personal pronoun (them)
PPHS1 3rd person sing. subjective personal pronoun (he, she)
PPHS2 3rd person plural subjective personal pronoun (they)
PPIO1 1st person sing. objective personal pronoun (me)
PPIO2 1st person plural objective personal pronoun (us)
PPIS1 1st person sing. subjective personal pronoun (I)
PPIS2 1st person plural subjective personal pronoun (we)
PPX1 singular reflexive personal pronoun (e.g. yourself, itself)
PPX2 plural reflexive personal pronoun (e.g. yourselves, themselves)
PPY 2nd person personal pronoun (you)
RA adverb, after nominal head (e.g. else, galore)
REX adverb introducing appositional constructions (namely, e.g.)
RG degree adverb (very, so, too)
RGQ wh- degree adverb (how)
RGQV wh-ever degree adverb (however)
RGR comparative degree adverb (more, less)
RGT superlative degree adverb (most, least)
RL locative adverb (e.g. alongside, forward)
RP prep. adverb, particle (e.g about, in)
RPK prep. adv., catenative (about in be about to)
RR general adverb
RRQ wh- general adverb (where, when, why, how)
RRQV wh-ever general adverb (wherever, whenever)
RRR comparative general adverb (e.g. better, longer)
RRT superlative general adverb (e.g. best, longest)
RT quasi-nominal adverb of time (e.g. now, tomorrow)
TO infinitive marker (to)
UH interjection (e.g. oh, yes, um)
VB0 be, base form (finite i.e. imperative, subjunctive)
VBDR were
VBDZ was
VBG being
VBI be, infinitive (To be or not..., It will be ..)
VBM am
VBN been
VBR are
VBZ is
VD0 do, base form (finite)
VDD did
VDG doing
VDI do, infinitive (I may do... To do...)
VDN done
VDZ does
VH0 have, base form (finite)
VHD had (past tense)
VHG having
VHI have, infinitive
VHN had (past participle)
VHZ has
VM modal auxiliary (can, will, would, etc.)
VMK modal catenative (ought, used)
VV0 base form of lexical verb (e.g. give, work)
VVD past tense of lexical verb (e.g. gave, worked)
VVG -ing participle of lexical verb (e.g. giving, working)
VVGK -ing participle catenative (going in be going to)
VVI infinitive (e.g. to give... It will work...)
VVN past participle of lexical verb (e.g. given, worked)
VVNK past participle catenative (e.g. bound in be bound to)
VVZ -s form of lexical verb (e.g. gives, works)
XX not, n't
ZZ1 singular letter of the alphabet (e.g. A, b)
ZZ2 plural letter of the alphabet (e.g. A's, b's)
YBL punctuation tag - left bracket
YBR punctuation tag - right bracket
YCOL punctuation tag - colon
YCOM punctuation tag - comma
YDSH punctuation tag - dash
YEX punctuation tag - exclamation mark
YLIP punctuation tag - ellipsis
YQUE punctuation tag - question mark
YQUO punctuation tag - quotes
YSCOL punctuation tag - semicolon
YSTP punctuation tag - full-stop

[ Back to Contents ]


Notes on Tokenization and Display of tags

In the Sampler Corpus each orthographic word is normally preceded by its wordclass tag, an SGML w element enclosed in angle brackets. A single whitespace follows each word.

Punctuation codes are enclosed in SGML c elements.

The first sentence of file A7V reads:

<w NP1>Lebanon <w NN1>leader <w VVZ>builds <w NN1>cabinet<c YSTP>.

In the text citations in this document, for the purposes of illustrating the choice of tag in particular contexts, we omit all POS-tags except those on the item under discussion. Thus if we are exemplifying lexical verbs (VV-) we would render the above sentenct as:

Lebanon leader <w VVZ>builds cabinet.

Contracted forms

Contracted forms include enclitics and 'fused words', such as he's, she'll, don't, wanna and gimme.

In the BNC (both main and sampler corpus) the CLAWS automatic tagger breaks these forms down into separate syntactic units, giving each unit its own tag. Although this policy has, at least in some cases, resulted in some strange-looking word divisions, we nevertheless feel it to be preferable to assigning a single form to the whole orthographic word. Examples include:

<w VM>could<w VHI>'ve
<w VDZ>does<w XX>n't
<w VD0>du<w XX>n< VVI>no
<w VV0>wan< TO>na
<w VV0>gim<w PPIO1>me

Click here for full list of contracted forms

The lack of whitespace between the forms shows that these components form a single orthographic unit.

Note that in the case of ain't we have not found a suitable POS-tag to give to the first element ( ai ), and in all cases have used the unclassified tag FU for the whole orthographic word

Multi-words

`Multi-words' are, in a sense, the reverse of contracted forms. They indicate multiple word combinations which function as one wordclass - for example, a complex preposition, an adverbial, or a foreign expression naturalised into English as a compound noun. It seems linguistically preferable to assign one wordclass tag to the whole unit rather than give a separate tag to each component part. In the Sampler, the tag appears on the first word in the multi-word sequence, for example:

<w RR>of course (adverb)

To make multiword units clearer in citations given here we link the component parts by means of the underscore ( _ ) character, like this:

<w RR>of_course (adverb)
<w II>according_to (preposition)
<w NN1>persona_non_grata ('naturalised' compound noun)
<w CS>except_that (conjunction)

Click here for full list of multi-word forms and their associated tags .

Note that some multi-words can represent different categories according to context, e.g. rather than in:

Someone else should have done it <w II>rather_than me.
Her disability had enriched <w CS>rather_than restricted her life

Moreover, sometimes it is more appropriate to tag a word combination as consisting of ordinary words than as a multi-word sequence, as in the case of 'know how' below:

You <w VV0>know <w RRQ>how we used to always fight
cf. The creation of the <w NN1>know_how fund for the former Soviet Union

Words separated by slash

Words which are joined together and delimited by a slash ( / ) are not split up in tagged versions of the text:

We have adopted the following simple principle for handling such items:

Examples
A title and/or_CC an author's name
You should be a graduate in Electrical/Electronic_JJ Engineering, Physics , Mathematics , Computing or a related discipline .
A time-space matrix for each rural/social/age_FU group.


2. INTRODUCTION TO WORD CLASSES

NOUNS

choice of tags:
ND1 NN NN1 NN2
NNO NNO2 NNT1 NNT2 NNU NNU1 NNU2
NNA NNB NNL1 NNL2
NP NP1 NP2 NPD1 NPD2 NPM1 NPM2

There are two main categories of noun in the Sampler Corpus: common (mostly beginning NN- ) and proper nouns (NP-). We have moreover made distinctions for number (-1, -2 or no suffix), and presence of an additional feature in the case of locative (NNL-), titular (NNB), temporal (NNT-, NPM-, NPD-) and directional (ND) nouns, and for units of measurement (NNU-). While this level of detail can be highly informative, it also means that borderlines have had to be drawn between the various categories, to achieve consistency of application.

Number

Singular nouns end in -1; plural nouns -2. Nouns such as fish, which is morphologically invariant for number, and government, which can take either a singular or plural verb, (so-called 'neutral for number') have no numeric suffix:

Take a <w NN1>shower and two <w NN2>glasses of ice-cold <w NN1>water
The <w NP2>Jones arrived yesterday
The <w NN>government is (/ are) recommending changes in higher education.
Mmm... the <w NN>fish is excellent.
We were fifteen <w NNU2>miles away from the nearest town.
The <w NP2>Alps

Common nouns

We make no special distinction between common nouns (eg water, cheese) that can be mass (or 'non-count') nouns, and other common nouns. All are tagged NN1 when singular and NN2 when plural:

<w NN1>Cheese is good for you.
One <w NN1>car is enough for a family.

Three <w NN2>cheeses.
Three <w NN2>cars.

Proper nouns

  1. Personal, company and place names

    The three main categories of Proper noun we apply are PERSONAL, COMPANY and GEOGRAPHICAL names.

    <w NP1>Harold, <w NP1>Jane, <w NP1>Turner
    <w NP1>London, <w NP1>New <w NP1>York, <w NP1>Africa
    <w NP1> IBM, <w NP1>Glaxo, <w NP1>Minolta

    A person's initials preceding a surname are tagged NP1, just as the surname itself:

    <w NP1>E.M. <w NP1>Forster

  2. Names of countries and states

    The following compound names are treated as individual proper nouns

    <w NP1>United <w NP1>States
    <w NP1>United <w NP1>Kingdom

    BUT we retain ordinary tags for.....?

    Soviet Union , Union of Soviet Socialist Republics ... Republic (Dominican Republic?) British Isles

  3. Names of seas and oceans

    Baltic Atlantic Pacific
    BUT Indian Ocean , Irish Sea

  4. Other geographical features

    The <w NP2>Alps

  5. Nouns of style

    Preceding a proper noun, or sequence of proper nouns, these are tagged NNB.
    <w NNB>Miss <w NP1>Pamela <w NP1>S <w NP1>Jones
    <w NNB>Archbishop <w NP1>Runcie
    <w NNB>Pastor <w NP1>Tukes
    <w NNB>Chairman <w NP1>Mao
    <w NNB>Sub-Lieutenant <w NP1>A <w NP1>J <w NP1>Morris

  6. Product names

[Back to Contents ]


VERBS

choice of tags:
VB0 VBDR VBDZ VBG VBI VBM VBN VBR VBZ
VD0 VDD VDG VDI VDN VDZ
VH0 VHD VHG VHI VHN VHZ
VM VMK
VV0 VVD VVG VVGK VVI VVN VVNK VVZ

  1. Inflection is marked by the final (third, or in the case of am and are, fourth) character.
    -0 base form finite-Z 3rd person sing-M 1sg (BE only)
    -D past tense-N past participle-R 1pl/2pl/3pl (BE only)
    -I infinitive-G present participle
  2. All forms of BE, HAVE and DO receive tags beginning VB-, VH- and VD- respectively.
    We do not differentiate between auxiliary and main uses of such verbs:

    She <w VBZ>is writing to her MP.
    <w VB0>Be calm, as I <w VBM>'m sure he will come.
    John <w VHZ>'s sent three letters.
    We <w VH0>have a problem.
    He <w VDD>did n't care.
    They <w VD0>do nice chocolates.

    Note that Subjunctives receive base form tags.
    She ordered that they <w VB0>be taken away.

  3. Modals

    All modals are tagged VM. We make no distinction between so-called past and present forms:

    We <w VM>can go
    We <w VM>could go.

  4. Lexical verbs

    Tags beginning VV- apply to all other (lexical) verbs.

    She <w VVZ>goes ; They <w VV0>want to take the bus
    After <w VVG>sitting here for hours, we <w VVD>left
    <w VVN>Left to our own devices, we decided to <w VVI>get on with it.

  5. Catenative or semi-auxiliary verb forms have a -K suffix.
    In the Sampler Corpus, catenative verbs are limited to going + to, ought + to, and used + to.

    We're <w VVGK>going to fight this all the way
    They <w VMK>used to play rugby

  6. Contracted forms (can't, won't, gimme, dunno etc) can be problematic, because it is not always obvious if and where they should be divided. See above on contracted forms

    Click here for full list of contracted forms

See further - Guide to Disambiguation
Section 3 VV0 vs VM
Section 4 got, dare, let, used

[ Back to Contents ]


ADJECTIVES

choice of tags: JJ JJR JJT JK

The number of tag distinctions made for adjectives is limited to four. However, ambiguities frequently arise between adjectives and other worclasses, in particular adverbs, nouns and participles.

  1. General adjectives (JJ-)

    The general tag for adjective is JJ. We make no distinction between predicative and attributive uses:

    The ground was <w JJ>dry and <w JJ.>dusty
    The colonel prodded the <w JJ>dry ground.


    Comparative adjectives receive the tag JJR;
    Superlative adjectives receive JJT.

  2. Quasi-comparatives and quasi-superlatives

    Adjectives which have a heightening or downtoning effect rather like that of comparatives and superlatives, but which do not behave syntactically like comparatives or superlatives, are treated as ordinary adjectives. Examples include utter, upper and uppermost, which are acceptable in these examples

    Events in Eastern Europe were still <w JJ>uppermost in Mr Li's mind.
    This won't affect the <w JJ>upper classes

    BUT not these:

    * It was an utter shambles than I have ever seen.
    (cf It was a worse shambles than I have ever seen.)
    * The salmon pool is upper than the dam.
    (cf The salmon pool is lower than the dam.)

  3. JK is used for able and unable in a catenative context (Cf VMK, VVGK)

    Will you be <w JK>able to manage?

    In other contexts able and unable are classed as general adjectives, eg:

    Your son is very <w JJ>able

See further - Guide to Disambiguation
Section 3 ADJECTIVE vs PARTICIPLE JJ vs VVG, JJ vs VVN
Section 3 ADJECTIVE vs NOUN JJ vs NN1
Section 3 ADJECTIVE vs ADVERB JJ vs RR, JJR vs RRR
Section 4 double, well, right

[ Back to Contents ]


ADVERBS

choice of tags: RA REX RG RGQ RGQV RGR RGT RL

Adverbs constitute one of the most heterogeneous lexical categories in English, and to some extent this is reflected in the wide range of tags included.

As well as the general adverb RR, we provide tags for degree adverbs (very, too etc.) (RG), prepositional adverbs/particles (RP), locative adverbs (RL), adverbs of time (RT), post-nominal adverbs (inclusive, 79BC, galore etc.) (RA). For the first two of these tags, a comparative and superlative also exists (RRR, RRT, RGR, RGT). Further adverb-tags are listed in Section 1b.

Examples
From 1922 to 1977 <w RA>inclusive
<w RL>Here, <w RL>there and <w RL>everywhere
They drove <w RRR>faster

  1. RP Prepositional Adverb - see Prepositions

  2. RG Degree Adverb - in the main this is limited to a small set of words which do not occur also as general adverbs.

Within the adverb class one sometimes encounters a difficult choice between a more general and a more specific tag, eg: RG vs RR, RGQ vs RRQ RL vs RR, RR vs MD.
In most instances we default to the more general tag, but some notable exceptions are the following words: so, too, quite, rather

See further - Guide to Disambiguation
Section 2 ADVERB vs ORDINAL
Section 2 DEGREE ADVERB vs GENERAL ADVERB
Section 3 ADVERB vs ADJECTIVE RR vs JJ, RRR vs JJR )
Section 3 COMPARATIVE ADVERB vs DETERMINER
Section 3 ADVERB vs PREPOSITION ( )
Section 4 about, as, but, much, no, so, when

[ Back to Contents ]


ARTICLES, DETERMINERS & PRONOUNS

choice of tags:
AT AT1 APPGE
DA DA1 DA2 DAR DAT DB DB2 DD DD1 DD2 DDQ DDQGE DDQV
PN PN1 PNQO PNQS PNQV PNX1 PPGE PPH1 PPHO1 PPHO2 PPHS1 PPHS2

  1. Determiner-Pronoun:Tags beginning D-

    Recognising that there is a large amount of formal and functional overlap between determiners and pronouns, we have conflated under the D- heading words that are capable of either function, such as that, few, both, another.

    Examples:
    at <w DB>all times of the year
    free secondary education for <w DB>all
    <w DA2>Few diseases are incurable
    for the benefit of the <w DA2>few

    The D- type words are subdivided according to the positions in which they would occur in a complex noun phrase.
    DD- indicates a central determiner which appears in ordinary position, ie before other modifiers and the head noun (eg <w DD>some children, <w DD>some new plates )
    DB- indicates a pre-determiner, ie a determiner coming before the noun phrase and any other determiners. (eg <w DB>all the children, <w DB>all of their recommendations)
    DA- indicates a post-determiner, ie a determiner coming after any other determiners ( eg few in a <w DA2>few biscuits, many in the <w DB>many plates ).

    Click here for full list of D- tagged words.

  2. `Pronoun'-only words

    Tags beginning P- indicate pronouns which do not share the determiner function, eg I, it , anyone. The main attributes we recognise with regard to pronouns are : personal or indefinite (PP- or PN-), case (nominative= -S-, accusative = -O-) and number (singular=1, plural=2). Examples include:

    Click here for full list of P- tagged words

    APPGE is the prenominal possessive pronoun (my, your, etc).
    The initial A- signals that it shares the position of articles.

  3. Articles, tagged AT, AT1

    The, a/an, no and every are given separate status as articles rather than determiner-pronouns, since they never function pronominally.

  4. Relative pronouns

    Which as a relative or interrogative pronoun is grouped with the other determiner-pronouns, and tagged DDQ
    <w DDQ >Which flavour do you want?
    The details <w DDQ >which I have been able to gather are inconclusive.

    Meanwhile, that as a relative clause complementizer is treated with that as a complement clause complementizer, and tagged CST
    This is the news <w CST>that we dreaded.
    Jim decided <w CST>that enough was enough.

    Note however that that does take a D- tag, namely DD1, when it functions as a demonstrative pronoun or a determiner.

For D-tagged words, the main source of ambiguity is between determiners and adverbs. See
Section 3: DAR vs RRR ('more' and 'less') and
Section 4: much; no; that

Note also the expressions:
a little
a great/good many
a lot

[ Back to Contents ]


PREPOSITIONS AND PREPOSITIONAL ADVERBS

choice of tags:
IF II IO IW
RP RPK

  1. Prepositions

    Most prepositions are tagged II, including a large number of complex prepositions.

    Examples
    <w II>in Paris ;
    <w II>as a rule

    <w II>according_to the Bible
    <w II>other_than that, I would agree with you
    That's certainly something to think <w II>about.

    Click here for full list of II words

    More specific tags are used as follows:

    IW denotes with (or wi'), without
    IO denotes of
    IF denotes for

  2. Prepositional adverbs/particles

    We assign the tag RP to a preposition-type word which has no complement. Typical uses of RP are in phrasal verb constructions, or when it functions as a place adjunct. e.g.
    there's a lot of it <w RP>about these days
    Don't give <w RP>up on us just yet.

    The following is a full list of possible RP words:

     'bout about along around back by down in off 
    on out over round through thru to under up 
    

    Of the above list all except back allow also a prepositional reading.
    Thus there are many instances of ambiguity between II and RP. (See below)

    Note the special use of about in the catenative construction be about to:

    We were <w RPK>about to climb on the bus when suddenly it shot away.

See further - Guide to Disambiguation
Section 3 Preposition vs Adverb Particle vs Locative Adverb (II vs RP)
Section 4 but, about

[ Back to Contents ]


CONJUNCTIONS

choice of tags: CCB CC CS CSA CSN CST CSW

We have maintained the traditional division of conjunctions into coordinating and subordinating types.

  1. The tags CC (and, or) CCB (but) denote coordinators.

  2. Tags beginning CS- denote subordinators.

    CS is the default tag (eg before, since, because, and the compound as soon as.)
    CSA applies to as when it introduces a subordinate clause, or the second operator in a comparative (as...as.., same...as) construction.
    CSN applies to than (in any context except certain multiwords)
    eg other than, more than
    CST applies to that, introducing reported speech (and also relative clauses)
    CSW applies to whether and if when they appear in indirect questions.

Examples:

<w CS>When you have finished, give me a call.
Alexander rejoiced <w CS>after he heard the news.
<w CSA>As the war is nearly over, we should start thinking about truce.
Come over <w CS>as_soon_as you can.
It was a much bigger catch <w CSN>than she could handle.
Glaxo announced <w CST>that half-year profits were up from 1989.
Tell me <w CSW>whether (or <w CSW>if) you want to come along.

Click here for full list of CS-tagged words and compounds.

See further - Guide to Disambiguation
Section 4 so, as, that

[ Back to Contents ]


NUMERALS

choice of tags:
MC MC1 MCMC MCGE MF
MD
NNO NNO2

  1. Cardinal numbers and fractions
    receive tags beginning MC- , eg.
    Even when a number has nominal function we tag it as MC or MC1. Examples:

    put a '<w MC>3' in the box.
    <w MC1>one in <w MC>ten students
    in <w MC>1991
    scored <w MC>11.05

  2. Numeral nouns eg hundred, thousands, dozen, gross
    receive tags NNO when singular in form, or NNO2 when plural.

    These are treated differently from other numbers because they are nouns both morphologically (taking plural endings) and syntactically (acting as head of a noun phrase - eg being preceded by determiners or even adjectives, eg a good hundred).

  3. Ordinal numbers are assigned MD in all syntactic positions, including adverbial positions, as in
    He came fourth in the race.
    Note that MD is also assigned to the less overtly numeric words like next and last, even in clear adverbial, adjectival or nominal contexts. This is because next and last function like ordinals both syntactically and semantically.

  4. Currency expressions

    Currency expressions, consisting of numbers and a unit of measurement of some kind, are assigned a nominal tag, NNU1 (singular), NNU2 (plural), or NNU (neutral for number)

    <w NNU>6kg;
    <w NNU>£600
    <w NNU>12.5%
    <w NNU2>12&ins;

  5. Other mixtures of numeric and alphabetic characters are assigned FO (formulaic) tags

    Page <w FO>7a
    Serial no. <w FO>909X44T
    <w FO>A4 sheets
    Just drive up the <w FO>M1

The main ambiguity in this category is between one functioning as a cardinal number (MC1) and as a pronoun (PN1).

[ Back to Contents ]


MISCELLANEOUS OTHER TAGS

The following tags are included here: GE XX TO BCL ZZ1 ZZ2 FO FW FU