Developed at the University of Lisbon, Dept. of Informatics, by the NLX-Natural Language and Speech Group.
Lx-Tagger is a part-of-speech tagger for Portuguese that assigns a single morpho-syntactic tag, from the tagset below, to every token.
The tag is attached to the token, using a /
(slash) symbol as separator:
um exemplo → um/IA exemplo/CN
Each individual token in multi-token expressions of closed POS classes
gets the tag of that expression prefixed by "L" and followed by the number of its position within the expression:
de maneira a que → de/LCJ1 maneira/LCJ2 a/LCJ3 que/LCJ4
For an online demo of this tool, check here
Tag | Category | Examples |
---|---|---|
ADJ | Adjectives | bom, brilhante, eficaz, … |
ADV | Adverbs | hoje, já, sim, felizmente, … |
CARD | Cardinals | zero, dez, cem, mil, … |
CJ | Conjunctions | e, ou, tal como, … |
CL | Clitics | o, lhe, se, … |
CN | Common Nouns | computador, cidade, ideia, … |
DA | Definite Articles | o, os, … |
DEM | Demonstratives | este, esses, aquele, … |
DFR | Denominators of Fractions | meio, terço, décimo, %, … |
DGTR | Roman Numerals | VI, LX, MMIII, MCMXCIX, … |
DGT | Digits | 0, 1, 42, 12345, 67890, … |
DM | Discourse Marker | olá, … |
EADR | Electronic Addresses | http://www.di.fc.ul.pt, … |
EOE | End of Enumeration | etc |
EXC | Exclamative | ah, ei, etc. |
GER | Gerunds | sendo, afirmando, vivendo, … |
GERAUX | Gerund "ter"/"haver" in compound tenses | tendo, havendo … |
IA | Indefinite Articles | uns, umas, … |
IND | Indefinites | tudo, alguém, ninguém, … |
INF | Infinitive | ser, afirmar, viver, … |
INFAUX | Infinitive "ter"/"haver" in compound tenses | ter, haver … |
INT | Interrogatives | quem, como, quando, … |
ITJ | Interjection | bolas, caramba, … |
LTR | Letters | a, b, c, … |
MGT | Magnitude Classes | unidade, dezena, dúzia, resma, … |
MTH | Months | Janeiro, Dezembro, … |
NP | Noun Phrases | idem, … |
ORD | Ordinals | primeiro, centésimo, penúltimo, … |
PADR | Part of Address | Rua, av., rot., … |
PNM | Part of Name | Lisboa, António, João, … |
PNT | Punctuation Marks | ., ?, (, … |
POSS | Possessives | meu, teu, seu, … |
PPA | Past Participles not in compound tenses | afirmados, vivida, … |
PP | Prepositional Phrases | algures, … |
PPT | Past Participle in compound tenses | sido, afirmado, vivido, … |
PREP | Prepositions | de, para, em redor de, … |
PRS | Personals | eu, tu, ele, … |
QNT | Quantifiers | todos, muitos, nenhum, … |
REL | Relatives | que, cujo, tal que, … |
STT | Social Titles | Presidente, drª., prof., … |
SYB | Symbols | @, #, &, … |
TERMN | Optional Terminations | (s), (as), … |
UM | "um" or "uma" | um, uma |
UNIT | Abbreviated Measurement Units | kg., km., … |
VAUX | Finite "ter" or "haver" in compound tenses | temos, haveriam, … |
V | Verbs (other than PPA, PPT, INF or GER) | falou, falaria, … |
WD | Week Days | segunda, terça-feira, sábado, … |
Multi-Word Expressions | ||
LADV1…LADVn | Multi-Word Adverbs | de facto, em suma, um pouco, … |
LCJ1…LCJn | Multi-Word Conjunctions | assim como, já que, … |
LDEM1…LDEMn | Multi-Word Demonstratives | o mesmo, … |
LDFR1…LDFRn | Multi-Word Denominators of Fractions | por cento |
LDM1…LDMn | Multi-Word Discourse Markers | pois não, até logo, … |
LITJ1…LITJn | Multi-Word Interjections | meu Deus |
LPRS1…LPRSn | Multi-Word Personals | a gente, si mesmo, V. Exa., … |
LPREP1…LPREPn | Multi-Word Prepositions | através de, a partir de, … |
LQD1…LQDn | Multi-Word Quantifiers | uns quantos, … |
LREL1…LRELn | Multi-Word Relatives | tal como, … |
LX-Tagger was developed and is maintained at University of Lisbon by the NLX-Natural Language and Speech Group of the Department of Informatics.
LX-Tagger builds on MXPOST.
This work was partly supported by FCT-Fundation of Science and Technology.
When mentioning this tagger, this is the reference to be used:
To use LX-Tagger you must accept the terms of this license.
You can download the system here.
Contact us using the following email address: 'nlxgroup' concatenated with 'at' concatenated with 'di.fc.ul.pt'.
LX because LX is the "code" name Lisboners like to use to refer to their hometown.