Developed at the University of Lisbon, Dept. of Informatics, by the NLX-Natural Language and Speech Group.
features | versão portuguesa
LX-Parser (beta version) is a freely available on-line service for constituency parsing of Portuguese sentences. This service was developed and is maintained at University of Lisbon by the NLX-Natural Language and Speech Group of the Department of Informatics.
LX-Parser performs a syntactic analysis of Portuguese sentences in terms of their constituency structure.
LX-Parser is supported by the Stanford Parser. The parser developed by the Stanford University is a statistical parser that is trained over a previously annotated corpus. A total of 22118 sentences from CINTIL Treebank were used for training. This treebank is being developed and maintained at the University of Lisbon by the NLX-Natural Language and Speech Group of the Department of Informatics.
The parser uses probabilistic grammars. Under the Parseval metric it achieves an f-score of 89% (value obtained through 10-fold cross-evaluation).
Tag Meaning A Adjective AP Adjective Phrase ADV Adverb ADVP Adverb Phrase C Complementizer CL Clitics CP Complementizer Phrase CARD Cardinal CONJ Conjuction CONJP Conjuction Phrase D Determiner DEM Demonstrative N Noun NP Noun Phrase O Ordinals P Preposition PP Preposition Phrase PPA Past Participles/Adjectives POSS Possessive PRS Personals QNT Predeterminer REL Relatives S Sentence SNS Sentence with null subject V Verb VP Verb Phrase
The syntactic analyses produced by LX-Parser are similar to the analyses found in the treebank on which LX-Parser was trained. This treebank was designed along the principles described in the following handbook:
Branco António, João Silva, Francisco Costa, Sérgio Castro, 2011, CINTIL TreeBank Handbook: Design options for the representation of syntactic constituency. Department of Informatics, University of Lisbon, Technical Reports series, nb. di-fcul-tp-11-02.
Lx-Parser is being developed by Patrícia Gonçalves and João Silva, managed by António Branco, by the NLX-Natural Language and Speech Group, partly in the scope of the SemanticShare Project, funded by FCT-Fundação para a Ciência e Tecnologia.
Contact us using the following email address: 'nlx' concatenated with 'at' concatenated with 'di.fc.ul.pt'.
This work was partly supported by FCT-Fundation of Science and Technology under the grant FCT/PTDC/PLP/81157/2006 for project SemanticShare. The system uses the PHPSyntaxTree Visualizer and the Stanford Parser.
LX because LX is the "code" name Lisboners like to use to refer to their hometown.