The LT Corpus (Literary Corpus) contains approximately 1,781,083 running words of European and Brazilian Portuguese. It includes 70 copyright-free classics (61 Portugal and 9 from Brazil) published before 1940.
This resource includes a spoken corpus with approximately 300.000 words, covering both formal (152.755 words) and informal (165.838 words) speech, with aligned sound and orthographic transcription and POS-tag information.
LX-Suite is a freely available online service for the shallow processing of Portuguese. It was developed and is maintained by the NLX-Natural Language and Speech Group at the University of Lisbon, Department of Informatics. LX-Suite is composed by a set of shallow processing tools: - LX Sente...
Filter by:
Written Language (8)
Spoken Language (1)
Lemmatization (13)
Human Use (4)
Pos Tagging (11)
Text Mining (5)
Parsing (4)
Annotation (3)
Lexicon Access (3)
Event Extraction (2)
Other (1)
Speech Analysis (1)