LT Corpus

The LT Corpus (Literary Corpus) contains approximately 1,781,083 running words of European and Brazilian Portuguese. It includes 70 copyright-free classics (61 Portugal and 9 from Brazil) published before 1940.

Resource Type:Corpus
Media Type:Text
Language:Portuguese
LogicalFormBankPT

The LogicalFormBankPT (Branco, 2009, and Branco et al., 2011) is a corpus of semantic dependencies of translated texts composed of 3,406 sentences and 44,598 tokens taken from the Wall Street Journal. The LogicalFormBankPT is composed of MRS representations of each sentence’s semantic relation...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
Lince - Conversor para a Nova Ortografia

Lince is a multi-platform stand-alone application that updates the textual contents of documents in a range of popular formats to the spelling prescribed by the 1990 Portuguese language reform. It works with both previously existing Portuguese language orthographic standards (1943, previously val...

Resource Type:Tool / Service
Language:Portuguese
LEX-MWE-PT: Word Combination in Portuguese Language

This lexicon includes multiword expressions (MWE) of European Portuguese extracted from a balanced 50,8M word written corpus – a subcorpus of the Reference Corpus of Contemporary Portuguese (CRPC). This corpus covers different genres, being mainly constituted by journalistic texts (59%), but it a...

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Portuguese
LexMan-POSTagger

LexMan-POSTagger is a morphological analyser tool that morphologically tags all words. Size: Lemmas verbs: 12 995; Lemmas nouns and adj: 38 180; Lemmas adverbs: 7 250; Compound words: 35 201. Language: Portuguese.

Resource Type:Tool / Service
Language:Portuguese
LexMan-ChunkerTokenizer

LexMan-ChunkerTokenizer is a tokenizer and sentence splitter tool. Marks sentence boundaries, multi-word boundaries. Size: Lemmas verbs: 12 995; Lemmas nouns and adj: 38 180; Lemmas adverbs: 7 250; Compound words: 35 201. Language: Portuguese.

Resource Type:Tool / Service
Language:Portuguese
Lexicon of discourse markers for European Portuguese

The lexicon of discourse markers for European Portuguese contains 252 pairs of discourse marker/rhetorical sense. The lexicon covers conjunctions, prepositions, adverbs, adverbial phrases and alternative lexicalizations with a connective function, as in the PDTB (Prasad et al., 2008; Prasad et al...

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Portuguese
Lemmatizer for Portuguese

Based on the MXPOST part of speech tagger and UNITEX dictionaries for Portuguese, this tool produces the lemmas of the words of a text stored in a plain text file. The source code is also provided.

Resource Type:Tool / Service
Language:Portuguese
Inquirições reais

Royal inquiries of 1258 (primarily published in the Portugaliae Monumenta Historica).

Resource Type:Corpus
Media Type:Text
Language:Portuguese
Hesita-POS

Hesita-POS is an annotaded corpus. Tv News.

Resource Type:Corpus
Media Type:Text
Language:Portuguese

Order by:

Filter by:

Portuguese (193)
English (50)
German (20)
French (19)
Czech (17)
Italian (17)
Basque (14)
Bulgarian (14)
Slovak (8)
Polish (7)
Danish (6)
Finnish (6)
Irish (6)
Latvian (6)
Maltese (6)
Swedish (6)
Catalan (3)
Chinese (3)
Spanish (3)
Arabic (2)
Latin (2)
Bosnian (1)
Hindi (1)
Russian (1)
Serbian (1)
Swahili (1)
Thai (1)
Turkish (1)
Urdu (1)