-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
4338d14
commit 964ca09
Showing
45 changed files
with
638 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
{ | ||
"Name": "1000 Novels Corpus", | ||
"URL": "http://hdl.handle.net/11321/312", | ||
"Family": "Literary corpora", | ||
"Description": "This corpus is available for download from CLARIN-PL.", | ||
"Languages": ["pol"], | ||
"License": "CC-BY 4.0", | ||
"Size": ["1000 texts"], | ||
"Annotation": [], | ||
"Access": { | ||
"Download": "http://hdl.handle.net/11321/312" | ||
}, | ||
"Publication": "" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
{ | ||
"Name": "1000PLUS Novels Corpus (1.0)", | ||
"URL": "http://hdl.handle.net/11321/699", | ||
"Family": "Literary corpora", | ||
"Description": "This corpus is available for download from CLARIN-PL.", | ||
"Languages": ["pol"], | ||
"License": "CC-BY-SA 3.0", | ||
"Size": ["1000 texts", "17,352,826 words"], | ||
"Annotation": [], | ||
"Access": { | ||
"Download": "http://hdl.handle.net/11321/699" | ||
}, | ||
"Publication": "" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
{ | ||
"Name": "Electronic corpus of 15th-century Castilian cancionero manuscripts", | ||
"URL": "http://hdl.handle.net/11372/LRT-873", | ||
"Family": "Literary corpora", | ||
"Description": "This is a lyric corpus of 15th century cancioneros.\nThe corpus is available for online browsing through an external interface.", | ||
"Languages": ["spa"], | ||
"License": "", | ||
"Size": [], | ||
"Annotation": [], | ||
"Access": { | ||
"Browse": "http://cancionerovirtual.liv.ac.uk/" | ||
}, | ||
"Publication": "" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
{ | ||
"Name": "Late 19th- and Early 20th-Century Polish Novels ", | ||
"URL": "http://hdl.handle.net/11321/57", | ||
"Family": "Literary corpora", | ||
"Description": "This corpus is available for download from CLARIN-PL.", | ||
"Languages": ["pol"], | ||
"License": "CC-BY 3.0", | ||
"Size": [], | ||
"Annotation": [], | ||
"Access": { | ||
"Download": "http://hdl.handle.net/11321/57" | ||
}, | ||
"Publication": "" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
{ | ||
"Name": "aformes", | ||
"URL": "http://hdl.grnet.gr/11500/UOA-0000-0000-2575-3", | ||
"Family": "Literary corpora", | ||
"Description": "This corpus contains fiction texts from a journal of undergraduate creative writing at the Faculty of English Language and Literature.\nThe corpus is available for download from clarin:el.", | ||
"Languages": ["ell","eng"], | ||
"License": "CC-BY-NC", | ||
"Size": ["376,250 words"], | ||
"Annotation": [], | ||
"Access": { | ||
"Download": "http://hdl.grnet.gr/11500/UOA-0000-0000-2575-3" | ||
}, | ||
"Publication": "" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
{ | ||
"Name": "Complete Corpus of Anglo-Saxon Poetry", | ||
"URL": "http://hdl.handle.net/11372/LRT-867", | ||
"Family": "Literary corpora", | ||
"Description": "This corpus is available for online browsing through an external interface.", | ||
"Languages": ["ang"], | ||
"License": "", | ||
"Size": [], | ||
"Annotation": ["none"], | ||
"Access": { | ||
"Browse": "https://www.sacred-texts.com/neu/ascp/" | ||
}, | ||
"Publication": "" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
{ | ||
"Name": "Anthology of Middle English texts / Santiago Gonzalez y Fernandez-Corugedo", | ||
"URL": "http://hdl.handle.net/20.500.14106/1398", | ||
"Family": "Literary corpora", | ||
"Description": "This corpus contains literary texts from 1100 to 1400.\nThe corpus is available for download from the Oxford Text Archive.", | ||
"Languages": ["enm"), "heb"], | ||
"License": "Oxford Text Archive Licence", | ||
"Size": ["4,000 words"], | ||
"Annotation": [], | ||
"Access": { | ||
"Download": "http://hdl.handle.net/20.500.14106/1398" | ||
}, | ||
"Publication": "" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
{ | ||
"Name": "Bonnier novels I (1976/77) (2017-10-04)", | ||
"URL": "http://hdl.handle.net/10794/115", | ||
"Family": "Literary corpora", | ||
"Description": "This corpus presents 69 Bonnier novels from 1976-77.\nThe corpus is available for download from SWE-CLARIN and for online browsing through Korp.", | ||
"Languages": ["swe"], | ||
"License": "CC-BY 4.0", | ||
"Size": ["6,578,675 tokens", "462,625 sentences"], | ||
"Annotation": ["sentence scrambling"], | ||
"Access": { | ||
"Browse": "https://spraakbanken.gu.se/korp/#corpus=romi" | ||
"Download": "http://hdl.handle.net/10794/115" | ||
}, | ||
"Publication": "" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
{ | ||
"Name": "Bonnier novels II (1980/81) (2017-03-17)", | ||
"URL": "http://hdl.handle.net/10794/116", | ||
"Family": "Literary corpora", | ||
"Description": "This corpus presents 60 Bonnier novels from 1980-81.\nThe corpus is available for download from SWE-CLARIN and for online browsing through Korp.", | ||
"Languages": ["swe"], | ||
"License": "CC-BY 4.0", | ||
"Size": ["4,304,271 tokens", "298,361 sentences"], | ||
"Annotation": ["sentence scrambling"], | ||
"Access": { | ||
"Browse": "https://spraakbanken.gu.se/korp/#corpus=romii" | ||
"Download": "http://hdl.handle.net/10794/116" | ||
}, | ||
"Publication": "" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
{ | ||
"Name": "Classics of English and American Literature in Finnish (CEAL)", | ||
"URL": "http://urn.fi/urn:nbn:fi:lb-2016110901", | ||
"Family": "Literary corpora", | ||
"Description": "This corpus contains Finnish translations of the following three texts: Jane Austen: Ylpeys ja ennakkoluulo (Pride and Prejudice), translated by Kersti Juva, Teos 2013; Henry James: Washingtonin aukio (Washington Square), translated by Kersti Juva, Otava 2003; Charles Dickens: Kolea talo (Bleak House), translated by Kersti Juva, Tammi, 2006.\nThe corpus is available for online browsing through Korp in two versions - Version 1 (Sentences and Paragraphs in the Original Order) and Version 2 (Scrambled Paragraphs))", | ||
"Languages": ["fin"], | ||
"License": "CLARIN RES + NC", | ||
"Size": ["3 novels", "484,010 tokens"], | ||
"Annotation": ["MSD-tagged", "syntactically parsed"], | ||
"Access": { | ||
"Browse (original)": "http://urn.fi/urn:nbn:fi:lb-2018011201" | ||
"Browse (scrambled)": "http://urn.fi/urn:nbn:fi:lb-2018011202" | ||
}, | ||
"Publication": "" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
{ | ||
"Name": "Corpus of Finnish Literary Classics", | ||
"URL": "http://hdl.handle.net/11372/LRT-773", | ||
"Family": "Literary corpora", | ||
"Description": "This corpus contains works by established Finnish fiction writers from the 1880s to the 1930s. There are different types of prose and plays, as well as lyrics and aphorisms.\nThis corpus is available for online browsing through an external interface.", | ||
"Languages": ["fin"], | ||
"License": "", | ||
"Size": ["1,456,658 words"], | ||
"Annotation": [], | ||
"Access": { | ||
"Browse": "http://kaino.kotus.fi/korpus/klassikot/meta/klassikot_coll_rdf.xml" | ||
}, | ||
"Publication": "" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
{ | ||
"Name": "Classics Library of the National Library of Finland - Kielipankki version", | ||
"URL": "http://urn.fi/urn:nbn:fi:lb-2018051701", | ||
"Family": "Literary corpora", | ||
"Description": "This corpus contains literary texts from 1549 to 1944.\nThe corpus is available for online browsing through FIN-CLARIN.", | ||
"Languages": ["fin","swe"], | ||
"License": "CC-BY", | ||
"Size": [], | ||
"Annotation": [], | ||
"Access": { | ||
"Browse": "https://www.kielipankki.fi/corpora/nlfcl-fi-authors/" | ||
}, | ||
"Publication": "" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
{ | ||
"Name": "Corpus of Early Literary Finnish", | ||
"URL": "http://hdl.handle.net/11372/LRT-772", | ||
"Family": "Literary corpora", | ||
"Description": "The corpus of Early Modern Finnish contains Finnish-language works in various fields published during the 19th century, annual issues of the oldest periodicals and newspapers, almanac and decree texts, and some dictionaries. An effort has been made to include the earliest, most important and (based on the number of reprints, for example) most widely distributed works. The selection of publications has also been made with a view to achieving the widest possible thematic coverage, although more works originally written in Finnish have been included than translations. These have been alphabetised by the name of their translator, seasonal publications by their title, and other works by their author. The Finnish translations of unknown authors are in the Anonymous folder, the texts of unknown authors in the Other folder. The materials cover the period between Old and Modern English and a little beyond. The earliest book dates from 1809, the latest from 1891, but there are texts of the regulations right up to the end of the century. However, most of the material is from 1810-1880. This later material can also be found in the Classics corpus.", | ||
"Languages": ["fin"], | ||
"License": "", | ||
"Size": [], | ||
"Annotation": [], | ||
"Access": { | ||
"Download": "" | ||
}, | ||
"Publication": "" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
{ | ||
"Name": "Corpus of Estonian fiction", | ||
"URL": "http://hdl.handle.net/10.15155/1-00-0000-0000-0000-0007EL", | ||
"Family": "Literary corpora", | ||
"Description": "This corpus contains texts from 1990 onwards.\nThe corpus is available for download from CELR.", | ||
"Languages": ["est"], | ||
"License": "CLARIN ACA - NC", | ||
"Size": ["5,768,504 words"], | ||
"Annotation": [], | ||
"Access": { | ||
"Download": "http://hdl.handle.net/10.15155/1-00-0000-0000-0000-0007EL" | ||
}, | ||
"Publication": "" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
{ | ||
"Name": "Estonian Runic Songs' Database", | ||
"URL": "http://hdl.handle.net/10.15155/9-00-0000-0000-0000-0008FL", | ||
"Family": "Literary corpora", | ||
"Description": "These are the oldest text recordings of Estonian runic songs (the text recordings were created in the 19th century and in the first decades of the 20th century). In addition to the runic songs, the database also has songs of transitional form and end-rhymed songs (about 6000).\nThe corpus is available for online browsing through an external interface.", | ||
"Languages": ["est"], | ||
"License": "CLARIN ACA", | ||
"Size": ["92,134 texts"], | ||
"Annotation": [], | ||
"Access": { | ||
"Browse": "http://www.folklore.ee/regilaul/" | ||
}, | ||
"Publication": "" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
{ | ||
"Name": "Electronic text corpus of Sumerian literature (ETCSL)", | ||
"URL": "http://hdl.handle.net/11372/LRT-874", | ||
"Family": "Literary corpora", | ||
"Description": "This corpus presents a selection of nearly 400 literary compositions recorded on sources which come from ancient Mesopotamia and date to the late third and early second millennia BCE.\nThe corpus is available for online browsing through an external interface.", | ||
"Languages": ["sux"], | ||
"License": "", | ||
"Size": ["400 literary compositions"], | ||
"Annotation": [], | ||
"Access": { | ||
"Browse": "http://etcsl.orinst.ox.ac.uk/" | ||
}, | ||
"Publication": "" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
{ | ||
"Name": "Finnish Folk Poetry", | ||
"URL": "http://urn.fi/urn:nbn:fi:lb-2014052712", | ||
"Family": "Literary corpora", | ||
"Description": "This corpus contains poems from 1564 to 1939.\nThe corpus is available for online browsing through Korp.", | ||
"Languages": ["fin", "krl", "lud", "lat", "swe", "olo", "izh", "vot"], | ||
"License": "CC-BY-NC", | ||
"Size": ["7.1 million words"], | ||
"Annotation": ["unannotated"], | ||
"Access": { | ||
"Browse": "http://urn.fi/urn:nbn:fi:lb-2014052711" | ||
}, | ||
"Publication": "" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
{ | ||
"Name": "The Finnish Gutenberg Corpus", | ||
"URL": "http://urn.fi/urn:nbn:fi:lb-2014100301", | ||
"Family": "Literary corpora", | ||
"Description": "This corpus contains Finnish books made available by the Gutenberg project. The texts have not been linguistically annotated.\nThe corpus is available for online browsing through Korp.", | ||
"Languages": ["fin"], | ||
"License": "CC-BY", | ||
"Size": ["34,487,420 words"], | ||
"Annotation": [], | ||
"Access": { | ||
"Browse": "http://urn.fi/urn:nbn:fi:lb-2014102101" | ||
}, | ||
"Publication": "" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
{ | ||
"Name": "Classics of Finnish Literature, Kielipankki Version", | ||
"URL": "http://urn.fi/urn:nbn:fi:lb-20140730186", | ||
"Family": "Literary corpora", | ||
"Description": "This corpus contains prose fiction, plays, poetry and aphorisms (some written originally in Swedish) of established Finnish authors published from 1880s to 1949.\nThe corpus is available for online browsing through Korp.", | ||
"Languages": ["fin"], | ||
"License": "EUPL v.1.1 SA", | ||
"Size": ["1,500,000 words"], | ||
"Annotation": ["syntactically parsed (TDT alpha)", "named entities (FiNER)", "MSD-tagged", "lemmatized"], | ||
"Access": { | ||
"Browse": "http://urn.fi/urn:nbn:fi:lb-2016081601" | ||
}, | ||
"Publication": "" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
{ | ||
"Name": "Greek Medieval Texts", | ||
"URL": "http://hdl.grnet.gr/11500/AEGEAN-0000-0000-251D-7", | ||
"Family": "Literary corpora", | ||
"Description": "This corpus contains medieval texts contains written material covering the period from the 4th till the 16th century A.D. The texts can be classified into the following categories: religious, poetical-literary, political-historical, hymns, epigrams.\nThe corpus is available for download from clarin:el.", | ||
"Languages": ["ell","grc"], | ||
"License": "CC-BY-NC", | ||
"Size": ["3,419,553 words"], | ||
"Annotation": [], | ||
"Access": { | ||
"Download": "http://hdl.grnet.gr/11500/AEGEAN-0000-0000-251D-7" | ||
}, | ||
"Publication": "" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
{ | ||
"Name": "Cultural Thesaurus of the Greek Language", | ||
"URL": "http://hdl.grnet.gr/11500/ATHENA-0000-0000-23E3-8", | ||
"Family": "Literary corpora", | ||
"Description": "This corpus contains prose, poetry, drama, and essays from the 18th century onwards.\nThe corpus is available for online browsing through a dedicated webpage.", | ||
"Languages": ["ell"], | ||
"License": "proprietary", | ||
"Size": ["1 million tokens"], | ||
"Annotation": ["semantic"], | ||
"Access": { | ||
"Browse": "http://www.potheg.gr/Intro.aspx?lan=2" | ||
}, | ||
"Publication": "" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
{ | ||
"Name": "Johannes V. Jensen Corpus", | ||
"URL": "http://hdl.handle.net/20.500.12115/20", | ||
"Family": "Literary corpora", | ||
"Description": "This corpus presents the collected works of the Danish author Johannes Jensen.\nThe corpus is available for download from CLARIN-DK and for online browsing through a dedicated concordancer.", | ||
"Languages": ["dan"], | ||
"License": "CC BY-SA 4.0", | ||
"Size": ["1,760,093 words", "8,489 pages"], | ||
"Annotation": ["unannotated"], | ||
"Access": { | ||
"Browse": "http://johannesvjensen.dk/jensenonline/liste-over-vaerker/" | ||
"Download": "http://hdl.handle.net/20.500.12115/20" | ||
}, | ||
"Publication": "" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
{ | ||
"Name": "Corpus of longer narrative Slovenian prose KDSP 1.0", | ||
"URL": "http://hdl.handle.net/11356/1823", | ||
"Family": "Literary corpora", | ||
"Description": "This corpus contains 262 texts of longer older Slovenian narrative prose. The texts were published between 1836 and 1918 and are at least 20,000 words long.\nThe texts have bibliographical metadata (author name, title, year of publication, length) and are classified according to the decade of publication, length, text type, text subtype, theme, and level of canonicity (texts by those authors included in school textbooks after 1980 and/or included in the Collected writings of Slovenian poets and writers, are marked with a high degree of canonicity). The metadata about the authors of the texts are provided with their gender, occupation, and years of birth and death. The corpus texts come from three digital sources, and each text is marked for its source. They are <a href=\"https://sl.wikisource.org/wiki/\">Wikisource</a> (145 texts), the <a href=\"https://github.com/COST-ELTeC/ELTeC-slv\">ELTeC corpus</a> (96 texts), and the <a href=\"https://www.dlib.si/\">dLib digital library</a> (21 texts). The corpus is provided in two variants, one containing running text and the other with added linguistic analyses. These comprise tokens, sentences, lemmas, MULTEXT-East morphosytactic descriptions and Universal Dependencies morphological features. The linguistic annotation was performed with the <a href=\"https://github.com/clarinsi/classla\">CLASSLA program</a>. The source format of the corpus in TEI/XML, with two derived formats also available: one is plain text, and the other vertical files, as used by concordances, like the CWB.\nThe corpus is available for download from CLARIN.SI as well as through the noSketchEngine and KonText concordancers.", | ||
"Languages": ["slv"], | ||
"License": "CC-BY 4.0", | ||
"Size": ["262 texts", "11 million words", "14 million tokens"], | ||
"Annotation": ["MSD-tagged (MULTEXT-East & UD)", "lemmatised", "annotated with author and text metadata"], | ||
"Access": { | ||
"Browse (noSketchEngine)": "https://www.clarin.si/ske/#dashboard?corpname=kdsp" | ||
"Browse (KonText)": "https://www.clarin.si/kontext/query?corpname=kdsp" | ||
"Download": "http://hdl.handle.net/11356/1823" | ||
}, | ||
"Publication": "" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
{ | ||
"Name": "Aleksis Kivi Corpus (SKS)", | ||
"URL": "http://urn.fi/urn:nbn:fi:lb-201405274", | ||
"Family": "Literary corpora", | ||
"Description": "This corpus contains all the known letters, manuscripts and published works by Finnish author Aleksis Kivi (1834–1872). Most of the texts were written in Finnish while some of the letters and manuscripts are in Swedish. The time coverage of the texts: 1855-1871.\nThe corpus is available for online browsing through Korp.", | ||
"Languages": ["fin","swe"], | ||
"License": "CC-BY-NC", | ||
"Size": ["413,735 words"], | ||
"Annotation": ["MSD-tagged", "syntactically parsed"], | ||
"Access": { | ||
"Browse": "http://urn.fi/urn:nbn:fi:lb-2016121604" | ||
}, | ||
"Publication": "" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
{ | ||
"Name": "Latvian literature classics", | ||
"URL": "http://hdl.handle.net/11372/LRT-184", | ||
"Family": "Literary corpora", | ||
"Description": "This corpus presents classics from the end of the 19th century to the beginning of the 20th century.", | ||
"Languages": ["lat"], | ||
"License": "", | ||
"Size": [], | ||
"Annotation": [], | ||
"Access": { | ||
"Download": "" | ||
}, | ||
"Publication": "" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
{ | ||
"Name": "LT Corpus", | ||
"URL": "http://hdl.handle.net//21.11115/0000-000B-D33D-3", | ||
"Family": "Literary corpora", | ||
"Description": "This corpus contains 70 copyright-free classics (61 Portugal and 9 Brazil) published before 1940.\nThe corpus is available for download from PORTULAN.", | ||
"Languages": ["por"], | ||
"License": "CLARIN RES", | ||
"Size": ["1,781,083 words"], | ||
"Annotation": ["PoS-tagged", "lemmatized"], | ||
"Access": { | ||
"Download": "http://hdl.handle.net/21.11115/0000-000B-D33D-3" | ||
}, | ||
"Publication": "" | ||
} |
Oops, something went wrong.