Fenno-Ugrica – Kielipankki downloadable version IPR holder: National Library of Finland Please read the end-user licence of the corpus in the file LICENCE.txt. More information on the corpus is available in META-SHARE: http://urn.fi/urn:nbn:fi:lb-201902261 The languages in the corpus and their three-letter ISO 639-3 codes are the following: – Eastern Mari: mhr – Erzya: myv – Ingrian: izh – Khanty: kca – Mansi: mns – Moksha: mdf – Nenets: yrk – Selkup: sel – Veps: vep – Western Mari: mrj The data is available in VRT format (VeRticalized Text): – Each token on its own line; texts, paragraphs and sentences marked with XML-style tags with attributes. Please note that each text element corresponds to a single page of the original work, not a single work. – The input format for Corpus Workbench (and Korp). The data corresponds to the Korp version of Fenno-Ugrica. – A zip file containing a single VRT file for each language. In file and directory names, the languages are denoted by their three-letter ISO 639-3 codes. For further information, please contact kielipankki@csc.fi.