Latvian Language Learner corpus

LaVA corpus contains 1015 essays (190k tokens and 790k characters excluding whitespaces) from foreigners studying at Latvian higher education institutions who are learning Latvian as a foreign language in the first or second semester, reaching the A1 (possibly A2) Latvian language proficiency level.

Corpus developers: Ilze Auziņa, Inga Kaija, Kristīne Levāne-Petrova, Kristīne Pokratniece and Roberts Darģis.


Funding: Development of Learner Corpus of Latvian: methods, tools and applications (Project No. lzp-2018/1-0527)