ELFA corpus -- the Corpus of English as a Lingua Franca in Academic Settings University of Helsinki Suggested citation: ELFA 2008. The Corpus of English as a Lingua Franca in Academic Settings. Director: Anna Mauranen. http://www.helsinki.fi/elfa/elfacorpus. (date of last access). Contents of the material available for download: 1) ELFA_txt.zip This compressed file contains: a) the ELFA corpus in plain text format (165 .txt files, ISO 8859-1 encoding); b) a spreadsheet index of the corpus files (in .xls and .csv formats); and c) the transcription conventions of the corpus, which also explains how the files are named (.pdf). The text corpus files can be viewed in a word processor or in concordancers like AntConc or WordSmith. The ELFA text markup is pseudo-XML -- not "well-formed" XML in the technical sense, but all metadata is enclosed in . This minimal markup is designed for readability. When you want to get a word list or ngrams/collocates in a concordancer, just choose the setting where are ignored; the file headers will also be omitted from these searches. 2) ELFA_xml.zip This compressed file contains: a) The ELFA corpus in TEI P5-compliant XML (166 .xml files including the corpus header, UTF-8 encoding); and b) a spreadsheet index of the corpus files (in .xls and .csv formats). The corpus header file (ELFA_corpus_header.xml) contains full documentation of the corpus -- most of which is also found on the ELFA project website (http://www.helsinki.fi/elfa/elfacorpus.html). If you're not already familiar with XML, the .txt version is recommended. 3) ELFA_wav_audio: This folder contains the audio files on which the transcriptions are based (165 .wav files). They are available for individual download, and all files have been anonymized. All participant names that are spoken aloud and represented with tags in the transcription have been replaced with a tone. License information: * The txt and xml files are available under the CC BY License (https://creativecommons.org/licenses/by/4.0/) * The audio data is made available via CLARIN RES (http://urn.fi/urn:nbn:fi:lb-201403262) and requires a permission for download: https://lbr.csc.fi/web/guest/catalogue?domain=LBR&resource=urn:nbn:fi:lb-201403262&target=application Ray Carey 15.5.2015 Updated 3.5.2016 Martin Matthiesen