About
This site gathers slides and case studies for the Master in Corpus Linguistics course at KU Leuven, Fall of 2022 (taught by Mariana Montes).
Schedule
Class number | Date | Topic | Material in website | Assignment |
---|---|---|---|---|
1 | 29/09 | Introduction | Slides | Install software and packages |
2 | 06/10 | Reading corpora and using Git | Slides | Corpus description |
3 | 13/10 | Contingency tables | Slides | Cross-references |
4 | 20/10 | Association measures | Slides | Tibble manipulation |
5 | 27/10 | Linear regression | Slides | The Grammar of Graphics |
6 | 03/11 | Logistic regression | Slides | Markdown |
7 | 10/11 | Correspondence analysis | Slides | MCLM Tutorials |
8 | 17/11 | Factor analysis | Slides | Paper proposal |
9 | 24/11 | Collocations and keywords: example | Study | Citations |
10 | 01/12 | Retrieval and analysis of variants in alternation | Study | Interlinear glosses |
11 | 08/12 | Case study with logistic regression | Slides and study | NA |
12 | 15/12 | Lectometry example | Study | Check out sample paper |
13 | 22/12 | Register analysis | Study | TBD |
Techniques and research questions
Technique | Corpus | RQ |
---|---|---|
Collocation analysis | Full corpus, with or without annotation | What words does a certain word co-occur with? |
Keyword analysis | Full corpus, with or without annotation | What words characterize a given text or subcorpus? |
Variationist analysis (with logistic regression) - variants | Concordance and annotation of factors | What factors influence the choice of a variant over its alternative? |
Analysis of varieties (with correspondence analysis) | Full corpus, with or without annotation | How are varieties distinguished based on the frequency of certain features (e.g. words)? |
Multidimensional register analysis (with factor analysis) | Full corpus with annotation | What dimensions underlie the features and how do they characterize registers? |
ConcAnnotator
For manual annotation, you may use the desktop tool concAnnotator. Unfortunately, while there exist installation files for different platforms (to download from here), they are not certified, so your system will probably reject them. Alternatively, you can install them in development mode, which requires a bit more work and a Node JS installation. If you’re up to it, here are the instructions:
Install Node JS if you don’t have it already. You can check if you have it by typing
node -v
in a console (Windows PowerShell in Windows, Terminal in Mac).Open a console and go to a directory where you want to download the code. For example, from the console type
cd
, to sit in the home directory of your computer, and thenls
to see which files are present there.1 You might have a “Downloads” directory. You can now docd Downloads
to go to that directory.Clone the repository with
git clone <url>
; the url depends on whether you use SSH for authentication or not:If you use SSH, type
git clone git@github.com:montesmariana/concAnnotator.git
.If you don’t use SSH, type
git clone https://github.com/montesmariana/concAnnotator.git
.
The previous step should have created a “concAnnotator” directory. Go to it with
cd concAnnotator
.Install the node dependencies with
npm install
.Create your own installation with
npx electron-forge make
.
In sum:
cd
cd Downloads
git clone git@github.com:montesmariana/concAnnotator.git # or the other url
cd concAnnotator
npm install npx electron-forge make
The last step should create an “out” folder that has two folders inside: “make” and “concannotator…” (the “…” depends on your OS). You can now go to the concAnnotator folder with your normal file explorer, go into “out” and then “concannotator…”, and you will find an application file. Clicking on it should start the program. From here on, unless I make updates to the software, you don’t need to do anything else, just click on the file to start the program.
Footnotes
cd
is a command to change directories;ls
shows you the contents of a directory.↩︎