
Lauren Fonteyn
Linguist, Illustrator
I am a cognitive linguist interested in using quantitative methods to get a firmer understanding of variation and change in language and culture. Some of my most recent work focusses on grammatical (morpho-syntactic) variation at the level of individual language users: Do native speakers of the same language abide by the same grammatical rules, or can the grammatical knowledge of an individual be unique? Do individuals follow different grammatical rules in different communicative settings? And is it possible to find evidence for ‘grammatical lifespan change’?
My main expertise lies in working with (historical) linguistic/text corpora, and recently I have started exploring how machine learning models can help quantitatively investigate language variation and change, including the perhaps-not-so-easily-quantifiable aspects of language. You will find my publications on these subjects under Publications (both academic and popularizing – scroll down for the latter!). For any questions, you may contact me at my current work e-mail or at my private e-mail.
In my current position at Leiden University, a significant portion of my time is spent on teaching in the English Language and Culture and Linguistics programmes, as well as being part of the Young Academy Leiden. In what little time there is left, I occasionally make illustrations – some to liven up articles, and some just for fun. You’ll find some examples including a link to the accompanying articles under Illustrations.
Short CV
- 2018 - Now : Leiden University, Assistant Professor in English Linguistics
- 2016 - 2018 : University of Manchester, Assistant Professor in English Linguistics
- 2012 - 2016 : University of Leuven (KU Leuven), PhD in Linguistics
Featured Publications
- On the probability and direction of morphosyntactic lifespan change: Language Variation and Change, 2022
- Individuality in syntactic variation: An investigation of the seventeenth-century gerund alternation: Cognitive Linguistics, 2020
- Adapting vs. Pre-training Language Models for Historical Languages: Journal of Data Mining and Digital Humanities, 2022
- Varying Abstractions: a conceptual vs. distributional view on prepositional polysemy: Glossa, 2021
See here for a complete list of publications.
Current Projects
- MacBERTh and GysBERT, Language Models for Historical English and Dutch (PDI-SSH 2020): In this project, we created two language models (more specifically, BERT models) pre-trained on historical textual material (date range: 1450-1950). Researchers who interpret and analyse historical textual material are well-aware that languages are subject to change over time, and that the way in which concepts and discourses of class, gender, norms and prestige function in different time periods. As such, it is quite important that the interpretation of textual/linguistic material from the past is not approached from a present-day point-of-view, which is why NLP models pre-trained on present-day language data are less than ideal candidates for the job. This is where historical models like MacBERTh and GysBERT can help. As of 2021, MacBERTh (pre-trained on historical English, 1450-1950) has been published in the huggingface repository. As of 2022, GysBERT (pre-trained on historical Dutch, 1500-1950) has been published in the huggingface repository.
- Complexity in Complementation. Understanding Long-term Change in Verb Complementation in terms of Inter- and Intra-individual Variation : This project investigates the impact of interindividual differences in cognitive representations on long-term population-level language change. Focussing on the system of English verb complementation, the study seeks to offer a unified cognitive and socio-linguistic account of syntactic variation and change that has so far been missing. In particular, it investigates (i) idiolectal variation in the constraints on alternating complementation patterns; (ii) the interaction between the linguistic behaviour of individuals and the (changing) distribution of grammatical variants at the population-level across different time stages. The research is carried out by modelling the observed patterns of stability or diffusion in terms of the weakening and strengthening of grammatical and cognitive constraints within a socially homogeneous group of individuals.