Sample Studies
Exemplary list of publications
Atzenhofer-Baumgartner, Florian, & Tamás Kovács (2024). Is text normalization relevant for classifying medieval charters?. In International Conference on Theory and Practice of Digital Libraries (pp. 125-132). Cham: Springer Nature Switzerland. https://arxiv.org/abs/2408.16446
Abstract
This study examines the impact of historical text normalization on the classification of medieval charters, specifically focusing on document dating and locating. Using a data set of Middle High German charters from a digital archive, we evaluate various classifiers, including traditional and transformer-based models, with and without normalization. Our results indicate that the given normalization minimally improves locating tasks but reduces accuracy for dating, implying that original texts contain crucial features that normalization may obscure. We find that support vector machines and gradient boosting outperform other models, questioning the efficiency of transformers for this use case. Results suggest a selective approach to historical text normalization, emphasizing the significance of preserving some textual characteristics that are critical for classification tasks in document analysis.
Baumgarten, Stefan (2019). Sprachliche und diskursive Tendenzen einer eliminatorischen Narration: Historische und englische Übersetzungen von Mein Kampf. In O. Plöckinger (Ed.), Sprache zwischen Politik, Ideologie und Geschichtsschreibung: Analysen historischer und aktueller Übersetzungen von ‘Mein Kampf’ (pp. 51–88). Franz Steiner. https://doi.org/10.25162/9783515123808
Gerhalter, Katharina (2020). Paradigmas y polifuncionalidad: Estudio diacrónico de «preciso»/«precisamente», «justo»/«justamente», «exacto»/«exactamente» y «cabal»/«cabalmente». Berlin, Boston: De Gruyter. https://doi.org/10.1515/9783110669831
Abstract
This study presents a corpus-based semasiologial and onomasiological analysis of the paradigm of adjectives and adverbs of accuracy. Looking specifically at justo / justamente, cabal / cabalmente, preciso / precisamente, and exacto / exactamente, it is above all concerned with the development of three pragmatic-discursive functions: focalization, affirmation, and reformulation.
Heidinger, Steffen, & Richard Huyghe (2023). Semantic roles and the causative-anticausative alternation: Evidence from French change-of-state verbs. Linguistics, 62(1), 159–202. https://doi.org/10.1515/ling-2021-0207
Abstract
Change-of-state verbs are heterogeneous with respect to their occurrence in the causative-anticausative alternation. While some of them are never used as anticausatives (e.g., destroy), others seem to largely favor the anticausative form (e.g., wither). On the basis of corpus data and statistical analysis for French change-of-state verbs, we show that there is a relationship between the anticausative use of a verb and the semantic role of its transitive subject: The more frequently the transitive subject of a verb is a cause (as opposed to agent or instrument), the more frequently the verb is used as anticausative (as opposed to transitive causative). In addition to presenting this novel empirical finding, we propose an account for the observed correlation: Depending on their semantic role, causers have different likelihoods to end up in the subject position of a transitive causative sentence, and the likelihood is lower for causes than for agents. Different factors are considered responsible for the observed correlation, including the asymmetry between agents and causes concerning salience as event participants, topic-worthiness, and the possibility of being expressed as anticausative adjuncts.
Iordăchioaia, Gianina, Lonneke van der Plas, & Glorianna Jagfeld (2020). Compositionality in English deverbal compounds: The role of the head. In S. Schulte im Walde & E. Smolka (Eds.), The role of constituents in multiword expressions (pp. 61–106). Language Science Press. https://doi.org/10.5281/zenodo.3598558
Abstract
This paper is concerned with the compositionality of deverbal compounds such as
budget assessment in English. We present an interdisciplinary study on how the
morphosyntactic properties of the deverbal noun head (e.g., assessment) can pre-
dict the interpretation of the compound, as mediated by the syntactic-semantic
relationship between the non-head (e.g., budget) and the head. We start with Grim-
shaw’s (1990) observation that deverbal nouns are ambiguous between composi-
tionally interpreted argument structure nominals, which inherit verbal structure
and realize arguments (e.g., the assessment of the budget by the government), and
more lexicalized result nominals, which preserve no verbal properties or arguments
(e.g., The assessment is on the table.). Our hypothesis is that deverbal compounds
with argument structure nominal heads are fully compositional and, in our system,
more easily predictable than those headed by result nominals, since their composi-
tional make-up triggers an (unambiguous) object interpretation of the non-heads.
Linguistic evidence gathered from corpora and human annotations, and evaluated
with machine learning techniques supports this hypothesis. At the same time, it
raises interesting discussion points on how different properties of the head con-
tribute to the interpretation of the deverbal compound.
Kaltenböck, Gunther (2021). Funny you should say that: On the use of semi-insubordination in English. Constructions and Frames 13(1), 126–159. https://doi.org/10.1075/cf.00049.kal
Abstract
This paper investigates the formal and functional properties of so-called semi-insubordination (SIS), i.e. complex sentences with an ‘incomplete’ matrix clause (e.g. Funny that you should say that), on the basis of corpus data. It is shown that SIS differs in its function from the structurally related constructions it-extraposition and exclamatives, exhibiting its own functional profile: viz. expressing a subjectivizing speaker evaluation which is non-exclamative, deictically anchored, and relates to a non-presupposed proposition. Given these functional idiosyncrasies it is argued that SIS is best analysed as a construction in its own right (in terms of Construction Grammar) rather than simply an incomplete elliptical structure.
Kelterer, Anneliese, Margaret Zellers, & Barbara Schuppler (2023). (Dis)agreement and preference structure are reflected in matching along distinct acoustic-prosodic features. In Proc. Interspeech 2023 (pp. 4768–4772). https://doi.org/10.21437/Interspeech.2023-1538
Abstract
This paper presents an investigation of acoustic-prosodic alignment in conversational speech and its relationship to functional inter-speaker alignment. While most previous research studied global alignment over whole conversations between strangers, the focus of this paper is on alignment between friends, partners and colleagues as a more local phenomenon related to affiliation and preference structure. Based on 359 turn-pairs from assessment sequences, we analyzed three prosodic matching features between adjacent turns in logistic and linear regression models. We found that disagreements tend to be produced with less F0 span matching than agreements and with less F0 median matching in some parts of the conversation. Preferred responses were more likely to be marked by higher F0 median matching than dispreferred responses. These results indicate that different aspects of functional inter-speaker alignment are reflected in matching along distinct acoustic-prosodic features.
Lackner, Andrea (2022). Das Österreichische Gebärdensprachkorpus im Entstehen [Creating the Austrian Sign Language Corpus]. In C. Posch, K. Irschara, & G. Rampl (Eds.), Wort - Satz - Korpus: Multimethodische digitale Forschung in der Linguistik (pp. 193–235). Innsbruck University Press. https://doi.org/10.15203/99106-061-1
Monakhov, Sergei, & Holger Diessel (2024). Complex words as shortest paths in the network of lexical knowledge. Cognitive Science 48(11): e70005. https://doi.org/10.1111/cogs.70005
Abstract
Lexical models diverge on the question of how to represent complex words. Under the morpheme-based approach, each morpheme is treated as a separate unit, while under the word-based approach, morphological structure is derived from complex words. In this paper, we propose a new computational model of morphology that is based on graph theory and is intended to elaborate the word-based network approach. Specifically, we use a key concept of network science, the notion of shortest path, to investigate how complex words are learned, stored, and processed. The notion of shortest path refers to the task of finding the shortest or most optimal path connecting two non-adjacent nodes in a network. Building on this notion, the current study shows (i) that new complex words can be segmented into morphemes through the shortest path analysis; (ii) that attested English words tend to represent the shortest paths in the morphological network; and (iii) that novel (unattested) words receive higher acceptability ratings in experiments when they are formed along established optimal paths. The model's performance is tested in two experiments with human participants as well as against the behavioral data from the English Lexicon Project. We interpret our empirical results from the perspective of a usage-based model of grammar and argue that network science provides a powerful tool for analyzing language structure.
Scherr, Elisabeth, & Arne Ziegler. (2023). Vertikale Variation im Artikelsystem im süd- und mittelbairischen Übergangsgebiet. Zeitschrift für Dialektologie und Linguistik, 90(2), 210–243. https://doi.org/10.25162/zdl-2023-0007
Abstract
Reduced forms of the definite articles are one central characteristic of Bavarian dialects, with the deletion of the word initial dental (d-) being observed especially in the South- and Central Bavarian transition area. The present study shows, however, that in this dialect region the reduced forms appear frequently also in formal communication settings where they are functionally motivated and could be a symptom of language change in progress.