CORPUS LINGUISTICS
Anno accademico 2019/2020 - 1° annoCrediti: 6
SSD: L-LIN/12 - LINGUA E TRADUZIONE - LINGUA INGLESE
Organizzazione didattica: 150 ore d'impegno totale, 114 di studio individuale, 36 di altre attività
Semestre: 1°
ENGLISH VERSION
Modalità di svolgimento dell'insegnamento
Lezioni frontali e attività seminariali in lingua inglese
Prerequisiti richiesti
Conoscenza della lingua inglese di un livello B2, nozioni di base linguistica generale, o di filosofia del linguaggio o di semiotica
Frequenza lezioni
Obbligatoria
Contenuti del corso
Dopo un’introduzione alla statistica per un approccio quantitativo all’analisi linguistica, il corso introdurrà, con ripetute sessioni pratiche, l’uso di software per a) l’analisi del lessico, attraverso i concetti di collocation, keywords e lexical density; b) un approccio lessico-grammaticale alla descrizione linguistica; c) l’analisi della variazione in termini di registro linguistico; d) studi sociolinguistici e stilistici; e) confronti diacronici.
Testi di riferimento
Brezina, Vaclav 2018 Statistics in Corpus Linguistics. A Practical Guide, Cambridge University Press, pp. 296.
Programmazione del corso
Argomenti | Riferimenti testi | |
---|---|---|
1 | Lecture 1 is introduces basic principles of statistical thinking that are necessary for informed application of statistical procedures to corpus data. It explains the role of statistics in scientific research in general and corpus linguistics in particular. Topics such as the creation of corpora, types of research design, basic statistical terminology, as well as data exploration and visualization will be discussed. | Brezina, Vaclav 2018 Statistics in Corpus Linguistics, chapter 1 |
2 | Computer lab session with exercises and Lancaster Stats Tools online | Brezina, Vaclav 2018 Statistics in Corpus Linguistics, chapter 1 |
3 | Lecture 2 introduces simple statistical measures that help describe the occurrence of words in texts and corpora. It focuses on word frequencies and distributions both of which are crucial for meaningful description of patterns of language use | Brezina, Vaclav 2018 Statistics in Corpus Linguistics, chapter 2 |
4 | Computer lab session with exercises and Lancaster Stats Tools online | Brezina, Vaclav 2018 Statistics in Corpus Linguistics, chapter 2 |
5 | Lecture 3 explores meanings of words in context, which is an area important to both linguistic and social analyses. Topics discussed are collocations, keywords and manual coding of concordance lines; these play a key role both in the study of semantics (‘dictionary’ meanings of words) and in discourse analysis. | Brezina, Vaclav 2018 Statistics in Corpus Linguistics, chapter 3 |
6 | Computer lab session with exercises and Lancaster Stats Tools online | Brezina, Vaclav 2018 Statistics in Corpus Linguistics, chapter 3 |
7 | Lecture 4 focuses on the statistical analysis of lexico-grammatical features in language such as articles, passive constructions or modal expressions. The chapter shows how lexico-grammatical variation can be summarised using cross-tabulation and what statistical measures can be computed based on cross-tabulation summary tables. These measures range from simple percentages to the chi-squared test and logistic regression | Brezina, Vaclav 2018 Statistics in Corpus Linguistics, chapter 4 |
8 | Computer lab session with exercises and Lancaster Stats Tools online | Brezina, Vaclav 2018 Statistics in Corpus Linguistics, chapter 4 |
9 | Lecture 5 discusses a group of methods that can be used for the simultaneous analysis of a large number of linguistic variables that characterise different texts and registers. First, we look at the relationship between two linguistic variables by means of correlation. Both Pearson’s and the non-parametric Spearman’s correlations are explained. Next, we explore the classification of words, texts, registers etc. using the technique of hierarchical agglomerative clustering | Brezina, Vaclav 2018 Statistics in Corpus Linguistics, chapter 5 |
10 | Computer lab session with exercises and Lancaster Stats Tools online | Brezina, Vaclav 2018 Statistics in Corpus Linguistics, chapter 5 |
11 | Lecture 6 discusses different statistical procedures available for the analysis of stylistic and sociolinguistic variation in corpora. It reviews different approaches to variation, pointing out the common connection to the notion of ‘style’ understood as a particular way of speaking and using language | Brezina, Vaclav 2018 Statistics in Corpus Linguistics, chapter 6 |
12 | Computer lab session with exercises and Lancaster Stats Tools online | Brezina, Vaclav 2018 Statistics in Corpus Linguistics, chapter 6 |
13 | Lecture 7 discusses statistical procedures that can be used to explore historical or diachronic data. First, specific features of diachronic studies are outlined and techniques that provide effective visualizations of diachronic change are introduced. Second, the lecture focuses on the statistical comparison of two time periods using a procedure called bootstrapping. Next, the diachronic application of the cluster analysis is discussed | Brezina, Vaclav 2018 Statistics in Corpus Linguistics, chapter 7 |
14 | Computer lab session with exercises and Lancaster Stats Tools online | Brezina, Vaclav 2018 Statistics in Corpus Linguistics, chapter 7 |
15 | Lecture 8 brings together the statistical knowledge discussed in this course. It then discusses an important topic of replication and introduces a statistical technique called meta-analysis, which provides statistical (quantitative) summary of studies dealing with the same research question(s) (topic). Finally, common effect size measures are reviewed and a guide for their interpretation is provided | Brezina, Vaclav 2018 Statistics in Corpus Linguistics, chapter 8 |
16 | Computer lab session with exercises and Lancaster Stats Tools online | Brezina, Vaclav 2018 Statistics in Corpus Linguistics, chapter 8 |
Verifica dell'apprendimento
Modalità di verifica dell'apprendimento
- Prova orale: gli studenti presenteranno e discuteranno i risultati di un’analisi a partire dal corpus che avranno costruito. Verrà valutato il livello di dettaglio di analisi ed i suoi risultati anche in base alle scelte fatte in fase di costruzione del corpus.
- Prova pratica: gli studenti costruiranno un corpus da interrogare con gli strumenti appresi durante le lezioni. Verranno valutati i criteri di selezione dei testi, le finalità di analisi e l’eventuale dettaglio e accuratezza dell’annotazione dei testi.
Esempi di domande e/o esercizi frequenti
Non ci sono domande frequenti