Lancaster Stats Tools online

Materials


1
Introduction Statistics meets corpus linguistics

Introduction to corpus statistics

This lecture is an introducton to corpus statistics. It will take you through the basic concepts and principles of statistical thinking including descriptive and inferential statistics, types of frequency, dispersion, statistical tests, effect sizes etc.

The video can be downloaded here.

Chapter 1: exercises [pdf] and answers to exercises [pdf]

Data visualization: exercises [pdf]

Concordance for the lemma GO [csv] [xlsx]

'The' in BE06 [csv] [xlsx]

Passives in BE06 - genres [csv] [xlsx]

'The' and 'I' in BNC64 [csv] [xlsx]

'Go'/'travel' in BNC [csv] [xlsx]


'Lovely' in BNC64: Male and female speech [csv] [xlsx]

'Lovely' in BNC64: Age [csv] [xlsx]


Modals in the Brown family - frequencies [csv] [xlsx]

Modals in the Brown family - concordances [csv] [xlsx]

Modals in the Brown family - summary [csv] [xlsx]


Data visualization [xlsx]

Lecture 1: slides [pptx] and lesson plan [pdf]

2
Vocabulary Frequency and dispersion

Videos will be here

Chapter 2: exercises [pdf] and answers to exercises [pdf]

BNC frequency list (BNCweb) [txt]

DP calculations [xlsx]

Lecture 2: slides [pptx] and lesson plan [pdf]

3
Semantics and discourse Collocations,keywords and lockwords

Semantics and discourse

This lecture discusses . These include cross-tablulation, the chi-squared test and logistic regression.

The video can be downloaded here.

Chapter 3: exercises [pdf] and answers to exercises [pdf]

Inter-rater agreement (exercise 9) [csv] [xlsx]

Inter-rater agreement (example) [csv] [xlsx]

Guardian comments [txt]

Daily Mail comments [txt]

Lecture 3: slides [pptx] and lesson plan [pdf]

4
Lexico-grammar From simple counts to complex models

Lexicogrammar

This lecture discusses different approaches to analysing lexicogrammar. These include cross-tablulation, the chi-squared test and logistic regression.

The video can be downloaded here.

Chapter 4: exercises [pdf] and answers to exercises [pdf]

The vs. a(n) dataset [csv] [xlsx]

Modals dataset [csv] [xlsx]

Modals dataset with genre coding [csv] [xlsx]

Modals dataset with variety coding [csv] [xlsx]

Cross-tab of modals of obligation [csv] [xlsx]

Cross-tab of which and that [csv] [xlsx]

Which and that dataset for logistic regression [csv] [xlsx]

Lecture 4: slides [pptx] and lesson plan [pdf]

5
Register variation Correlation, clusters and factors

Register variation

This lecture focuses on techniques that help us uncover relationships in complex linguistic data. These techniques include correlation, cluster analysis and multidimensional analysis.

The video can be downloaded here.

Chapter 5: exercises [pdf] and answers to exercises [pdf]

Correlations [csv] [xlsx]

Clusters [csv] [xlsx]

MD BE06 (British English)[csv] [xlsx]

MD AmE06 (American English)[csv] [xlsx]

New Zealand English - ICE-NZ [xlsx]

Lecture 5: slides [pptx] and lesson plan [pdf]

6
Sociolinguistics Individual and social variation

Videos will be here

Chapter 6: exercises [pdf] and answers to exercises [pdf]

T-test or Mann-Whitney U test [csv] [xlsx]

ANOVA or Kruskal-Wallis[csv] [xlsx]

Correspondence analysis [csv] [xlsx]

Mixed effect model [csv] [xlsx]

White House Press Conferences [csv] [xlsx]

Exercise 6 [xlsx]

Lecture 6: slides [pptx] and lesson plan [pdf]

7
Change over time Working with diachronic data

Change over time

This lecture introduces statistical measures that are appropriate for dealing with historical data. This include bootstrap test, peaks & troughs and Usage Fluctuation Analysis.

The video can be downloaded here.

Chapter 7: exercises [pdf] and answers to exercises [pdf]

Modals in BrE 1931 - 2006 [csv] [xlsx]

Bootstrap test data [csv] [xlsx]

VNC cluster data [csv] [xlsx]

Peaks & troughs data [csv] [xlsx]

UFA data: 'whore' in the 17th century [zip]

Colours - dataset [xlsx]

Lecture 7: slides [pptx] and lesson plan [pdf]

8
Bringing everything together Ten principles of statistical thinking, meta-analysis and effect sizes

Bringing everything together

This lecture is the last from the stats series. It brings together the information learnt in this course and focuses on key principles behind effect sizes and meta-analysis in corpus linguistics.

The video can be downloaded here.

Chapter 8: exercises [pdf] and answers to exercises [pdf]

Meta-analysis [csv] [xlsx]

Lecture 8: slides [pptx] and lesson plan [pdf]