#LancsBox: Lancaster University corpus toolbox

User guide [pdf]


Overview and data

#LancBox is a new-generation corpus analysis tool. Version 3 has been designed primarily for 64-bit operating systems (Windows 64-bit, Mac and Linux) that allow the tool’s best performance. #LancsBox also operates on older 32-bit systems, but its performance is somewhat limited. Downloading and running it is very easy. It is done in three simple steps: 1) download, 2) extract and 3) run.

Win 10: Windows 10 users need to
allow #LancsBox to runwin10 security
■ Click on 'More info'
■ Then click on 'Run anyway'.
for the first time after download.
Mac: Mac users can
run #LancsBox ■ Download #LancsBox for mac.
■ Double-click on the downloaded file to extract the LancsBox app.
■ Open in Finder and copy the 'LancsBox' app to the Applications folder – N.B. it is really important to run LancsBox from Applications!
■ Double-click on ‘LancsBox’ to run it.
■ If your computer complains about the source of the downloaded package, go to apple icon>System Preferences>Security and Privacy and allow the app to run.
as a package.

Starting with #LancsBox [pdf]

[download video]

Load data into #LancsBox [pdf]

Data can be loaded and imported into #LancsBox on the ‘Corpora’ tab. This tab opens automatically when you run #LancsBox. #LancsBox works with corpora in different formats (.txt, .xml, .doc, .docx, .pdf, .odt, .xls, .xlsx and many others) and with wordlists (.cvs). There are two options for loading corpora and wordlists: i) load data and ii) download corpora and wordlists that are distributed with #LancsBox.

[download video]

Search in #LancsBox [pdf]

Throughout the tool, #LancsBox offers powerful searches at different levels of corpus annotation using i) simple searches, ii) wildcard searches, iii) smart searches and iv) regex searches.

[download video]

KWIC [pdf]

The KWIC tool generates a list of all instances of a search term in a corpus in the form of a concordance. It can be used, for example, to:

  • Find the frequency of a word or phrase in a corpus.
  • Find frequencies of different word classes such as nouns, verbs, adjectives.
  • Find complex linguistic structures such as the passives, split infinitives etc. using ‘smart searches’.
  • Sort, filter and randomise concordance lines.

[download video]

Whelk [pdf]

The Whelk tool provides information about how the search term is distributed across corpus files. It can be used, for example, to:

  • Find absolute and relative frequencies of the search term in corpus files.
  • Filter the results according to different criteria.
  • Sort files according to absolute and relative frequencies of the search term.

[download video]

Words [pdf]

The Words tool allows in-depth analysis of frequencies of types, lemmas and POS categories as well as comparison of corpora using the keywords technique. It can be used, for example, to:

  • Compute frequency and dispersion measures for types, lemmas and POS tags.
  • Visualize frequency and dispersion in corpora.
  • Compare corpora using the keyword technique.
  • Visualize keywords.

[download video]

[download video]

GraphColl [pdf]

The GraphColl tool identifies collocations and displays them in a table and as a collocation graph or network. It can be used, for example, to

  • Find the collocates of a word or phrase.
  • Find colligations (co-occurrence of grammatical categories).
  • Visualise collocations and colligations.
  • Identify shared collocates of words or phrases.
  • Summarise discourse in terms of its ‘aboutness’.

[download video]

Text [pdf]

The Text tool enables an in-depth insight into the context in which a word or phrase is used. It can be used, for example, to

  • View a search term in full context.
  • Preview a text.
  • Preview a corpus as a run-on text.
  • Check different levels of annotation of a text/corpus.

[download video]

#LancsBox feedback

If you have a question about #LancsBox functionalities, please read the manual or watch video tutorials to see if you can find the answers there. If you are experiencing problems with #LancsBox, try to find the answer in the troubleshooting chart.

Feedback form