#LancsBox: Lancaster University corpus toolbox
What is #LancsBox?
#LancsBox is a new-generation software package for the analysis of language data and corpora developed at
Main features of #LancsBox:
- Works with your own data or
existing corporaCurrently distributed with:.
■ Brown, LCMC, LOB, Newsbooks, Shakespeare and VULC
- Can be used by linguists, language teachers, historians, sociologists, educators and anyone interested in language.
- Visualizes language data.
- Analyses data in
any language Morphological annotation available for: .
■ Arabic, Catalan, Czech, Dutch, English, Finish, German, Italian, Latin, Mongolian, Portugese, Romanian, Russian, Slovak, Spanish and Swahili.
- Automatically annotates data for .
- Works with any major operating system (Windows, Mac, Linux).
Acknowledgements: The development of #LancsBox was supported by ESRC grants ES/K002155/1 and EP/P001559/1.#LancsBox uses the multiple
: Apache Tika, Gluegen, Groovy, JOGL, minlog, QuestDB, RSyntaxTextArea, smallseg and TreeTagger.
How to cite #LancsBox?
Brezina, V., McEnery, T., & Wattam, S. (2015). Collocations in context: A new perspective on collocation networks. International Journal of Corpus Linguistics, 20(2), 139-173.
How does #LancsBox work?
#LancsBox is very easy to use.
, load data and start the analysis straightaway. Below is a brief overview of the
Developers: Vaclav Brezina (lead developer), Tony McEnery and Matt Timperley.
Teaching materials coordinator: Dana Gablasova
Visual design assistance: Rachael Hill
Student helpers: Samuel Armstrong, David Ellis (both 2017 SPRINT internship)
Former collaborators: Steve Wattam (GraphColl, v. 1)
We are looking for collaborators to help us develop #LancsBox support in