BNC2014 logo

British National Corpus 2014

A new resource for research and teaching on the contemporary English language


Welcome to the information and data release website for the BNC2014.

About the corpus

The British National Corpus 2014 is a large collection of samples of contemporary British English language use, gathered from a range of real-life contexts. The BNC2014, which contains millions of words of spoken and written English, is being gathered by Lancaster University and Cambridge University Press, and is a new resource for research and teaching on contemporary British English. It is the successor to the original British National Corpus, which was gathered in the early 1990s. By comparing the two corpora, researchers will be able to shed light on how British English may have changed over the last two decades.

About the Spoken BNC2014

The 11.5-million-word spoken component of the BNC2014 contains transcripts of recorded conversations, gathered from members of the UK public between 2012 and 2016. The conversations were recorded in informal settings (typically at home) and took place among friends and family members. An innovative aspect of the corpus is that the speakers recorded their conversations using the built-in audio recording device in their smartphones. The corpus comprises 1,251 conversations, featuring a total of 672 speakers.

Click here for up-to-date news on the Spoken BNC2014 project.

About the Written BNC2014

The Written BNC2014 is an ongoing project. Details of progress on the written part of the BNC2014 project are available on the CASS website.

Click here for up-to-date news on the Written BNC2014 project.

This page was last modified on Thursday 18 March 2021 at 2:52 pm.