Subtlex-UK: A New and Improved Word Frequency Database for British English 论文

2013Quarterly Journal of Experimental Psychology引用 1287
Natural Language Processing TechniquesText Readability and SimplificationAuthorship Attribution and Profiling

详细信息

发表期刊/会议
Quarterly Journal of Experimental Psychology
发表日期
2013-10-04
发表年份
2013

关键词

Natural Language Processing TechniquesText Readability and SimplificationAuthorship Attribution and Profiling

摘要

We present word frequencies based on subtitles of British television programmes. We show that the SUBTLEX-UK word frequencies explain more of the variance in the lexical decision times of the British Lexicon Project than the word frequencies based on the British National Corpus and the SUBTLEX-US frequencies. In addition to the word form frequencies, we also present measures of contextual diversity part-of-speech specific word frequencies, word frequencies in children programmes, and word bigram frequencies, giving researchers of British English access to the full range of norms recently made available for other languages. Finally, we introduce a new measure of word frequency, the Zipf scale, which we hope will stop the current misunderstandings of the word frequency effect.