The Great Noun List
A list of frequently-used common nouns in the English language, delivered in a plain .txt format.
What we have here is a list of the most frequently-used common nouns (i.e. not proper nouns) in English, the largest plain list of its kind freely available on this great internet (currently storing 6,775 nouns). It compiles a large number of common nouns including clothing, raw materials, professions, transportation, abstract concepts, matter, food, education, and many and sundry objects.
There are many sources on the internet for collecting tens of thousands of nouns (word corpuses, dictionary APIs, text mining Wikipedia), but my noun list is different because it contains only very frequently-used ones, and it has been checked line-by-line with actual human eyes to make sure that the words in it are ones that I’ve seen before. This makes the list more practical for use in software that will process normal written English. The list is an alphabetised text file with each word on a new line.
There are no usage restrictions; I dedicate this list to the public domain. You don’t need to credit me or link to this page, although it would be nice if you did so that others could use the list if they wanted.
Yes, you can distribute the list as part of your program or project. If you want people to download the plain list for themselves, it would be best to send them to this page since I add new words all the time.
Whatevs, really! Use it to compile flashcards to teach English, fashion some sort of board game with it, or use it in software you’re programming, as I did for some auto-linking wiki software and for my random noun generator.
The first 4,609 words of the list came from all over the web:
I documented the R analysis pipelines of harvesting MASC and accessing Oxford API here:
I continue to come back and add new nouns as I think of them.
Oh goodness no.
The Second Edition of the 20-volume Oxford English Dictionary, published in 1989, contains full entries for 171,476 words in current use, and 47,156 obsolete words. To this may be added around 9,500 derivative words included as subentries. Over half of these words are nouns, about a quarter adjectives, and about a seventh verbs; the rest is made up of exclamations, conjunctions, prepositions, suffixes, etc. And these figures don’t take account of entries with senses for different word classes (such as noun and adjective).
Leland R. Beaumont made The Verbinator by retrieving verbs from the Open American National Corpus (a very big and detailed word list) and then reducing that massive dataset using my list of commonly-used nouns. Leland provides a PDF file full of verbs that you can use.