This site provides the data for the Word Mapper app hosted by Quartz.


About


The original Word Mapper app lets users map the relative frequencies of the top 10,000 most common words in an
8.9 billion word corpus of 890 million geocoded Tweets collected from across the contiguous United States between 11 October 2013 and 22 November 2014. The app was created by Jack Grieve, Andrea Nini, and Diansheng Guo for the Trees and Tweets project, funded by AHRC/ESRC/JISC/IMLS as part of Digging into Data 3. The Quartz version was designed by Nikhil Sonnad and maps the 97,246 words that occur at least 500 times in the corpus.


Data

The four word-by-county regional data matrices used for Word Mapper are
available for download here. The first matrix contains the relative frequencies per billion words of the 97,246 words measured across 3,075 counties and the next three matrices contain the corresponding Getis-Ord Gi* z-scores, calculated using three different nearest neighbors spatial weights matrices.

See the papers and talks below for more information. If you use the data in your research, please cite one or more of the papers. You can contact Jack Grieve at j.grieve1@aston.ac.uk with any questions.


Papers

Jack Grieve, Andrea Nini and Diansheng Guo. 2016. Mapping lexical innovation on American social media. In Review.

Andrea Nini, Carlo Corradini, Diansheng Guo and Jack Grieve. 2016. The application of growth curve modeling for the analysis of diachronic corpora. Forthcoming in Language Dynamics and Change.

Jack Grieve, Andrea Nini and Diansheng Guo. 2016. Analyzing lexical emergence in American English online. Forthcoming in English Language and Linguistics

Martijn Wieling, Jack Grieve, Gosse Bouma, Josef Fruehwald, John Coleman and Mark Liberman. 2016. Variation and change in the use of hesitation markers in Germanic languages. Language Dynamics and Change 6: 199-234.

Yuan Huang, Diansheng Guo, Alice Kasakoff and Jack Grieve. 2016. Understanding US regional linguistic variation with Twitter data analysis. Computers, Environment and Urban Systems 54: 244-255.


Talks

Jack Grieve. 2016. Identifying and mapping the spread of new words. Invited presentation at BAULT 2016, University of Helsinki, December 2, 2016.

Jack Grieve. 2016. Functional variation drives regional variation. Invited presentation at University of Edinburgh, November 10, 2016.


Jack Grieve. 2016. Using big data to map language structure and use. Invited Plenary at American Association of Corpus Linguistics 2016, Ames, Iowa, September 16, 2016.

Jacopo Rocchi, Andrea Nini, David Saad, Jack Grieve. 2016. Dynamics and equilibria in Twitter: Analyzing geographical lexical spread. Presented at IT Open Research Forum Workshop, London School of Economics, May 19, 2016.

Jack Grieve, Diansheng Guo, Alice Kasakoff, Andrea Nini. 2016. Trees and Tweets: Mining Billions to Understand Regional Linguistic Variation and Human Migration. Presented at Digging into Data Round 3 Conference, Glasgow, January 28, 2016.

Jack Grieve, Andrea Nini, Diansheng Guo, Alice Kasakoff. Using Social Media to Map Double modals in Modern American English. Presented at New Ways of Analyzing Variation 44, University of Toronto, October 22-25, 2015.
          
Jack Grieve, Andrea Nini, Diansheng Guo, Alice Kasakoff. Big Data for the Analysis of Language Variation and Change. Presented at From Data to Evidence: Big Data, Rich Data, Uncharted Data, University of Helsinki, October 19, 2015.

Jack Grieve, Andrea Nini, Diansheng Guo, Alice Kasakoff. Recent Changes in Word Formation Strategies in American Social Media. Presented at Corpus Linguistics 2015, Lancaster University, July 22, 2015.

Jack Grieve. Tracking the Emergence of New Words Across Time and Space. Invited Presentation at the Digital Science Speaker Series, London, May 26, 2015.

Jack Grieve. Corpus Linguistics for Regional Dialectology. Invited Presentation at UCREL CRS, Lancaster University, UK, May 14, 2015.

Jack Grieve. Big Data for Lexical Research. Invited Presentation at JISC's Digifest 2015, as part of the Big Data and the Dark Arts panel, Birmingham, UK, March 10, 2015.

Jack Grieve. Tracking the Emergence of New Words Across Time and Space. Invited Presentation at the Digital History Seminar Series, Institute of Historical Research, School of Advanced Study, University of London, February 24, 2015.

Jack Grieve. Mapping Lexical Spread in American English. Presented at the American Dialect Society Annual Meeting, Portland, Oregon, January 8, 2015.

Jack Grieve, Diansheng Guo, Alice Kasakoff, and Andrea Nini. Big-data Dialectology: Analyzing Lexical Spread in a Multi-billion Word Corpus of American English. Presented at American Association of Corpus Linguistics 2014, Flagstaff, Arizona, September 28, 2014.

Jack Grieve. Spatial and geostatistical analysis for regional dialectology. Presented at Methods in Dialectology XV, Groningen, Netherlands, August 11, 2014.


Selected Media

How to Swear Across America. Oxford Dictionaries. December 6, 2016.

10,000 Words Ranked According to their Trumpiness. Quartz. November 17, 2016.

Redefining the Modern Dictionary. Time. May 12, 2016.

Down with vs. Up For: We Have Maps. Language Log. March 29, 2016.

Does South Carolina Really Care that Donald Trump Swears. The Washington Post. February 14, 2016.

Totally Wordmapper. Language Log. January 29, 2016.

Geolexicograpy. Language Log. January 27, 2016.

These Maps Reveal How Your State Curses. Maxim. August 17, 2015.

The Week in Data. FiveThirtyEight. August 2, 2015.

How fleek, faved and famo conquered America. Daily Mail. July 30, 2015.

How brand-new words are spreading across America. Quartz. July 29, 2015.

Mapping swearwords throughout the U.S. Washington Post. July 29, 2015.

Where the curses are. Language Log. July 27, 2015.

Tracking the emergence of new words across time and space. Macmillan Dictionary Blog. July 21, 2015.

Want to know how to curse like a proper American? Have a look at these maps. The Guardian. July 17, 2015.

Which Curse Words are Popular in Your State? Huffington Post. July 17, 2015

Pluto, Feathered Dinos, And Other Amazing Images Of The Week. Popular Science. July 17, 2015.

Laten we vloeken in stijl. De Morgen. July 23, 2015.

Un linguiste a cartographié les insultes les plus utilisées aux États-Unis. Slate France. July 17, 2015.

Do You Live in a Fuck State or a Shit State? Mother Jones. July 17, 2015.

Do You Live in a "Bitch" or a "Fuck" State? American Curses, Mapped. Gawker. July 16, 2015

Language Quiz: Are You On Fleek. New York Times. February 22, 2015.

16 Maps that Blew our Mind in 2014. Hotpads. December 30, 2014.

How your dude is another man's buddy. Daily Mail. December 28, 2014.

The heart of dudeness. Big Think: Strange Maps. December 24, 2014.

The dude map: How Americans refer to their bros. Quartz. December 23, 2014.

Things That Make You Go "Um". The Atlantic. November 17, 2014.

Um, here’s an, uh, map that shows where Americans use “um” vs. “uh”. Quartz. September 15, 2014.

UM / UH geography. Language Log. August 13, 2014.  

Biggest ever linguistic survey on Twitter could find the next 'selfie' or 'twerk'. Phys.org, March 4, 2014.

Linguistic researchers begin hunt for the next 'Selfie'. The Daily Telegraph, March 3, 2014.