Computational Analysis of Spatial and Temporal Variation in Literary Sentiment
Chloe Gjoka ’23, Olga Redko ’21, and Professor Jonathan Gordon (Computer Science)
This project tracked temporal and spatial variation in sentiment within tens of thousands of books. This research aims to highlight relationships between historical events and locations, explored through sentiment expressed within prevailing literature. To understand the emotions elicited from each book, we used two lexicons to process the texts, both of which produced intensity of emotions ranging from 0 to 1. NRC-EIL was used to gather data on eight categories of emotion, while VADER produced a single sentiment score that ranged from negative to positive. The metadata available for each book – based on previous URSI research – provided the year the book was published and locations associated with the author, according to Wikidata. With all our gathered data, we calculated average sentiment scores and mapped our data in different ways according to the information we had available: the locations mapped were cities, US counties, US states, and countries. These maps were also divided so that each location choice would show sentiments in total, with interactive buttons to show each of the eight emotions as well as positivity and negativity. An optional time slider enabled the map to show data according to a chosen year. These visualizations showed somewhat stable sentiments, but interesting changes were seen during turbulent decades. We expanded our investigation into sentiment by calculating outliers among positive-to-negative sentiments for countries and world cities. Studied per year, these outliers were graphed as histograms or as scatterplots. We calculated outliers for all of our data as well as a stricter sample size. These visualizations also showed similar sentiment values as there were not many outliers. We may continue to analyze literature in different languages to explore the differences in sentiment among language groups.