If I Can’t Analyze My Own Sentiments, How can a Machine?

My final project heavily relies upon the study of the “comments section” attached to the Instagram postings on each of Douglas Coupland’s “retooled” Slogans (2020-Present). Why? I’ve made a small claim in previous posts / pages on this site stating that a Slogan (2020-Present) on Instagram must be viewed in conjunction with the metadata that comes attached to its posting, data which includes: comments, post caption, date of posting, location of posting, and number of likes. The Slogan (2020-Present) on Instagram is thus the whole post and, in my view, should be studied as a sort of secure communal square… as an illustration, consider a man standing on a busy street next to a large signed and dated billboard he created with a slogan plastered onto it; the man has a jar of markers on a stool next to the sign and folks begin to pick up markers and write notes onto the billboard. At some point, someone walks up and takes a photo of the billboard. For the purposes of this study, I would essentially be studying that resulting photograph.

I mentioned Sebastian Veg’s (2016) statement that: “the exchange of slogans can be viewed as a constitutive type of ‘communicative action’ (pg 691)” on my webpage. I am concerned with uncovering what the community of Douglas Coupland commenters is acting on. To do this, I have been exploring the methodologies of Topic Modeling and Sentiment Analysis.

Topic modeling is a tool that utilizes an algorithm to seek at the most frequently used meaningful words in a corpus and then organize them into select groupings or “topics” based on those words’ relationships to each other. Scholars Andrew Goldstone and Ted Underwood (2012) described topics as: 

“A “topic model” assigns every word in every document to one of a given number of topics. Every document is modeled as a mixture of topics in different proportions. A topic, in turn, is a distribution of words — a model of how likely given words are to co-occur in a document. The algorithm (called LDA) knows nothing “meta” about the articles (when they were published, say), and it knows nothing about the order of words in a given document.”

Andrew Goldstone & Ted Underwood (2012)

One of the more widely used algorithms for this purpose is Latent Dirichlet Allocation (LDA). According to Benjamin M. Schmidt (2012) in his article “Words Alone: Dismantling Topic Models in the Humanities,” this particular algorithm was first described by David Blei, a computer scientist who specializes in machine learning, in 2003. Blei and his group viewed LDA as an advance in information retrieval; Schmidt (2012) states that this is quite different from the algorithm’s use in the Digital Humanities where it is often treated as a tool for discovery. In this dichotomy, retrieval would tell a researcher what texts they might find interesting; discovery would tell a researcher something new about the text itself. 

I had a little bit of trouble understanding the difference between retrieval and discovery, but Benjamin Schmidt (2012) linked to a fun allegorical blog post titled “The LDA Buffet is Now Open; or, Latent Dirichlet Allocation for English Majors” by Matthew L. Jockers (2011) that clarified things for me… the article’s about authors at a buffet picking out topics to put on their plates. Another helpful article Schmidt (2012) linked to was Andrew Goldstone’s and Ted Underwood’s (2012) “What Can Topic Models of PMLA Teach Us About the History of Literary Scholarship“; the conclusion of their piece states that a topic model states both what people are writing about and how (how relating to “academically” or “laymen-ly” etc).

This how, however, does create a number of problems, some of which Schmidt (2012) addresses in his article… a significant one, and one I am concerned about in my own study is: “long term drifts in language,” including changes in language that not be “topically coherent.” While I do not have to worry about “long term drifts” I do have to worry about the odd nuances of Internet-speak, where phrases like “not good,” or words like “ok, okay, O.K., okie, ‘k, k, kay” are common. Could the common misspellings and odd phrasings common on the Internet derail my study?

While Googling around, I stumbled on this article by Nuo Wang, a chemistry PhD who wrote an article called “Topic Modeling and Sentiment Analysis to Pinpoint the Perfect Doctor.” Her article detailed a project where she used a tool to consolidate troves of reviews on doctors using LDA and sentiment analysis; she called the resulting tool “DoctorSnapshot.” This tool analyzed developed topics—later defined into 11 specific topics by Wang—like “bedside manner” and then detected either “positivity” or “negativity” in reviewers’ discussion of those topics. This project appears to be extremely well-developed, and I can see it being a useful guide for someone such as myself who is explicitly trying to define exactly who commenters on Douglas Coupland’s Instagram account are: What are their beliefs? What are their concerns? What do they like? Etc.

Nuo Wang’s methodology at the moment exceeds the scope of my abilities, however, this class helped bring her discussion into the realm of my understanding. I hope that, perhaps with some time and extra effort, I will be able to perform a similar-style analysis on commenter text. It could be a boon for a future project that I am working on.

Works Cited:

Goldstone, Andrew, and Ted Underwood. 2012. “» What Can Topic Models of PMLA Teach Us About the History of Literary Scholarship?” Journal of Digital Humanities 2 (1). http://journalofdigitalhumanities.org/2-1/what-can-topic-models-of-pmla-teach-us-by-ted-underwood-and-andrew-goldstone/.

Jockers, Matthew. 2011. “The LDA Buffet Is Now Open; or, Latent Dirichlet Allocation for English Majors | Matthew L. Jockers.” 2011. https://www.matthewjockers.net/2011/09/29/the-lda-buffet-is-now-open-or-latent-dirichlet-allocation-for-english-majors/.

Schmidt, Benjamin. 2012. “» Words Alone: Dismantling Topic Models in the Humanities.” Journal of Digital Humanities 2 (1). http://journalofdigitalhumanities.org/2-1/words-alone-by-benjamin-m-schmidt/.

Veg, Sebastian. 2016. “Creating a Textual Public Space: Slogans and Texts from Hong Kong’s Umbrella Movement.” The Journal of Asian Studies 75 (3): 673–702.

Wang, Nuo. 2017. “Topic Modeling and Sentiment Analysis to Pinpoint the Perfect Doctor.” Insight. November 21, 2017. https://blog.insightdatascience.com/topic-modeling-and-sentiment-analysis-to-pinpoint-the-perfect-doctor-6a8fdd4a3904.

One thought on “If I Can’t Analyze My Own Sentiments, How can a Machine?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s