Application of Topic Modeling to Tweets as the Foundation for Health Disparity Research for COVID-19. Academic Article uri icon

Overview

abstract

  • We randomly extracted publicly available Tweets mentioning COVID-19 related terms (n=2,558,474 Tweets) from Tweet corpora collected daily using an API from Jan 21st to May 3rd, 2020. We applied a clustering algorithm to publicly available Tweets authored by African Americans (n=1,763) to detect topics and sentiment applying natural language processing (NLP). We visualized fifteen topics (four themes) using network diagrams (Newman modularity 0.74). Compared to the COVID-19 related Tweets authored by others, positive sentiments, cohesively encouraging online discussions (e.g., Black strong 27.1%, growing up Blacks 22.8%, support Black business 17.0%, how to build resilience 7.8%), and COVID-19 prevention behaviors (e.g., masks 4.7%, encouraging social distancing 9.4%) were uniquely observed in African American Twitter communities. Application of topic modeling techniques to streaming social media Twitter provides the foundation for research team insights regarding information and future virtual based intervention and social media based health disparity research for COVID-19.

publication date

  • June 26, 2020

Research

keywords

  • Betacoronavirus
  • Coronavirus Infections
  • Pandemics
  • Pneumonia, Viral

Identity

PubMed Central ID

  • PMC7728402

Scopus Document Identifier

  • 85087394600

Digital Object Identifier (DOI)

  • 10.3233/SHTI200484

PubMed ID

  • 32604591

Additional Document Info

volume

  • 272