COVID-19 is a disease that has affected everyone globally with the largest number of cases in the United States and counting. Recent statistics have shown that the mortality rate is high in the African-American, and socio-economically weak communities. Also, officials in the white house briefings stated that there must be more studies on understanding which areas/ counties are getting affected the most and why the virus is having a high mortality rate in the African-American communities. Hence we identified key datasets that contain county-level attributes relating to economic status, ethnicity, COVID-19 cases, hospitalization, mortality rate and so on and perform visual analytics to obtain more insights. This free AI-analysis tool was developed with our partner, Akai Kaeru, and is available through our Zeblok computational dashboard. We feel that further analysis with other relevant datasets could potentially identify hotspots and clusters, and also assist in emergency preparedness for future disease outbreaks.
We analyzed several datasets from various publicly available sources and we identified that the Kaggle repository titled UNCOVER COVID-19 Challenge provided by Roche Data Science Coalition had the relevant data. We used most of the attributes and used various machine learning techniques to obtain some interesting and valuable understanding of the data.
Image 1: Death Rate v.s. Percent Uninsured
Image 2: Death Rate v.s. Percent Black (High percent of uninsured)
Approach/ Tools used
The goal of this analysis is to identify criteria that put U.S. counties at risk for a higher death rate from COVID-19. We measure the death rate as the number of confirmed COVID-19 deaths per 100k people. We used Akai Kaeru’s Explainable AI Notebook to identify and explain why some counties have a higher death rate than others. This technology uses a combination of statistical analyses and visual analytics to allow users to identify subgroups of counties with statistically high/low COVID-19 death rates.
The analysis resulted in over 100 patterns that succinctly explain differences in county level death rates due to COVID-19. Many of the patterns that describe unusually high death rates are based on socioeconomic factors that also correlate with minority status (i.e. counties with high poverty and more minorities tend to have higher death rates). For example, counties that have a high percentage of African-American residents and a high percentage of uninsured have about a death rate that is greater than twice the national average (i.e. 21 deaths per 100K people). These counties are primarily concentrated in the south.
Image 3: Multiple maps with insights to data
Image 4: Counties with High Percent Black and Uninsured
Image 5: Covid-19 Death Rate per U.S. county.
Akai Kaeru, LLC creates AI-powered software that helps data scientists solve stubborn problems in real-world applications. Leveraging extensive research expertise in interactive high-dimensional data visualization, its two co-founders transformed acclaimed research products into the practical Salient Pattern Miner, Visual Causal Analyst, and Data Context Map that are at the heart of the Explainable AI data analytics software suite.
Zeblok and Akai Kearu have come together to offer the Explainable-AI algorithm as part of Zeblok’s ecosystem.
More information on Akai Kearu at https://akaikaeru.com/