Share This Post

Fighting Covid: A Lack of Local Data Sources

Since the beginning of the COVID-19 crisis, many non-profits, government agencies, and companies have pulled together data to track cases, fatality, and recovery rates.  Most of the data, while interesting, is not at a level granular enough for those quarantined, fighting the virus, or impacted financially to make reasonable decisions. Managing the COVID-19 pandemic requires reliable data down to the local level.  However, few localities appear to have the infrastructure or data-collection methods mature enough to capture such data. An example of this is in the metro Austin area, which encompasses the City of Austin as well as some of the surrounding counties. I’ll start with some resources which show the situation at a macro-level, then attempt to drill-down to a more local level so that we can see the scope of the problem. 

One of the major sources of data cited in national media outlets is represented in the following visualization.  This is probably one of the more popular visualizations out there.

covid-19 global cases

 

Covid-19 Global Cases

With this dashboard, I can see worldwide trends, hotspots, and approximate how quickly the disease is spreading in certain locations.  Fatality rates and recovery rates are difficult to trust as there is very little consistency worldwide on how recoveries are monitored.  (In fact, the site above stopped tracking recovery rates in the U.S. altogether due to inconsistencies in reporting). 

Here is another one which displays level at the Texas state level: 

covid-19 cases in texas

 

Covid-19 Cases in Texas

This gives me a little more information, and is updated once per day.  It tells me how many people in the state have actually been tested (it’s not a lot).  But it doesn’t tell me where they’ve been tested, whether they were symptomatic or asymptomatic (important for gathering control data), nor does it project how many cases are actually out there based on the current sample size.  What’s even more disturbing is the case and fatality counts differ by as much as 30% depending on which dashboard you are viewing. Some of this can be explained due to latency in the data, but there is also probably some double-counting going on.  The TX dashboard even has the following comment to this effect: 

Why are these case counts different from what other sources are reporting?

Other dashboards may be using media reports to gather information about new cases, which may result in some cases being counted more than once.

However, an even bigger gap in these data sets is information that would allow me to make decisions like: 

  • Should I go to the hospital? Which one?
  • Which hospitals have enough beds? 
  • How many beds are there in my city? 
  • How many are occupied and where? 
  • If my symptoms are serious, which hospitals have ICU units?  Ventilators? How many?

Hospital-level data is even more unreliable.  In Texas, the number of reported Acute-care beds ranges from 58,055 (which is obviously erroneous given that some hospitals in this list report zero): to 80,677 (probably a more reliable number given that is from a state agency, but the official, publicly available reports are more than 3 years old).  County-level information is even more difficult to find. The County Information Project compiles information from the Texas Association of Counties.

The data is clearly false.  As an example, St. David’s Medical center is listed as having 0 acute care beds.  I know this is wrong because I occupied one of them at one time!

But let’s assume that the DSHS bed numbers are close enough to correct.  How do I know how often those beds are full? The DSHS report lists bed utilization for any given year at approximately 60%.

hospital inpatient utilization

 

Inpatient Utilization 2016

The report does not define utilization or occupancy rate, but I was able to find what appeared to be an official definition here.

Briefly, the utilization rates for a given period are calculated by determining the number of occupied bed-days and dividing it by the number of total available bed-days over a given reporting period. For example, if a hospital has 100 beds, the maximum bed-days per year are 365 * 100 = 36,500.  If the hospital saw 1000 patients over the course of a year, and they each stayed in hospital 5 days, the utilization rate for the year would be 5000/36,500 * 100 = 13.7%. 60% utilization over a year is not meaningful enough to know whether there will be beds available during an outbreak. At the very least, one would need to know the median daily utilization rate, or even more preferably, the maximum daily utilization rate. This would give one an understanding of what to expect in terms of bed availability in a worst-case scenario.  

We already know that some cases of COVID require hospitalization in an ICU.  But how many ICU beds are in your city? What is the utilization rate of ICU beds on any given day?  This does not appear to be widely reported, although some (but not all) individual hospitals appear to publish statistics on number of ICU beds. Houston Methodist is a good example of how they break down the number of beds (they are considered quite a large hospital network and from the page below appear to have 102 ICU beds in one of their locations. The others either do not have any ICU beds or are not reported).

If a public health administrator were to ask the question, do I need to order ventilators?  Do I need to convert some acute care beds to ICU beds? How many? Such a question would be nearly impossible to answer at a local level, at least based on readily-available public data.  Fortunately, while I was writing this post, some researchers at The University of Texas saw the same gaps I did, and published a paper on March 26 of this year.

This is the first and only source I have come across which lists the total number of acute care beds in the austin metro area (4,000) and the number of ICUs/ventilators (750).  It required an academic study in order for the numbers to be readily accessible to a lay-person such as myself. Even with this data, I do not know what the utilization rate of the ICU beds are, or how many beds are currently occupied. Showing capacity by hospital would help route serious cases to the appropriate resources, or potentially allow less critical cases to be placed in facilities which do not provide ICU beds, bed have acute beds available. 

Austin is a city which prides itself on being tech-saavy and forward-thinking.  Yet, the data for the Austin metro area is unreliable and inconsistent. How well does this bode for other localities which do not have the data collection and reporting infrastructure or an academic institution such as the University of Texas performing research on number of beds and capacity? 

Do you know of any localities which are posting reliable COVID data sets online, including occupancy rates, number of acute care and ICU beds?  Comment below. 

Contact us to learn more about how to harness big data for your business.

Fighting Covid: A Lack of Local Data Sources

More To Explore

AI in Software Development

AI in Software Development

How AI is Revolutionizing Software Development If you’re managing software projects, you know the holy trinity of success: speed, accuracy, and scale. But achieving all three simultaneously? That’s the tough

AI to Write Requirements

How We Use AI to Write Requirements

At ArgonDigital, we’ve been writing requirements for 22 years. I’ve watched our teams waste hours translating notes into requirements. Now, we’ve cut the nonsense with AI. Our teams can spend

ArgonDigital | Making Technology a Strategic Advantage