Understanding Healthcare Expenditure in the United States

DataRes at UCLA
8 min readMar 25, 2021

--

By: Naomi Golin, Avishek Ghosh, Jun Bae, Samantha Chung, Angelina Kim

The United States does not have a universal healthcare program that applies to all people, so citizens must rely on numerous public and private insurance companies. Working with a variety of resources including the US census and Urban Health Institute (Urban Health Institute), simulated datasets on medical cost from Kaggle (Medical Cost Personal Dataset) (Cardiovascular Disease), and data from Health Care Cost Institute(Health Cost Institute), we discovered certain trends that highlight systemic issues that exist and act against marginalized communities, as well as deeply rooted biases that may exist in the medical community in the US.

In this article, we have decided to take a closer look at how certain factors such as BMI, smoking habits, ethnicity, and gender correlate with the costs for certain medical procedures. The first factor that we decided to look at is the Body Mass Index (BMI). This is defined as one’s weight divided by height squared and is commonly correlated with a range of health conditions such as cardiovascular health conditions, diabetes, high blood pressure, as well as high cholesterol levels. From the perspective of health insurance companies, we would expect that individuals with higher BMI will be charged a larger amount due to these pre-existing health conditions. In our analysis, BMI categories are defined as follows:

  1. “Underweight” = BMI of smaller than 18.5
  2. “Healthy” = BMI greater than 18.5 (inclusive) and smaller than 25 (inclusive)
  3. “Overweight” = BMI of greater than 25.

Bar Graph showing mean Medical Costs associated with BMI subcategories:

From the given bar plot above, the mean medical yearly cost for an underweight individual is $8852, and the mean medical yearly cost for individuals of a healthy weight is $10410, and the mean medical yearly cost for overweight individuals is $13946. Hence, as noted from the bar graph given above, it is a general trend that individuals with a greater BMI are charged a greater insurance cost from insurance companies than individuals with a smaller BMI. This could be the case because a higher BMI indicates higher risk of a variety of health conditions including diabetes and high blood pressure — which will require greater insurance coverage.

ScatterPlot of BMI and Medical Cost

From the given scatterplot above, there is a generally increasing relationship between BMI and medical cost, where the points with the highest medical costs correspond to individuals with BMI levels of 25 or greater. Furthermore, the correlation between BMI and medical cost is 0.198, although this is still far from 1 and is a weak trend, there are multiple factors that contribute to medical cost — hence, this positive relationship still suggests that BMI does play a role in determining medical costs charged by an insurance company. However, it must also be taken into account that there are some current criticisms with regards to BMI, such as the fact that it does not take into account body fat percentages, and these are factors to be considered when directly analyzing the relationship between BMI and medical cost.

Since it is commonly known that smoking is injurious to health, and smoking status was present in the dataset, we decided to analyze the effects of smoking on potential insurance costs. Smoking is known to cause cancer, stroke, heart and lung disease, and a multitude of other health defects. From this statement, it should be expected that people who smoke should have a higher annual medical bill.

From the scatter plot above, it is evident that smoking does indeed increase an individual’s annual medical bill. There appear to be three clusters of points. For all clusters, the medical bill increases with age as expected. Ageing results in the gradual reduction in physical and mental capacity, along with increased risk and vulnerability to diseases. It can be hypothesized that the bottom green cluster represents healthy individuals who do not smoke. The top red cluster could represent actively smoking individuals. The middle cluster with a mix of red and green points could include people who occasionally smoke, have a family history of a disease, had untimely injuries from accidents, etc.

High blood pressure is a key risk factor for heart disease, the leading cause of death in the United States. Using the blood pressure data for 70000 patients across the United States, it is found that the mean systolic blood pressure is 120 for non-smoking individuals and 128 for smokers.

From the probability density plot above, it can be observed that individuals who smoke have a higher probability of having high blood pressure than those who do not smoke. This confirms that smokers are more likely to develop heart diseases, causing their medical bills to be more.

Additionally, healthcare is a basic necessity that should be freely available to everyone. Using data provided by Urban Institute, we have managed to find a few alarming trends. We have decided to explore how percentages of people in medical debt vary across regions populated with predominantly white people compared with that of communities predominantly populated by people of color.

Below are two bar charts that depict the distribution of medical debt in percentages and average household income across the United States. Not every household is able to acquire medical insurance, and a lot of people, especially certain minority groups, find themselves accumulating financial debt due to healthcare expenses. Comparing average income and medical debt of white people and people of color help us highlight the problem that the US is facing where BIPOC populations are significantly disadvantaged by the system that governs the country. The plot below is showing average income per household across the US. It is meant to put the amount of income into perspective as there is a significant variation in the average values across all states.

By analyzing the other two plots, we can see that the average income of white families is significantly higher than that of people of color, a pattern that is consistent throughout all the states.

However, the percentages of medical debt in areas with predominantly non white people is observed to be much higher than that of areas with mostly white populations.

We have decided to explore the impact of gender on medical insurance costs because gender inequality is prevalent among all aspects of our lives, and its association with healthcare costs is no different. Using raw data obtained from the Health Care Cost Institute, a non-profit research organization that publishes healthcare-related data, we manage to analyze how the cost of healthcare differs between gender. Looking at the out-of-pocket expenses in particular, the first plot highlights an obvious discrepancy between the costs paid by men and women. It is also interesting to note how the average healthcare costs for both genders generally increases as age increases. While the average amount paid by men is $20.30, for women it is $24.10. Furthermore, there also looks to be increased cost discrepancy between the genders as age increases. While this could be partially attributed to childbirth costs, the fact that the stacked bar plot above indicates that this cost difference is seen within all types of healthcare services indicates an even greater systemic issue. In general, those that identify as female are on average paying more out-of-pocket for healthcare services. More research needs to be done by the federal government and policy-makers to identify the source of these increased costs, which will perhaps allow them to identify certain policies that may help eradicate not only the gender discrepancy but also age problems.

Throughout our article, we analyzed the relationship of Body Mass Index, gender, ethnicity, and smoking status on the average cost of medical insurance that each individual is expected to contribute. Our results show that frequent smokers and those of a high body mass index are likely to be charged a greater amount with regards to insurance costs. Additionally, women are expected to contribute more to their healthcare costs and people of color are more likely to be negatively impacted by medical insurance debt than those who are white (despite having a lower average income). This highlights fundamental issues within the healthcare system, and violates the principles that healthcare should be a fundamental right and not a privilege. We would recommend that governments and insurance companies take the following actions: Taking into account the current family’s income level, and ensure that the amount charged upon them is an affordable amount. Also, it might be beneficial if insurance companies are held accountable through a systematic review system to ensure that non white populations and women are not being charged more. Resolving these issues will require action taken by the federal system and also awareness and advocacy by the public.

If you would like to explore more different medical cost personal datasets, please feel free to check out the link below:

Datasets used:

1). Urban Institute Dataset:

Carman, Joella. “Urban Institute (Github Repository).” Github, <https://github.com/UrbanInstitute/debt-interactive-map/tree/master/data/201911-update>

Kaggle Datasets:

2). Choi, Miri. “Medical Cost Personal Datasets.” Kaggle, 21 Feb. 2018, www.kaggle.com/mirichoi0218/insurance.

3). Ulianova, Svetlana. “Cardiovascular Disease Dataset.” Kaggle, 20 Jan. 2019, www.kaggle.com/sulianova/cardiovascular-disease-dataset.

4). Public Health Dataset : https://healthcostinstitute.org

Our Github portfolio:

--

--

No responses yet