Craving a Sweet Treat? Think Again…

DataRes at UCLA
9 min readJan 15, 2025

--

Authors: Audrey Huang, Brian Kim, Kevin Espinas, Joshua Sujo, Henry Zhao

Source: Buravleva stock, Shutterstock

With the recent trend of ‘sweet treats,’ it has become widely popular across the internet to reward

yourself with a sweet drink or dessert after a hard (or even not so hard) day. This trend may be concerning however, as Americans are rampantly intaking sugar. According to the US Center for Disease Control, adults and children alike intake a daily average of 17 teaspoons of added sugar, a startling 5 teaspoons above the recommended intake.

The excessive consumption of added sugars has been known to cause serious health risks, such as unhealthy weight gain, obesity, heart disease, and diabetes. According to the CDC, the rate of diabetes diagnosis in kids and teens is projected to increase at least 70% by 2060. In adults, 11.6% of the US population had diabetes in 2021, which has jumped almost 2% since 2004. Diabetes can lead to many health complications, such as hospitalization, kidney disease, vision disability, and even death.

With these concerns in mind, our team sought to look into what factors lead to a higher chance of diabetes, with the eventual goal of prevention. For this purpose, we examined a dataset from the 2013–2014 National Health and Nutrition Health Survey, which includes data regarding diabetes diagnosis, age, exercise, BMI, blood glucose, and more.

Initial Look

First, we want to take a look at the potential relationships between our given variables, examining variables that may have a strong correlation with Diabetic Diagnosis.

The correlation plot above shows that the variables with the highest correlation at 0.67 are oral glucose tolerance and blood glucose after fasting. The second highest correlation at 0.55 is blood insulin level and BMI. Age also appears to be somewhat correlated with BMI, blood glucose after fasting, and oral glucose tolerance. Specifically looking at diabetic diagnosis, there appears to be a somewhat negative correlation with both oral glucose tolerance and blood glucose after fasting, with correlations of -0.26 and -0.29 respectively. Similar results were found in a Diabetes Care journal article, where impaired glucose tolerance and fasting blood glucose levels were shown to have inverse relationships with an individual’s risk of developing type 2 diabetes. This is likely due to the fact that lower oral glucose tolerance and altered fasting blood glucose levels reflect the body’s inability to regulate glucose properly, which is a key characteristic of diabetes.

Differences by Diabetic Status

We first examine variables that may help us distinguish differences between diabetic and non-diabetic diagnoses. To start, we take a look at the relationship between Age Group and Blood Glucose Level after Fasting between diabetics and non-diabetics.

The violin plots show that non-diabetics have wider variability in fasting blood glucose levels, spanning a large range. This indicates a less consistent regulation of blood sugar in this group, which could be influenced by lifestyle habits. Seniors have a much narrower range of glucose levels, suggesting that an increase in age could be correlated with a more stable blood sugar regulation, which could be due to health interventions or awareness.

The variability for adult diabetics could be due to challenges in managing blood sugar from dietary habits. Seniors also have a tight cluster of values, but have less outliers. This may be due to better glucose control from medical oversight over the years.

Across both groups, diabetics consistently have higher fasting glucose levels than non-diabetics, underscoring the significant metabolic impact of the condition. The narrower distributions observed in seniors suggest that age and health interventions contribute to more consistent glucose regulation. Adults appear to be more affected by variability, highlighting the need for greater awareness around maintaining stable blood sugar levels through dietary moderation.

Next, we will examine blood glucose and insulin levels, as diabetes primarily arises from the body’s inability to produce sufficient insulin, resulting in elevated blood glucose levels.

This box plot compares the distribution of the delta between blood glucose and insulin levels for diabetic and non-diabetic groups. The “delta” is calculated as the difference between blood glucose and insulin levels. For example, diabetic individuals generally show a higher glucose-insulin delta compared to non-diabetic individuals. This suggests that diabetic people tend to have higher blood glucose levels relative to insulin levels, which could indicate a reduced insulin response or insulin resistance. To examine this relationship more, we take a closer look at Blood Glucose Levels and Blood Insulin Levels separately.

These violin plots show the distributions of blood glucose and insulin levels across individuals who are active versus inactive, with separate colors for diabetic and non-diabetic groups. For example, diabetic individuals, both active and inactive, generally have higher glucose levels and a spread in insulin levels, indicating that activity alone may not fully regulate glucose-insulin dynamics for diabetic individuals. Next, we look at the relationship between Exercise and Blood Glucose Level after Fasting, and its effect on Diabetes Diagnosis

The bar chart reveals strong correlations between blood glucose levels and a positive diabetes diagnosis. Patients with higher blood glucose levels after fasting are more likely to have diabetes. In this dataset, “Regular Exercise” is defined as the “respondent takes part in weekly moderate or vigorous-intensity physical activity”. When taking into account exercise, there is very minimal effect on blood glucose levels after fasting for those without diabetes. This evidence is relatively strong because of the large number of respondents without diabetes.

For those with diabetes, the mean glucose level after fasting for those with “Regular Exercise” is 120.5. On the other hand, those with “No Regular Exercise” have a mean glucose level after fasting of 160.2. Since there are only 21 total respondents with a positive diabetes diagnosis, this evidence is not conclusive. However, from the data, we can hypothesize that regular exercise is correlated with lower blood glucose levels after fasting.

From a health perspective, this would strongly encourage patients with diabetes to take part in regular exercise as it can lower blood glucose levels significantly.

We now look into BMI, more specifically its relationship with both blood glucose (after fasting) and blood insulin, in both diabetics and non-diabetics.

First we look into the correlation map which includes both diabetics and non-diabetics. Notice that there appears to be positive correlations between BMI and blood insulin along with BMI and blood glucose. But as we split the two groups and look at the maps we notice a change. For diabetics, there now appears to be negative correlation between BMI and blood glucose and a stronger positive correlation between BMI and blood insulin.

We can take a closer look at the type of relationship by plotting these variables. Looking into the scatter plots we notice these trends mentioned. Looking into the diabetics plot, we notice that there seems to be a few outliers for blood glucose, while blood insulin does not seem to have any outliers. There is one specific outlier to note, that in which lies at about a BMI of 27.6, and blood glucose of 405. Looking at this individual’s blood insulin levels, it is 9.65. Additionally, the highest insulin level at 102.29, with blood glucose of 103, are both diagnosed as diabetic.

Diagnosis Modeling

With these relationships, we have a foundational understanding of the relationship between our variables, given the differences in diabetic diagnosis. Next, we want to build a predictive model to determine diabetic diagnosis based on our variables. We can take a closer look at which variables are strongly correlated with our response, diabetic diagnosis, in order to start modeling.

Given the correlation barplot above, we identify that blood glucose after fasting and oral glucose tolerance are the most correlated with diabetic diagnosis with correlations of -0.29 and -0.26 respectively. Next, we take a closer look at this relationship.

The scatterplot between oral glucose tolerance and blood glucose level after fasting visualizes that diabetic individuals tend to have higher oral glucose tolerance and higher blood glucose levels after fasting. However this trend is difficult to visualize due to the imbalance of diabetic individuals (only 21 observations) compared to non-diabetic individuals (2198 observations). Thus, we utilize decision tree modeling, via the boosting technique, to draw conclusions about diabetic diagnosis.

With an accuracy of 98.4985%, we can now use our boosting tree model to predict diagnosis based on oral glucose tolerance and blood glucose levels after fasting. The leaf nodes of our tree tell us the predicted result of diabetic diagnosis, where negative numbers represent diabetic and positive numbers represent non-diabetic.

Where purple represents diabetic and blue represents non-diabetic, we can apply our findings from the decision tree to visualize which areas are prone to being diabetic.

Conclusions

From our model, we find that there is a higher chance an individual is diagnosed with diabetes if they have an oral glucose tolerance above 219 and if their blood glucose level after fasting is above 155 or below 116. Oral glucose tolerance measures how an individual’s body handles sugar after meals as the body uses the sugar for energy. A higher oral glucose tolerance indicates the body is not handling sugar as it is meant to, as the level of sugar in the blood remains high, leading to the likelihood of being diagnosed as diabetes which is reflected in our findings.

Blood glucose levels after fasting, otherwise known as blood sugar, is a common metric for determining diabetes. The case of high blood sugar, known as hyperglycemia, is a common sign of diabetes, which is reflected in the diagnosis of diabetes for high blood glucose levels. On the other hand, hypoglycemia is the case of low blood sugar, and is often a result of diabetes treatment, which we can see in the diabetic diagnosis for low blood glucose levels.

Our findings also indicate blood glucose levels are closely related with exercise and BMI, which suggest the best way to avoid diabetes is to maintain a healthy diet and consistent exercise! So, although you always deserve a sweet treat, just make sure to supplement it with health conscious choices!

References

Better Health Channel. (n.d.). Diabetes and insulin. Better Health Victoria. Retrieved December 13, 2024, from https://www.betterhealth.vic.gov.au/health/conditionsandtreatments/diabetes-and-insulin

Buravleva stock. “Diabetes” Shutterstock, 08/24/2021, www.shutterstock.com/image-vector/diabetes-doctors-testing-blood-glucose-using-2030471396.

Centers for Disease Control and Prevention. (n.d.). Added sugars data and research. Centers for Disease Control and Prevention. Retrieved November 26, 2024, from https://www.cdc.gov/nutrition/php/data-research/added-sugars.html#cdc_data_surveillance_section_4-consumption-in-children-and-young-adults

Centers for Disease Control and Prevention. (n.d.). Diabetes data and research. Centers for Disease Control and Prevention. Retrieved November 26, 2024, from https://www.cdc.gov/diabetes/php/data-research/index.html

Mayo Clinic. (n.d.). Glucose tolerance test. Mayo Clinic. Retrieved November 26, 2024, from https://www.mayoclinic.org/tests-procedures/glucose-tolerance-test/about/pac-20394296

Cleveland Clinic. (n.d.). Blood glucose test. Cleveland Clinic. Retrieved November 26, 2024, from https://my.clevelandclinic.org/health/diagnostics/12363-blood-glucose-test

UCI Machine Learning Repository. (n.d.). National Health and Nutrition Examination Survey 2013–2014 (NHANES) age prediction subset. University of California, Irvine. Retrieved November 26, 2024, from https://archive.ics.uci.edu/dataset/887/national+health+and+nutrition+health+survey+2013-2014+(nhanes)+age+prediction+subset

Saad, M. F., & Fox, C. S. (2006). Risk of progression to type 2 diabetes based on relationship between postload plasma glucose and fasting plasma glucose. Diabetes Care, 29(7), 1613–1618. https://doi.org/10.2337/dc06-0361

--

--

Responses (6)