Development of a Deep Learning Model for Predicting Obesity Using Health Behavior Data of Elementary School Students
Abstract
Background: Childhood obesity poses serious long-term health risks and is a growing global concern. In South Korea, national health surveys collect behavioral and physical data from elementary students, but the large number of questionnaire items can burden young respondents and reduce accuracy. Thus, simplified models with high predictive power are needed.
Methods: We analyzed data from over 250,000 elementary students collected by the Korean Ministry of Education (2015–2022). Using the Rohrer Index as the outcome variable, key predictors were selected via Lasso and Elastic Net regression. Categorical variables were reduced using Multiple Correspondence Analysis (MCA), and a deep learning model (NECTOR) combining MLP and self-attention was developed.
Results: NECTOR achieved high predictive performance with R² scores of 0.994 (boys) and 0.996 (girls), and low mean squared errors of 3.072 and 1.841, respectively. It outperformed baseline models using the same inputs.
Conclusion: A small set of core health indicators can effectively predict the Rohrer Index. The proposed model enables efficient and reliable obesity screening in school settings, supporting early intervention efforts.
2. Bjerregaard LG, Jensen BW, Ängquist L, et al (2018). Change in overweight from childhood to early adulthood and risk of type 2 diabetes. N Engl J Med, 378 (14): 1302–1312.
3. Simmonds M, Llewellyn A, Owen CG, Woolacott, N (2016). Predicting adult obesity from childhood obesity: A sys-tematic review and meta-analysis. Obes Rev, 17 (2): 95–107.
4. UNICEF (2019). The State of the World’s Chil-dren 2019: Children, Food and Nutrition – Growing well in a changing world. https://www.unicef.org/reports/state-of-worlds-children-2019
5. Herman KM, Craig CL, Gauvin L, Katzmarzyk PT (2009). Tracking of obe-sity and physical activity from child-hood to adulthood: the Physical Activity Longitudinal Study. Int J of Pediatr Obes, 4 (4): 281-288.
6. Sharma H (2022). How short or long should be a questionnaire for any research? Re-searchers dilemma in deciding the ap-propriate questionnaire length. Saudi J Anaesth, 16 (1): 65-68.
7. Lim H, Lee H, Kim J (2023). A prediction model for childhood obesity risk using the machine learning method: a panel study on Korean children. Sci Rep, 13 (1): 10122.
8. Sonoda R, Tokiya M, Touri K, Tanomura Y, Yada K, Funakoshi Y, Saito I (2023). A point system to predict the future risk of obesity in 10-year-old children. Environ Health Prev Med, 28: 25.
9. Gupta M, Phan TLT, Bunnell HT, Beheshti R (2022). Obesity Prediction with EHR Data: A deep learning approach with in-terpretable elements. ACM Trans Comput Healthc, 3 (3): 32.
10. Colmenarejo G (2020). Machine learning models to predict childhood and ado-lescent obesity: a review. Nutrients, 12 (8): 2466.
11. Cole TJ, Bellizzi MC, Flegal KM, Dietz WH (2000). Establishing a standard defini-tion for child overweight and obesity worldwide: international survey. BMJ, 320 (7244): 1240-3.
12. Dietz WH (1998). Health consequences of obesity in youth: childhood predictors of adult disease. Pediatrics, 101(3 Pt 2):518-25.
13. Patton GC, Sawyer SM, Santelli JS, et al (2016). Our future: a Lancet commission on adolescent health and wellbeing. Lancet, 387 (10036):2423-2478.
14. Peterson CM, Su H, Thomas DM, et al (2017). Tri-ponderal mass index vs body mass index in estimating body fat dur-ing adolescence. JAMA Pediatr, 171 (7): 629-636.
15. Khosla T, Lowe CR (1967). Indices of obe-sity derived from body weight and height. Br J Prev Soc Med, 21 (3): 122-8.
16. Tibshirani R (1996). Regression shrinkage and selection via the lasso. J R Stat Soc Series B Stat Methodol, 58 (1): 267-288.
17. Zou H, Hastie T (2005). Regularization and variable selection via the elastic net. J R Stat Soc Series B Stat Methodol, 67 (2): 301-320.
18. Abdi H, Valentin D (2007). Multiple corre-spondence analysis. Encyclopedia of Meas-urement and Statistics, 2 (4):651-657.
19. Hammond R, Athanasiadou R, Curado S, et al (2019). Predicting childhood obesity using electronic health records and pub-licly available data. PLoS One, 14 (4): e0215571.
| Files | ||
| Issue | Vol 54 No 12 (2025) | |
| Section | Original Article(s) | |
| Keywords | ||
| Childhood obesity Rohrer Index Deep learning Health behavior School health survey | ||
| Rights and permissions | |
|
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. |



