Change in Life Expectancy Across Countries

Authors

  • Sutej Reddy Mandadi Redmond High School

Abstract

This paper used machine learning (ML) techniques to examine which factors contribute the greatest to life expectancy levels. Firstly, through background research, life expectancy was found to be an effective representation of a country’s overall health. Next, initial data analysis was done to analyze which features of the data were relevant to this study by looking at the factors affecting life expectancy. After the features were selected, three ML models were fitted to the data: multiple linear regression, random forest regression, and decision tree regression. The ML models were instrumental in identifying how these features interact with each other and life expectancy. The random forest regression model returned the highest R-squared value so that is the model used for this study. The R-squared value communicates how accurately the model makes predictions compared to the actual test data. To decide which of the features affected life expectancy greatest, feature importance was used. Feature importance is a metric that shows how greatly features are affecting the output value in an ML model. After running feature importance on the random forest regression model, the graph showed that the gross domestic product (GDP) of the country most greatly affected life expectancy. GDP encompasses the value of total final output of goods and services produced by the economy of a nation in a year. This conveys the importance of economic involvement to a country’s overall health. When a resource-constrained country does better economically and improves its GDP, it increases output of goods and services resulting in job creation and more money in the nation. The additional financial resources will provide an opportunity for resource constrained nations to spend more money on institutions like health care and education which in turn impact life expectancy positively.

Downloads

Published

2025-01-21

Data Availability Statement

The dataset used for this paper is available in Kaggle. The link to the dataset is: https://www.kaggle.com/datasets/kumarajarshi/life-expectancy-who. The dataset was originally published by the World Health organization.

Issue

Section

Research Articles