Novel insights into amyotrophic lateral sclerosis progression through machine learning: analysis of biomarkers and clinical observations in a large-scale patient database

Authors

  • Berke Yilmaz Tesla STEM High School

Abstract

Amyotrophic Lateral Sclerosis (ALS) is a relentless and devastating neurodegenerative disease characterized by the progressive degeneration of motor neurons in the brain and spinal cord. This study aims to enhance the tracking of ALS progression by identifying key predictors of decline using the ALS Functional Rating Scale (ALSFRS) score. Utilizing the comprehensive Pooled Resource Open-Access ALS Clinical Trials (PRO-ACT) Database, a diverse array of machine learning algorithms is employed, including logistic and LASSO regressions, support vector machines, random forests, gradient boosted trees, explainable boosted machines, extreme gradient boosted trees, and neural network modeling. After data preprocessing, the study analyzed a clean cohort of approximately 6,000 patients and over 400 features, representing the most extensive dataset used in ALS research within the Pro-ACT framework to date. This dataset includes detailed demographics, medication usage, and blood marker information. The Explainable Boosting Machine (EBM) demonstrated superior performance, achieving an AUC of 0.81, accuracy of 0.74, recall of 0.73, and precision of 0.64, with significant (80%) overlap in key features identified across models. A total of 24 biomarkers were identified as playing a role in ALS progression, with Bicarbonate, Creatine Kinase, Creatinine, Chloride, Calcium, and Phosphorus standing out as the most significant. Both the feature importance scores from the Explainable Boosting Machine (EBM) and the Mann-Whitney Test (p < 0.001) confirmed the statistical significance of these key biomarkers, validating their critical roles in the analysis of ALS progression.

Author Biography

  • Berke Yilmaz, Tesla STEM High School

    Berke Yilmaz is a senior at Tesla STEM High School with a deep passion for technology and medical science. His early fascination with machines and computers has evolved into a dedicated pursuit of computer science, where he has developed strong skills in coding and software development. This foundation has led him to explore the field of machine learning, applying data-driven techniques to solve complex problems and make informed predictions.

    Berke's interest has recently expanded to the intersection of technology and healthcare, particularly in the field of computational biology. His work focuses on understanding genetics, biological processes, and medical research, with the goal of leveraging technology to improve healthcare outcomes. Through his research, he aims to make a meaningful impact on people's lives.

    Looking ahead, Berke is eager to continue learning and innovating within these fields. Whether through further education, research projects, or entrepreneurial endeavors, he is committed to advancing the possibilities of technology to address real-world challenges.

Downloads

Published

2025-01-21

Data Availability Statement

The data is available on Pro-Act website: https://ncri1.partners.org/ProACT/Home/Index

Issue

Section

Research Articles