Psychology

AI Predicts Math Success Across Six East Asian Education Systems

AI Insight

This study analyzed mathematics performance data from 26,969 fifteen-year-old students across six high-performing East Asian countries using machine learning models, with XGBoost proving most accurate in predicting achievement. The research found that mathematics self-efficacy was the strongest predictor of performance, followed by participation in before-school extracurricular activities and weekly instructional time, while affective and behavioral factors consistently outweighed socioeconomic variables in importance. The model explained approximately 57% of variance in student mathematics achievement.


The findings suggest that educational interventions focusing on building student confidence in mathematics and optimizing instructional approaches may be more effective than structural reforms alone. This evidence-based approach provides actionable guidance for policymakers and educators in East Asian and similar educational contexts to prioritize self-efficacy development and equitable learning opportunities.


IntroductionIn the contemporary era of artificial intelligence and rapidly evolving knowledge systems, mathematics performance has become a critical competency for younger generations. Although the mathematical achievement of students in East Asian countries has attracted increasing scholarly attention, studies employing machine learning techniques to examine the combined determinants of their success remain limited.MethodsThis study evaluates six machine learning models-Random Forest, LightGBM, XGBoost, AdaBoost, Elastic Net, and Linear Regression to identify the most accurate algorithm for predicting mathematics performance among students from six high-performing East Asian countries/economies participating in the Programme for International Student Assessment (PISA) 2022. A sample of 26,969 fifteen-year-old students was analyzed. Following model selection, a post hoc feature selection procedure was applied, retaining the 24 most influential predictors from an initial set of 62 variables to ensure analytical parsimony while preserving model performance. SHapley Additive exPlanations (SHAP) values and SHAP interaction analyses were employed to quantify the magnitude, direction, and heterogeneity of each predictor’s contribution at the individual level, including nonlinear relationships.ResultsXGBoost emerged as the optimal model, demonstrating superior predictive accuracy (R2 = 0.5758, RMSE = 65.06) and explaining approximately 57.03% of the variance in mathematics achievement. Mathematics self-efficacy was identified as the most dominant predictor, exerting a substantially larger effect than all other variables, followed by participation in extracurricular activities before school and weekly mathematics instructional time. Affective, behavioral, and instructional factors consistently outperformed structural and socioeconomic variables in predictive importance.Discussion and conclusionThese findings underscore the central role of student-proximal determinants in mathematics achievement within Confucian Heritage Culture educational contexts. Interpreted through the lens of self-determination theory, the results carry important implications for educational policy and practice, particularly in prioritizing self-efficacy development, optimizing instructional time, and promoting equitable learning environments. The study also contributes theoretically and offers directions for future research.

Source: Using machine learning to predict student mathematics performance in six East Asian countries: evidence from PISA 2022