Medicine

A Reproducible Pipeline for Processing Commercial Wearable Step-Count Data in Aging Cohorts: Application and Evaluation in the STRRIDE-PD Reunion Study

AI Insight

This study presents a standardized, reproducible data processing pipeline for transforming raw step-count data from commercial Garmin wearable devices into usable research variables, tested in a cohort of 67 older adults from the STRRIDE-PD Reunion study. Wearable-derived step counts were significantly associated with 9 of 16 cardiometabolic and fitness outcomes, including cardiorespiratory fitness, body composition, and lipid profiles, while self-reported exercise showed no significant associations with any outcome. A regression calibration analysis confirmed that self-report systematically attenuated estimated associations, indicating that the choice of measurement instrument directly shapes scientific conclusions in physical activity research.


These findings underscore the inadequacy of self-reported physical activity as a sole measurement tool in aging epidemiology and highlight the need for standardized, transparent wearable data pipelines as foundational infrastructure for producing reliable and reproducible research on exercise and health outcomes in older populations.


⚠️ Preprint – Noch nicht peer-reviewed

Dieser Artikel wurde noch nicht von unabhängigen Experten begutachtet. Die Ergebnisse sind vorläufig und sollten mit Vorsicht interpretiert werden.

Wearable devices offer the ability to objectively characterize free-living physical activity; however, raw step-count data generated by commercial devices require systematic processing before they can support rigorous inference. We describe a transparent, reproducible standard operating procedure (SOP) for transforming epoch-level step-count data from commercial Garmin devices into participant-level analytic variables and demonstrate its application in the STRRIDE-PD Reunion study: a long-term follow-up of older adults originally enrolled in a supervised exercise intervention trial. This data pipeline standardizes timestamps, reconstructs daily epoch grids, infers wear time from observed step patterns, and applies a prespecified valid-day threshold ([≥]10 hours inferred wear time) to generate participant-level summaries. Among 67 participants (mean age 71.4 years, 65.7% women), the median valid-day count was 10 days, median average daily steps were 5,794, and participant-level estimates were identical across [≥]10-hour and [≥]6-hour valid-day thresholds. Wearable-derived step counts were significantly associated with 9 of 16 cardiometabolic and fitness outcomes, including cardiorespiratory fitness, body composition, and lipid profiles. By contrast, self-reported exercise – assessed via a frequency-by-duration composite ranked into deciles – was not significantly associated with any outcome. A regression calibration framework applied to the full sample quantified the attenuation underlying this discrepancy: the naive self-report model systematically underestimated associations relative to both the observed Garmin model and calibration-corrected estimates. These findings demonstrate that measurement approach is a determinant of scientific conclusions in physical activity research, and that reproducible wearable data pipelines are essential infrastructure for aging epidemiology.

Source: A Reproducible Pipeline for Processing Commercial Wearable Step-Count Data in Aging Cohorts: Application and Evaluation in the STRRIDE-PD Reunion Study