AI Model Translates Wearable Activity Data into Natural Language Summaries
Global: AI Model Translates Wearable Activity Data into Natural Language Summaries
A novel generative framework integrates minute‑level wearable activity recordings with large language models to produce free‑text daily behavioral summaries. The system, evaluated on a dataset of 54,383 actigraphy‑text pairs derived from NHANES recordings, achieved a BERTScore‑F1 of 0.924 and a ROUGE‑1 score of 0.722, surpassing prompt‑based baselines by 7 percent in ROUGE‑1.
Methodology
The approach combines a pretrained actigraphy encoder with a lightweight projection module that maps behavioral embeddings into the token space of a frozen decoder‑only large language model. By keeping the language model frozen, the framework relies on the projection to bridge sensor representations and textual tokens, enabling autoregressive generation without fine‑tuning the entire language model.
Dataset Construction
Researchers compiled a novel dataset containing 54,383 paired samples of raw actigraphy signals and corresponding textual descriptions. The data were sourced from real‑world recordings in the National Health and Nutrition Examination Survey (NHANES), ensuring a diverse representation of daily activity patterns across a broad population.
Performance Evaluation
Quantitative assessment reported a BERTScore‑F1 of 0.924, indicating high semantic fidelity between generated summaries and reference texts. Lexical accuracy measured by ROUGE‑1 reached 0.722, outperforming existing prompt‑based baselines by 7 percent. These metrics suggest the model reliably captures both meaning and wording of human‑written summaries.
Training Dynamics
The model was trained using cross‑entropy loss on language tokens alone. Average training loss converged to 0.38 by epoch 15, reflecting stable optimization throughout the training process.
Qualitative Findings
Analysis of generated outputs demonstrated the model’s ability to reflect circadian structure and behavioral transitions within a day. Principal component analysis of embedding spaces showed improved cluster alignment after training, supporting the interpretability of the learned representations.
Implications and Future Work
By converting raw wearable sensor data into fluent, human‑centered narratives, the framework opens new pathways for behavioral monitoring, clinical review, and personalized health interventions. Future research may explore scaling to additional sensor modalities and integrating real‑time feedback mechanisms.
This report is based on information from arXiv, licensed under Academic Preprint / Open Access. Based on the abstract of the research paper. Full text available via ArXiv.
Ende der Übertragung