This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Cardio, is properly cited. The complete bibliographic information, a link to the original publication on http://cardio.jmir.org, as well as this copyright and license information must be included.
Activity monitoring is necessary to investigate sedentary behavior after a stroke. Consumer wearable devices are an attractive alternative to research-grade technology, but measurement properties have not been established.
The purpose of this study was to determine the accuracy of 2 wrist-worn fitness trackers: Fitbit Charge HR (FBT) and Garmin Vivosmart (GAR).
Adults attending in- or outpatient therapy for stroke (n=37) wore FBT and GAR each on 2 separate days, in addition to an X6 accelerometer and Actigraph chest strap monitor. Step counts and heart rate data were extracted, and the agreement between devices was determined using Pearson or Spearman correlation and paired
Step counts from FBT and GAR positively correlated with the X6 accelerometer (ρ=.78 and ρ=.65,
FBT and GAR had moderate to strong correlation with best available reference measures of walking activity in individuals with subacute stroke. Accuracy appears to be lower among rollator users and varies according to heart rhythm. Consumer wearables may be a viable option for large-scale studies of physical activity.
Physical activity and exercise are recommended for stroke survivors because of the wide range of benefits that support recovery [
As an outcome measure for research trials, for example, testing the effectiveness of exercise training, self-report measures are frequently used to collect information on free-living physical activity but are prone to inaccuracy (eg, overestimation) from recall bias [
Historically, accelerometer-based activity monitors developed for research settings have been expensive and relatively difficult to use. For example, the Accelerometry for Bilateral Lower Extremities system, which accurately measures walking activity after stroke, requires trained personnel and a custom algorithm that operates on proprietary software to process the data [
Accelerometers can be reliable and valid for activity monitoring in persons with stroke [
This study was approved by the institutional research ethics board. Sample size target was 40 to represent the range of physical function typical of the subacute stroke population. Between June 2016 and March 2017, 37 adults attending in- or outpatient therapy for stroke at the Toronto Rehabilitation Institute provided written informed consent following an invitation to participate. Individuals were excluded if they were unable to walk without physical assistance from another person or if they were unable to understand written or spoken English. Participant characteristics are presented in
Participants wore 4 devices for 5.5-10 hours consecutively: (1) Actigraph chest strap heart rate monitor worn under clothing; (2) wGT3X+ sensor (Actigraph, Pensacola, Florida, USA); (3) Model X6-2mini (“X6”) accelerometer (Gulf Coast Data Concepts, LLC, Waveland, Mississippi, USA); and (4) consumer wearable device on the wrist of the less-affected arm: FBT (Fitbit Inc., San Francisco, California, USA) and GAR (Garmin Ltd., Schaffhausen, Switzerland), which were worn on 2 separate days within 1 week. The wGT3X+ sensor is also capable of accelerometry but was only used in this study to store the Actigraph heart rate data. Although body location differs between devices, the placements are consistent with previous validation methods and regular functionality such that results are applicable to use in “real-life.”
Study personnel visited inpatients on the stroke unit in the morning (typically between 8 am and 9 am) to don the devices and then retrieved them at the end of the workday (~4 pm). Outpatient participants were met during the day and sent home wearing the devices, along with instructions to remove them before bed and to return them at their next visit or therapy session. A piece of Fabrifoam was used to affix the wGT3X+ sensor and X6 accelerometer to the ankle of the less-affected leg. Participants were instructed to go about normal daily activities and not remove the devices unless required (eg, discomfort, personal hygiene, or risk of damage), or unless they became a burden. Upon retrieval or return, participants completed a feasibility questionnaire asking about their experience and thoughts on the device (
Participant characteristics (n=37).
Descriptive variable | Mean (SD)a, median, or count | Range or percentageb | |
Age, years | 64.4 (15.0) | 41-90 | |
Women | 13 | 35 | |
Height, cm | 171.0 (9.3) | 152-190.5 | |
Weight, kg | 77.1 (15.4) | 45-113 | |
Time post stroke, days | 42.6 (33.2) | 12-135 | |
Left | 20 | 54 | |
Right | 15 | 41 | |
Bilateral | 1 | 3 | |
None | 1 | 3 | |
NIH-SSc score | 2 | 0-11 | |
COVSd score | 85 | 65-91 | |
BBSe score | 53 | 4-56 | |
CMSAf stage of leg | 6 | 4-7 | |
CMSAf stage of foot | 6 | 3-7 | |
Walking speed, m/s | 0.92 (0.29) | 0.28-1.5 | |
None | 12 | 32 | |
Rollator | 17 | 46 | |
Single point cane | 8 | 22 | |
Atrial fibrillation | 6 | 16 |
aSD: standard deviation.
bPercentages may not sum to 100% due to rounding.
cNIH-SS: National Institutes of Health-Stroke Scale.
dCOVS: Clinical Outcome Variables Scale.
eBBS: Berg Balance Scale.
fCMSA: Chedoke-McMaster Stroke Assessment.
Within 2 days of activity monitoring, the following tests were conducted for each participant during a short data collection session or as part of routine clinical care (data then extracted from patient chart): the National Institutes of Health-Stroke Scale (NIH-SS) [
Maximum heart rate was determined in 1 of the following 3 ways: cardiopulmonary exercise test as part of routine care (value recorded by electrocardiography when respiratory exchange ratio >1.1; n=3); estimation using published formulas (164−0.7×age for individuals taking beta-blockers [
Step counts were extracted from the X6 accelerometer data using a previously validated custom written algorithm implemented in MATLAB (MathWorks, Nantick, Massachusetts, USA) [
Heart rate data, transmitted from the Actigraph chest strap monitor to the wGT3X+ sensor via Bluetooth, were transferred to a computer, initially processed in 60-second epochs using the ActiLife software version 6 (Actigraph, Pensacola, Florida, USA), and exported to a text file. We noted that a number of Actigraph data points were physiologically improbable (<45 beats per minute); these were removed before further processing. To allow for comparison of heart rate measurement, we created time-aligned Actigraph and FBT/GAR data series. Actigraph heart rate data were averaged over 5-min epochs to compare with FBT values, which were manually transcribed into a spreadsheet from the Web application due to manufacturer’s restrictions in accessing raw data. GAR heart rate time series data were downloaded from the Web application as TCX files in 60-second epochs and converted into text files.
Because a large amount of heart rate data were missing, we first examined responsiveness of Actigraph, FBT, and GAR measures to changes in activity. Step counts from the X6 accelerometer were tallied over 5-min (for FBT) and 1-min (for GAR) epochs and aligned with the heart rate data. Heart rate, as recorded by each device, was averaged over all periods of rest (epochs with zero steps recorded) and at 3 intensities of walking activity: 50-79%, 80-99%, and ≥100% of comfortable cadence (based on self-selected walking on the GAITRite mat). We then determined agreement between the Actigraph and FBT/GAR heart rate data. If epochs were missing for one device, the corresponding data were deleted for the other device. From these modified time series, average heart rate and time within a target zone (55-80% of maximum heart rate) were calculated for each device.
Step counts from FBT and GAR were compared with the X6 accelerometer using Spearman correlation (ρ) and Wilcoxon signed rank tests. Both step count analyses were conducted for the whole group and separately by usual gait aid. To test the responsiveness of the devices to changes in physical activity, average heart rate at each intensity was compared with resting heart rate using paired
Bland-Altman plots were used to visualize interdevice agreement. The difference between the FBT or GAR data and the reference measurements for each participant were plotted against the average of the 2 values. The mean or median difference and its 95% CI or interquartile range were represented by lines on each graph.
Out of all the participants, 5 chose not to complete the second day of the study; therefore, analysis of step count data was limited to 36 participants for FBT and 33 for GAR. Furthermore, 2 participants declined to wear the Actigraph chest strap, and there were less than 60 min of valid heart rate data on both devices for 4 (when wearing FBT) and 9 participants (when wearing GAR); therefore, comparison of Actigraph and wrist-device heart rate data was limited to 30 (for FBT) and 22 (for GAR) participants. Potential reasons for missing data are discussed below.
Results of the step count analyses are presented in
Agreement in step counts between X6 accelerometer and wrist-worn devices.
FBTa | GARb | |||||||||
Differencec |
||||||||||
All participants | 36 | .78 (<.001) | 463 |
32.9% |
.002; .99 | 33 | .65 (<.001) | 963 |
40.9% |
.008; break/>.99 |
No gait aid | 13 | .97 (<.001) | −203 |
10.3% |
.85; .32 | 11 | .56 (.07) | −561 |
23.1% |
.28; break/>0.77 |
Rollator | 15 | .42 (.12) | 926 |
52.3% |
.01; .99 | 15 | .30 (.27) | 1390 |
67.2% |
.001; break/>>.99 |
Single-point cane | 8 | .98 (<.001) | 406 |
12.6% |
.08; .90 | 7 | .93 (.003) | 963 |
21.3% |
.02; break/>.99 |
aFBT: Fitbit Charge HR.
bGAR: Garmin Vivosmart.
cThe difference is calculated as the X6 accelerometer step count minus the wrist device step count; therefore, a positive value means the wrist-worn device undercounted, whereas a negative value means the wrist-worn device overcounted.
dIQR: interquartile range
d
Bland-Altman plots of step count agreement between X6 accelerometer (ACC) and wrist-worn devices: left, Fitbit Charge HR (FBT), and right, Garmin Vivosmart (GAR). Solid bold line is the median difference between step count measurements, averaged over all participants. Dashed lines are the interquartile range of the difference. Note that the scale on the y-axis is not the same between the graphs.
For step count measurement of all participants combined, there was a strong positive correlation between the X6 accelerometer and FBT (ρ=.78,
For the gait aid subanalyses, there were strong positive correlations between X6 accelerometer and FBT step counts (ρ>.97,
On average, valid Actigraph, FBT, and GAR heart rate data were available for 42.4% (95% CI 35.7-48.8), 95.3% (95% CI 93.3-97.2), and 75.1% (95% CI 63.8-86.5) of the time worn during monitoring, respectively. Data indicating responsiveness of heart rate to changes in activity are presented in
Responsiveness of heart rate devices to change in walking activity.
Cadence (%) | Five-min epochs | One-min epochs | ||||||
FBTa | GARb | |||||||
50-79 | 6.3 (3.2-9.3) | <.001 (27) | 4.7 (0.6-8.8) | .03 (20) | 2.8 (0.9-4.7) | .006 (30) | 3.5 (0.7-6.3) | .02 (28) |
80-99 | 15.4 (10.0-20.7) | <.001 (13) | 11.8 (−0.2 to 23.7) | .05 (7) | 3.5 (0.5-6.5) | .02 (26) | 6.6 (3.8-9.5) | <.001 (25) |
≥100 | 16.8 (7.9-25.8) | .005 (6) | 17.8 (−26.3 to 62.0) | .22 (3) | 1.8 (−1.6 to 5.3) | .27 (14) | 12.9 (5.7-20.2) | .003 (9) |
aFBT: Fitbit Charge HR.
bGAR: Garmin Vivosmart.
cValues presented are mean increase in heart rate from rest (“percentage change”), expressed as a percentage of estimated maximum heart rate, with 95% CI in parentheses.
d
Both Actigraph (for 1-min epochs) and FBT showed significant increases in heart rate when participants walked at greater than or equal to 50% of their self-paced cadence compared with rest (
Results of the comparison in heart rate data between devices are presented in
When participants without arrhythmia were analyzed separately, the correlation between Actigraph and FBT for average heart rate (
Agreement in heart rate data between Actigraph and wrist-worn devices.
Group | FBTa | GARb | |||||||||
Differencec |
|||||||||||
All participants | 30 | .53 (.003) | 2.4 |
10.1% |
.28; .30 | 22 | .75 (<.001) | −0.5 |
7.4% |
.78; .16 | |
No arrhythmia | 24 | .64 (<.001) | 1.1 |
9.9% |
.61; .20 | 19 | .74 (<.001) | −1.1 |
7.7% |
.59; .23 | |
Atrial fibrillation | 6 | .16 (.77) | 7.6 |
10.7% |
.31; .67 | 3 | .87 (.33) | 3.1 |
5.8% |
.46; .47 | |
All participants | 30 | .49 (.006) | −15 |
42.9% |
.43; .67 | 22 | .74 (<.001) | −5.5 |
28.4% |
.15; .82 | |
No arrhythmia | 24 | .57 (.004) | −15 |
42.9% |
.45; .70 | 19 | .73 (<.001) | −8 |
29.4% |
.16; .85 | |
Atrial fibrillation | 6 | −.03 (.96) | −5 |
66.7% |
.88; .58 | 3 | 1.0 (<.001) | −1 |
15.7% |
.75; .38 |
aFBT: Fitbit Charge HR.
bGAR: Garmin Vivosmart.
cThe difference is calculated as the Actigraph value minus the value for the wrist device; therefore, a positive value means the wrist-worn device underestimated, whereas a negative value means the wrist-worn device overestimated.
dIQR: interquartile range.
e
fPearson correlation coefficient.
gSpearman correlation coefficient.
Bland-Altman plots of agreement between Actigraph and wrist-worn devices: left, Fitbit Charge HR (FBT), and right, Garmin Vivosmart (GAR) for mean heart rate (top) and median time in target zone (bottom). Solid bold line is the mean difference between measurements, averaged over all participants. Dashed lines are the 95% CI or interquartile range of the difference. Note that the scale on the y-axis is not the same between the graphs.
The positive correlations between Actigraph and GAR for average heart rate and time in zone remained high for both those without arrhythmia (
All participants completed the feasibility questionnaire for at least 1 device; 27 individuals evaluated their experience with all 4 devices (both fitness trackers, X6 accelerometer, and chest strap). In terms of comfort, 94% (31/33) and 97% (32/33) of participants found FBT and GAR, respectively, to be somewhat or very comfortable, whereas 89% (33/37) and 91% (31/34) said the same for the X6 accelerometer and heart rate monitor, respectively. Overall, 7 individuals reported problems wearing the devices. Issues included general discomfort, trouble with doffing, and wrist strap feeling too tight. When asked about the level of confidence in their ability to don and doff independently, the average response, based on a visual analog scale from 0 (not confident at all) to 10 (extremely confident), was 8.8 for all devices except for the chest strap (7.7). A large majority of participants said they would be likely or very likely to participate in a study that involved wearing the FBT (28/33, 85%), GAR (29/33, 88%), X6 accelerometer (30/37, 81%), or heart rate monitor (24/34, 71%) every day for 1 week. Some concerns included sleeping with the device and remembering to put it on.
The main finding of this study is that FBT and GAR had a moderate to strong correlation with the best available reference devices for measuring walking activity in terms of step count and heart rate among individuals with subacute stroke. Accuracy varied widely according to mobility status and based on whether or not heart rhythm was normal. The consumer devices were well accepted by participants.
In patients not using a gait aid, steps counted by the fitness trackers were not different from that of the X6 accelerometer; however, GAR was inaccurate (23.1% error) compared with FBT (10.3% error), and equivalence was not demonstrated. Neither device appears suitable for rollator users due to significant undercounting, and despite strong correlation of FBT with the X6 accelerometer for single-point cane users, accuracy was low (>10% error). Most studies on consumer wearables have been conducted with healthy adults, but, consistent with our results, overall validity of step counts with a tendency toward underestimation has been found [
Although the high accuracy of chest strap monitors is well established, comparison of heart rate data was complicated by the low reliability of the Actigraph acquisition system. This may have been due to Bluetooth transmission problems, drying of electrode areas over time, or chest strap placement issues such as slippage through the day. Therefore, the positive correlations of the Actigraph with FBT and GAR could be over- or underestimated, and power to detect a difference in average heart rate or time in target zone was reduced. Considering data availability and responsiveness, FBT appeared superior, but heart rate measured by GAR was also sensitive to changes in walking intensity (ie, cadence). In general, intensity of physical activity appeared to be relatively low according to total time in target zone and the sample size of higher cadence levels, although data may not have been available when participants walked quickly, which could account for the relatively high error associated with this parameter. For individuals with atrial fibrillation, FBT had lower agreement with the Actigraph (10.7% average heart rate error) than did GAR; however, it is not clear to which device the inaccuracy for this clinical subgroup can be attributed as no criterion standard was performed for comparison (see below). Some wearable heart rate monitors based on photopletysmography (optical detection of blood volume changes) have been evaluated with evidence of variable accuracy [
It may have been possible to minimize the large amount of missing data by performing the study in a laboratory with constant supervision of the participants, but, aside from being more resource-efficient, our design benefits from ecological validity. Devices were compared from different but typical and realistic body positions. The portability of the technology tested allowed for monitoring to take place under free-living conditions over many hours such that a range of activity levels could theoretically be captured. This precluded the use of “gold standard” measures such as electrocardiography and visual step counts to establish criterion validity. Clinically relevant variables were evaluated, and subanalyses revealed differences between groups to more precisely guide the interpretation of results. Although manual entry of some data was necessary for the purpose of this study, the commercially intended functions of consumer wearables provide information in a user-friendly format that could easily be applied in clinical or research settings.
Overall, the strength of correlations and measures of accuracy suggest that FBT is valid for step-counting in individuals who do not use a gait aid, whereas both devices are suitable for group analyses that tolerate greater measurement variability. The tendency to underestimate steps and general lack of equivalence with reference standards should be considered. FBT was reliable, responsive, and accurate for recording nonarrhythmic heart rate. Assessing validity in participants with atrial fibrillation was limited by low sample sizes. These results, along with the generally positive feedback from the feasibility questionnaire, imply that fitness trackers may be a viable alternative to “research grade” activity monitors for large clinical trials. As more commercial models and algorithms are developed, consumer wearables should continue to be investigated for accuracy. The selection of a device for research or health care purposes will ultimately depend on the context, including patient population and primary outcome.
Feasibility questionnaire.
Berg Balance Scale
Chedoke-McMaster Stroke Assessment
Clinical Outcome Variables Scale
Fitbit Charge HR
Garmin Vivosmart
National Institutes of Health-Stroke Scale
standard deviation
The authors acknowledge the support of the Toronto Rehabilitation Institute; equipment and space have been funded with grants from the Canada Foundation for Innovation, Ontario Innovation Trust, and the Ministry of Research and Innovation. AM is supported by a New Investigator Award from the Canadian Institutes of Health Research (MSH-141983).
None declared.