Is there bias in GPS enabled smartphone cycling app data?

By Michael Brenneis

Smartphones with GPS tracking ability are capable of collecting large amounts of pedestrian and cyclist movement data. But do tracking apps developed largely for athletic or route-planning use capture the big picture of where pedestrians and cyclists travel and what infrastructure they use? The answer, according to a new study in the Journal of Transport & Health, is “no.” These apps miss data from segments of the cycling population, as well as information about the usage of particular kinds of infrastructure by riders with particular characteristics.

The researchers in this study interviewed cyclists at 29 sites where cyclists were likely to be found in the City of Atlanta. They either approached those who were unlocking bikes or those with bicycles who were passing an event table. They missed riders that didn’t stop, or that only stopped in an unsurveyed area. The 99 participants answered an extensive list of questions about their riding purposes (utility or leisure), riding frequency, and their use of fitness or route-planning apps. The cyclists drew their typical routes on a map, and they were asked a list of demographic and socioeconomic questions.

The researchers found that app users in general shared similar sociodemographic characteristics with non-app users, but were a bit older and had a bit higher income. Riding characteristics did show some differentiation. App users were more likely to describe themselves as strong, fearless riders. App users rode more frequently than non-app users, and rode farther on a weekly basis. Non-app users spent a little more time in shared travel lanes than app users, perhaps out of necessity, since they also rode more for utility. App usage was reported by 39 percent of respondents.
The researchers used the survey results to simulate route data for each rider based on how often they would follow their typical route, and the probability of capturing these rides based on respondents’ patterns of app use. Ride-level data was “estimated by extrapolating rider-described typical rides to a one-year period.”

The simulated data revealed potential gaps in the capture of riding data. An estimated 34.6 percent of rides were captured by an app. Fitness apps recorded an estimated 40.9 percent of leisure rides but only 25.1 percent of commutes. Fitness apps recorded an estimated 39.4 percent of the distance ridden on multi-use trails, and 20.8 percent of the estimated distance ridden in protected bike lanes. It’s possible that app users, while riding for exercise, may demonstrate a preference for off-street paths over protected bike lanes, for example. App users may not record their commuting or errand-running rides, during which they could potentially make more use of protected, on-street infrastructure.

When evaluating the relative safety of different infrastructure—protected versus conventional bike lanes, for example—on a crash per kilometer ridden basis, the researchers found more plausible results by accounting for the proportion of rides recorded by apps on the different types of infrastructure individually, rather than lumping all types of infrastructure together. Weighting calculations by the overall expected proportion of ridership, rather than by the infrastructure-specific value, could underestimate the relative safety of protected bike lanes, for example.

Michael Brenneis is an Associate Researcher at SSTI.