Development of Time-location Weighted Spatial Measures Using Global Positioning System Data
Article information
Abstract
Objectives
Despite increasing availability of global positioning system (GPS), no research has been conducted to analyze GPS data for exposure opportunities associated with time at indoor and outdoor microenvironments. We developed location-based and time-weighted spatial measures that incorporate indoor and outdoor time-location data collected by GPS.
Methods
Time-location data were drawn from 38 female subjects in California who wore a GPS device for seven days. Ambient standard deviational ellipse was determined based on outdoor locations and time duration, while indoor time weighted standard deviational ellipse (SDE) was developed to incorporate indoor and outdoor times and locations data into the ellipse measure.
Results
Our findings indicated that there was considerable difference in the sizes of exposure potential measures when indoor time was taken into consideration, and that they were associated with day type (weekday/weekend) and employment status.
Conclusions
This study provides evidence that time-location weighted measure may provide better accuracy in assessing exposure opportunities at different microenvironments. The use of GPS likely improves the geographical details and accuracy of time-location data, and further development of such location-time weighted spatial measure is encouraged.
Introduction
Human time-location behavior is an important determinant of environmental health. There are several methods used for acquiring time-location information including time-activity diaries, questionnaires, and observation [1]. These methods can be heavily affected by recall abilities and the voluntary participation of subjects. With the recent development of geographic information systems (GIS) and related geospatial technology, including global positioning system (GPS), it becomes increasingly possible to examine the interrelationship between human time-location behaviors and exposure potentials. Recent validation studies indicated that GPS could improve quality and collection efficiency of time location data [2,3].
GPS technology is now widely available, and it has been utilized across various settings of public health research and practices; mapping the concentrations of pollutants and environmental risk factors, tracking and identifying high risk population groups, characterizing environmental correlates of the built environment, and further developing exposure assessment methods [4-8], including recent review articles documenting best practice guideline on the use of GPS in health studies [9,10]. Although the GPS technology has potential for environmental health and exposure science, a number of important issues need to be addressed. One major obstacle is the lack of appropriate methods to represent and analyze a large amount of time-location data in the context of human exposure assessment. Previous studies have focused on the feasibility and validation of GPS-collected time activity data [2,3,11], and the development of wearable GPS logger or GPS-enabled cell phones [8,12,13], to investigate the effects of time-location behaviors on health. Few examples exist on methods for the collection and processing of complex and large sets of space-time mobility data in health research [9]; yet no available methods to summarize exposure potentials/opportunities that may vary with individual's mobility patterns in space and time.
Another issue is that some GPS devices fail to receive a signal indoors [14], because it is often the case that automatically collected GPS data are representative of time spent outdoor locations. In the US, people spend more than 80% of their time indoors, mostly in their own homes [15]. Related to a specific health outcome, for example, a recent study found that increasing weeks of pregnancy was positively correlated with increasing time spent at home [16]. The fact that humans are mobile adds another complexity in understanding population exposures because of the possibility of various exposures in different micro-environments and hazards through movement. Little research had been conducted to analyze GPS-collected time-location data for exposure opportunities associated with time at indoor and outdoor microenvironments. Indoor concentration of certain pollutants may differ from outdoor levels because of indoor sources and ventilation patterns. For instance, US Environmental Protection Agency Total Exposure Assessment Methodology studies reported higher indoor levels of common organic pollutants [17]. However, previous environmental epidemiological studies were mostly based on outdoor measurement of pollutants, not considering indoor exposures.
The purpose of the study was to develop methods to incorporate location-time weighted spatial distribution for subject mobility using GPS collected time-location data. We developed two types of location-based and time-weighted spatial measures; ambient standard deviational ellipse (A-SDE) that represented average size of time-location variability only with outdoor location, and indoor time-location weighted SDE (IW-SDE) with equal in- and outdoor weights, that represented both indoor and outdoor exposures based on time and exposure weights.
Materials and Methods
Study Population and Data Collection
Time-location data were collected from a subset of study participants in California, as a part of the Study of Personal Exposure Related Behavior (SUPERB). Details of the SUPERB have been described elsewhere [18]. In short, SUPERB collected data in three main tiers with data collection platforms for food consumption, temporal-spatial activity, and household products. A small number of subjects from the Tier 1 study participants (n=555) were randomly selected, to test innovative data collection strategies as a part of data collection platform for temporal-spatial activity patterns. Forty four study participants (6 males and 38 females) provided written informed consent and agreed to participate in the GPS data collection. The observation period was one week and study participants were asked to wear it upon waking. They were also asked to record the places they visited during the week when they were wearing the GPS. In this study, time location data of 38 female subjects were analyzed. Male participants were excluded primarily due to small sample size, and major gender difference in time use and activity patterns. The average age of the subjects was 46 years old, ranged between 23 and 81.
Participants were asked to wear a small portable GPS device attached to their waist in a small pouch. The GPS we selected (Mini Tracker-MT4100; SkyTRX, Crestview, FL, USA) was found to have spatial accuracy within 2.5 meters outdoors. Among the four devices (Garmin 60, Olathe, KS, USA; Garmin Forerunner 201, Olathe, KS, USA; Geostats GeoLogger, Atlanta, GA, USA; Skytrax Minitracker MT4100, Crestview, FL, USA) tested for accuracy, portability, battery life, cost and mapping capability, it was selected as the most reliable one for time location measurement [14]. It can store data up to 100 hours (about 360,000 data points) with time interval of second that are sufficient to capture the movement of subjects outdoors. The unit has a universal serial bus port and data were downloaded onto a computer by study personnel during their field visits and processed using Skytech software that interfaced with the internet, providing both the actual GPS locations at each time and a record of times spent at each residential block.
The GPS provided geographic location and time continuously for the 7-day period in 2007-2008. Preprocessing works were required to convert GPS-collected data into data usable in the analyses. These steps include, identifying geographic coordinates (latitude and longitude) of data points and converting them into Universal Transverse Mercator coordinate system in mile, as well as identifying routes of movement, time spent at a specific location, and outlier detection/removal. Possibly missing and outlying data points were checked in pre-processing steps, based on the information obtained from the logged GPS data, including geographic coordinates, elevation, speed of movement, duration, and strengths of signals, and activity log of participants. For example, routes of movement could be identified by connecting the points by temporal orders, and were used to identify outlying values. When participants recorded no outdoor activities on specific date in their activity logs, we also excluded the corresponding GPS data. Further, outdoor location and the time spent outdoors were identified from individual data points with at least one-minute of duration, because it will reduce possible loss of the GPS data due to signal dropouts. Most GPS data were recorded when the device was in outdoors, thus our study included all the moving data points with over 1 minute of time duration as outdoor measured points.
Time-location data were collected only for study participants who made a trip during the seven day time periods (229 days). The GPS time location data resulted in a total of 23,629 minutes of measured data points over 7-day time periods of observation, average of 103 minutes of outdoor location data per person-day. The data implied that the subjects spent 7% of their time outdoors. Because GPS did not record indoors and the gaps between outdoor time allowed us to quantify time spent indoors. Of the total 329,760 minutes time-location data, about 93% of them were classified as indoor locations. A new variable indicating time in a specific geographic location was created, and geographic coordinates for indoor locations were imputed using the measured GPS data values before and after the locations for outdoor activities lasted longer than 1-minute. Measured and imputed times spent outdoors and indoors were transferred into a GIS database system for spatial analysis and modeling. A prototype system consisting of a time-location database in GIS was built, then SDEs of study participants were generated to represent time-location variability using R aspace package [19].
Development of A-SDE and IW-SDW
SDE was primarily employed to depict spatial distribution of study participants' indoor and outdoor time-locations. SDE is one of the centrographic measures that are used to characterize the dispersion of points around mean center, and major steps for deriving SDEs are well described [20]; for a set of points, mean center values of x and y, the rotation angle of the ellipse, and lengths of the major and minor elliptical axes (standard deviations [SDs] of x, y) are determined to construct the SDEs. Two spatial measures were developed to model the time location data in the study. First, the SDE at one SD that include approximately two-thirds of the participant's measured outdoor location and time were created, and named as A-SDE. This location-based spatial measure represents average size of time-location variability only with outdoor location data. Second, we developed the IW-SDE with equal in- and outdoor weights, that represented both indoor and outdoor exposures based on time and exposure weights, because exposures may also occur in both indoor and outdoor locations.
Parameters of ellipses including two-dimensional centrality (mean center) and dispersion (SD) measures, and shape measure (eccentricity) were identified using the equations below. The size of an ellipse (A) was determined by,
and the elongation of the ellipse often measured by its eccentricity between the values of 0 and 1, was determined by,
where Sx is a SD of x-axis and Sy is a SD of y-axis, respectively [19,20].
Descriptive statistics of the size of two ellipses, including arithmetic and geometric mean, SD, and percentile values were reported, and the differences in the sizes between the two SDEs were tested after log transformation. The size of A-SDE and IW-SDE were compared primarily by day type, weekdays (n=168), defined as Monday through Friday, versus weekend (n=61), Saturday and Sunday, in addition to daily variations. Employment status was also used as a primary determinant of variability in the size of ellipses during the week, thus subjects were grouped as employed (n=8) and non-employed (n=26), excluding 4 subjects with missing employment status. As data were positively skewed, geometric mean and SD were primarily used to compare the mean sizes of the two SDEs, grouped by day type (weekday-weekend) and by employment status.
Results
Key parameters of the ellipses, including mean center of x and y, and SD of x and y from the mean center as two-dimensional centrality and dispersion measures respectively, are presented in Table 1. The mean centers by the two SDEs were within 1 mile for x and y coordinates (91.9 to 92.4, and 2589.3 to 2589.8, respectively). With the standard deviation dispersion measure that primarily captures variation and direction of maximum dispersion of the in- and outdoor location points, we observed that A-SDE had the highest SD for x axes, while IW-SDE for y axes, indicating different orientation of the two ellipses. The shape measure of eccentricity showed slightly narrow shape of IW-SDE compared to the shape of A-SDE (0.30 and 0.19, respectively), but both were close to circular form.
The A-SDE and IW-SDE had considerable differences in the size of ellipse. Descriptive statistics of the size of ellipses for A-SDE and IW-SDEs are presented in Table 2. Overall mean and median size of A-SDE were larger than those of IW-SDE; around the mean center, geometric mean of A-SDE was about 36 square miles in size, while the corresponding IW-SDE was smaller, about 18 square miles in size, and the sizes of A-SDE and IW-SDE were significantly different (p<0.001). The median sizes had similar patterns between the two SDEs (median of 49 and 25 square miles for A-SDE and IW-SDE, respectively). For both SDEs, weekday SDEs were larger than the weekend SDEs, however, higher variability was observed in weekends than in weekdays. The sizes of A-SDE were larger than those of IW-SDE for both weekdays and weekend (Table 2).
Variability of area by day of a week is shown in Figure 1. Over a week, both SDEs showed wide daily variation. The A-SDE exhibited much higher variation in the size during the week compared to the IW-SDE. We also found that A-SDEs were larger than IW-SDEs in all days. A peak on Monday was observed in A-SDE. Among the weekdays, the SDEs on Monday and Thursday were higher than on the rest of the week. Both A-SDE and IW-SDEs on Wednesday and in weekend days were lower compared to the ones for other days over the week.
The SDEs were associated with employment status. The differences in ellipse size by day type (weekday/weekend) and employment status (employed/unemployed) are shown in Figure 2. A-SDEs were larger than IW-SDEs for all employment-day combinations. Average size of SDEs for unemployed subjects was larger than the one for employed subjects during weekdays, while it was smaller during weekends. Patterns of weekday and weekend were different for employment status. Unemployed subjects had larger SDEs in weekday than in weekend, while employed subjects had larger SDEs in weekend.
Discussion
We developed methods to model location-based and time-weighted spatial measures using time location data collected by GPS. Because exposure can be affected by time spent at specific locations, and concentrations can be various in different locations, we developed indoor time-weighted spatial measures to incorporate time duration for indoor locations. Such time-location weight may provide better accuracy in assessing exposures associated with exposure opportunities at different microenvironments. When indoor time was taken into consideration, there were considerable differences between the sizes of A-SDEs and IW-SDEs, indicating that exposure assessment based on only outdoor location might misclassify exposure potentials. We observed that the size of IW-SDEs was smaller than that of A-SDEs. This size difference may be due to the fact that IW-SDEs consider the time spent indoors.
There were consistent patterns observed when the ellipse was broken down by day type and employment (work/non-work and weekday/weekend). Such variability in location based measures between the SDEs was partially explained by these key variables. However, not all major determinants of IW-SDE were investigated due to the scope of our study. We limited our data analysis to women. Among primary determinants of time-use and activity are related to demographic (age, gender), employment, day type, lifestyle/life stage related factors [21]. Age was also used as one of the potential determinants of time-use and activity patterns. However, we have not included results by age group and/or day type, partly because of key characteristics of study sample (small sample size, female only, wide range of age across study participants), as well as study design (household-based sampling, not individual-based) [18]. We have run additional analyses by age group and day type, however, we found that time-activity patterns of age groups should be interpreted with additional information on household structure (ex. existence of children), that may be a subject of another study. There is evidence suggesting gender role differences in time-use and activity participation, primarily the effects of employment and household maintenance related constraints on time-use and activity among women; regardless of employment status, women are more likely constrained by gender-role related activities that women perform in their everyday lives. Women are more likely to take short and more frequent trips and participate in maintenance activities than their male counterparts, thus are more likely subject to spatial and temporal requirements of household maintenance and/or childcare responsibility-related activities [22]. As a consequence, while patterns may also vary by age, females tend to spend less out-of-home time during weekend [23]. Our data showed smaller SDEs in weekends indicating less movement. Such difference of movement in weekdays and weekends are also observed in other population [23,24].
Given the fact that trips are often linear and interrelated than circular patterns, we employed the ellipse, rather than a circle, to summarize and compare concentration and dispersion patterns of in-and outdoor locations. SDEs have been previously used to characterize patterns of human activities and associated travel behaviors, for example as a measure for approximate human space-time activity spaces [25,26]. As one of the added values of the newly developed spatial measure, our study expands the use of ellipse and related descriptive summary statistics of locational data by capturing both in- and outdoor exposure potentials; by using the ellipse, we identified directional variation of the multiple indoor and outdoor locations for A-SDE and IW-SDE. We were also able to compare approximate shape of the ellipse with the eccentricity measure. However, we primarily compared the size of measures in this study, because our aim was to develop such a measure to summarize overall patterns of human time-space behavior and the associated exposure potentials as a pilot study. The spatial summary measure developed in this study is thus well-suited to describe the spatial scope for the spatially linked and interrelated sets of in- and outdoor location points. Furthermore, other features of the spatial measures, in addition to the size and shape used, could also be employed in a future study, especially when the purpose is to compare and represent spatial patterns and relationships among each individual location-time weighted ellipse. For instance, visually comparing the orientation of the each individual measures may be an effective way in future studies that relate actual exposure and time-location patterns.
While strengths of the study include the richness of time-location data collected by GPS, interpretation and generalizability of study findings may be limited by some constraints posed by data collection and analyses, including short time period of observation (a week) with small numbers of subjects. We cannot rule out the potential effects of other environmental factors (for example, weather condition/seasonality) on the observed weekly patterns. Further testing of the measures with additional factors may be needed. Personal exposure can be affected by micro-environmental concentration and the time subjects stayed in the microenvironment. In such microenvironmental model, time spent in indoor and outdoor locations may be critical for further personal exposure assessment. However, personal exposure may be higher with longer indoor time if indoor concentration of pollutants is higher than outdoor. We were not able to provide direction of influence of indoor and outdoor exposure in this study. Furthermore, trip purpose/motive is not often identifiable from GPS-collected time-location data. Modes of transportation are not known in most case, thus there may be variations due to exposure in-car, in-transit, and ambient environment. Recent studies in this context have been conducted to collect and analyze GPS-collected time location data associated with different microenvironments [27,28]. Further work is needed to develop methods for collection and classification of time-location data in a more wide range of microenvironments (for example, more specific categories of indoor and outdoor microenvironments, including in-vehicle location), that may be more meaningful categories in the investigation of exposure and health outcomes, and to refine the models that complement the existing methods of collection and classification of time-location data (for example, GIS capability and data layers). Despite the limitations, our findings showed exposure potential measures greatly vary by time-location, thus need to distinguish indoor and outdoor location and time in health exposure studies.
The use of GPS data will likely improve the geographical details and accuracy of time-location data, making it feasible to use time-use and activity data collected to address exposure in small temporal and spatial scales. Our study indicated that GPS-collected time location data could be used as an alternative in exposure assessment incorporating complex space-time behavior, and thus effective in addressing the cross-sectional nature of exposure and health assessment. A major contribution of this study is that a spatially explicit model was developed to summarize and handle complex and large sets of GPS collected time-location data, and that this is the first study to combine exposure opportunities associated with time at indoor and outdoor microenvironments; we were able to build a prototype system for spatial-temporal analysis involving GPS data, and demonstrated that importance of indoor exposure for spatially explicit summary measures. Further development and refinement of GPS data analysis system may provide the capability of tracing individuals and population groups through multiple context (for example, indoor and outdoor, home, work-based location) and environment (health-promoting or health-damaging ones). The IW-SDE developed in this study may represent exposure opportunities more accurately, and thus may represent the interactions between complex space-time behavior and exposure potentials over meaningful units of time and space. Incorporating spatial-temporal patterns of human mobility may be of significance in understanding human exposure behavior in space-time and its complex relationship with the environment, and further development of such a measure incorporating time of indoor and outdoor exposure is encouraged.
Acknowledgements
This research was funded by a Science to Achieve Results (STAR) grant #RD-83154001 from the United States Environmental Protection Agency. We thank all of our participants for their participation.
Notes
The authors have no conflict of interest to declare on this study.
This article is available from: http://e-eht.org/