Showing posts with label Trajectories. Show all posts
Showing posts with label Trajectories. Show all posts

Wednesday, August 21, 2024

In Silico Human Mobility Data Science

In the past we have wrote about using simulation to build synthetic datasets for trajectory analysis due to the limited availability of real world comprehensive datasets. In relation to this work we  (Andreas Züfle, Dieter Pfoser, Carola Wenk, Hamdi Kavak, Taylor Anderson, Joon-Seok Kim, Nathan Holt, Andrew DiAntonio and myself) have a new vision paper entitled "In Silico Human Mobility Data Science: Leveraging Massive Simulated Mobility Data" published in Transactions on Spatial Algorithms and Systems

In the paper we sketch out a framework  for in silico mobility data science. The rationale being in someway that mobility data alone does not tell us much about why people do what do and to quote from the paper "but imagine a world where we can go back in time to ask people about the purpose of their mobility to understand why an individual visited a place of interest." By building models (aka, agent-based models) we can do just that which therefore allows us to build in silico human mobility data  

To build this argument, in the paper we review existing data sets of individual human mobility and their limitations in terms of size and representativeness. We then survey existing simulation frameworks that generate individual human mobility data and comment on their limitations before presenting our vision of a scalable in silico world that captures realistic human patterns of life and allows us to generate massive datasets as sandboxes for human mobility data science. Building off this we describe a small sample of applications and research directions that would be enabled by such massive individual human mobility datasets if our vision came true.

If this sounds of interest, below we provide the abstract to the paper, some of the figures we use to highlight our argument and our envisioned framework that could exhibit both realistic behavior and realistic movement. Finally at the bottom of the post we provide a reference and a link to the paper itself. As always, any thoughts or comments are most welcome. 

Abstract:

Human mobility data science using trajectories or check-ins of individuals has many applications. Recently, we have seen a plethora of research efforts that tackle these applications. However, research progress in this field is limited by a lack of large and representative datasets. The largest and most commonly used dataset of individual human trajectories captures fewer than 200 individuals while data sets of individual human check-ins capture fewer than 100 check-ins per city per day. Thus, it is not clear if findings from the human mobility data science community would generalize to large populations. Since obtaining massive, representative, and individual-level human mobility data is hard to come by due to privacy considerations, the vision of this paper is to embrace the use of data generated by large-scale socially realistic microsimulations. Informed by both real data and leveraging social and behavioral theories, massive spatially explicit microsimulations may allow us to simulate entire megacities at the person level. The simulated worlds, which do not capture any identifiable personal information, allow us to perform “in silico” experiments using the simulated world as a sandbox in which we have perfect information and perfect control without jeopardizing the privacy of any actual individual. In silico experiments have become commonplace in other scientific domains such as chemistry and biology, permitting experiments that foster the understanding of concepts without any harm to individuals. This work describes challenges and opportunities for leveraging massive and realistic simulated alternate worlds for in silico human mobility data science.

Key Words: Spatial Simulation, Mobility Data Science, Trajectory Data, Location Based Social Network Data, In Silico

The envisioned in silico mobility data science process- (let:) A massive microsimulation is created to simulate realistic human behavior specified by a user through an AI-supported builder tool. (middle:) The microsimulation generates massive datasets, including high-fidelity trajectories of all individuals over years of simulation time. This data, which is 100% accurate and complete (in the simulated world) is then sampled to generate realistic datasets. (right:) These datasets are then used to perform mobility data science tasks in the simulated in silico world as if it was the real world. The results of these tasks can then be compared to the ground truth data (of the simulated in silico world) for validation.

The Patterns of Life Simulation. A video of the simulation can be found at: https://www.youtube.com/watch?v=rP1PDyQAQ5M.
Envisioned framework for a simulation that exhibits both realistic behavior and realistic movement.

Full reference: 

Züfle, A., Pfoser, D., Wenk, C., Crooks, A.T., Kavak, H., Anderson, T., Kim, J-S., Holt, N. and Diantonio, A. (2024), In Silico Human Mobility Data Science: Leveraging Massive Simulated Mobility Data (Vision Paper), Transactions on Spatial Algorithms and Systems (pdf).

Tuesday, November 14, 2023

Massive Trajectory Data Based on Patterns of Life

Following on from the last post, we (Hossein AmiriShiyang RuanJoon-Seok KimHyunjee JinHamdi KavakDieter PfoserCarola Wenk and Andreas Zufle and myself) have a paper in the Data and Resources track at the 2023 ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems entitled "Massive Trajectory Data Based on Patterns of Life".  

This Data and Resources paper introduces readers to a large sets of simulated individual-level trajectory and location-based social network data we have generated from our Urban Life Model (click here to find out more about the model). The data comprises of 4 suburban and urban regions, including 1) the George Mason University Campus area, Fairfax, Virginia, 2) the French Quarter of New Orleans, Louisiana, 3) San Francisco, California, and 4) Atlanta, Georgia. For each of the 4 study regions, we run the simulation with 1K, 3K, 5K, and 10K agents for 15 months of simulation time. We also provide simulations for 10 years and 20 years, having 1K agents for each of the 4 regions of interest. For each dataset, three items are provided: 1) Check-ins, and 2) social network links and 3) trajectory information per agent per five-minute tick. As such we argue in the paper that our datasets are orders of magnitude larger than existing real-world trajectory and location-based social network (LBSN) data sets. 

If this sounds of interest we encourage readers to check out the paper (see the bottom of this post), while the datasets, as well as additional documentation, can be found at OSF (https://osf.io/gbhm8/) and the data generator (model) can be found at https://github.com/azufle/pol.

Abstract: Individual human location trajectory and check-in data have been the driving force for human mobility research in recent years. However, existing human mobility datasets are very limited in size and representativeness. For example, one of the largest and most commonly used datasets of individual human location trajectories, GeoLife, captures fewer than two hundred individuals. To help fill this gap, this Data and Resources paper leverages an existing data generator based on fine-grained simulation of individual human patterns of life to produce large-scale trajectory, check-in, and social network data. In this simulation, individual human agents commute between their home and work locations, visit restaurants to eat, and visit recreational sites to meet friends. We provide large datasets of months of simulated trajectories for two example regions in the United States: San Francisco and New Orleans. In addition to making the datasets available, we also provide instructions on how the simulation can be used to re-generate data, thus allowing researchers to generate the data locally without downloading prohibitively large files.

Full Referece: 

Amiri, H., Ruan, S., Kim, J., Jin, H., Kavak, H., Crooks, A.T., Pfoser, D., Wenk, C. and Züfle, A. (2023), Massive Trajectory Data Generation using a Patterns of Life Simulation, Proceedings of the 2023 ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, Hamburg, Germany. (pdf)

Thursday, December 02, 2021

Urban life: A model of people and places

We have just wrapped up project that created a simple agent-based simulation of urban life as part of DARPA's Ground Truth Program. To this end we have just published a  new paper entitled "Urban life: a model of people and places" published in Computational and Mathematical Organization Theory, with Andreas Züfle, Carola Wenk, Dieter Pfoser, Joon-Seok Kim, Hamdi Kavak, Umar Manzoor, Hyunjee Jin  and myself. In the paper we provide an overview of the model and how it was used to test and validate human domain research. For interested readers, below you can find the abstract  to the paper along with some images that will give you a sense of our simulation model (which for interested readers was created with MASON and its GIS extension (GeoMason). While at the bottom of the post you can find the full reference and a link to the paper. 

 Abstract

We introduce the Urban Life agent-based simulation used by the Ground Truth program to capture the innate needs of a human-like population and explore how such needs shape social constructs such as friendship and wealth. Urban Life is a spatially explicit model to explore how urban form impacts agents’ daily patterns of life. By meeting up at places agents form social networks, which in turn affect the places the agents visit. In our model, location and co-location affect all levels of decision making as agents prefer to visit nearby places. Co-location is necessary (but not sufficient) to connect agents in the social network. The Urban Life model was used in the Ground Truth program as a virtual world testbed to produce data in a setting in which the underlying ground truth was explicitly known. Data was provided to research teams to test and validate Human Domain research methods to an extent previously impossible. This paper summarizes our Urban Life model’s design and simulation along with a description of how it was used to test the ability of Human Domain research teams to predict future states and to prescribe changes to the simulation to achieve desired outcomes in our simulated world.

Our generated maps colored based on different aggregation levels.

A screenshot of the graphical user interface from a representative model run. Top-Left: The spatial network and agents. Bottom left: Simulation parameters that can be specified prior to simulation start. Top-middle: the social network. Bottom-middle: Summary statistics of the simulation during tun-time such as friendship. Right: Profiles of recreational sites.

Screenshot of the epidemic simulator depicting the French Quarter, New Orleans, LA, USA.

Full Reference:

Züfle, A., Wenk, C., Pfoser, D., Crooks, A.T., Kavak, H., Kim, J-S. and Jin, H. (2021), Urban Life: A Model of People and Places, Computational and Mathematical Organization Theory. Available at https://doi.org/10.1007/s10588-021-09348-7 (pdf)

Wednesday, June 30, 2021

Towards Large-Scale Agent-Based Geospatial Simulation

Running large scale spatial agent-based models is often a computational challenge. To address this challenge, at the upcoming International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation (or SBP-BRiMS for short), Umar Manzoor, Hamdi KavakJoon-Seok Kim, Dieter Pfoser, Andreas Zufle, and Carola Wenk and myself have a paper entitled "Towards Large-Scale Agent-Based Geospatial Simulation."

In the paper we propose a scalable and general agent-based modeling and simulation framework for geospatial simulations involving networks. Specifically we propose to a solution for the parallelization of the single-threaded GeoMASON tookit by employing the Java Agent Development Environment (JADE) for the communication between threads (essentially, we divide the space of our agents into partitions, each handled by a separate thread of execution). We evaluate the proposed framework an simple urban model (created in MASON), which simulates simple patterns of life within an urban setting (click here for the blog post). The model has spatial network for agent movement and social network for maintaining social links. We compared the performance of the proposed framework on different settings, and concluded from experimentation that the proposed framework is outperformed by GeoMason when the agent population is small whereas with an increasing agent population, our proposed framework outperforms GeoMason as the complexity and time taken in simulation step increases substantially. If this sounds of interest, below we provide the abstract to the paper, along with some images of the framework and and simulation architecture. At the bottom of the page you can find the full citation and a link to the paper.

Abstract. Agent-based geospatial simulations have become very popular and widely used in examining the social and cultural characteristics of populations. Well-known toolkits such as NetLogo or MASON generally have scalability limitations, especially when the model and underlying spatial infrastructure become complex. This paper presents a framework for simulating large-scale agent-based geospatial systems by integrating the multi-agent systems toolkit JADE with the MASON agent-based modeling framework and its GIS extension, GeoMASON. The proposed Java-based framework can simulate large areas with hundreds of thousands of agents. It allows for the studying the evolution of a population and its environment over time. Such a framework provides the essential first steps for scalable model execution without sacrificing the model generality. 

Keywords: Large-scale geospatial simulation, Agent-based Modeling, MASON, Jade, GIS.

System Architecture of Proposed Framework.
Agent transfer between Zones.
Simulation using Proposed Architecture.

Full Reference:

Manzoor, U., Kavak, H., Kim, J-S., Crooks, A.T., Pfoser, D., Zufle, A. and Wenk, C. (2021), Towards Large-Scale Agent-Based Geospatial Simulation, 2021 International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation, Washington DC. (pdf)

Monday, June 29, 2020

Location-Based Social Network Data Generation

Continuing and building upon our previous work on Location-Based Social Networks (LBSNs) at the The 21st IEEE International Conference on Mobile Data Management we have a paper entitled "Location-Based Social Network Data Generation Based on Patterns of Life." In the paper we discuss how LBSNs research has become an active research topic in a variety of areas describing mobility patterns, location recommendation and friend recommendation systems. However we make the argument that real-world LBSN data sets (e.g., Gowalla, BrightKite) are a rather scarce resource due to privacy implications of making such data public available. Furthermore, in many publicly available LBSN data sets, the vast majority of users have less than ten check-ins or the number of locations visited by a user is usually only a small portion of all locations that user has visited (as shown in the table below).

Publicly Available Real-World LBSN Data Sets.

To overcome these weaknesses in this paper we present a LBSN simulation (an agent-based model created in MASON) capable of creating multiple artificial but socially plausible, large-scale LBSN data sets. If this sounds of interest to you, below we provide a little more information about the paper, Specifically, its abstract, a depiction of LBSNs, our case studies and the resulting simulations we used to develop LBSN data based on patterns of life (PoL) and some sample results. In addition to this, as the conference was virtual, Joon-Seok Kim also made a great movie of the conference paper. At the bottom of this post we provide the full reference and link to the paper. 

We would also like to draw the readers attention to our online resources which accompanies this paper. For example, to allow others to use and extend our work, the source code and scripts used to generate these data sets is available at: https://github.com/gmuggs/pol, while all of the generated data sets can be found at OSF (https://osf.io/e24th/?view_only=191fdd0c640847b5b85597ab0e57186d). For more details about this model and data readers are referred to the webpage created by Joon-Seok Kim: https://mdm2020.joonseok.org.

Abstract:
Location-based social networks (LBSNs) have been studied extensively in recent years. However, utilizing real-world LBSN data sets in such studies yields several weaknesses: sparse and small data sets, privacy concerns, and a lack of authoritative ground-truth. To overcome these weaknesses, we leverage a large scale LBSN simulation to create a framework to simulate human behavior and to create synthetic but realistic LBSN data based on human patterns of life. Such data not only captures the location of users over time but also their interactions via social networks. Patterns of life are simulated by giving agents (i.e., people) an array of “needs” that they aim to satisfy, e.g., agents go home when they are tired, to restaurants when they are hungry, to work to cover their financial needs, and to recreational sites to meet friends and satisfy their social needs. While existing real-world LBSN data sets are trivially small, the proposed framework provides a source for massive LBSN benchmark data that closely mimics the real-world. As such it allows us to capture 100% of the (simulated) population without any data uncertainty, privacy-related concerns, or incompleteness. It allows researchers to see the (simulated) world through the lens of an omniscient entity having perfect data. Our framework is made available to the community. In addition, we provide a series of simulated benchmark LBSN data sets using different real-world urban environments obtained from OpenStreetMap. The simulation software and data sets which comprise gigabytes of spatio-temporal and temporal social network data are made available to the research community.
LBSN Overview

Case Studies: A: New Orleans, Louisiana (NOLA), Mississippi River, Lake Pontchartrain, and the ‘French Quarter’. B: George Mason University (GMU), Fairfax, VA. C: Synthetic Villages - Small (Left) and Large (Right).
Environments Populated with Agents. Clockwise from Top Left: GMU, NOLA, Large and Small Synthetic Villages.
Data Sets Resulting from Location-Based Social Network Simulation
Average Social Network Degree over Time (1K).
Social Network





Full Reference:
Kim, J-S., Jin, H., Kavak, H., Rouly, O.C., Crooks, A.T., Pfoser, D., Wenk, C. and Züfle, A. (2020), Location-Based Social Network Data Generation Based on Patterns of Life, The 21st IEEE International Conference on Mobile Data Management, Versailles, France. (pdf)

Friday, July 26, 2019

Location-Based Social Simulation

At the upcoming 16th International Symposium on Spatial and Temporal Databases (SSTD) we have vision paper entitled "Location-Based Social Simulation" accepted. In the paper we discuss issues such as data sparsity and privacy concerns with using real world location-based social networks (LBSNs) like Foursquare and Yelp. To overcomes these issues, we describe how one can employ geospatial simulation (i.e. an agent-based model) to create artificial, but socially plausible LBSN data sets which overcomes some of the limitations with respect to LBSNs.

ABSTRACT:
Location-based social networks (LBSNs) have been studied extensively in recent years. However, utilizing real-world LBSN datasets in such studies has severe weaknesses: sparse and small datasets, privacy concerns, and a lack of authoritative ground-truth. Our vision is to create a large scale geosimulation framework to simulate human behavior and to create synthetic but realistic LBSN data that captures the location of users over time as well as social interactions of users in a social network. While existing LBSN datasets are trivially small, such a framework would provide the first source of very large LBSN benchmark data which would closely mimic the real world, containing high-fidelity information of location, and social connections of millions of simulated agents over several years of simulated time. Therefore, it would serve the research community by revitalizing and reshaping research on LBSNs by allowing researchers to see the (simulated) world through the lens of an omniscient entity having perfect data. These evaluations will guide future research allowing us to develop solutions to improve LBSN applications such as user-location recommendation, friend recommendation, location prediction, and location privacy.

KEYWORDS: Agent-based simulation, location-based social network, data generator, spatial network, human behavior

Full Reference: 
Kavak, H., Kim, J-S., Crooks, A.T., Pfoser, D., Wenk C. and Züfle, A (2019), Location-Based Social Simulation, Proceedings of the 16th International Symposium on Spatial and Temporal Databases, Vienna, Austria, pp 218-221. (pdf)

Update: Our paper was selected as runner-up  for best Vision Paper.