Tuesday, November 14, 2023

Massive Trajectory Data Based on Patterns of Life

Following on from the last post, we (Hossein AmiriShiyang RuanJoon-Seok KimHyunjee JinHamdi KavakDieter PfoserCarola Wenk and Andreas Zufle and myself) have a paper in the Data and Resources track at the 2023 ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems entitled "Massive Trajectory Data Based on Patterns of Life".  

This Data and Resources paper introduces readers to a large sets of simulated individual-level trajectory and location-based social network data we have generated from our Urban Life Model (click here to find out more about the model). The data comprises of 4 suburban and urban regions, including 1) the George Mason University Campus area, Fairfax, Virginia, 2) the French Quarter of New Orleans, Louisiana, 3) San Francisco, California, and 4) Atlanta, Georgia. For each of the 4 study regions, we run the simulation with 1K, 3K, 5K, and 10K agents for 15 months of simulation time. We also provide simulations for 10 years and 20 years, having 1K agents for each of the 4 regions of interest. For each dataset, three items are provided: 1) Check-ins, and 2) social network links and 3) trajectory information per agent per five-minute tick. As such we argue in the paper that our datasets are orders of magnitude larger than existing real-world trajectory and location-based social network (LBSN) data sets. 

If this sounds of interest we encourage readers to check out the paper (see the bottom of this post), while the datasets, as well as additional documentation, can be found at OSF (https://osf.io/gbhm8/) and the data generator (model) can be found at https://github.com/azufle/pol.

Abstract: Individual human location trajectory and check-in data have been the driving force for human mobility research in recent years. However, existing human mobility datasets are very limited in size and representativeness. For example, one of the largest and most commonly used datasets of individual human location trajectories, GeoLife, captures fewer than two hundred individuals. To help fill this gap, this Data and Resources paper leverages an existing data generator based on fine-grained simulation of individual human patterns of life to produce large-scale trajectory, check-in, and social network data. In this simulation, individual human agents commute between their home and work locations, visit restaurants to eat, and visit recreational sites to meet friends. We provide large datasets of months of simulated trajectories for two example regions in the United States: San Francisco and New Orleans. In addition to making the datasets available, we also provide instructions on how the simulation can be used to re-generate data, thus allowing researchers to generate the data locally without downloading prohibitively large files.

Full Referece: 

Amiri, H., Ruan, S., Kim, J., Jin, H., Kavak, H., Crooks, A.T., Pfoser, D., Wenk, C. and Züfle, A. (2023), Massive Trajectory Data Generation using a Patterns of Life Simulation, Proceedings of the 2023 ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, Hamburg, Germany. (pdf)

Monday, November 13, 2023

Synthetic Geosocial Network Generation

In the past the blog has explored the creation of social networks for models. Keeping with this vain of research, I was fortunate to work with Ketevan GallagherTaylor Anderson and Andreas Züfle to consider the role of location of individuals when generating social networks. This work has resulted in a new paper entitled "Synthetic Geosocial Network Data Generation"  which was presented at the 7th ACM SIGSPATIAL Workshop on Location-based Recommendations, Geosocial Networks and Geoadvertising (LocalRec 2023). If this sounds of interest, below you can read the abstract to the paper, see some the generated geosoical networks and find the full reference and link to the paper. In addition to this, the Python code and data used to generate the networks is available at https://github.com/KetevanGallagher/Synthetic-Geosocial-Networks.

Abstract: Generating synthetic social networks is an important task for many problems that study humans, their behavior, and their interactions. Geosocial networks enrich social networks with location information. Commonly used models to generate synthetic social networks include the classical Erdos-Renyi, Barabasi-Albert, and Watts-Strogatz models. However, these classic social network models do not consider the location of individuals. Real-world geosocial networks do exhibit a strong spatial autocorrelation, thus having a higher likelihood of a social connection between agents that are spatially close. As such, recent variants of the three classical models have been proposed to consider location information. Yet, these existing solutions assume that individuals are located on a uniform lattice and exhibit certain limitations when applied to real-world data that exhibits clusters. In this work, we discuss these limitations and propose new approaches to extend the three classic social network generation models to geosocial networks. Our experiments show that our generated synthetic geosocial networks address the shortcomings of the state-of-the-art models and generate realistic geosocial networks that exhibit high similarity to real-world geosocial networks. 
Keywords: Geosocial Networks, Network Generation, Synthetic Social Networks, Erdos-Renyi, Watts-Strogatz, Barabasi-Albert.


Real- World Geosocial Network using Facebook Social Connectedness Data between Zone Improvement Plan (ZIP) Region Centroids for the State of Virginia, USA.
Geosocial graphs using Virginia ZIP code data.
Graphs using Fairfax Census Tract data.


Full Referece:
Gallagher, K., Anderson, T., Crooks, A.T. and Züfle, A. (2023), Synthetic Geosocial Network Data Generation, Proceedings of the 7th ACM SIGSPATIAL Workshop on Location-based Recommendations, Geosocial Networks and Geoadvertising (LocalRec 2023), Hamburg, Germany. (pdf) (presentation)

Friday, November 03, 2023

Geographically Synthetic Populations for ABM: A Gallery of Applications

Often we are building geographically explicit agent-based models we spend a lot of time creating the synthetic population to instantiate our artificial world. We have tired to overcome this with creating methods to generate such populations (see this old blog post). Building on this work, Na (Richard) Jiang, Fuzhen Yin, Boyu Wang and myself have a new paper entitled "Geographically-Explicit Synthetic Populations for Agent-based Models: A Gallery of Applications" which was presented at 2023 Computational Social Science Society of the Americas conference. In the paper we extend the synthetic population to the whole of New York state. While at the same time we introduce a pipeline for using the population datasets for model initialization. To show this pipeline, we present several case studies utilizing Python and Mesa. These models range from that of commuting to disease spread and vaccination uptake. If this sounds of interest, below we provide the abstract to the paper along with some of the key figures including our pipeline and example applications. At the bottom of the page we provide the full reference and a link to the paper which has links to the models and data.
Abstract: Over the last two decades, there has been a growth in the applications of geographically-explicit agent-based models. One thing such models have in common is the creation of synthetic populations to initialize the artificial worlds in which the agents inhabit. One challenge such models face is that it is often difficult to create reusable geographically-explicit synthetic populations with social networks. In this paper, we introduce a Python based method that generates a reusable geographically-explicit synthetic population dataset along with its social networks. In addition, we present a pipeline for using the population datasets for model initialization. With this pipeline, multiple spatial and temporal scales of geographically-explicit agent-based models are presented focusing on Western New York. Such models not only demonstrate the utility of our synthetic population on commuting patterns but also how social networks can impact the simulation of disease spread and vaccination uptake. By doing so, this pipeline could benefit any modeler wishing to reuse synthetic populations with realistic geographic locations and social networks. 
Keywords: Agent-Based Model, Geographically-Explicit Agent-Based Models, Synthetic Population, Python, Mesa.
Pipeline of Utilizing Synthetic Population Resulting Datasets in Agent-Based Models.

Large Scale Disease Spread Model Structure.

Disease Dynamics for Two Diseases.

Vaccination Opinion Dynamic Model.

Simulation Vaccination Rate v.s. Real Vaccination Records: (A) All Population; (B) Different Age Groups of Population.

Full Referece: 

Jiang, N., Crooks, A.T., Yin, F. and Wang B. (2023), Geographically-Explicit Synthetic Populations for Agent-based Models: A Gallery of Applications, Proceedings of the 2023 Conference of The Computational Social Science Society of the Americas, Santa Fe, NM. (pdf)