Wednesday, August 21, 2024

In Silico Human Mobility Data Science

In the past we have wrote about using simulation to build synthetic datasets for trajectory analysis due to the limited availability of real world comprehensive datasets. In relation to this work we  (Andreas Züfle, Dieter Pfoser, Carola Wenk, Hamdi Kavak, Taylor Anderson, Joon-Seok Kim, Nathan Holt, Andrew DiAntonio and myself) have a new vision paper entitled "In Silico Human Mobility Data Science: Leveraging Massive Simulated Mobility Data" published in Transactions on Spatial Algorithms and Systems

In the paper we sketch out a framework  for in silico mobility data science. The rationale being in someway that mobility data alone does not tell us much about why people do what do and to quote from the paper "but imagine a world where we can go back in time to ask people about the purpose of their mobility to understand why an individual visited a place of interest." By building models (aka, agent-based models) we can do just that which therefore allows us to build in silico human mobility data  

To build this argument, in the paper we review existing data sets of individual human mobility and their limitations in terms of size and representativeness. We then survey existing simulation frameworks that generate individual human mobility data and comment on their limitations before presenting our vision of a scalable in silico world that captures realistic human patterns of life and allows us to generate massive datasets as sandboxes for human mobility data science. Building off this we describe a small sample of applications and research directions that would be enabled by such massive individual human mobility datasets if our vision came true.

If this sounds of interest, below we provide the abstract to the paper, some of the figures we use to highlight our argument and our envisioned framework that could exhibit both realistic behavior and realistic movement. Finally at the bottom of the post we provide a reference and a link to the paper itself. As always, any thoughts or comments are most welcome. 

Abstract:

Human mobility data science using trajectories or check-ins of individuals has many applications. Recently, we have seen a plethora of research efforts that tackle these applications. However, research progress in this field is limited by a lack of large and representative datasets. The largest and most commonly used dataset of individual human trajectories captures fewer than 200 individuals while data sets of individual human check-ins capture fewer than 100 check-ins per city per day. Thus, it is not clear if findings from the human mobility data science community would generalize to large populations. Since obtaining massive, representative, and individual-level human mobility data is hard to come by due to privacy considerations, the vision of this paper is to embrace the use of data generated by large-scale socially realistic microsimulations. Informed by both real data and leveraging social and behavioral theories, massive spatially explicit microsimulations may allow us to simulate entire megacities at the person level. The simulated worlds, which do not capture any identifiable personal information, allow us to perform “in silico” experiments using the simulated world as a sandbox in which we have perfect information and perfect control without jeopardizing the privacy of any actual individual. In silico experiments have become commonplace in other scientific domains such as chemistry and biology, permitting experiments that foster the understanding of concepts without any harm to individuals. This work describes challenges and opportunities for leveraging massive and realistic simulated alternate worlds for in silico human mobility data science.

Key Words: Spatial Simulation, Mobility Data Science, Trajectory Data, Location Based Social Network Data, In Silico

The envisioned in silico mobility data science process- (let:) A massive microsimulation is created to simulate realistic human behavior specified by a user through an AI-supported builder tool. (middle:) The microsimulation generates massive datasets, including high-fidelity trajectories of all individuals over years of simulation time. This data, which is 100% accurate and complete (in the simulated world) is then sampled to generate realistic datasets. (right:) These datasets are then used to perform mobility data science tasks in the simulated in silico world as if it was the real world. The results of these tasks can then be compared to the ground truth data (of the simulated in silico world) for validation.

The Patterns of Life Simulation. A video of the simulation can be found at: https://www.youtube.com/watch?v=rP1PDyQAQ5M.
Envisioned framework for a simulation that exhibits both realistic behavior and realistic movement.

Full reference: 

Züfle, A., Pfoser, D., Wenk, C., Crooks, A.T., Kavak, H., Anderson, T., Kim, J-S., Holt, N. and Diantonio, A. (2024), In Silico Human Mobility Data Science: Leveraging Massive Simulated Mobility Data (Vision Paper), Transactions on Spatial Algorithms and Systems (pdf).