Friday, December 04, 2020

Future Developments in Geographical Agent-Based Models: Challenges and Opportunities

Its been a while since (to say the least), that we wrote a position paper about agent-based modeling. But with agent-based modeling becoming more widely accepted  and the growth of machine learning within the geographical sciences we thought we would revisit some of the existing challenges  (e.g. validation, representing behavior) and discuss how machine learning and data might help here. To this end, Alison HeppenstallNick Malleson, Ed Manley, Jiaqi Ge and Mike Batty, have recently published a paper entitled "Future Developments in Geographical Agent-Based Models: Challenges and Opportunities" in Geographical Analysis.  Below we provide the abstract to the paper, and if this is of interest please follow the links to the paper itself.

Abstract

Despite reaching a point of acceptance as a research tool across the geographical and social sciences, there remain significant methodological challenges for agent-based models. These include recognizing and simulating emergent phenomena, agent representation, construction of behavioral rules, calibration and validation. Whilst advances in individual-level data and computing power have opened up new research avenues, they have also brought with them a new set of challenges. This paper reviews some of the challenges that the field has faced, the opportunities available to advance the state-of-the-art, and the outlook for the field over the next decade. We argue that although agent-based models continue to have enormous promise as a means of developing dynamic spatial simulations, the field needs to fully embrace the potential offered by approaches from machine learning to allow us to fully broaden and deepen our understanding of geographical systems.

Full Reference:

Heppenstall, A., Crooks, A.T., Malleson, N., Manley, E., Ge, J. and Batty, M. (2021), Future Developments in Geographical Agent-Based Models: Challenges and Opportunities, Geographical Analysis. https://doi.org/10.1111/gean.12267 (pdf)

Tuesday, November 03, 2020

Integrating Social Networks into Large-scale Urban Simulations

Building on past posts about our work with respect to generating large scale synthetic populations for agent-based models, we have a new paper entitled "Integrating Social Networks into Large-scale Urban Simulations for Disaster Responses" that was accepted at the 3rd ACM SIGSPATIAL International Workshop on GeoSpatial Simulation. In the paper we discuss our method to create synthetic populations  which incorporates social networks to generate for the New York megacity region. To demonstrate the utility of our approach, we use the generated synthetic population to initialize an agent-based model which not only generates basic patterns of life (e.g., commuting to and from work), but also allows us to explore how people react to disasters and how their social networks are changed by such events. 

If sounds of interest to you, below we provide the abstract to the paper, along with our synthetic population workflow and some sample outcomes from the model. At the bottom of the post we provide the full reference and link to the paper (the paper itself also links to a GitHub repository where more information about the synesthetic population can be found).

ABSTRACT: Social connections between people influence how they behave and where they go; however, such networks are rarely incorporated in agent-based models of disaster. To address this, we introduce a novel synthetic population method which specifically creates social relationships. This synthetic population is then used to instantiate a geographically explicit agent-based model for the New York megacity region which captures pre- and post- disaster behaviors. We demonstrate not only how social networks can be incorporated into models of disaster but also how such networks can impact decision making, opening up a variety of new application areas where network structures matter in urban settings. 

KEYWORDS: Urban Simulation, Agent-based models, Synthetic Populations, Social Networks, Geographical Information Systems, Disasters.

Synthetic population and social network generation workflow.
Synthetic population at household level within a census track (A) and social network of one individual (B).
Example of a heat-map of traffic density (A) Manhattan is center of the plot. The impact area of the disaster and the health status of the agents (B).

Full Reference:

Jiang, N., Burger, A., Crooks, A.T. and Kennedy, W.G. (2020), Integrating Social Networks into Large-scale Urban Simulations for Disaster Responses. Geosim ’20: 3rd ACM SIGSPATIAL International Workshop on GeoSpatial Simulation, Seattle, WA. (pdf)

Thursday, October 15, 2020

The Impact of Mandatory Remote Work during the COVID-19 Pandemic

In the past we have written about using agent-based modeling to study human resources management issues and how workplace the layout might impact subordinates interactions with managers but with growing amounts data we can explore how employees communicate with each other. To this end, Talha Oz and myself have a  new paper entitled "Exploring the Impact of Mandatory Remote Workduring the COVID-19 Pandemic" which will be presented in a special session on COVID-19 at the 2020 International Conference on Social Computing, Behavioral-Cultural Modeling, & Prediction and Behavior Representation in Modeling and Simulation (or SBP-Brims 2020 for short). 

In this study we exploit metadata (and not content) emitted from commonplace workplace technologies such as calendar and workplace messaging apps collected from a tech company in order to see how mandatory remote work changed communication patterns and how such data can be used to measure organizational health. If this is of interest to you, below we provide the abstract to the paper along with some of the results with respect to how meetings and communication patterns changed from  business as usual (BAU), pre pandemic to that when people were forced to work from home (WFH). Finally at the bottom of the post we provide the full reference and the link to the paper.

Abstract. During the early months of the COVID-19 pandemic, millions of people had to work from home. We examine the ways in which COVID-19 affect organizational communication by analyzing five months of calendar and messaging metadata from a technology company. We found that: (i) cross-level communication increased more than that of same-level, (ii) while within-team messaging increased considerably, meetings stayed the same, (iii) off-hours messaging became much more frequent, and that this effect was stronger for women; (iv) employees respond to non-managers faster than managers; finally, (v) the number of short meetings increased while long meetings decreased. These findings contribute to theories on organizational communication, remote work, management, and flexibility stigma. Besides, this study exemplifies a strategy to measure organizational health using an objective (not self-report based) method. To the best of our knowledge, this is the first study using workplace communication metadata to examine the heterogeneous effects of mandatory remote work. 

Keywords: Work from Home, Communication, COVID-19, Organization.




Full Reference:

Oz, T. and Crooks, A.T. (2020), Exploring the Impact of Mandatory Remote Work during the COVID-19 Pandemic, 2020 International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation, Washington DC. (pdf)

 If you would like a pre-print of  paper, just let us know and we can email you one. 

Utilizing Python for Agent-based Modeling: The Mesa Framework

In the past I have mentioned Mesa, an agent-based modeling framework in Python is several posts but not really discussed it in detail. This is about to change with this post. The reason being is that we have a paper at the forthcoming International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation (or SBP-BRiMS for short) entitled "Utilizing Python for Agent-based Modeling: The Mesa Framework".

While Mesa started off with two students from the CSS program at George Mason University: Jackie Kazil and David Masad it has now grown to include over 70 contributors. In this new paper we discuss the rationale for developing Mesa (see https://github.com/projectmesa/mesa) which arose because there was no framework for easily building agent-based models in Python. Furthermore we discuss Mesa's design goals and its architecture and usage, along with who is using Mesa and extensions to it (e.g. Mesa-Geo, Multi-Level Mesa), finally we conclude the paper with future development directions. Below we provide the abstract to the paper and a selection of figures which highlights Mesa's model components (model, analysis and visualization), how various activation schedules are incorporated within Mesa and an illustration of how these different schemes impact a model and some examples of Mesa's visualization functionality. At the bottom of the post we have the full reference and a link to the paper.

Abstract.
Mesa is an agent-based modeling framework written in Python. Originally started in 2013, it was created to be the go-to tool in for re-searchers wishing to build agent-based models with Python. Within this paper we present Mesa’s design goals, along with its underlying architecture. This includes its core components: 1) the model (Model, Agent, Schedule, and Space), 2) analysis (Data Collector and Batch Runner) and the visualization (Visualization Server and Visualization Browser Page). We then discuss how agent-based models can be created in Mesa. This is followed by a discussion of applications and extensions by other researchers to demonstrate how Mesa design is decoupled and extensible and thus creating the opportunity for a larger decentralized ecosystem of packages that people can share and reuse for their own needs. Finally, the paper concludes with a summary and discussion of future development areas for Mesa. 

Keywords: Agent-based Modeling, Python, Framework, Complex Systems. 
Mesa model components: model, analysis and visualization.
Activation schedules within Mesa and an illustration of how these different schemes impact a model. In this case the Prisoner’s Dilemma. Defecting agents are in red and cooperating agents are in blue. Each image is from the same step, but different activation schemes are used.
Model visualization of two Mesa applications within a web browser: (A) Wolf-sheep predation Model. (B) Virus on a network (Source: https://github.com/projectmesa).

Full Reference: 
Kazil, J., Masad, D. and Crooks, A.T. (2020), Utilizing Python for Agent-based Modeling: The Mesa Framework, in Thomson, R., Bisgin, H., Dancy, C., Hyder, A. and Hussain, M. (eds), 2020 International Conference on Social Computing, Behavioral-Cultural Modeling & Prediction and Behavior Representation in Modeling and Simulation, Washington DC., pp. 308-317. (pdf)

Wednesday, October 07, 2020

Creating Intelligent Agents

Continuing our work on machine learning and agent-based modeling,  at the upcoming Computational Social Science (CSS 2020) annual conference, Dale Brearcliffe and myself have a paper entitled: "Creating Intelligent Agents: Combining Agent-Based Modeling with Machine Learning."  In the paper we discuss how advances in computational availability and power have permitted a rapid increase in the development and use of machine learning (ML) solutions in a wide variety of applications (some examples we have already shown on this website), including within agent-based models. 

One thing however, is that while within the ML community at large, it is common to compare different approaches and take the one that gives the best result (e.g., like we did in the Communities, Bots and Vaccinations paper), this is not the case within the social simulation community. There has been little written with respect to why one ML method was chosen over another, or how the simulation results might be different if different ML methods were used. To address this gap we demonstrate the integration of three machine learning methods (i.e., Evolutionary Computing, Q Learning, and State→Action→Reward→State→Action (SARSA)) into the well-known agent based model: Sugarscape (in this instance we modified NetLogo's "Sugarscape 2 Constant Growback"). Our rationale for choosing the Sugarscape model was that it is well known within the social sciences and as the purpose of this paper was not to solve or explore a specific social issue, but to show how different ML methods can be used within the same agent-based model and to show how different methods impact the results of a model.

If this type of research is of interest, below we provide the abstract to the paper, a flow chart of the model execution along with some results. At the bottom of the past you can find the full reference and a link to the paper. Supplementary material can also be found at https://tinyurl.com/ML-Agents. At this link, the model presented in this paper along with a full description of it following the Overview, Design concepts, and Details (ODD) protocol can be found. We do this to allow others to replicate the results and adapt the ML methods for their own applications if they so desire.  

Graphical User Interface of the “Creating Intelligent Agents” model. From left to right: input parameters, agents within their artificial world, and aggregate model outputs
 

Abstract. 

Over the last two decades with advances in computational availability and power, we have seen a rapid increase in the development and use of Machine Learning (ML) solutions applied to a wide range of applications, including their use within agent-based models. However, little attention has been given to how different ML methods alter the simulation results. Within this paper, we discuss how ML methods have been utilized within agent-based models and explore how different methods affect the results. We do this by extending the Sugarscape model to include three ML methods (evolutionary computing, and two reinforcement learning algorithms (i.e., Q Learning, and State→Action→Reward→State→Action (SARSA)). We pit these ML methods against each other and the normal functioning of the rule-based method (Rule M) in pairwise combat. Our results demonstrate ML methods can be integrated into agent-based models, that learning does not always mean better results, and that agent attributes considered important to the modeler might not be to the agent. Our paper's contribution to the field of agent-based modeling is not only to show how previous researchers have used ML but also to directly compare and contrast how different ML methods used in the same model impact the simulation outcome, which is rarely discussed thus, helping bring awareness to researchers who are considering using intelligent agents to improve their models. 

Keywords: Agent-based Modeling, Evolutionary Computing, Machine Learning, Reinforcement Learning, Sugarscape.

 

Model execution flowchart.


Mean result for vision for all rule combinations (50 model runs).
 
Full reference:
Brearcliffe, D.K. and Crooks, A.T. (2020), Creating Intelligent Agents: Combining Agent-Based Modeling with Machine Learning, The 2020 Computational Social Science Society of Americas Conference, Online. (pdf)

Wednesday, September 16, 2020

Learning-based Actor-Interpreter State Representation

While in previous posts we have discussed machine learning (ML) with respect to social media analysis, we have also been exploring how one can use it in agent-based modeling. One of the first examples of this is a new paper with Paul Cummings which has been accepted at the upcoming 2020 International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation (or SBP-BRiMS 2020 for short).

In the paper entitled "Development of a Hybrid Machine Learning Agent Based Model for Optimization and Interpretability" we discuss the growth of ML within agent-based models and present the design of the hybrid agent-based/ML model called the Learning-Driven Actor-Interpreter Representation (LAISR) Model. LAISR's attempts to: "a) generate an optimal decision-making strategy through its training process, including a more constrained parameter space, and b) describe its behaviors in a human-readable and interpretable approach." To demonstrate the LAISR's model we use a simple wargaming example, that of a tactical air and ground warfare as an experiment and discuss areas of further research and applications.

If this is of interest to you, below we provide the abstract to the paper, along with a high level overview of the LAISR's model, its tactical experiment diagram and some results from the wargaming experiment. This is followed by a short movie of a representative model run that Paul has created. Finally at the the bottom of the post you can see the full reference and a link to the paper itself.

Abstract.
The use of agent-based models (ABMs) has become more wide-spread over the last two decades allowing researchers to explore complex systems composed of heterogeneous and locally interacting entities. How-ever, there are several challenges that the agent-based modeling community face. These relate to developing accurate measurements, minimizing a large complex parameter space and developing parsimonious yet accurate models. Machine Learning (ML), specifically deep reinforcement learning has the potential to generate new ways to explore complex models, which can enhance traditional computational paradigms such as agent-based modeling. Recently, ML algorithms have proved an important contribution to the de-termination of semi-optimal agent behavior strategies in complex environments. What is less clear is how these advances can be used to enhance existing ABMs. This paper presents Learning-based Actor-Interpreter State Representation (LAISR), a research effort that is designed to bridge ML agents with more traditional ABMs in order to generate semi-optimal multi-agent learning strategies. The resultant model, explored within a tactical game scenario, lies at the intersection of human and automated model design. The model can be decomposed into a format that automates aspects of the agent creation process, producing a resultant agent that creates its own optimal strategy and is interpretable to the designer. Our paper, therefore, acts as a bridge between traditional agent-based modeling and machine learning practices, designed purposefully to enhance the inclusion of ML-based agents in the agent-based modeling community.

Keywords: Agent-Based Modeling, Machine Learning, Explainable Artificial Intelligence.

LAISR model

LAISR tactical experiment diagram (A). Actor finite state machine (B).

Heat map representations of actor agents.



Full Reference:
Cummings, P. and Crooks, A.T. (2020), Development of a Hybrid Machine Learning Agent Based Model for Optimization and Interpretability, in Thomson, R., Bisgin, H., Dancy, C., Hyder, A. and Hussain, M. (eds), 2020 International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation, Washington DC., pp 151-160. (pdf)

Friday, September 11, 2020

Utilizing ABMs for The Human Resource Management

In a previous post from a few years ago we looked at how the workplace the layout might impact subordinates interactions with managers. Now turning to work employee satisfaction within the workplace, at the forthcoming International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation (or SBP-BRiMS for short) we have a paper entitled "The Human Resource Management Parameter Experimentation Tool".

In this paper we have created a model called the "Human Resources Management-Parameter Experimentation Tool" or HRM-PET for brevity, which is based on Herzberg et al. (1959) Two-Factor Theory. This theory has been used and tested for decades in human resource management as it can capture the interaction between a work force’s motivation and their environment’s hygiene. Hygiene in this context relates to policies and administration, supervision-technical, relationship-superior, working conditions, and salary which together moderate job dissatisfaction. While the theory has been extensively used it has not been explored via an agent-based model until now. By utilizing agent-based modeling, it allows us to test the empirically found variations on the Two-Factor Theory and its application to specific industries or organizations.

If you are interested in finding out more about this work, below we provide the abstract to the paper, an annotated graphical user interface of the model along with the basic decision making process for the agents. This is followed by some of the results from the model and movie of a representative model run. At the bottom of the post we provide the full citation to the paper, along with that of Herzberg et al. (1959). The model itself which was created in NetLogo 6.1, can be found along with a detailed Overview, Design concepts, and Details plus Decision making (ODD+D) document at http://bit.ly/HMR-PET. The rationale for utilizing the ODD+D and for sharing the model is that it allows broader dissemination of the model and its methodology.

Abstract:
Human resource management (HRM) draws on the field of organizational theory (OT) to identify, quantify, and manage people-based phenomena that impact organizational operations and outcomes. OT research has long used computational methods and agent-based modeling to understand complex adaptive systems. Agent-based modeling methodologies within HRM, however, are still rare. Within the HRM and management science literature, Herzberg’s et al. (1959) Two-Factor Theory (TFT) is a framework that has been tested and used for decades. Its ability to capture the interaction between a work force’s motivation and their environment’s hygiene lends itself well to agent-based modeling as a method of study. Here, we present the development of the Human Resources Management-Parameter Experimentation Tool (HRM-PET) as the first explicit ABM instantiation of TFT, filling the gap between the study of HRM and computational OT tools like agent-based modeling. 

Keywords: Human Resources Management, Management Science, Workforce Dynamics, Agent-based Modeling.
HRM-PET graphical user interface.
Decision making process for the agents in HRM-PET.
Worker congregation to work units under three variations of work unit hygiene factor distributions and two variations of weighing worker satisfaction and dissatisfaction.



References:
Herzberg, F.I., Mausner, B. and Snyderman, B. (1959)The Motivation to Work (2nd ed.). New York: John Wiley.
Iasiello, C., Crooks, A.T. and Wittman, S. (2020), The Human Resource Management Parameter Experimentation Tool, in Thomson, R., Bisgin, H., Dancy, C., Hyder, A. and Hussain, M. (eds), 2020 International Conference on Social Computing, Behavioral-Cultural Modeling & Prediction and Behavior Representation in Modeling and Simulation, Washington DC., pp. 298-307. (pdf)


Tuesday, September 01, 2020

Beyond Words: Comparing Structure, Emoji Use, and Consistency Across Social Media Posts

Continuing our work on Emojis, at the forthcoming International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation (or SBP-BRiMS for short) we (Melanie Swartz, Arie Croitoru and myself) have a paper entitled "Beyond Words: Comparing Structure, Emoji Use, and Consistency Across Social Media Posts." In the paper we introduce and demonstrate a language-agnostic methodology to characterize structures of content and emoji use within a document (in this case a tweet), measure consistency of structures across a set of documents, and cluster documents and users with similar patterns and behavior. Using a corpus of 44 million tweets collected in October and November 2018 related to the 2018 U.S. midterm elections based on keywords, hashtags, and user accounts associated with candidates or political parties we were able to gain insights into the unique or shared structures of communication styles and emoji use of over 3.3 million unique users and user roles such as journalists, bots and others. If this sounds of interest to you, below we provide the abstract to the paper, some tables and figures of the our findings along with the full reference and a link to the paper. Furthermore, if you are interested in extending this work to other areas, Melanie has made the code available at https://github.com/msemoji/.

Abstract
Social media content analysis often focuses on just the words used in documents or by users and often overlooks the structural components of document composition and linguistic style. We propose that document structure and emoji use are also important to consider as they are impacted by individual communication style preferences and social norms associated with user role and intent, topic domain, and dissemination platform. In this paper we introduce and demonstrate a novel methodology to conduct structural content analysis and measure user consistency of document structures and emoji use. Document structure is represented as the order of content types and number of features per document and emoji use is characterized by the attributes, position, order, and repetition of emojis within a document. With these structures we identified user signatures of behavior, clustered users based on consistency of structures utilized, and identified users with similar document structures and emoji use such as those associated with bots, news organizations, and other user types. This research compliments existing text mining and behavior modeling approaches by offering a language agnostic methodology with lower dimensionality than topic modeling, and focuses on three features often overlooked: document structure, emoji use, and consistency of behavior.
Keywords: Data Mining, Social Media, Emojis, User Behavior Modeling.
Emoji attributes

Most common content structures with emojis for non-retweets.

Clusters of users with similar behavior across two factors in non-retweets (left) and retweets (right) Colors indicate cluster assignments.

Full Reference: 
Swartz, M., Crooks, A.T. and Croitoru, A. (2020), Beyond Words: Comparing Structure, Emoji Use, and Consistency Across Social Media Posts, in Thomson, R., Bisgin, H., Dancy, C., Hyder, A. and Hussain, M. (eds), 2020 International Conference on Social Computing, Behavioral-Cultural Modeling & Prediction and Behavior Representation in Modeling and Simulation, Washington DC., pp 1-11. (pdf)

 

Monday, August 31, 2020

Beginning A New Chapter

Just a short post to say that over the summer saw us move from Northern Virginia to Buffalo to start a new chapter in our lives and research.  More specifically I have become a Professor in the Department of Geography and  a member of the RENEW (Research and Education in eNergy, Environment and Water) Institute at the University at Buffalo.

It does not seem that long ago I wrote a similar post about moving from CASA to George Mason University (but that was over 10 years ago!). I guess it does not feel that long as I really enjoyed my time at George Mason University and the people (faculty and students) who I worked with there.  I am also supper excited to see where this new chapter leads.


Wednesday, July 22, 2020

Diversity from Emojis and Keywords in Social Media

Building on our initial work on emojis  use and and how one can carry out a systematic comparison of emojis across individual user profiles and communication patterns within social media, we have a new paper entitled: "Diversity from Emojis and Keywords in Social Media" which was presented at the 11th International Conference on Social Media and Society

In the paper we present a novel method using a diversity language model to associate diversity related attributes to social media user accounts and content by analyzing the emojis and keywords used (in this case from Twitter). We used this diversity language model to shed light on the groups of social media users and content with similar diversity attributes related to American politics (specifically the 2018 U.S. midterm elections). Our results revealed topics of interest and patterns of social media engagement across political lines among the diverse populations that otherwise would not have been apparent if we only analyzed the key political campaign phrases and slogans (i.e. “Blue Wave” and “Make America Great Again”) without taking diversity into account.

For interested readers, below we provide the abstract to the paper along with some figures from the paper. These include our workflow for diversity analysis of social media content, a high level overview of our diversity language model. These are followed by some of our results. Specifically the presence of diversity keywords and emojis in user profiles, and the composition of users in our collection based on gender for two political campaigns. If this peaks you interest as the conferce was virtual we have also prepared a short movie of the paper. While at the bottom of the post you find the full reference to the paper along with a link to the paper itself.

Abstract:
Social media is a popular source for political communication and user engagement around social and political issues. While the diversity of the population participating in social and political events in person are often considered for social science research, measuring the diversity representation within online communities is not a common part of social media analysis. This paper attempts to fill that gap and presents a methodology for labeling and analyzing diversity in a social media sample based on emojis and keywords associated with gender, skin tone, sexual orientation, religion, and political ideology. We analyze the trends of diversity related themes and the diversity of users engaging in the online political community during the lead up to the 2018 U.S. midterm elections. Our results reveal patterns along diversity themes that otherwise would have been lost in the volume of content. Further, the diversity composition of our sample of online users rallying around political campaigns was similar to those measured in exit polls on election day. The diversity language model and methodology for diversity analysis presented in this paper can be adapted to other languages and applied to other research domains to provide social media researchers a valuable lens to identify the diversity of voices and topics of interest for the less-represented populations participating in an online social community.

Keywords: Social media, emoji, diversity, elections, political campaigns
Workflow for diversity analysis of social media content
Diversity Language Model
Presence of diversity keywords and emojis in user profiles
Composition of users in our collection based on gender for two political campaigns



Full Reference:

Swartz, M., Crooks, A.T. and Kennedy, W.G. (2020), Diversity from Emojis and Keywords in Social Media, in Gruzd, A., Mai, P., Recuero, R., Hernández-García, A., Lee, C.S., Cook, J., Hodson, J., McEwan, B and Hopke, J. (eds.), Proceedings of the 11th International Conference on Social Media & Society, Toronto, Canada, pp 92-100. (pdf)

Monday, June 29, 2020

Location-Based Social Network Data Generation

Continuing and building upon our previous work on Location-Based Social Networks (LBSNs) at the The 21st IEEE International Conference on Mobile Data Management we have a paper entitled "Location-Based Social Network Data Generation Based on Patterns of Life." In the paper we discuss how LBSNs research has become an active research topic in a variety of areas describing mobility patterns, location recommendation and friend recommendation systems. However we make the argument that real-world LBSN data sets (e.g., Gowalla, BrightKite) are a rather scarce resource due to privacy implications of making such data public available. Furthermore, in many publicly available LBSN data sets, the vast majority of users have less than ten check-ins or the number of locations visited by a user is usually only a small portion of all locations that user has visited (as shown in the table below).

Publicly Available Real-World LBSN Data Sets.

To overcome these weaknesses in this paper we present a LBSN simulation (an agent-based model created in MASON) capable of creating multiple artificial but socially plausible, large-scale LBSN data sets. If this sounds of interest to you, below we provide a little more information about the paper, Specifically, its abstract, a depiction of LBSNs, our case studies and the resulting simulations we used to develop LBSN data based on patterns of life (PoL) and some sample results. In addition to this, as the conference was virtual, Joon-Seok Kim also made a great movie of the conference paper. At the bottom of this post we provide the full reference and link to the paper. 

We would also like to draw the readers attention to our online resources which accompanies this paper. For example, to allow others to use and extend our work, the source code and scripts used to generate these data sets is available at: https://github.com/gmuggs/pol, while all of the generated data sets can be found at OSF (https://osf.io/e24th/?view_only=191fdd0c640847b5b85597ab0e57186d). For more details about this model and data readers are referred to the webpage created by Joon-Seok Kim: https://mdm2020.joonseok.org.

Abstract:
Location-based social networks (LBSNs) have been studied extensively in recent years. However, utilizing real-world LBSN data sets in such studies yields several weaknesses: sparse and small data sets, privacy concerns, and a lack of authoritative ground-truth. To overcome these weaknesses, we leverage a large scale LBSN simulation to create a framework to simulate human behavior and to create synthetic but realistic LBSN data based on human patterns of life. Such data not only captures the location of users over time but also their interactions via social networks. Patterns of life are simulated by giving agents (i.e., people) an array of “needs” that they aim to satisfy, e.g., agents go home when they are tired, to restaurants when they are hungry, to work to cover their financial needs, and to recreational sites to meet friends and satisfy their social needs. While existing real-world LBSN data sets are trivially small, the proposed framework provides a source for massive LBSN benchmark data that closely mimics the real-world. As such it allows us to capture 100% of the (simulated) population without any data uncertainty, privacy-related concerns, or incompleteness. It allows researchers to see the (simulated) world through the lens of an omniscient entity having perfect data. Our framework is made available to the community. In addition, we provide a series of simulated benchmark LBSN data sets using different real-world urban environments obtained from OpenStreetMap. The simulation software and data sets which comprise gigabytes of spatio-temporal and temporal social network data are made available to the research community.
LBSN Overview

Case Studies: A: New Orleans, Louisiana (NOLA), Mississippi River, Lake Pontchartrain, and the ‘French Quarter’. B: George Mason University (GMU), Fairfax, VA. C: Synthetic Villages - Small (Left) and Large (Right).
Environments Populated with Agents. Clockwise from Top Left: GMU, NOLA, Large and Small Synthetic Villages.
Data Sets Resulting from Location-Based Social Network Simulation
Average Social Network Degree over Time (1K).
Social Network





Full Reference:
Kim, J-S., Jin, H., Kavak, H., Rouly, O.C., Crooks, A.T., Pfoser, D., Wenk, C. and Züfle, A. (2020), Location-Based Social Network Data Generation Based on Patterns of Life, The 21st IEEE International Conference on Mobile Data Management, Versailles, France. (pdf)

Friday, June 19, 2020

Call for Papers: ACM SIGSPATIAL 2020 International Workshop on Geospatial Simulation (GeoSim 2020)


Building upon two successful GeoSim workshops, the 2020 GeoSim Workshop (held in conjunction with the ACM SIGSPATIAL 2020 conference) is seeking papers.

The 3rd GeoSim workshop will focus on all aspects of simulation as a general paradigm to model and predict spatial systems and generate spatial data. New simulation methodologies and frameworks, not necessarily coming from the SIGSPATIAL community, are encouraged to participate. Also, this workshop is of interest to everyone who works with spatial data. The simulation methods that will be presented and discussed in the workshop should find a wide application across the community by producing benchmark datasets that can be parameterized and scaled.

The workshop seeks high-quality full (8-10 pages) and short (up to 4 pages) papers that will be peer-reviewed. Once accepted, at least one author is required to register for the workshop and the ACM SIGSPATIAL conference, as well as attend the workshop to present the accepted work which will then appear in the ACM Digital Library.

We solicit novel and previously unpublished research on all topics related to geospatial simulation including, but not limited to:
  • Disease Spread Simulation
  • Urban Simulation
  • Agent Based Models for Spatial Simulation
  • Multi-Agent Based Spatial Simulation
  • Big Spatial Data Simulation
  • Spatial Data/Trajectory Generators
  • Road Traffic Simulation
  • Environmental Simulation
  • GIS using Spatial Simulation
  • Modeling and Simulation of COVID-19
  • Interactive Spatial Simulation
  • Spatial Simulation Parallelization and Distribution
  • Geo-Social Simulation and Data Generators
  • Social Unrest and Riot Prediction using Simulation
  • Spatial Analysis based on Simulation
  • Behavioral Simulation
  • Verifying, and Validating Spatial Simulations
  • Applications for Spatial Simulation

Special Topic
The special topic for GeoSim 2020 brings focus to current trends in disease spread simulations, their practicality in predictive and prescriptive analytics, and the challenges they face in their use.

Workshop Information

Wednesday, June 10, 2020

New Paper: A Thematic Similarity Network Approach for Analysis of Places Using VGI

Building upon our work on volunteered geographical information (VGI) and ambient geographic information (AGI) and how such data (e.g. social media) can be used to understand place, Xiaoyi Yuan, Andreas Züfle and myself have a new paper entitled: "A Thematic Similarity Network Approach for Analysis of Places Using Volunteered Geographic Information" in the ISPRS International Journal of Geo-InformationIn this paper we use textual data from crowdsourced reviews originating with TripAdvisor and geo-located Twitter data and leverage this unstructured geographical information to comprehend the complexity of places at scale. Specifically we explore the connectedness and relationships of places through thematic (i.e., topical) similarity networks using Manhattan, New York as a case study. If such work sounds of interest to you, below we provide the abstract to the paper in order for you to gain a greater understanding of work, along with some figures that show our workflow and how communities where connected, before presenting some of our results. Finally at the bottom of the post, the full reference and a link to the paper is provided.  For those interested in extending or utilizing this work. The python code for presented in our analysis is available at: https://bitbucket.org/xiaoyiyuan/network_vgi/

Abstract:
The research presented in this paper proposes a thematic network approach to explore rich relationships between places. We connect places in networks through their thematic similarities by applying topic modeling to the textual volunteered geographic information (VGI) pertaining to the places. The network approach enhances previous research involving place clustering using geo-textual information, which often simplifies relationships between places to be either in-cluster or out-of-cluster. To demonstrate our approach, we use as a case study in Manhattan (New York) that compares networks constructed from three different geo-textural data sources --TripAdvisor attraction reviews, TripAdvisor restaurant reviews, and Twitter data. The results showcase how the thematic similarity network approach enables us to conduct clustering analysis as well as node-to-node and node-to-cluster analysis, which is fruitful for understanding how places are connected through individuals’ experiences. Furthermore, by enriching the networks with geodemographic information as node attributes, we discovered that some low-income communities in Manhattan have distinctive restaurant cultures. Even though geolocated tweets are not always related to place they are posted from, our case study demonstrates that topic modeling is an efficient method to filter out the place-irrelevant tweets and therefore refining how of places can be studied.

Keywords: Geo-Textual Data, Volunteered Geographic Information, Crowdsourcing, Similarity Network Analysis, Topic Modeling

Work flow from data input to the construction of the thematic similarity network and analysis (i.e., community detection and unique nodes discovery).

A stylized network demonstrating the process of community detection from a fully-connected similarity network.


Network visualization of all communities from the thematic similarity networks with major communities highlighted. Only the major communities are shown on the map for the sake of clarity. Major communities in Network visualization and mapping for each network are colored the same and thus the legend applies for both.


Two examples of communities with boundary nodes and their respective topics.

Full Reference:
Yuan X., Crooks, A.T. and Züfle, A. (2020), A Thematic Similarity Network Approach for Analysis of Places Using Volunteered Geographic Information, ISPRS International Journal of Geo-Information,  9(6), 385, https://doi.org/10.3390/ijgi9060385. (pdf)

Tuesday, June 02, 2020

Location-Based Social Simulation for Prescriptive Analytics of Disease Spread

Building upon our previous work on Location-Based Social Networks (LBSNs) and how agent-based modeling could provide an alternative to real world data sets, in the latest SIGSPATIAL Special Newsletter, we (Joon-Seok Kim, Hamdi Kavak, Chris Rouly, Hyunjee Jin, Dieter Pfoser, Carola Wenk, Andreas Zufle and myself) have an article entitled "Location-Based Social Simulation for Prescriptive Analytics of Disease Spread."

In this article we discuss a geographically explicit agent-based model that we have been developing that is capable not only of simulating human behavior but also able to create synthetic but realistic LBSN data based on human patterns-of-life. Furthermore, in the article we discuss how such data and models can be used to explore the parameter space of possible prescriptions to find optimal strategies (or policies) to achieve a desired system state and outcome. We refer to such a search for optimal policies as prescriptive analytics. (for readers wishing to learn more about prescriptive analytics please see the 1st ACM KDD Workshop on Prescriptive Analytics for the Physical World).

To give an example of such prescriptions, in the article we make use of a simple hypothetical disease model and explore two prescribed policies to mitigate the spread of the disease. The first policy requires all agents to wear simulated Personal Protective Equipment (PPE) that reduce the chance of infection by 50%. The second policy enforces strict social distancing measures onto a fixed proportion of 50% of the population. Those who follow the social distancing order avoid recreational site visits from meeting people although they still go to restaurants. In addition to these two policies, as a baseline, we also ran a “null-prescription” in which no intervention was prescribed. We find that the social distancing prescription was extremely effective. On the other hand, our simulation results for PPE policy showed that merely wearing protective gear without any change in behavior has no significant effect (for the case of this disease).

If this type of research is of interest to you, below we provide the abstract to the paper, a movie of a representative simulation run, some of our results of the prescriptions described above and a link to the paper itself. Further information about the model and data can be found at https://geosocial.joonseok.org/p/epidemic.html and the data is available at https://osf.io/e24th/. Also as we are currently going through COVID-19, we thought a a brief write up and links to some disease models and discussions of modeling efforts related to it was also appropriate to include.

Abstract: 
Human mobility and social networks have received considerable attention from researchers in recent years. What has been sorely missing is a comprehensive data set that not only addresses geometric movement patterns derived from trajectories, but also provides social networks and causal links as to why movement happens in the first place. To some extent, this challenge is addressed by studying location-based social networks (LBSNs). However, the scope of real-world LBSN data sets is constrained by privacy concerns, a lack of authoritative ground-truth, their sparsity, and small size. To overcome these issues we have infused a novel geographically explicit agent-based simulation framework to simulate human behavior and to create synthetic but realistic LBSN data based on human patterns-of-life (i.e., a geo-social simulation). Such data not only captures the location of users over time, but also their motivation, and interactions via temporal social networks. We have open sourced our framework and released a set of large data sets for the SIGSPATIAL community. In order to showcase the versatility of our simulation framework, we added disease a model that simulates an outbreak and allows us to test different policy measures such as implementing mandatory mask use and various social distancing measures. The produced data sets are massive and allow us to capture 100% of the (simulated) population over time without any data uncertainty, privacy-related concerns, or incompleteness. It allows researchers to see the (simulated) world through the lens of an omniscient entity having perfect data.

Screenshot of the epidemic simulator depicting the French Quarter, New Orleans, LA, USA.



New cases and SEIR epidemic course.


Full Reference:
Kim, J-S., Kavak, H., Rouly, C.O., Jin, H., Crooks, A.T., Pfoser, D., Wenk, C. and Zufle, A. (2020), Location-Based Social Simulation for Prescriptive Analytics of Disease Spread, SIGSPATIAL Special, 12(1): 53-61. (pdf)

The Washington Post's Disease Model
While this post is not about COVID per se, if you are interested in disease models the Washington Post had a great article about COVID several months ago entitled "Why outbreaks like corona virus spread exponentially, and how to “flatten the curve”." This article generated a lot of discussion such as on the SIMSOC Mailing list and was citied in a paper in the Journal of Artificial Societies and Social Simulation (JASSS) entitled  "Computational Models That Matter During a Global Pandemic Outbreak: A Call to Action." Other goods discussions on COVID related models (particularly agent-based models) can be found on Review of Artificial Societies and Social Simulation (RofASSS) website (here), the CoMSES Net Discourse Forum (along with links to past epidemic models) and the Sociology and Complexity Science Blog has some very good posts on modeling and public health.