Wednesday, November 06, 2019

Simulating Urban Patterns of Life: A Geo-Social Data Generation Framework

At the ACM SIGSPATIAL'19 conference, Joon-Seok Kim, Hamdi Kavak, Umar Manzoor, Dieter Pfoser, Carola Wenk, Andreas Züfle and myself have a paper entitled "Simulating Urban Patterns of Life: A Geo-Social Data Generation Framework." The general idea behind the paper is that while trajectory data is being used to capture human mobility in many applications (e.g. traffic prediction, ride-sharing applications), the use of real-world trajectory data raises serious concerns with respect to the privacy of users who contribute such information. 

To overcome privacy concerns we have created a geo-social data generator by utilizing agent-based modeling. The notion behind this generator is to allow users to develop and customize the logic of agent behaviors for different applications domains (e.g. commuting around a city). Once the basic model is created, the simulation can then be run and  geo-social data is generated which can then be used as a substitute to real-world trajectory data to study human mobility. If you wish to find out more about this paper, below is the abstract to the paper, along with some figures of the framework architecture and a link to the paper. Further supplementary materials including a demo video (which is also below) and sample data can be found at:

Data generators have been heavily used in creating massive trajectory datasets to address common challenges of real-world datasets, including privacy, cost of data collection, and data quality. However, such generators often overlook social and physiological characteristics of individuals and as such their results are often limited to simple movement patterns. To address these shortcomings, we propose an agent-based simulation framework that facilitates the development of behavioral models in which agents correspond to individuals that act based on personal preferences, goals, and needs within a realistic geographical environment. Researchers can use a drag-and-drop interface to design and control their own world including the geospatial and social (i.e. geo-social) properties. The framework is capable of generating and streaming very large data that captures the basic patterns of life in urban areas. Streaming data from the simulation can be accessed in real time through a dedicated API. 
Keywords: Agent-based simulation, trajectory data, data generator, spatial network, human behavior.
Causality in human behavior

Architecture of framework

Layout of model builder and sample model

Full Reference:
Kim, J-S., Kavak, H., Manzoor, U., Crooks, A.T., Pfoser, D., Wenk C. and Züfle, A (2019), Simulating Urban Patterns of Life: A Geo-Social Data Generation Framework, in Banaei-Kashani, F., Trajcevski, G., Güting, R.H., Kulik, L. and Newsam, S. (eds.), Proceedings of the 27th International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL 2019), Chicago, IL. (pdf)

Tuesday, November 05, 2019

New Paper: Assessing the Placeness of Locations through User-contributed Content

In the past we have written about how one can use crowdsourced data to gain a collective sense of place from Twitter contributions and also from corresponding Wikipedia entries (e.g. here). In a new paper with Xiaoyi Yuan, we extend this work to explore how user-contributed data can be used to explore if urban places are becoming inauthentic due to urban commodification and standardization by chain stores such as restaurants. To this end, at the at 3rd ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery (GeoAI) we have a paper entitled: "Assessing the Placeness of Locations through User-contributed Content"

In the paper we attempt to understand the relationship between restaurants and urban identities via user-contributed content. We extracted and analyzed information from over 3 million Yelp reviews from 37,000 restaurants using a Convolutional Neural Network (CNN) model in order to study places from the bottom up. Specifically we were interested to what extent cities share similarities or differences in their Yelp restaurant reviews. Furthermore, we wanted to explore how opinion aspects (i.e. what reviewers care about the most) are mentioned differently in urban chain and independent restaurants. Through the analysis of the Yelp reviews we find that online geo-tagged text data is fruitful for understanding places and aspect-based sentiment analysis helps us understand the large volumes of text. Not only did we discover that cities show homogeneity in terms of restaurant reviews, but for chain restaurants, “location” often emphasizes the differences between different stores of the same chain whereas for independent restaurant reviews, the aspect “location” reflects the characteristics of the places the restaurants are situated. If this is of interest to you, below we provide the abstract to the paper, along with some of the key findings and a link to the paper.

Previous research has argued that urban places are becoming “placeless” and inauthentic. Many local policies have also proposed to encourage more independent stores in order to restore urban identity. Others argue, however, that chain stores provide affordable merchandise and different locations of the same chain may have different meanings to an individual. The research presented in this paper uses a Convolutional Neural Networks model to extract opinion aspects from more than 3 million user-contributed Yelp restaurant reviews. The results show high homogeneity among cities in terms of the average proportions of aspects in restaurant reviews. In addition, for fast food chains, “location” is the only aspect category reviewed proportionally higher than independent fast food restaurants. An analysis of the co-occurrences of “location” indicates that the identity of chain restaurants stems from the comparison between the same chain of different locations whereas the identity of the independent restaurants is more diverse, implying the intricacies of placeness of urban stores. This research demonstrates that fine-grained sentiment analysis (i.e., opinion aspect extraction and analysis) with geo-tagged text data is fruitful for studying nuanced place perceptions on a large scale.
KEYWORDS: Urban Places, Convolutional Neural Networks, Aspect-based Sentiment Analysis
Figure 1: Illustration of an example of a CNN layer.
Figure 3: Mapping restaurants in NV, AZ, PA, NC, WI, IL. Not all cities are shown in each state. Only cities have data that accounts for the majority of the restaurants in that state are mapped, for the sake of visual clarity.
Figure 6: Average proportions of aspect categories for chain and independent fast food restaurants for two kinds of cuisine (American, Mexican) in Las Vegas, Phoenix, and Charlotte, normalized by dividing the mean for comparison.
Yuan X. and Crooks A.T. (2019), Assessing the Placeness of Locations through User-contributed Content, in Gao, S., Newsam, S., Zhao, L., Lunga, D., Hu, Y., Martins, B., Zhou, X. and Chen, F. (eds.), Proceedings of the 3rd ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery (GeoAI), Chicago, IL. pp. 15-23. (pdf)

Thursday, October 31, 2019

Talk: Utilizing Agent-based Models and Open Data to Examine the Movement of People and Information

Earlier this month I was invited to give a talk as part the Criminal Investigations and Network Analysis Center (CINA) Distinguished Speaker Series. As readers of the blog might expect, I chose to talk about how open data (e.g. OpenStreetMap, Twitter) can be utilized in agent-based models to study a variety of applications (many of which can be found over on my research page). The talk itself was entitled: "Utilizing Agent-based Models and Open Data to Examine the Movement of People and Information: A Gallery of Applications." Below you can read the brief abstract of the paper and if this peaks your interest, CINA recorded my talk and highlighted (short) version  is given below (while the full talk can be found at:

Today we are awash with many new forms of open data (e.g. crowdsourced, social media), but we are still challenged with how individuals make decisions and how this leads to more aggregate patterns emerging. One way to explore how individuals make decisions, or are impacted by information and their resulting consequences, is via agent-based modeling. Agent-based modeling allows for simulating heterogenous actors and their decision-making processes within complex systems. Through a series of example applications ranging from the small-scale movement of pedestrians over seconds, to that of the movement of people over borders over hours and days, I will demonstrate how open data can be leveraged within the agent-based building process. Specifically, the examples will show that by focusing on individuals, or groups of individuals and the networks that connect them, more aggregate patterns emerge from the bottom up.

Friday, October 25, 2019

Papers at CSSSA Conferece

At the  2019 Computational Social Science Society of Americas (CSSSA) Conference, we have two papers being presented which relates to our interests in urban simulation. Full citations and links to them are provided at the bottom of this post, while what follows provides a brief overview to them. Turning first to the the paper entitled "Capturing the Effects of Gentrification on Property Values: An Agent-Based Modeling Approach," co-authored with Niloofar Bagheri-Jebelli and Bill Kennedy explores how agents choices for specific locations within a city leads to gentrification occurring. The  model and data that accompanies the paper can be found at:, while below we provide the abstract of the paper, the graphical user interface of the model along with movie of one simulation run with default model settings.

Cities are complex systems which are constantly changing because of the interactions between the people and their environment. Such systems often go through several life cycles which are shaped by various processes. These may include urban growth, sprawl, shrinkage, and gentrification. These processes affect the urban land markets which in turn affect the formation of a city through feedback loops. Through models we can explore such dynamics, populations, and the environments in which people inhabit. The model proposed in this paper intends to simulate the aforementioned dynamics to capture the effect of agents’ choices and actions on the city structure. Specifically, this model explores the effect of gentrification on population density and housing values. The proposed model is significant in its integration of ideas from complex systems theory which is operationalized within an agent-based model stylized on urban theories to study gentrification as a cause of increased in land values. The model is stylized on urban theories and results from the model show that the agents move to and reside in properties within their income range, neighboring agents that have similar economic status. The model also shows the role of gentrification by capturing both the supply and demand aspects of this process in the displacement and immobilization of agents with lower incomes. This is one of the first models that combines several processes to explore the life cycle of a city through agent-based modeling.

Keywords: Urban Dynamics, Land Markets, Gentrification, Urban Growth, Urban Shrinkage, Urban Sprawl.

Model graphical user interface at default settings.

Gentrification by demand in the 10th neighborhood of the inner-city.

Turning to our second paper which was presented as a poster, entitled "Modeling Social Networks in an Agent-Based Model of a Nuclear Weapon of Mass Destruction Event" we discussed our continuing  work on disasters. Specifically our project on how people might react in an event of  Nuclear Weapon of Mass Destruction (NWMD) in New York City when one integrates social networks into an agent-based model. In the paper we discuss preliminary results which demonstrate how we can integrate  household social networks explicitly into a spatially explicit model. Furthermore we demonstrate and benchmark agent commuting patterns for the New York City Commuter Region with a sample population  (as we show in one of the movies below) along with demonstrating agents initial reactions post NWMD detonation.

Connections between human beings often influence where people go and how they behave, yet their representation as social networks are rarely modeled as a factor of human behavior in agent-based models. Social networks are increasingly being used to study human behavior in disasters, and empirical work has shown that human beings prioritize the safety of themselves and loved ones (i.e., households) before helping neighbors and coworkers. In this poster, we briefly present our agent-based model being used to characterize the New York City area population’s reaction to a Nuclear Weapon of Mass Destruction (NWMD) event. The model methodology demonstrates how social networks can be integrated into an agent-based model and act as a basis for decision-making during a disaster. Preliminary simulations show how agents potentially respond to a NWMD event with measurable changes in location and network formations over space and time.
Keywords: Agent-Based Model, Human Behavior, Social Networks, Emergency, Disaster Response, Nuclear Weapon of Mass Destruction.

Bagheri-Jebelli, N., Crooks, A.T. and Kennedy, W.G. (2019), Capturing the Effects of Gentrification on Property Values: An Agent-Based Modeling Approach, The 2019 Computational Social Science Society of Americas Conference, Santa Fe, NM. (pdf)

Burger, A. G., Kennedy, W.G., Crooks, A.T., Jiang, N. and Guillen-Piazza, D. (2019), Modeling Social Networks in an Agent-Based Model of a Nuclear Weapon of Mass Destruction Event, The 2019 Computational Social Science Society of Americas Conference, Santa Fe, NM. (paper pdf) (poster pdf)

Wednesday, September 04, 2019

Communities, Bots and Vaccinations

Following on from our work on bots and health discussions in relation to online social networks (OSNs), Xiaoyi Yuan, Ross Schuchard and myself have just published a paper entitled "Examining Emergent Communities and Detecting Social Bots within the Polarized Online Vaccination Debate in Twitter" in Social Media + Society. Within the paper we explore the communication patterns of vaccine discussions in Twitter. More specifically we ask three questions:
  1. Do vaccine discussions on Twitter show a highly clustered pattern in the sense that users tend to communicate more often with those who have same opinions towards vaccination than those who do not? 
  2. If the communication is highly clustered, to what extent do pro-vaccine users reach out to anti-vaccine users and vice versa? 
  3. How much do social bots, computer algorithms designed to mimic human behavior and interact with humans in an automated fashion, contribute to the conversation as previous research has shown that social bots can have certain impact on human communication in social media?
In order to answer these questions, we use a variety of machine learning techniques (e.g.  logistic regression, support vector machine (e.g. linear and non-linear kernel), k-nearest neighbors, nearest centroid, and Naïve Bayes) trained with labeled data (which is available at to categorize each user’s vaccination stance. By exploring a combination of opinion groups and retweet networks we discovered that pro- and anti-vaccine users retweet predominantly from their own opinion group, while users with neutral opinions are distributed across communities. In addition, our bot analysis (using the  open-source DeBot detection platform) discovered that 1.45% of the corpus users were identified as likely bots and these produced 4.59% of all tweets within our data set. If you wish to find out more about our paper, below you can read the abstract along with seeing some figures including a sketch of our methodology, a selection of results and a link to the paper.

Many states in the United States allow a “belief exemption” for measles, mumps, and rubella (MMR) vaccines. People’s opinion on whether or not to take the vaccine can have direct consequences in public health. Social media has been one of the dominant communication channels for people to express their opinions of vaccination. Despite governmental organizations’ efforts of disseminating information of vaccination benefits, anti-vaccine sentiment is still gaining momentum. Studies have shown that bots on social media (i.e., social bots) can influence opinion trends by posting a substantial number of automated messages. The research presented here investigates the communication patterns of anti- and pro-vaccine users and the role of bots in Twitter by studying a retweet network related to MMR vaccine after the 2015 California Disneyland measles outbreak. We first classified the users into anti-vaccination, neutral to vaccination, and pro-vaccination groups using supervised machine learning. We discovered that pro- and anti-vaccine users retweet predominantly from their own opinion group. In addition, our bot analysis discovers that 1.45% of the corpus users were identified as likely bots which produced 4.59% of all tweets within our dataset. We further found that bots display hyper-social tendencies by initiating retweets at higher frequencies with users within the same opinion group. The article concludes that highly clustered anti-vaccine Twitter users make it difficult for health organizations to penetrate and counter opinionated information while social bots may be deepening this trend. We believe that these findings can be useful in developing strategies for health communication of vaccination.

Keywords: Anti-vaccine Movement, Twitter, Social Media, Opinion Classification, Bot Analysis

Full Reference:
Yuan, X.,  Schuchard, R.  and Crooks, A.T. (2019), Examining Emergent Communities and Detecting Social Bots within the Polarized Online Vaccination Debate in Twitter, Social Media + Society. (pdf)