Monday, September 18, 2017

Agent-Based Modeling Chapter

In the recently published "Comprehensive Geographic Information Systems" edited by Bo Huang, Alison Heppenstall, Nick Malleson and myself have a chapter entitled "Agent-based Modelling"1. Within the chapter, we provide a overview of agent-based modeling (ABM) especially for the geographical sciences. This includes a section on how ABM emerged i.e. "The Rise of the (Automated) Machines", along with a discussion on what constitutes an agent. This is followed with steps to building an agent-based model, including: 1) the preparation and design; 2) model implementation 3) and how one goes about evaluating a model (i.e. verification, calibration and validation and how these are particularity challenging with respect to spatial agent-based models). We then discuss how we can integrate space and GIS into agent-based models and review a number of open-source ABM toolkits (e.g. GAMA, MASON, NetLogo) before concluding with challenges and opportunities that we see ahead of us, such as adding more complex behaviors to agent-based models, and how "big data" offers new avenues for multiscale calibration and validation of agent-based models.  If you are still reading this, below you can read the abstract of the paper and find the full reference to the chapter.

Agent-based modeling (ABM) is a technique that allows us to explore how the interactions of heterogeneous individuals impact on the wider behavior of social/spatial systems. In this article, we introduce ABM and its utility for studying geographical systems. We discuss how agent-based models have evolved over the last 20 years and situate the discipline within the broader arena of geographical modeling. The main properties of ABM are introduced and we discuss how models are capable of capturing and incorporating human behavior. We then discuss the steps taken in building an agent-based model and the issues of verification and validation of such models. As the focus of the article is on ABM of geographical systems, we then discuss the need for integrating geographical information into models and techniques and toolkits that allow for such integration. Once the core concepts and techniques of creating agent-based models have been introduced, we then discuss a wide range of applications of agent-based models for exploring various aspects of geographical systems. We conclude the article by outlining challenges and opportunities of ABM in understanding geographical systems and human behavior.

Keywords: Agent-based modeling; Calibration; Complexity; Geographical information science; Modeling and simulation; Validation; Verification.

Full Reference
Crooks, A.T., Heppenstall, A. and Malleson, N. (2018), Agent-based Modelling, in Huang, B. (ed), Comprehensive Geographic Information Systems, Elsevier, Oxford, England. Volume 1, pp. 218-243 DOI: (pdf)

1. [Readers of this blog might of expected the chapter would be about Agent-based Modeling, but its still worth a read!]

Thursday, August 17, 2017

Big Data, Agents and the City

In the recently published book "Big Data for Regional Science" edited by Laurie Schintler and  Zhenhua Chen, Nick Malleson, Sarah Wise, and Alison Heppenstall and myself have a chapter entitled: Big Data, Agents and the City. In the chapter we discuss how big data can be used with respect to building more powerful agent-based models. Specifically how data from say social media could be used to inform agents behaviors and their dynamics; along with helping with the calibration and validation of such models with a emphasis on urban systems. 

Below you can read the abstract of the chapter, see some of the figures we used to support our discussion, along with the full reference and a pdf proof of the chapter. As always any thoughts or comments are welcome.

Big Data (BD) offers researchers the scope to simulate population behavior through vastly more powerful Agent Based Models (ABMs), presenting exciting opportunities in the design and appraisal of policies and plans. Agent-based simulations capture system richness by representing micro-level agent choices and their dynamic interactions. They aid analysis of the processes which drive emergent population level phenomena, their change in the future, and their response to interventions. The potential of ABMs has led to a major increase in applications, yet models are limited in that the individual-level data required for robust, reliable calibration are often only available in aggregate form. New (‘big’) sources of data offer a wealth of information about the behavior (e.g. movements, actions, decisions) of individuals. By building ABMs with BD, it is possible to simulate society across many application areas, providing insight into the behavior, interactions, and wider social processes that drive urban systems. This chapter will discuss, in context of urban simulation, how BD can unlock the potential of ABMs, and how ABMs can leverage real value from BD.  In particular, we will focus on how BD can improve an agent’s abstract behavioral representation and suggest how combining these approaches can both reveal new insights into urban simulation, and also address some of the most pressing issues in agent-based modeling; particularly those of calibration and validation.

Keywords: Agent-based models, Big Data, Emergence, Cities.

The growth in Agent-based modeling -from search results of Web of Science and Google Scholar.

Hotspots of activity of Tweeter Users: Tweet locations and associated densities for a selection of prolific users.

Full Reference:
Crooks, A.T., Malleson, N., Wise, S. and Heppenstall, A. (2018), Big Data, Agents and the City, in Schintler, L.A. and Chen, Z. (eds.), Big Data for Urban and Regional Science, Routledge, New York, NY, pp. 204-213. (pdf)

Sunday, August 13, 2017

Predicting the Evolution of Narratives in Social Media

Building on our work on narratives and social media at the 15th International Symposium on Spatial and Temporal Databases (SSTD'17) we have a paper entitled: "Predicting the Evolution of Narratives in Social Media." In the paper we discuss briefly the challenges that social media poses with respect to understanding narratives and propose a framework that could be used to develop simulation models to predict the spread and evolution of narratives by blending the social, spatial and contextual dimensions of online narratives that are contextually informed by past events. Below you can read the abstract to our paper along with a link to the paper itself.
Abstract. The emergence of global networking capabilities (e.g. social media) has provided newfound mechanisms and avenues for information to be generated, disseminated, shaped, and consumed. The spread and evolution of online information represents a unique narrative ecosystem that is facilitated by cyberspace but operates at the nexus of three dimensions: the social network, the contextual, and the spatial. Current approaches to predict patterns of information spread across social media primarily focus on the social network dimension of the problem. The novel challenge formulated in this work is to blend the social, spatial, and contextual dimensions of online narratives in order to support high fidelity simulations that are contextually informed by past events, and support the multi-granular, reconfigural and dynamic prediction of the dissemination of a new narrative.

Full Reference:
Schmid, K. A. Zufle, A., Pfoser, D., Crooks, A.T., Croitoru, A. and Stefanidis, A. (2017), Predicting the Evolution of Narratives in Social Media, in Gertz, M., Renz, M., Zhou, X., Hoel, E., Ku, W.-S., Voisard, A., Zhang, C., Chen, H., Tang, L., Huang, Y., Lu, C.-T. and Ravada, S. (eds.) Advances in Spatial and Temporal Databases: Proceedings of the 2017 International Symposium on Spatial and Temporal Databases, Springer, New York, NY., pp. 388-392 (pdf)

Saturday, August 05, 2017

Spatial Agent-based Modeling to Explore Slum Formation Dynamics

In the newly published book edited by Jean-Claude Thill  and Suzana Drajicavic entitled:  "Geocomputational Analysis and Modeling of Regional Systems" Amit Patel, Naoru Koizumi and myslef have a chapter which explores some of our work with respect to modeling slums in India. The chapter is titled:  "Spatial Agent-based Modeling to Explore Slum Formation Dynamics in Ahmedabad, India." In which we report some of the work we did with pertaining to our sponsored NSF Project: "An Integrated Simulation Framework to Explore Spatio-temporal Dynamics of Slum Formation in Ahmedabad, India". Below you can see the abstract for the chapter along with some of the figures and a link to the project page.

 "More than 900 million people or one third of the world’s urban population lives in either slum or squatter settlements. Urbanization rates in developing countries are often so rapid that formal housing development cannot meet the demand. In the past decades, international, national and local development communities have taken several policy actions in an attempt to improve the living conditions of people within slums or to eradicate them completely. However, such policies have largely failed and slum-free cities have remained a distant goal for many developing countries. This chapter argues that for informed policymaking, it is important to investigate questions related to slum formation such as: (1) How do slums form and expand? (2) Where and when are they formed? (3) What types of structural changes and/or policy interventions could improve housing conditions for the urban poor? In order to address these questions, this chapter develops a geosimulation model that is capable of exploring the spatio-temporal dynamics of slum formation and simulating future formation and expansion of slums within cities of the developing world. Our geosimulation model integrates agent-based modeling (ABM) and Geographic Information System (GIS), methods that are often applied separately to explore slums. In our model, ABM simulates human behavior and GIS provides a spatial environment for the housing market. GIS is also used to analyze empirical data using spatial analyses techniques, which is in turn used to validate the model outputs. The core of this framework is a linked dynamic model operating at both micro and macro geographic and demographic scales. The model explores the collective effect of many interacting inhabitants of slums as well as non-slum actors (e.g. local government) and how their interactions within the spatial environment of the city generate the emergent structure of slums at the macro scale. We argue that when empirical data is absent, geosimulation provides useful insights to study implications of various policies. The goal of this framework is to develop a decision support tool that could allow urban planners and policymakers to experiment with new policy ideas ex-ante in a simulated environment. We calibrate and validate the model using data from Ahmedabad, the sixth largest city of India, where 41% of its population lives in slums. This is one of the first attempts to develop an integrated and multi-scalar analytical framework to tackle slum issues in the developing world at multiple spatial scales."
Keywords: Slums Agent-based modeling India Geosimulation

Integrated Simulation Framework
Slum Locations and Slum Sizes in Ahmedabad, 2001

Spatial Sprawl Experiment

Full Reference:
Patel, A., Crooks, A.T. and Koizumi, N. (2018). “Spatial Agent-based Modeling to Explore Slum Formation Dynamics in Ahmedabad, India” in Thill J.C. and Drajicavic, S. (eds.), Geocomputational Analysis and Modeling of Regional Systems, Springer, New York, NY, pp 121-141. (pdf)

Further details of the model and project can be found here. As normal any thoughts and comments are most welcome.

Monday, July 31, 2017

Travel Times, cost distances and more in NetLogo

Just a short post to highlight Rohan Fisher's excellent website demonstrating and sharing a number of NetLogo models. One such example is shown below, which integrates GIS cost distance analysis to explore access to services via travel times (click here to read the full paper).

In other examples, Rohan shows how NetLogo can be used to explore the spread of fire or how Cane toads can colonize new areas as shown in the movie below. More information about Rohan's work can be found at: and his YouTube channel.

Friday, June 30, 2017

From Cyber Space to Physical Space Disease Outbreaks

At the upcoming 2017 International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation  conference (or SBP-BRiMS 2017 for short), Xiaoyi Yuan and myself will present a paper entitled: "From Cyber Space Opinion Leaders and the Diffusion of Anti-vaccine Extremism to Physical Space Disease Outbreaks". In the paper we explore how online discussions with respect to vaccinations can potentially impact on the spread of a disease. Below you can read the abstract to our paper, see the basic model logic and movie of a single simulation. If you are interested in finding out more about the model or running it yourself, you can do so here:

Measles is one of the leading causes of death among young children. In many developed countries with high measles, mumps, and rubella (MMR) vaccine coverage, measles outbreaks still happen each year. Previous research has demonstrated that what underlies the paradox of high vaccination coverage and measles outbreaks is the ineffectiveness of “herd immunity”, which has the false assumption that people are mixing randomly and there’s equal distribution of vaccinated population. In reality, the unvaccinated population is often clustered instead of not equally distributed. Meanwhile, the Internet has been one of the dominant information sources to gain vaccination knowledge and thus has also been the locus of the “anti-vaccine movement”. In this paper, we propose an agent-based model that explores sentiment diffusion and how this process creates anti-vaccination opinion clusters that leads to larger scale disease outbreaks. The model separates cyber space (where information diffuses) and physical space (where both information diffuses and diseases transmit). The results show that cyber space anti-vaccine opinion leaders have such an influence on anti-vaccine sentiments diffusion in the information network that even if the model starts with the majority of the population being pro-vaccine, the degree of disease outbreaks increases significantly. 

Keywords: Agent-based modeling Information networks Infectious disease transmission.

Full Reference:  
Yuan, X. and Crooks, A.T. (2017), From Cyber Space Opinion Leaders and the Spread of Anti-Vaccine Extremism to Physical Space Disease Outbreaks, in Lee, D., Lin, Y., Osgood, N. and Thomson, R. (eds.) Proceedings of the 2017 International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation, Springer, New York, NY., pp. 114-119. (PDF).

Monday, June 05, 2017

Comparing four modeling approaches using a Susceptible-Infected-Recovered (SIR) epidemic model

Over the years several modeling styles have been developed but often it is unclear what are the differences between them. In this joint post, we, (Yang Zhou and myself) would like to compare and contrast four modeling approaches widely used in Computational Social Science, namely: System Dynamics (SD) models, Agent-based Models (ABM), Cellular Automata (CA) models, and Discrete Event Simulation (DES). For a review of their undying mechanisms and core components of each readers are referred to Gilbert and Troitzsch's (2005) "Simulation for the Social Scientist"

To compare and contrast the differences in how these models work and how their underlying mechanisms generate outputs, we needed a common problem to test them against with the same set of model parameters. While one could choose a more complex example, here we decided to chose one of the simplest models we know. Specifically, we chose to model the spread of a disease specifically using a Susceptible-Infected-Recovered (SIR) epidemic model. Our inspiration for this came from the SD model outlined in the great book “Introduction to Computational Science: Modeling and Simulation for the Sciences” by Shiflet and Shiflet (2014) which was implemented in NetLogo from the accompanying website. For the remaining models (i.e. the ABM, CA, and DES) we created models from scratch in NetLogo. Below we will introduce how we built each model, before showing the results from the four models with the same set of parameters, which allows us to compare the results of the models. The source code, further documentation for the four models can be found over at Yang Zhou's website and GitHub page.

The System Dynamics Model

In the system dynamics model from Shiflet and Shiflet (2014), one person is infected at start. Infected people can infect susceptible people. The population of infected will always increase by (number of infected * number of susceptible * InfectionRate * change in time dt). The infected people may recover. The amount of people that will recover in an iteration is always equal to (number of infected * RecoveryRate * change in time dt). Figure 1 illustrates the system dynamics process while Figure 2 shows the SIR process as a flowchart.

Figure 1. System Dynamics process (source: Shiflet and Shiflet, 2014)

Figure 2. System Dynamics flowchart

The Agent-based Model

As in the case for the SD model, at the beginning of the simulation, one agent is infected. Agents are randomly distributed on the landscape, and in the beginning of each iteration, they turn to a random direction and move forward by one cell. During each iteration, an infected agent may infect other agents on the same cell. This is different from how the SD model works, specifically the probability of getting infected. In the SD model, the infection rate is the infection rate on the entire population. In the ABM, the probability of becoming infected is equal to the infection rate divided by the probability of an agent to be in the same cell, multiplied by the change in time. Each infected agent has a probability to recover in each time period, which equals to the recovery rate times the change in time. The equations in the ABM are the following:

Where P(same cell) = probability to be on the same cell, equals 1 divided by total number of cells; dt = change in time. Figure 3 illustrates the agent decision process while Figure 4 shows the display of the ABM

Figure 3. Agent-based Modeling: agent decision process

Figure 4. Display of the ABM. Green = susceptible. Red = infected. Blue = recovered.

The Cellular Automata Model

At the beginning of the simulation, one cell is infected. During each iteration (dt), the infected cell can infect other cells in its Moore neighborhood (i.e. 8 surrounding cells). The landscape will be a n by n square, and n is equal to the square root of the number of people to be created at the beginning of the simulation. Wrapping is enabled both horizontally and vertically. Similar to the ABM, we would like to map the probability of becoming infected to the one in the SD model. In the CA model, the probability of becoming infected is equal to the infection rate divided by the probability to be in the Moore neighborhood, multiplied by the change in time. Each infected cell has a probability to recover in each time period, which is based on the recovery rate multiplied by the change in time. The equations here are:

Figure 5 shows the changing process of the cells while Figure 6 shows the display of the CA model.

Figure 5. Cellular Automata cell changing process

Figure 6. Display of the CA model. Green = susceptible. Red = infected. Blue = recovered.

The Discrete Event Simulation Model

In a Discrete Event Simulation model (aka. queuing model), there are three abstract types of objects: 1) servers, 2) customers, 3) queues, which is quite different from the CA and ABMs.

So to implement a SIR model as a DES Servers are the processes of becoming infected and recovering. The durations people stay with the servers represent the process of becoming infected and becoming recovered. Customers are susceptible people to be infected, and infected people are waiting to recover. We assume there are two queues in this model. As susceptible objects (i.e. individuals) are created, queues for infection are formed while people are waiting to be infected. On the other hand, as people get infected, they form a second queue waiting to recover. During each iteration (dt), each object in queue has a probability to get become infected. Each infected agent object has a probability to recover which is equal to RecoveryRate. After agents recover, they enter the sink of recovered people. The equations can be written as follow:

While the whole process is illustrated in Figure 7.
Figure 7. Discrete Event Simulation process.

Results from the Implementations

Now that the models have been briefly described. We turn to how using the same set of parameters lead to different results. The default parameters being used in each model are: number of susceptible people at setup = 2500, Infection Rate = 0.002, Recovery Rate = 0.5, change of time (dt) = 0.001, and the numbers of people in each status are recorded. Since the SD model has no randomness and will always give the same result, it is run only once. Each of the other three models were run for 10 times (feel free to run them more if you wish), and then we took the average of the ten results and show them in Figure 8. The stop condition is that no individual left to be infected.

Figure 8. Results for the different models. Clockwise from top left: SD model, ABM, DES and CA

In the four models, we observe the same pattern: the number of susceptible people decreases, the number of infected people increases first and then decrease again, and the number of recovered people increase over time. However, each model realization also shows a lot of differences in how such patterns play out.

First of all, the SD model has the smallest number of iterations before no one is infected. The number of iterations shown on the graph are the average of the ten runs, since the runs range from smaller to larger numbers (except for the SD model, which only has one run). The SD model only took 17451 iterations to stop, while the ABM took 19145 iterations (on average), the DES model took 18645 iterations (on average). The CA model took the longest time on average for no more individuals to be infected, it took 25680 iterations (on average).

The results of the SD, ABM and DES models while appearing to be very similar to each other. In the sense, that the number of infected people increase fast at first and reaches a peak number of over 1500 at more than 2000 iterations (2272 for SD, 2403 for ABM, 2538 for DES). On the other hand, in the CA model, the number of infected people increases much slower due to the diffusion mechanism of the CA model and never reaches an amount as high as in the former models.

An important characteristic of the SD model is that there is no randomness in the model, so no matter how many times you run this model, you will get the same result. In the other three models, getting infected or recover always depend on a probability function, so there is difference in every run.

Furthermore, people in the SD model and the DES model are homogeneous, and everyone has the same probability to becoming infected or recovering from an infection, although these rates change over time, they do not vary among the different people in the population. On the other hand, in the ABM and the CA model, people (represented by moving agents or static cells) are heterogenous in the sense that they have different locations. Only susceptible people around an infected individual can be infected. It is interesting that when people can move around, like in the ABM, the result is similar to the SD model, though the ABM takes a little more time to recover (19145 iterations in ABM vs. 17451 iterations in SD). When people are static and the number of people on the same space is limited (one cell in one space in this case), like in the CA model, the infection process becomes slower and it takes longer for everyone to recover.

To test how the models are sensitive to a specific parameter we now present what happens if we increase the infection rate in each model from 0.002 to 0.02 and show the results shown in Figure 9. As to be expected as the infection rate increased, the number of susceptible people decrease at a much faster rate. However, the SD, the ABM, and the DES models are still similar to each other, while the infection in the CA model is slower. The average number of iterations for these models are: 15807 (SD), 15252 (ABM), 16937 (CA), 16677 (DES). By increasing the infection rate the total number of iterations of each model has decreased, with the CA model still taking the longest time to converge. The peak of infected people in each model are on average: 2363 people at 255 iterations (SD), 2310 people at 363 iterations (ABM), 2035 people at 1019 iterations (CA), 2340 people at 286 iterations (DES). The CA model takes a longer time and reaches a lower peak.

Figure 9. Results for the different models with infection rate = 0.02. Clockwise from top left: SD model, ABM, DES and CA.

These models are only simple examples of how a SIR model can be implemented in different modeling techniques, but in reality, if we were to model disease propagation in more detail we would need to consider many other things such as people could be both moving through space (i.e. traveling to work) and static (i.e. staying at home), and the capacity of each cell is always limited to some amount.

Gilbert, N. and Troitzsch, K.G. (2005), Simulation for the Social Scientist (2nd Edition), Open University Press, Milton Keynes, UK.

Shiflet, A.B. and Shiflet, G.W. (2014), Introduction to Computational Science: Modeling and Simulation for the Sciences (2nd Edition), Princeton University Press, Princeton, NJ.
More information about the models and to download them please visit Yang Zhou's website.

Wednesday, May 31, 2017


It is always a great pleasure to teach and work with students, and see them complete their academic program. Over this last academic year, I supervised 3 masters students and served on the committee of another one, who all successfully completed their Master of Arts in Interdisciplinary Studies (MAIS) with a concentration in Computational Social Science (CSS). To quote from the MAIS in CSS website:
"Computational Social Science (CSS) is an interdisciplinary science in which social science questions are investigated with modern computational tools. Our program was the first CSS MA in the world, and continues to advance the study of social science through computational methods (e.g. agent-based modeling, social network analysis, and big data).

Our faculty members are internationally recognized for their pioneering work in CSS, including authoring the first textbook in the field, and have written numerous books and articles on topics such as growing artificial societies, modeling geographical systems, and sustainability. Research in the program is and has been funded by the National Science Foundation, United States Agency for International Development, National Geospatial-Intelligence Agency, the Defense Threat Reduction Agency, the Defense Advanced Research Projects Agency, and NASA.

Besides taking introductory classes in theories and practices of social, geo-social, economic, and network modeling, you will have the opportunity to work one-on-one with faculty on your project or thesis of interest, as well as directed readings. Additionally, Mason’s proximity to the Washington, D.C., area provides an excellent opportunity to attend seminars offered by NGOs, visiting professors, and government employees.

Students range from recent college graduates to mid-career professionals who bring diverse knowledge that enhances the classroom experience. Graduates have gone on to pursue their doctorates at Mason and other Carnegie Classification Research 1 universities. Others have pursued or continued their career in government or the private sector, in organizations such as the U.S. Army, MapR Technologies, CACI, Logistics Management Institute, and Ninja Analytics, Inc.

To get the latest information on our program, visit us on Facebook or our program page."
Below is a selection of projects from this academic year. Eric Hansen project was entitled "An Agent-Based Model of British And Boer Small Arms and Tactics During the Second Anglo-Boer War" in which he explored different military technology had an impact on the military victories.

In another project, Paul Cummings  explored different strategies for combating radicalism (i.e. Security Risk model and Socio-Economic Hardship model) via an agent-based model under the title of "Modeling the Characteristics of Radical Ideological Growth using an Agent based Model Methodology"

Marta Hansen's final project was entitled "Positive Affect And Prospect Theory In Agent_Zero: A Model Extension" which extends Joshua Epstein’s Agent_Zero model to allow for cooperative events to take place.

Just to highlight that not all students opt for  agent-based models. Devin Bright undertook a project entitled "Mapping the Human Terrain of a Modern Megacity with the use of Social Media." In which he explored how a years worth of social media data can be mined and analyzed via GIS and social network analysis (SNA) to to give insights into the dynamics of New York City in the United States and Lagos in Nigeria.

Thursday, April 20, 2017

Zika in Twitter: Health Narratives

In the paper we explored how health narratives and event storylines pertaining to the recent Zika outbreak emerged in social media and how it related to news stories and actual events.

Specifically we combined actors (e.g. twitter uses), locations (e.g. where the tweets originated) and concepts (e.g. emerging narratives such as pregnancy) to gain insights on the mechanisms that drive participation, contributions, and interactions on social media  during a disease outbreak. Below you can read a summary of our paper along with some of the figures which highlight our methodology and findings.  

An overview of the Twitter narrative analysis approach, starting with data collection, and proceeding with preprocessing and data analysis to identify narrative events, which can be used to build an event storyline.

Background: The recent Zika outbreak witnessed the disease evolving from a regional health concern to a global epidemic. During this process, different communities across the globe became involved in Twitter, discussing the disease and key issues associated with it. This paper presents a study of this discussion in Twitter, at the nexus of location, actors, and concepts.
Objective: Our objective in this study was to demonstrate the significance of 3 types of events: location related, actor related, and concept- related for understanding how a public health emergency of international concern plays out in social media, and Twitter in particular. Accordingly, the study contributes to research efforts toward gaining insights on the mechanisms that drive participation, contributions, and interaction in this social media platform during a disease outbreak. 
Methods: We collected 6,249,626 tweets referring to the Zika outbreak over a period of 12 weeks early in the outbreak (December 2015 through March 2016). We analyzed this data corpus in terms of its geographical footprint, the actors participating in the discourse, and emerging concepts associated with the issue. Data were visualized and evaluated with spatiotemporal and network analysis tools to capture the evolution of interest on the topic and to reveal connections between locations, actors, and concepts in the form of interaction networks. 
Results: The spatiotemporal analysis of Twitter contributions reflects the spread of interest in Zika from its original hotspot in South America to North America and then across the globe. The Centers for Disease Control and World Health Organization had a prominent presence in social media discussions. Tweets about pregnancy and abortion increased as more information about this emerging infectious disease was presented to the public and public figures became involved in this. 
Conclusions: The results of this study show the utility of analyzing temporal variations in the analytic triad of locations, actors, and concepts. This contributes to advancing our understanding of social media discourse during a public health emergency of international concern.

Keywords: Zika Virus; Social Media; Twitter Messaging; Geographic Information Systems.

Spatiotemporal participation patterns and identifiable clusters over 4 of our twelve week study. The top left panel shows the data during the first week, and time progresses from left to right and from top to bottom towards .

Subsets of the full retweet network pertaining to the WHO (left) and CDC (right), and clusters identified within them. Magenta clusters are centered upon health entities, green upon news organizations, orange upon political entities.

Visualizing a narrative storyline across locations (blue), actors (red), and concepts (green).

Full Reference:
Stefanidis, A., Vraga, E., Lamprianidis, G., Radzikowski, J., Delamater, P.L., Jacobsen, K.H., Pfoser, D., Croitoru, A. and Crooks, A.T. (2017). “Zika in Twitter: Temporal Variations of Locations, Actors, and Concepts”, JMIR Public Health and Surveillance, 3 (2): e22. (pdf)

As normal, any feedback or comments are most welcome. 

Saturday, April 08, 2017

Talk from the AAG

The last few days I have been attending the  Association of American Geographers (AAG) Annual Meeting in Boston. A common theme at the AAG sessions I attended  (to me at least) seemed to  be the rise of new sources of data which give us new ways to explore geographical problems and the challenges of working with bigger data sets. Perhaps where this was most explicitly expressed were in the Geographic Data Science sessions which was pitched to be at the nexus of data science and geography.

While at the meeting I participated in a panel under the theme of "Geographic Data Science", and as part of the Symposium on Human Dynamics in Smart and Connected Communities, I co-organized two sessions entitled Agents - the 'atomic unit' of social systems? which also included Agent-Bingo.  Finally I and gave a presentation of our current research at Mason, entitled "Megacities through the Lens of Computational Social Science", more details can be seen below. For those wanting to know more on the synthetic population generation, click here.

Geographic Data Science Panel

Megacities through the Lens of Computational Social Science


Currently there are over 35 megacities, cities with over 10 million inhabitants, and the number of such cities are expected to grow in the coming years. These habitats represent many challenges from an agent-based modeling perspective. Their size and density, the diverse behaviors of their inhabitants, and their evolving social network of communities along with multiple interacting subsystems need to be understood, captured and modeled. To capture and link the dynamics that shape and form these systems, we must grapple with them in their entirety. While there have been many models applied to specific subsystems of megacities (e.g. traffic, disease spread, urban growth etc.) their interactions often go untouched.

The lens of computational social science (CSS), the interdisciplinary science of complex social systems and their investigation through computational modeling and related techniques can be used to understand and model megacities. Given the advances in computational power and the availability of fine scale datasets, what are the opportunities offered to us with respect to exploring megacities? In an attempt to answer this question we will demonstrate how new sources of data (e.g. volunteered geographical information) can be fused with more traditional data (e.g. census data) to create the basis of a megacity model both in terms of its physical environment and its social environment. We will then show results from a simulated disaster explores how people potentially react and behave to the evolving crisis within a megacity.

Keywords: Megacities, GIS, Agent-based modeling, Social Networks, Behavior

Full References:
Crooks A.T., Kennedy W.G., Burger, A. Oz, T. and Heppenstall, A. (2017), Megacities through the Lens of Computational Social Science, The Association of American Geographers (AAG) Annual Meeting, 5th-9th, April, Boston, MA. (pdf)

Tuesday, April 04, 2017

Smart Cities in IEEE Pervasive Computing

We are excited to announce that the special issue that we organized for IEEE Pervasive Computing is now out. In the special issue entitled "Smart Cities" and demonstrates the state of the art of pervasive computing technologies that collect, monitor, and analyze various aspects of urban life. The articles and departments in the special issue highlight the coming revolution in urban data via some of the different approaches researchers are taking to build tools and applications to better inform decision making (to reduce energy consumption or improve visitor flows, for example). Such research will be critical to setting goals for sustainable urban development within different global contexts. We need to better understand cities and their underlying systems if we want to improve the quality of urban life. To this end, in the special issue we have an introduction (editorial) followed by a number of articles, an interview and a research spotlight:
We hope you enjoy them. Thank you for the authors who submitted papers, the reviewers, Rob Kitchen for giving an interview and Barbara Lenz and Dirk Heinrichs for discussing their research. Lastly, we would also like to thank the IEEE Pervasive Computing team for ensuring that the special issue came to fruition.

Full Reference to the Introduction: 
Crooks, A.T., Schechtner, K., Day, A.K and Hudson-Smith, A (2017), Creating Smart Buildings and Cities, IEEE Pervasive Computing, 16 (2): 23-25. (pdf)

Friday, March 10, 2017

Geovisualization of Social Media

Figure 1: Map Mashup of Twitter data, where eachdot
represents a tweet, the text corresponds to the selected
 tweet marked with a star
In the recently released "The International Encyclopedia of Geography: People, the Earth, Environment, and Technology" we were asked to write a brief entry entitled "geovisualization of social media". Below is a summary of  our chapter:

The proliferation of social media over the last decade is presenting substantial computational challenges associated with the management, processing, analysis and visualization of the corresponding massive volumes of data. Furthermore, this new form of information also imposes new-found challenges upon the geographical community due to the unique nature of its content, as analyzing such data calls for a hybrid mix of spatial and social analysis. The spatial content of social media comprises primarily coordinates from which the contributions originate, or references to specific locations. At the same time, these data have a strong social component, as they can reveal the underlying social structure of the user community through manifestations of their interactions. Analyzing both the spatial and social content of social media feeds is referred to as geosocial analysis. Within this entry we explore the geovisualization opportunities and challenges that are emerging as social media are becoming the subject of study of the geographical community.
In more detail, we start off discussing how the geographic content of social media feeds represents a new type of geographic information. It transcends the early definitions of crowdsourcing or volunteered geographic information as it is not the product of a process through which citizens explicitly and purposefully contribute geographic information to update or expand geographic databases. Instead, the type of geographic information that can be harvested from social media feeds can be referred to as Ambient Geographic Information; it is embedded in the content of these feeds, often across the content of numerous entries rather than within a single one, and has to be somehow extracted. Nevertheless, it is of great importance as it communicates instantaneously information about emerging issues. At the same time, it provides an unparalleled view of the complex social networking and cultural dynamics within a society, and captures the temporal evolution of the human landscape.

In many cases, the geovisualization of social media feeds predominately take the appearance of web map mashups, in essence portraying the location of social media usage on a map. Such an early attempt to visualize social media is shown Figure 1. We argue that while this approach is informative, it often falls short of capturing the depth, richness, and complexity of the information that can be gleaned from social data. As a result, a need for more advanced geovisualization approaches that are capable of better capturing and communicating the complexity and multidimensionality of social media arises. And this is the focus of our chapter. We discuss briefly the geovisualization of network structures (such as shown in Figure 2), the geovisualization of network structure dynamics, the geovisualization of social media content (such as shown in Figure 3) along with the visualization of social media analysis (Figure 4) and conclude the chapter with a list of emerging research challenges.

Figure 2: Visualizing communities: a social network of an interest group (a), and the geovisualization of the  largest community shown over the contiguous U.S (B).

Figure 3: Visualizing social media content dynamics by coupling a Twitter stream viewer (A), a Twitter activity density map (B), and a ranked list top hash-tags (C) and top authors (E), a time slider (D), and author/hash-tags time series graphs.
Figure 4: Visualizing spatiotemporal clusters of tweets following the 2013 Boston bombing. Red circles indicate the approximate radius of each cluster, and color is used to indicate time.

We hope you enjoy. As always any feedback or comments most welcome. Please note this chapter was written a couple of years ago and more recent work by us has been done, click here to see some.

Full Reference:
Croitoru, A., Crooks, A.T., Radzikowski, J. and Stefanidis, A. (2017), Geovisualization of Social Media, in Richardson, D., Castree, N., Goodchild, M. F., Kobayashi, A. L., Liu, W. and Marston, R. (eds.), The International Encyclopedia of Geography: People, the Earth, Environment, and Technology, Wiley Blackwell. DOI: 10.1002/9781118786352.wbieg0605 (PDF)

Thursday, March 09, 2017

Cellular Automata

In the recently released "The International Encyclopedia of Geography: People, the Earth, Environment, and Technology" I was asked to write a brief entry on "Cellular Automata". Below is the abstract to my chapter, along some of the images I used in my discussion, the full reference to the chapter.

Cellular Automata (CA) are a class of models where one can explore how local actions generate global patterns through well specified rules. In such models, decisions are made locally by each cell which are often arranged on a regular lattice and the patterns that emerge, be it urban growth or deforestation are not coordinated centrally but arise from the bottom up. Such patterns emerge through the cell changing its state based on specific transition rules and the states of their surrounding cells. This entry reviews the principles of CA models, provides a background on how CA models have developed, explores a range of applications of where they have been used within the geographical sciences, prior to concluding with future directions for CA modeling. 

The figures below are a sample from the entry, for example, we outline different types of spaces within CA models such as those shown in Figures 1 and 2. We also show how simple rules can lead to the emergence of patterns such as the Game of Life as shown in Figure 3 or  Rule 30 as shown in Figure 4.

Figure 1: Two-Dimensional Cellular Automata Neighborhoods

Figure 2: Voronoi Tessellations Of Space Where Each Polygon Has A Different Number Of Neighbors Based On A Shared Edge.

Figure 3: Example of Cells Changing State from Dead (White) To Alive (Black) Over Time Depending On The States of its Neighboring Cells.

Figure 4: A One-Dimensional CA Model Implementing “Rule 30” Where Successive Iterations Are Presented Below Each Other.

Full Reference:
Crooks, A.T. (2017), Cellular Automata, in Richardson, D., Castree, N., Goodchild, M. F., Kobayashi, A. L., Liu, W. and Marston, R.  (eds.), The International Encyclopedia of Geography: People, the Earth, Environment, and Technology, Wiley Blackwell. DOI: 10.1002/9781118786352.wbieg0578. (pdf)

Monday, February 27, 2017

Agents - the 'atomic unit' of social systems? @AAG 2017

As part of the Symposium on Human Dynamics in Smart and Connected Communities at the forthcoming AAG Annual Meeting in Boston we have organized 2 sessions under the title of "Agents - the 'atomic unit' of social systems?" (session IDs 4169 & 4269). These will be held on on Saturday, 4/8/2017, from 8:00 am to 11.40 (we did not chose this time slot). Below you can see the session description and the list of speakers and titles. We hope some of the readers of this blog can make it to the sessions.

Session Description

By defining a social system as a collection of agents, individuals and their behaviors/decisions become the driving force of these systems. Complex global phenomena such as collective behaviors, extensive spatial patterns, and hierarchies are manifested through agent interaction in such a way that the actions of the parts do not simply sum to the activity of the whole. This allows unique perspectives into the inner workings of social systems, making agent-based modelling (ABM) a powerful and appealing tool for understanding the drivers of these systems and how they may change in the future.

What is noticeable from recent applications of ABM is the increase in complexity (richness and detail) of the agents, a factor made possible through new data sources and increased computational power. While there has always been 'resistance' to the notion that social scientists should search for some 'atomic element or unit' of representation that characterizes the geography of a place, the shift from aggregate to individual mark agents as a clear contender to fulfill the role of 'atom' in social simulation modelling. However, there are a number of methodological challenges that need to be addressed if ABM is to fully realize its potential and be recognized as a powerful tool for policy modelling in key societal issues. Most pressing are methods to accurately identify, represent, and evaluate key behaviors and their drivers in ABM.

This session will present papers that contribute towards this wide discussion ranging from epistemological perspectives of the place of ABM, extracting behavior from novel and established data sets to new, intriguing applications to establishing robustness in calibrating and validating ABMs. 


  • Andrew Crooks, Department of Computational and Data Sciences, George Mason University.
  • Alison Heppenstall, School of Geography, University of Leeds.
  • Nick Malleson, School of Geography, University of Leeds
  • Paul Torrens, Department of Computer Science and Engineering, Tandon School of Engineering, New York University.
  • Sarah Wise, Centre for Advanced Spatial Analysis (CASA), University College London.

4169 Symposium on Human Dynamics in Smart and Connected Communities: Agents - the 'atomic unit' of social systems? 1 

Saturday, 4/8/2017, from 8:00 AM - 9:40 AM in Regis, Marriott, Third Floor

Chair: Nick Malleson


4269 Symposium on Human Dynamics in Smart and Connected Communities: Agents - the 'atomic unit' of social systems? 2 

Saturday, 4/8/2017, from 10:00 AM - 11:40 AM in Regis, Marriott, Third Floor

Chair: Alison Heppenstall 


We hope you will stay around and attend these sessions. See you in Boston.