Wednesday, December 05, 2018

Detecting and Mapping Slums using Open Data

Urban and slum areas in Nairobi (False composite image created
by stacking image bands 7, 6 and 4 from the Landsat 8 satellite.
Turning back to slums, we just published paper entitled "Detecting and Mapping Slums using Open Data: A Case Study in Kenya" in the International Journal of Digital Earth. This work builds and extends our previous research on using new sources of data to explore the slum settlements in 3 cities in Kenya (i.e. Nairobi, Mombasa and Kisumu).  Specifically, we examine how the fusion of Volunteered Geographical Information, Social Media, and other open data sources can complement remote sensing imagery in supporting slum detection, mapping and monitoring. 

We do this by using data mining tools (e.g. logistic regression, discriminant analysis and the See5 decision tree), to develop context-sensitive definitions for slums based on location, as well as for testing the generalizability of indicators and derived slum models. The end result is an indicator database for slums using open sources of physical and socio-economic data that can be used to characterize slum settlements. If you wish to know more, below we provide the abstract to the paper along with some of the figures and the full citation with a link to the paper itself.

Abstract:
The worldwide slum population currently stands at over one billion, with substantial growth expected in the coming decades. Traditionally, slums have been mapped using information derived mainly from either physical indicators using remote sensing data, or socio-economic indicators using census data. Each data source on its own provides only a partial view of slums, an issue further compounded by data poverty in less developed countries. To overcome such issues, this paper explores the fusion of traditional with emerging open data sources and data mining tools to identify additional indicators that can be used to detect and map the presence of slums, map their footprint, and map their evolution. Towards this goal, we develop an indicator database for slums using open sources of physical and socio-economic data that can be used to characterize slum settlements. Using this database, we then leverage data mining techniques to identify the most suitable combination of these indicators for mapping slums. Using three cities in Kenya as test cases, results show that the fusion of these data can improve the mapping accuracy of slums. These results suggest that the proposed approach can provide a viable solution to the emerging challenge of monitoring the growth of slums.
Keywords: Slums; Remote Sensing; Socio-economic; Urban sustainability; Data mining; Kenya

Study areas in Kenya

Methodology workflow

Distribution of positive classified cases for slums for (a) logistic regression, (b) discriminant analysis and (c) the See5 decision tree.
Full Reference:
Mahabir, R., Agouris, P., Stefanidis, A., Croitoru, A. and Crooks, A.T. (2018), Detecting and Mapping Slums using Open Data: A Case Study in Kenya, International Journal of Digital Earth. DOI: https://doi.org/10.1080/17538947.2018.1554010. (pdf)

Bots in Social Networks

Recently our research has started to dig deeper into social media, especially how bots diffuse information in online social networks (OSNs). To this end at the  7th International Conference on Complex Networks and Their Applications we had a paper entitled: Bots in Nets: Empirical Comparative Analysis of Bot Evidence in Social Networks

In this paper we present a framework to characterize the pervasiveness and relative importance of bots in various OSNs conversations of three significant global events in 2016. In total, we harvested more than 30 million tweets from the U.S. Presidential Election, the Ukrainian Conflict and Turkish Political Censorship and compared the conversational patterns of bots and humans within each event. The results from this analysis showed that although Twitter participants identified as social bots comprised only 0.28% of all OSN users in this study, they accounted for a significantly large portion of prominent centrality rankings across the three conversations. If you want to know more about this new work, below we provide the abstract to the paper, a selection of figures and tables (including our methodology, some summary information about our data corpus and some of the results). Finally at the bottom of this post we have the full reference and a link to the paper.

Abstract:
The emergence of social bots within online social networks (OSNs) to diffuse information at scale has given rise to many efforts to detect them. While methodologies employed to detect the evolving sophistication of bots continue to improve, much work can be done to characterize the impact of bots on communication networks. In this study, we present a framework to describe the pervasiveness and relative importance of participants recognized as bots in various OSN conversations. Specifically, we harvested over 30 million tweets from three major global events in 2016 (the U.S. Presidential Election, the Ukrainian Conflict and Turkish Political Censorship) and compared the conversational patterns of bots and humans within each event. We further examined the social network structure of each conversation to determine if bots exhibited any particular network influence, while also determining bot participation in key emergent network communities. The results showed that although participants recognized as social bots comprised only 0.28% of all OSN users in this study, they accounted for a significantly large portion of prominent centrality rankings across the three conversations. This includes the identification of individual bots as top-10 influencer nodes out of a total corpus consisting of more than 2.8 million nodes. 

Keywords: bots, online social networks, social network analysis.
Fig. 1. Overall methodology to analyze bot evidence across multiple Twitter OSN conversations.
Table 1. Harvested Twitter Corpus Overview
Fig. 5. Correlation of centrality measures for select centrality comparisons: (a) U.S. Election eigenvector versus betweenness analysis, (b) Ukraine Conflict eigenvector versus betweenness analysis and (c)Ukraine Conflict eigenvector versus degree analysis.
Table 2. Bot density of largest emergent communities.

Full Reference:
Schuchard, R., Crooks, A.T., Stefanidis, A.  and Croitoru, A. (2018), Bots in Nets: Empirical Comparative Analysis of Bot Evidence in Social Networks, in Aiello, L.M., Cherifi, C. Cherifi, H., Lambiotte, R., Lió, P. and Rocha, L.M. (eds.), Volume 2, Proceedings of the 7th International Conference on Complex Networks and Their Applications, Cambridge, United Kingdom, Springer, pp 424-436. (pdf)

Saturday, November 17, 2018

Procedural City Generation Beyond Game Development

In the current SIGSPATIAL Special Newsletter whose theme is Urban Analytics and Mobility, Joon-Seok Kim, Hamdi Kavak and myself have a paper entitled "Procedural City Generation Beyond Game Development". In the paper we discuss how synthetic urban areas created via procedural city generation in which agents occupy could be used to automatically generate data which  could then be used as urban testbeds for applications such as social simulation, self-driving cars and transportation.  Specifically, we review procedural city generation from several perspectives: goals , inputs , outputs and methods. Which in turn allows us to address specific issues (e.g., plausibility, level of detail, ease of use) to sufficiently capture real-world cities and the people who inhabit them. If you want to find out more, below is the abstract to the paper along with the full reference an a link to the paper.

Abstract:
The common trend in the scientific inquiry of urban areas and their populations is to use real-world geographic and population data to understand, explain, and predict urban phenomena. We argue that this trend limits our understanding of urban areas as dealing with arbitrarily collected geographic data requires technical expertise to process; moreover, population data is often aggregated, sparsified, or anonymized for privacy reasons. We believe synthetic urban areas generated via procedural city generation, which is a technique mostly used in the gaming area, could help improve the state-of-the-art in many disciplines which study urban areas. In this paper, we describe a selection of research areas that could benefit from such synthetic urban data and show that the current research in procedurally generated cities needs to address specific issues (e.g., plausibility) to sufficiently capture real-world cities and thus take such data beyond gaming.


Full Reference:
Kim, J-S., Kavak, H. and Crooks A.T. (2018), Procedural City Generation Beyond Game Development, SIGSPATIAL Special, 10(2), 34-41. DOI: 10.1145/3292390.3292397 (pdf)

Thursday, November 08, 2018

Refugee Camps and Volunteered Geographical Information

Fig. 7. Stimulus-Awareness-Activism (SA2) framework
Previously we have posted on how one can use new sources of data  (e.g. Volunteered Geographical Information) to explore and understand the world around us, such as mass migration, urban form and function, or be used for the basis of a model. Continuing on with this research theme we recently had a paper published in PLoS ONE entitled: "News Coverage, Digital Activism, and Geographical Saliency: A Case Study of Refugee Camps and Volunteered Geographical Information."

In this paper we explore the relationship between news coverage (via Google news), search trends (via Google trends) and user edit contribution patterns in OpenStreetMap and  Wikipedia for refugee camps from around the world. Specifically we are interested in how news media coverage (and in particular digital media) impacts digital activism (i.e.  volunteers who contribute content to online communities). Based on our analysis we find that digital activism bursts tend to take place during periods of sustained build-up of public awareness deficit or surplus.

These findings are in line with two prominent mass communication theories: agenda setting and corrective action, and suggest the emergence of a novel Stimulus-Awareness-Activism (SA2) framework in today’s participatory digital age. We argue that this paper brings us one step closer to understanding the underlying mechanisms that drive digital activism in particular in the geospatial domain. Below you can read the abstract of the paper, see the refugee camps we studied and some of the results. At the bottom of the post we also provide the full reference and a link to the paper.

Abstract:
The last several decades have witnessed a shift in the way in which news is delivered and consumed by users. With the growth and advancements in mobile technologies, the Internet, and Web 2.0 technologies users are not only consumers of news, but also producers of online content. This has resulted in a novel and highly participatory cyber-physical news awareness ecosystem that fosters digital activism, in which volunteers contribute content to online communities. While studies have examined the various components of this news awareness ecosystem, little is still known about how news media coverage (and in particular digital media) impacts digital activism. In order to address this challenge and develop a greater understanding of it, this paper focuses on a specific form of digital activism, that of the production of digital geographical content through crowdsourcing efforts. Using refugee camps from around the world as a case study, we examine the relationship between news coverage (via Google news), search trends (via Google trends) and user edit contribution patterns in OpenStreetMap, a prominent geospatial data crowdsourcing platform. In addition, we compare and contrast these patterns with user edit patterns in Wikipedia, a well-known non-geospatial crowdsourcing platform. Using Google news and Google trends to derive a measure of thematic public awareness, our findings indicate that digital activism bursts tend to take place during periods of sustained build-up of public awareness deficit or surplus. These findings are in line with two prominent mass communication theories: agenda setting and corrective action, and suggest the emergence of a novel stimulus-awareness-activism framework in today’s participatory digital age. Moreover, these findings further complement existing research examining the motivational factors that drive users to contribute to online collaborative communities. This paper brings us one step closer to understanding the underlying mechanisms that drive digital activism in particular in the geospatial domain.

Figure 1. Study areas  (centroid location of camp).

Figure 5. OSM, Wikipedia, Google News, and Google Trends time series during a -/+4 months period around the strongest extremum point of each camp. The figures show that whereas OSM and Wikipedia entries tend to come in bursts, Google News and Trends display a more sustained type of activity.

Figure 6. The public awareness curve versus the cumulative OSM and Wikipedia edit activity during a -/+4 months period around the strongest extremum point of each camp. For camps such as Nyarugusu, OSM and Wikipedia bursts overlap with public awareness surplus. In other camps, such as Bidibidi, OSM edit activity bursts coincide with public awareness deficit.  

Full Reference: 
Mahabir, R., Croitoru, A., Crooks, A.T., Agouris, P. and Stefanidis, A. (2018), News Coverage, Digital Activism, and Geographical Saliency: A Case Study of Refugee Camps and Volunteered Geographical Information, PLoS ONE, 13(11): e0206825.   https://doi.org/10.1371/journal.pone.0206825 (pdf)

Wednesday, November 07, 2018

ABM platform developers from CoMSES 2018

There are many reviews of agent-based modeling platforms (e.g. Abar et al., 2017; Kravari and Bassiliades, 2015; Castle and Crooks, 2006) but rarely do you see movies  describing how such platforms have developed or where they are heading. Recently CoMSES Net (home of many great resources for agent-based modeling) held their second virtual conference: CoMSES 2018. During this virtual conference, the were presentations from Repast, Cormas and MESA to name but a few and I thought theese were worth sharing. If you click on the links below your can go directly to their threads (discussion) on from the conference.

Repast:
 

Cormas:


Mesa:


On a slightly different note, I just came across a series of podcasts by Jacob Ingalls and Benjamin Schumann (http://brokenjars.xyz/simtalk/) who have interviewed a number of practitioners carrying out simulation modeling including the CEO of AnyLogic. Similar to the movies above, these podcasts provide a different way of learning more about simulations.

Friday, October 19, 2018

New Paper: Scalability in the MASON Multi-agent Simulation System

Previously we posted about our work on advancing MASON, part of which we briefly discussed making it distributed in order to  run large scale models including geographical explicit ones along for optimization and validation purposes. To this end we recently had a paper accepted and presented at the  22nd International Symposium on Distributed Simulation and Real Time Applications (DS-RT 2018),  entitled "Scalability in the MASON Multi-agent Simulation System". 

In this paper we describe a distributed version of the MASON, and use three existing MASON models: HeatBugs, Flockers, and CampusWorld, to demonstrate how Distributed MASON achieves highly scalable performance, in terms of linear performance increases as the size of the simulations grow using Amazon Web Services.  Below you can read the abstract of the paper, see  some figures relating to how we go about data management and some of the results. Finally, at the bottom of the post you can see the full reference and access the paper itself.

Abstract:
This paper describes Distributed MASON, a distributed version of the MASON agent-based simulation tool. Distributed MASON is architected to take advantage of well known principles from Parallel and Discrete Event Simulation, such as the use of Logical Processes (LP) as a method for obtaining scalable and high performing simulation systems. We first explain data management and sharing between LPs and describe our approach to load balancing. We then present both a local greedy approach and a global hierarchical approach. Finally, we present the results of our implementation of Distributed MASON on an instance in the Amazon Cloud, using several standard multi-agent models. The results indicate that our design is highly scalable and achieves our expected levels of speed-up.




Full Reference:
Wang, H., Wei, E., Simon, R., Luke, S., Crooks, A.T., Freelan, D. and Spagnuolo, C. (2018), Scalability in the MASON Multi-agent Simulation System, The 22nd International Symposium on Distributed Simulation and Real Time Applications, Madrid, Spain. (pdf)

This research is supported by the National Science Foundation (Grant 1727303).

Friday, September 21, 2018

Exodus 2.0: Crowdsourcing Geographical and Social Trails of Mass Migration

Readers of the blog might know we have an interest in volunteered geographic information, social media and Web 2.0 technologies and how they can be used to explore urban systems. Recently however, we turned our focus on how such information and technologies can be used to explore and understand mass migrations.

To this end we recently had a paper published in the Journal of Geographical Systems entitled "Exodus 2.0: Crowdsourcing Geographical and Social Trails of Mass Migration". We adopt the term Exodus 2.0 to refer to this new migration paradigm in the digital age, whereby information is a commodity in the migration process.

Given the nature of migration processes, it is possible to explore them across two key dimensions: geographical and situational. The geographical dimension is associated with the physical migration pathways migrants take from a country of origin to a destination site (often through a number of intermediate “stop” sites). The situational dimension is associated with the social connectivity of moving migrant populations, the conditions on the ground, and the activities that take place as part of migration efforts (including the root conditions, proximate conditions and triggering events).
Factors that potentially cause refugee production and
 mass movement based on identified factors detailed by
Clark (1989) and Zottarelli (1998).
In the paper, we use the ongoing Syrian humanitarian crisis as a case study to to explore how the factors that potentially causes refugee production and mass movement  can be gleamed from new sources of data. Specifically, the potential of crowd-generated data—especially open data, volunteered geographic information and social media content (e.g. OpenStreetMap, Flickr, Twitter and Instagram)  to provide information about migration processes.  Through a series of case studies  we show how such data (when combined with more traditional data sources) offers a new lens to study such the geographical and situational dimensions of mass migration. Finally we discuss  how such data could be used to inform migration modeling. If we have not bored you yet and you are interested in finding out more about this line of inquiry, below we provide the abstract to the paper, some of the figures which go along with our analysis for studying the refugee production and movement. Finally, we also provide the full reference and a link to the paper. 

Abstract:
The exodus of displaced populations is a recurring historical phenomenon, and the ongoing Syrian humanitarian crisis is its latest incarnation. During such mass migration events, information is an essential commodity. Of particular importance is geographical (e.g., pathways and refugee camps) and social (e.g., refugee activities and networking) information. Traditionally, such information had been produced and disseminated by authorities, but a new paradigm is emerging: Web 2.0 and mobile computing technologies enable the involved stakeholder communities to produce, access, and consume migration-related information. The purpose of this article is to put forward a new typology for understanding the factors around migration and to examine the potential of crowd-generated data—especially open data and volunteered geographic information—to study such events. Using the recent wave of migration to Europe from the Middle East and northern Africa as a case study, we examine how migration-related information can be dynamically mined and analyzed to study the migrants’ pathways from their home countries to their destination sites, as well as the conditions and activities that evolve during the migration process. These new data sources can provide a deeper and more fine-grained understanding of the migration process, often in real-time, and often through the eyes of the communities affected by it. Nevertheless, this also raises significant methodological and technical challenges for their future use associated with potential biases, data quality issues, and data processing.

Keywords: Refugees, Forced migration, Humanitarian crisis, Volunteered geographic information, Crowdsourcing, Social media, GIS, Web 2.0.
Cumulative flow (2011–2015) illustrating Syrian forced migration to neighboring countries and other destination countries. Line thickness indicates increasing number of persons migrating.

Retweet network of geolocated Twitter microblogs that are discussing opinions, news and retweeting information related to “refugee” in multiple languages from May to August 2017.

A concept graph illustrating the associations between a keyword related to root factors of mass migration such as poverty (“welfare”) to other keywords, as they appear in our Twitter data corpus. The color of the node refers to specific themes: locations (green), actors (dark red), topics (red), entities and individuals (blue), concepts (white), and events (yellow). Red edges represent active associations between terms; gray edges represent inactive associations between terms.

An agent-based model of migration: top: the spatial environment, where the lines represent migration pathways, and the nodes represent number of migrants. Purple nodes represent final destination sites, red nodes show migrant deaths, and green nodes show migrants en route (source: Hu 2016).

Full Reference: 
Curry, T., Croitoru, A., Crooks, A.T. and Stefanidis, A. (in press), Exodus 2.0: Crowdsourcing Geographical and Social Trails of Mass Migration, Journal of Geographical Systems. DOI: https://doi.org/10.1007/s10109-018-0278-1 (pdf)

Wednesday, September 19, 2018

An Agent-Based Model of Rural Household Adaptation to Climate Change

Geographical location of the South Omo Zone of Ethiopia
While many of the recent posts on the site have focused on social media, social networks and volunteered geographical information, we have not forgotten or moved away from agent-based modeling (as you can probably gather from the title of this post).  To this end, Ates Hailegiorgis, Claudio Cioff-Revilla and myself recently had a paper published in the Journal of Artificial Societies and Social Simulation entitled: An Agent-Based Model of Rural Household Adaptation to Climate Change

The purpose of the model is to explore how climate change could impact rural societies in less developed countries whose livelihoods rely on subsistence agriculture. It has been suggested that climate change will place unprecedented stress on rural communities, as it will alter their resource base without giving them sufficient time for adaptation. While rural systems have developed various adaptive strategies over many generations in order to survive, the alteration of any resources can significantly affect even highly regarded and accepted customs, and may lead to the displacement of populations along with other severe humanitarian consequences.

In this paper we focus on the South Omo Zone of Ethiopia which covers an area of 2.3 million hectares and is located in the southern part of Ethiopia. Climate change is expected to play a significant role in shaping the future socio-ecological setting of the region and to explore this we devlepd a model  in the MASON simulation system, including its geographical information system (GIS) extension, GeoMASON called OMOLAND-CA (OMOLAND Climate Change Adaptation). Results from the model show that successive episodes of extreme events (e.g., droughts) can affect the adaptive capacity of households in the region, causing them to migrate from the region. While at the same time the rural communities manage to endure in spite of such harsh climatic change conditions.

Below you can read the abstract of the paper, see some of the figures including the models high-level architecture, along with the household decision-making process, some results from various scenarios and a link to the model and the full reference of the paper.

Abstract: 
Future climate change is expected to have greater impacts on societies whose livelihoods rely on subsistence agricultural systems. Adaptation is essential for mitigating adverse effects of climate change, to sustain rural livelihoods, and ensure future food security. We present an agent-based model, called OMOLAND-CA, which explores the impact of climate change on the adaptive capacity of rural communities in the South Omo Zone of Ethiopia. The purpose of the model is to answer research questions on the resilience and adaptive capacity of rural households with respect to variations in climate, socioeconomic factors, and land-use at the local level. Our model explicitly represents the socio-cognitive behavior of rural households toward climate change and resource flows that prompt agents to diversify their production strategy under different climatic conditions. Results from the model show that successive episodes of extreme events (e.g., droughts) affect the adaptive capacity of households, causing them to migrate from the region. Nonetheless, rural communities in the South Omo Zone, and in the model, manage to endure in spite of such harsh climatic change conditions.

Keywords: Climate Change Adaptation, Agent-Based Modeling, Socio-Cognitive Behavior

High-level architecture of the OMOLAND-CA model.

Household decision-making sequence for each time period in the model.

Population migration over time with different climatic conditions: a) 50% reduction, b) 70% reduction, c) 90% reduction of rainfall with different drought frequencies.

Livestock growth over time with different climatic conditions: a) 50% reduction, b) 70% reduction, c) 90% reduction of rainfall with different drought frequencies.
Simulation results of the frequency of crop planted per hectare.

In keeping with many of our agent-based models that we have created, a full description of the model (using the Overview, Design concepts, and Details plus Decision (ODD+D) protocol), along with its source code and data needed to run the model can be found at: https://www.openabm.org/model/5734/ .

Full Reference:
Hailegiorgis, A.B., Crooks, A.T. and Cioff-Revilla, C. (2018), An Agent-Based Model of Rural Households’ Adaptation to Climate Change, Journal of Artificial Societies and Social Simulation, 21 (4): 4. Available at http://jasss.soc.surrey.ac.uk/21/4/4.html.
 

Friday, August 31, 2018

A Million Page Views: Thank you

While I started blogging during my PhD (actually the first real post was from February 21st 2006), for some reason I only started recording statistics about the blog in May 2010. This month marks the milestone of over 1,000,000 page views. So I thought I would write a post that reflects this milestone. 

Initially, I started blogging as a way to keep track of agent-based models (ABM), example applications and toolkits I found interesting and this trend has continued over the years (with a few variations along the way). Many of my initial posts where focused on ways of utilizing agent-based models and integrating geographical information into such models.  However, over time I have also branched out into writing about other areas such as the utility of volunteered geographical information and social media to monitor, analyze and model urban systems and how one can use such data to study the connections between people via social networks

From looking at the statistics, since 2010 the most popular post (as you can see from the image below) is that of  Agent-based modeling in ArcGIS  (unfortunately this work is currently not being updated:( ) but it does show an interest in agent-based modeling in more of the mainstream GIS (or at least from some people). The other posts in the top 10 relate to modeling and analyzing urban systems and the people within them in some shape or form including a book book review I did for  JASSS. Perhaps my favorite post in this top 10 is that of Modeling Human Behavior   inspired by a book chapter written Bill Kennedy entitled 'Modelling Human Behavior in Agent-Based Models'.
With respect to the audience of the blog, nearly 48% of page views come from the United States while the reminder come from all around the world (as you can see from from the figure to the left, including France, Russia and the Ukraine). The most popular search terms for people coming to the blog include "agent based modeling", "NetLogo GIS" "NetLogo Examples" along with terms such such as urban analytics and big data. 
Looking at what web browsers and which operating systems people are using to access the site (which takes me back to my Masters thesis when I was working on developing web-mapping features for the Gazetteer for Scotland), Chrome makes up 43% of all page views  followed by Firefox (29%) and IE (16%). While for operating systems, 54% of visitors are using Windows, followed by Macintosh (27%) and Linux (8%).

While I mentioned above my favorite post in the top 10, reflecting on which post I refer most people to, it has to be the one entitled Applications of Agent-based Models because it shows how agent-based models are being used in a variety of settings. Looking back on the evolution of GIS and agent-based modeling since I started blogging, its impressive to see how different toolkits have started to utilize GIS. For example my first post was a hack on how to integrate GIS into NetLogo, from backspaces.net. Since then NetLogo, MASON and other platforms such as GAMA have evolved to allow making it (relatively) easier for the integration and exploration of geographical information and agent-based models. 

Moreover, when I started writing about this, there were very few example GIS and agent-based models (expect from Repast ones) or resources to get up and running with agent-based models but over time this has changed with more and more people sharing their models (thanks to things like GitHub (e.g. mason models, OpenABM)). There has also been a number of good text books written on GIS and ABM (and there is a great one coming soon from us) along with more blogs (e.g. Simulating Complexity) and courses being taught (e.g. Agent-Based Modeling Short Course at SESYNC). Lets hope this growth continues and thank you for reading and visiting this blog. If you would like to share your work on ABM and GIS please feel free to contact me or leave a comment a below.

Wednesday, July 18, 2018

Online Vaccination Discussion and Communities in Twitter

Continuing on our work of exploring health related issues in social media, Xiaoyi Yuan and myself had a paper accepted at the 9th International Conference on Social Media and Society. In our paper entitled: "Examining Online Vaccination Discussion and Communities in Twitter"  we examined the communication patterns of anti-vaccine and pro-vaccine users on Twitter by studying the retweet network from 660,892 tweets related to the measles, mumps, and rubella (MMR) vaccine published by 269,623 users using supervised learning to identify clusters of users based on their opinions (i.e. a pro-vaccine, anti-vaccine, or neutral user). 

The overall methodology can be seen in Figure 1 and more details can be found in the paper. Our data was collected using the GeoSocial Gauge System, however, since tweets are short and their content diverse, the data corpus needed to be cleaned so that the tweets could then be converted to features (e.g., unigrams or bigrams). After which we were able to use such features for training a variety of classifiers (i.e., logistic regression, support vector machine (linear and non-linear kernel), k-nearest neighbors, nearest centroid, and Naïve Bayes) to identify opinion groups. After this, we moved from on from identifying each user’s opinion to construct a retweet network in order to understand how in-group and cross-group communicate in the committees detected via retweet network. By carrying out this analysis we discovered that pro- and anti-vaccine users retweet predominantly from their own opinion group, while users with neutral opinions are distributed across communities. Below you can read our abstract, see some results from our study and the full reference (and link) to the paper.


Figure1: Steps used in our study to unveil the communication patterns of pro-vaccine and anti-vaccine users on Twitter
 Abstract:
Many states in the US allow a “belief exemption” for measles, mumps, and rubella (MMR) vaccines. People’s opinion on whether or not to take the vaccine could have direct consequences in public health— once the vaccine refusal of a group within a population is higher than what herd immunity can tolerate, a disease can transmit fast causing large scale of disease outbreaks. Social media has been one of the dominant communication channels for people to express their opinions of vaccination. Despite governmental organizations’ effects of disseminating information of vaccination benefits, anti-vaccine sentiment is still gaining its momentum, especially on social media. This research investigates the communicative patterns of anti-vaccine and pro-vaccine users on Twitter by studying the retweet network from 660,892 tweets related to MMR vaccine published by 269,623 users after the 2015 California Disneyland measles outbreak. Using supervised learning, we classified the users into anti-vaccination, neutral to vaccination, and pro-vaccination groups. Using a combination of opinion groups and retweet network structural community detection, we discovered that pro- and anti-vaccine users retweet predominantly from their own opinion group, while users with neutral opinions are distributed across communities. For most cross-group communication, it was found that pro-vaccination users were retweeting anti-vaccination users than vice-versa. The paper concludes that anti-vaccine Twitter users are highly clustered and enclosed communities, and this makes it difficult for health organizations to penetrate and counter opinionated information. We believe that this finding may be useful in developing strategies for health communication of vaccination and overcome some the limits of current strategies.

Key Words: Anti-vaccine movement, Twitter, social media, opinion classification
Figure 2: Network visualizations of the four largest communities. A: is colored by the belonging to a specific structural community and; B: is colored by belonging to opinion groups

Figure 3: Distributions of opinion groups in the four largest structural community

Full Reference:
Yuan, X. and Crooks, A.T. (2018), Examining Online Vaccination Discussion and Communities in Twitter, Proceedings of the 9th International Conference on Social Media and Society, Copenhagen, Denmark, pp 197-206. (pdf)

Wednesday, July 04, 2018

MASON Update

At the upcoming Multi-Agent-Based Simulation (MABS) workshop, we have a paper entitled "The MASON Simulation Toolkit: Past, Present, and Future" in which we discuss MASON's development history, its design and (probably more interesting) where MASON is going. This includes:
  1. Making it more robust (i.e. easier to run parameter tests), 
  2. Making it distributed in order to  run large scale models including geographical explicit ones along for optimization and validation purposes.
  3. Making it more coder-friendly by adding code templates that allow users to generate code skeletons for common MASON patterns and a way to easily record outputs and statistics.
  4. Making it more community-friendly by hopefully developing a special online repository to enable researchers to distribute models as jar files along with education aids and examples. Relating to this last point we have added a number of example models (code and data) from our own research to GitHub, see: https://github.com/eclab/mason/tree/master/contrib/geomason/sim/app/geo and the data to run the models is either there or here https://cs.gmu.edu/~eclab/projects/mason/extensions/geomason/geodemodata.zip (note this is 1.5 GB).
Below you can read the abstract from the paper along with a link to the paper itself.

Example Applications of MASON

Abstract
MASON is a widely-used open-source agent-based simulation toolkit that has been in constant development since 2002. MASON’s architecture was cutting-edge for its time, but advances in computer technology now offer new opportunities for the ABM community to scale models and apply new modeling techniques. We are extending MASON to provide these opportunities in response to community feedback. In this paper we discuss MASON, its history and design, and how we plan to improve and extend it over the next several years. Based on user feedback will add distributed simulation, distributed GIS, optimization and sensitivity analysis tools, external language and development environment support, statistics facilities, collaborative archives, and educational tools.

Keywords: Agent-Based Simulation, Open Source, Library

Full Reference:
Luke, S., Simon, R., Crooks, A.T., Wang, H., Wei, E., Freelan, D., Spagnuolo, C., Scarano, V., Cordasco, G. and Cioffi-Revilla, C. (2018), The MASON Simulation Toolkit: Past, Present, and Future, 19th International Workshop on Multi-Agent-Based Simulation (MABS2018), Stockholm, Sweden. (pdf)

Available on Github


This research is supported by the National Science Foundation (Grant 1727303).

Wednesday, June 13, 2018

Call for Papers: GeoSim’18



The GeoSim’18 workshop focuses on all aspects of simulation as a general paradigm to model and predict spatial systems and generate spatial data. New simulation methodologies and frameworks, not necessarily coming from the SIGSPATIAL community, are encouraged to participate. Also, this workshop is of interest to everyone who works with spatial data. The simulation methods that will be presented and discussed in the workshop should find a wide application across the community by producing benchmark datasets that can be parameterized and scaled. Simulated data sets will be made available to the community via the website.

The workshop seeks high-quality full (8 pages) and short (4 pages) papers that will be peer-reviewed. Once accepted, at least one author is required to register for the workshop and the ACM SIGSPATIAL conference, as well as attend the workshop to present the accepted work which will then appear in the ACM Digital Library.

Example topics include, but not limited to:
  • Applications for Spatial Simulation
  • Agent Based Models for Spatial Simulation
  • Multi-Agent based Spatial Simulation
  • Big Spatial Data Simulation
  • Spatial Data/Trajectory Generators
  • Road Traffic Simulation
  • Environmental Simulation
  • Geoinformation Systems using Spatial Simulation
  • Interactive Spatial Simulation
  • Spatial Simulation Parallelization and Distribution
  • Geo-Social Simulation and Data Generators
  • Social Unrest and Riot Prediction using Simulation
  • Spatial Analysis based on Simulation
  • Behavioral Simulation
  • Verifying, and Validating Spatial Simulations
  • Urban Simulation
  •  
Important Dates:
  • Submission deadline: August 20, 2018
  • Notification: September 20, 2018
  • Workshop date: November 06, 2018
For more information please visit www.geosim.org

https://www.dropbox.com/s/lgt6ip1u9lxvgwa/GeoSim18_cfp_final.pdf?dl=0

Tuesday, May 29, 2018

Spatial Agent-based Models of Human-Environment Interactions: Spring 2018

During the past spring semester I taught a class entitled "Spatial Agent-based Models of Human-Environment Interactions". As with many of my courses, students are expected to complete a end of semester project, in this case, develop an agent-based model that explores some aspect of related to the course theme of human-environment interactions. In the movie below is a selection of these projects can be seen. The projects ranged from urban growth, housing markets, the adoption of solar energy, employment opportunities, populations at risk from terrorism, commuting, to the spread of diseases. Many of the models were done in NetLogo, MASON and some in Python including using MESA.




I would like to thank the Students of CSS 645: Spatial Agent-based Models of Human-Environment Interactions for their participation in the class.

Monday, April 09, 2018

Predicting Rice Cropping Patterns around Poyang Lake, China using a Cellular Automata Model

http://mason.gmu.edu/~qtian2/QingTianSummary.html
Normally, on this blog, the focus is on agent-based modeling and GIS. However, I am not agnostic to other modeling approaches especially cellular automata (CA) modeling (which I have written about in the past).  To this end, Rui Zhang, Qing Tian, Luguang Jiang, Shuhua Qi, Ruixin Yang and myself recently had a paper published in Land Use Policy entitled: "Projecting Cropping Patterns around Poyang Lake and Prioritizing Areas for Policy Intervention to Promote Rice: A Cellular Automata Model" In the paper we explore current land use patterns in the Poyang Lake Region (PLR) of China. Specifically, we focus on current rice production in the region and what this might look like in the future (especially the impact of farmland consolidation) by using an CA model (built on the DINAMICA EGO platform). Below you can read the abstract to our paper, along with some figures, outlining our study area, the model design and development, along with observed current day and predicted rice cropping patterns around Poyang Lake. Finally at the bottom of the post I provide the full reference and a link to the paper.

Abstract:
Rural households’ cropping choices are increasingly influenced by nonfarm activities across the developing world, raising serious concerns about food security locally and globally. In China, rapid urbanization has led to agricultural decline in some regions. To stimulate agriculture, the Chinese government has recently increased its effort in farmland consolidation by providing special support to large farms in an attempt to address land-use inefficiency associated with small farming operations in rural China. Focusing on the Poyang Lake Region (PLR), we develop a Cellular Automata (CA) model to explore future agricultural land use and examine the impact of farmland consolidation. PLR is an important rice production base in Jiangxi Province and China. In PLR rice can be grown once a year on a plot, called one-season rice, or twice a year on the same plot, called two-season rice. Our CA model simulates the transition between one-season and two-season rice. Emphasizing distributional differences in the region, we use the modeling results to identify five areas where rice cultivation is (i) relatively stable for one-season rice, (ii) more likely to be one-season rice, (iii) of equal probability for either type, (iv) more likely to be two-season rice, and (v) relatively stable for two-season rice. We then explore the characteristics of these areas in terms of biophysical and geographical environments to provide further insights into how the government may prioritize areas for interventions to effectively promote food production and environmental sustainability. The analysis also indicates a positive effect of farmland consolidation on promoting rice production.

Keywords: Agricultural Land Use; Cellular Automata; Food Security; Environmental Sustainability; Farmland Consolidation; China.
Poyang Lake Region. The left map shows its location in China. Rice cropping patterns shown on the right map were interpreted from Landsat images in 2013.
Model design and development

Rice cropping patterns around Poyang Lake. The map on the left is observed land use in 2013 and on the right prediction for 2033.

Full Reference:
Zhang, R., Tian, Q., Jiang, L., Crooks, A.T., Qi, S. and Yang, R. (2018), Projecting Cropping Patterns around Poyang Lake and Prioritizing Areas for Policy Intervention to Promote Rice: A Cellular Automata Model, Land Use Policy, 74: 248-260. (pdf)
As always, any thoughts or comments are most welcome.

Saturday, April 07, 2018

Innovations in Urban Analytics @ the AAG

Symposium on New Horizons in Human Dynamics Research: Innovations in Urban Analytics Sessions

As part of the Symposium on New Horizons in Human Dynamics Research we have organized 5 sessions around Innovations in Urban Analytics. These sessions will take place on Thursday 12th of April from 8am to 7pm in the Bayside A, Sheraton, 4th Floor.

Description
New forms of data about people and cities, often termed ‘Big’, are fostering research that is disrupting many traditional fields. This is true in geography, and especially in those more technical branches of the discipline such as computational geography / geocomputation, spatial analytics and statistics, geographical data science, etc. These new forms of micro-level data have lead to new methodological approaches in order to better understand how urban systems behave. Increasingly, these approaches and data are being used to ask questions about how cities can be made more sustainable and efficient in the future.

These sessions will bring together the latest research in urban analytics. In particular the papers will engage in the following domains:
  • Agent-based modelling (ABM) and individual-based modelling;
  • Machine learning for urban analytics;
  • Innovations in consumer data analytics for understanding urban systems;
  • Real-time model calibration and data assimilation;
  • Spatio-temporal data analysis;
  • New data, case studies, demonstrators, and tools for the study of urban systems;
  • Complex systems analysis;
  • Geographic data mining and visualisation;
  • Frequentist and Bayesian approaches to modelling cities.


Symposium on New Horizons in Human Dynamics Research: Innovations in Urban Analytics I - Agent-Based Modelling and Machine Learning

Time: 8:00 AM
Location: Bayside A, Sheraton, 4th Floor

Chair: Nick Malleson.

Andrew Crooks, Annetta Burger, Xiaoyi Yuan and William Kennedy:
Title: The Generation and Application of Large Scale Synthetic Populations for Disease Outbreaks and Disasters.
Achilleas Psyllidis and Hendra Hadhil Choiri:
Title: A Convolutional Neural Network-based Model for Predicting the Perceived Attractiveness of Urban Places
Jonathan Reades, Jordan de Souza and Elizabeth Sklar:
Title: Predicting Neighbourhood Change in London with Random Forests  
Nick Malleson, Tomas CrolsJonathan Ward and Andrew Evans:
Title: Forecasting Short-Term Urban Dynamics: Data Assimilation for Agent-Based Modelling
Tomas Crols and Nick Malleson:
Title: Calibrating an Agent-Based Model of the Ambient Population using Big Data  

Symposium on New Horizons in Human Dynamics Research: Innovations in Urban Analytics II - Transport and Accessibility 


Time: 10:00 AM
Location: Bayside A, Sheraton, 4th Floor

Chair: Andrew Crooks

Ed Manley:
Title: Analysing Cities through Cognitive Models of Geographic Space.
Alison Heppenstall, Yuanxuan Yang and Alexis Comber:
Title: Who, why and when? Using smart card and social media data to reveal flows through urban spaces. 
Kerry Nice, Jason Thompson, Jasper Wijnands, Gideon Aschwanden and Mark Stevenson:
Title: The Paris end of town? Urban typology through machine learning.
Henrikki Tenkanen, Olle JärvMaria Salonen, Rein Ahas and  Tuuli Toivonen:
Title: Dynamic cities: Spatial accessibility as a function of time.
Thomas Redfern, Nicolas MallesonGillian Harrison, Frances Hodgson, Alexis Comber and Susan Grant-Muller:
Title: Monitoring, modelling and understanding the complex spatiotemporal dynamics of air pollution exposure, transport policies, and health burdens. 

Symposium on New Horizons in Human Dynamics Research: Innovations in Urban Analytics III - Data Synergies and Emerging Insights


Time: 1:20 PM
Location: Bayside A, Sheraton, 4th Floor

Chair: Alison Heppenstall

Tuuli Toivonen, Henrikki Tenkanen, Vuokko HeikinheimoOlle Järv and Tuomo Hiippala:
Title: Social media content for understanding the spatial patterns of urban leisure time 
Emmanouil Tranos:
Title: Doing internet archaeology to reveal the evolution of the digital economy in the UK.
Daniel Arribas-Bel:
Title: "Nowcasting" house prices at high spatiotemporal resolution.
Nik Lomax and Andrew Smith:
Title: High resolution demographic projections for infrastructure planning.

Discussant: Alison Heppenstall


Symposium on New Horizons in Human Dynamics Research: Innovations in Urban Analytics IV


Time: 3:20 PM
Location: Bayside A, Sheraton, 4th Floor

Chair: Ed Manley.

Boyana Buyuklieva and Adam Dennett:
Title: Making Metrics Meaningful: A Discussion of Implementation and Reproducibility Using Measures of Migration
Marina Toger, Ian Shuttleworth and John Östh:
Title: How average is average? Temporal patterns and variability in mobile phone data
Alec Davies, Mark Green and Alex Singleton
Title: Using new forms of data to investigate self-medication.
Ellen Talbot:
Title: Estimating Energy Consumption Through Smart Meter and Socio-demographic Datasets.
Discussant Ed Manley.


Symposium on New Horizons in Human Dynamics Research: Innovations in Urban Analytics V: Panel Session

Time: 5:20 PM
Location: Bayside A, Sheraton, 4th Floor

New forms of data about people and cities, often termed ‘Big’, are fostering research that is disrupting many traditional fields. This is true in geography, and especially in those more technical branches of the discipline such as computational geography / geocomputation, spatial analytics and statistics, geographical data science, etc. These new forms of micro-level data have lead to new methodological approaches in order to better understand how urban systems behave. Increasingly, these approaches and data are being used to ask questions about how cities can be made more sustainable and efficient in the future.

This panel session concludes the 'Innovations in Urban Analytics' paper theme.

Panelists:
Alex Singleton, Andrew Crooks, Boyana Buyuklieva, Tuuli Toivonen and Moira Zellner


Session Sponsors:
Organizers: