Wednesday, December 05, 2018

Detecting and Mapping Slums using Open Data

Urban and slum areas in Nairobi (False composite image created
by stacking image bands 7, 6 and 4 from the Landsat 8 satellite.
Turning back to slums, we just published paper entitled "Detecting and Mapping Slums using Open Data: A Case Study in Kenya" in the International Journal of Digital Earth. This work builds and extends our previous research on using new sources of data to explore the slum settlements in 3 cities in Kenya (i.e. Nairobi, Mombasa and Kisumu).  Specifically, we examine how the fusion of Volunteered Geographical Information, Social Media, and other open data sources can complement remote sensing imagery in supporting slum detection, mapping and monitoring. 

We do this by using data mining tools (e.g. logistic regression, discriminant analysis and the See5 decision tree), to develop context-sensitive definitions for slums based on location, as well as for testing the generalizability of indicators and derived slum models. The end result is an indicator database for slums using open sources of physical and socio-economic data that can be used to characterize slum settlements. If you wish to know more, below we provide the abstract to the paper along with some of the figures and the full citation with a link to the paper itself.

Abstract:
The worldwide slum population currently stands at over one billion, with substantial growth expected in the coming decades. Traditionally, slums have been mapped using information derived mainly from either physical indicators using remote sensing data, or socio-economic indicators using census data. Each data source on its own provides only a partial view of slums, an issue further compounded by data poverty in less developed countries. To overcome such issues, this paper explores the fusion of traditional with emerging open data sources and data mining tools to identify additional indicators that can be used to detect and map the presence of slums, map their footprint, and map their evolution. Towards this goal, we develop an indicator database for slums using open sources of physical and socio-economic data that can be used to characterize slum settlements. Using this database, we then leverage data mining techniques to identify the most suitable combination of these indicators for mapping slums. Using three cities in Kenya as test cases, results show that the fusion of these data can improve the mapping accuracy of slums. These results suggest that the proposed approach can provide a viable solution to the emerging challenge of monitoring the growth of slums.
Keywords: Slums; Remote Sensing; Socio-economic; Urban sustainability; Data mining; Kenya

Study areas in Kenya

Methodology workflow

Distribution of positive classified cases for slums for (a) logistic regression, (b) discriminant analysis and (c) the See5 decision tree.
Full Reference:
Mahabir, R., Agouris, P., Stefanidis, A., Croitoru, A. and Crooks, A.T. (2018), Detecting and Mapping Slums using Open Data: A Case Study in Kenya, International Journal of Digital Earth. DOI: https://doi.org/10.1080/17538947.2018.1554010. (pdf)

Bots in Social Networks

Recently our research has started to dig deeper into social media, especially how bots diffuse information in online social networks (OSNs). To this end at the  7th International Conference on Complex Networks and Their Applications we had a paper entitled: Bots in Nets: Empirical Comparative Analysis of Bot Evidence in Social Networks

In this paper we present a framework to characterize the pervasiveness and relative importance of bots in various OSNs conversations of three significant global events in 2016. In total, we harvested more than 30 million tweets from the U.S. Presidential Election, the Ukrainian Conflict and Turkish Political Censorship and compared the conversational patterns of bots and humans within each event. The results from this analysis showed that although Twitter participants identified as social bots comprised only 0.28% of all OSN users in this study, they accounted for a significantly large portion of prominent centrality rankings across the three conversations. If you want to know more about this new work, below we provide the abstract to the paper, a selection of figures and tables (including our methodology, some summary information about our data corpus and some of the results). Finally at the bottom of this post we have the full reference and a link to the paper.

Abstract:
The emergence of social bots within online social networks (OSNs) to diffuse information at scale has given rise to many efforts to detect them. While methodologies employed to detect the evolving sophistication of bots continue to improve, much work can be done to characterize the impact of bots on communication networks. In this study, we present a framework to describe the pervasiveness and relative importance of participants recognized as bots in various OSN conversations. Specifically, we harvested over 30 million tweets from three major global events in 2016 (the U.S. Presidential Election, the Ukrainian Conflict and Turkish Political Censorship) and compared the conversational patterns of bots and humans within each event. We further examined the social network structure of each conversation to determine if bots exhibited any particular network influence, while also determining bot participation in key emergent network communities. The results showed that although participants recognized as social bots comprised only 0.28% of all OSN users in this study, they accounted for a significantly large portion of prominent centrality rankings across the three conversations. This includes the identification of individual bots as top-10 influencer nodes out of a total corpus consisting of more than 2.8 million nodes. 

Keywords: bots, online social networks, social network analysis.
Fig. 1. Overall methodology to analyze bot evidence across multiple Twitter OSN conversations.
Table 1. Harvested Twitter Corpus Overview
Fig. 5. Correlation of centrality measures for select centrality comparisons: (a) U.S. Election eigenvector versus betweenness analysis, (b) Ukraine Conflict eigenvector versus betweenness analysis and (c)Ukraine Conflict eigenvector versus degree analysis.
Table 2. Bot density of largest emergent communities.

Full Reference:
Schuchard, R., Crooks, A.T., Stefanidis, A.  and Croitoru, A. (2018), Bots in Nets: Empirical Comparative Analysis of Bot Evidence in Social Networks, in Aiello, L.M., Cherifi, C. Cherifi, H., Lambiotte, R., LiĆ³, P. and Rocha, L.M. (eds.), Volume 2, Proceedings of the 7th International Conference on Complex Networks and Their Applications, Cambridge, United Kingdom, Springer, pp 424-436. (pdf)

Saturday, November 17, 2018

Procedural City Generation Beyond Game Development

In the current SIGSPATIAL Special Newsletter whose theme is Urban Analytics and Mobility, Joon-Seok Kim, Hamdi Kavak and myself have a paper entitled "Procedural City Generation Beyond Game Development". In the paper we discuss how synthetic urban areas created via procedural city generation in which agents occupy could be used to automatically generate data which  could then be used as urban testbeds for applications such as social simulation, self-driving cars and transportation.  Specifically, we review procedural city generation from several perspectives: goals , inputs , outputs and methods. Which in turn allows us to address specific issues (e.g., plausibility, level of detail, ease of use) to sufficiently capture real-world cities and the people who inhabit them. If you want to find out more, below is the abstract to the paper along with the full reference an a link to the paper.

Abstract:
The common trend in the scientific inquiry of urban areas and their populations is to use real-world geographic and population data to understand, explain, and predict urban phenomena. We argue that this trend limits our understanding of urban areas as dealing with arbitrarily collected geographic data requires technical expertise to process; moreover, population data is often aggregated, sparsified, or anonymized for privacy reasons. We believe synthetic urban areas generated via procedural city generation, which is a technique mostly used in the gaming area, could help improve the state-of-the-art in many disciplines which study urban areas. In this paper, we describe a selection of research areas that could benefit from such synthetic urban data and show that the current research in procedurally generated cities needs to address specific issues (e.g., plausibility) to sufficiently capture real-world cities and thus take such data beyond gaming.


Full Reference:
Kim, J-S., Kavak, H. and Crooks A.T. (2018), Procedural City Generation Beyond Game Development, SIGSPATIAL Special, 10(2), 34-41. DOI: 10.1145/3292390.3292397 (pdf)

Thursday, November 08, 2018

Refugee Camps and Volunteered Geographical Information

Fig. 7. Stimulus-Awareness-Activism (SA2) framework
Previously we have posted on how one can use new sources of data  (e.g. Volunteered Geographical Information) to explore and understand the world around us, such as mass migration, urban form and function, or be used for the basis of a model. Continuing on with this research theme we recently had a paper published in PLoS ONE entitled: "News Coverage, Digital Activism, and Geographical Saliency: A Case Study of Refugee Camps and Volunteered Geographical Information."

In this paper we explore the relationship between news coverage (via Google news), search trends (via Google trends) and user edit contribution patterns in OpenStreetMap and  Wikipedia for refugee camps from around the world. Specifically we are interested in how news media coverage (and in particular digital media) impacts digital activism (i.e.  volunteers who contribute content to online communities). Based on our analysis we find that digital activism bursts tend to take place during periods of sustained build-up of public awareness deficit or surplus.

These findings are in line with two prominent mass communication theories: agenda setting and corrective action, and suggest the emergence of a novel Stimulus-Awareness-Activism (SA2) framework in today’s participatory digital age. We argue that this paper brings us one step closer to understanding the underlying mechanisms that drive digital activism in particular in the geospatial domain. Below you can read the abstract of the paper, see the refugee camps we studied and some of the results. At the bottom of the post we also provide the full reference and a link to the paper.

Abstract:
The last several decades have witnessed a shift in the way in which news is delivered and consumed by users. With the growth and advancements in mobile technologies, the Internet, and Web 2.0 technologies users are not only consumers of news, but also producers of online content. This has resulted in a novel and highly participatory cyber-physical news awareness ecosystem that fosters digital activism, in which volunteers contribute content to online communities. While studies have examined the various components of this news awareness ecosystem, little is still known about how news media coverage (and in particular digital media) impacts digital activism. In order to address this challenge and develop a greater understanding of it, this paper focuses on a specific form of digital activism, that of the production of digital geographical content through crowdsourcing efforts. Using refugee camps from around the world as a case study, we examine the relationship between news coverage (via Google news), search trends (via Google trends) and user edit contribution patterns in OpenStreetMap, a prominent geospatial data crowdsourcing platform. In addition, we compare and contrast these patterns with user edit patterns in Wikipedia, a well-known non-geospatial crowdsourcing platform. Using Google news and Google trends to derive a measure of thematic public awareness, our findings indicate that digital activism bursts tend to take place during periods of sustained build-up of public awareness deficit or surplus. These findings are in line with two prominent mass communication theories: agenda setting and corrective action, and suggest the emergence of a novel stimulus-awareness-activism framework in today’s participatory digital age. Moreover, these findings further complement existing research examining the motivational factors that drive users to contribute to online collaborative communities. This paper brings us one step closer to understanding the underlying mechanisms that drive digital activism in particular in the geospatial domain.

Figure 1. Study areas  (centroid location of camp).

Figure 5. OSM, Wikipedia, Google News, and Google Trends time series during a -/+4 months period around the strongest extremum point of each camp. The figures show that whereas OSM and Wikipedia entries tend to come in bursts, Google News and Trends display a more sustained type of activity.

Figure 6. The public awareness curve versus the cumulative OSM and Wikipedia edit activity during a -/+4 months period around the strongest extremum point of each camp. For camps such as Nyarugusu, OSM and Wikipedia bursts overlap with public awareness surplus. In other camps, such as Bidibidi, OSM edit activity bursts coincide with public awareness deficit.  

Full Reference: 
Mahabir, R., Croitoru, A., Crooks, A.T., Agouris, P. and Stefanidis, A. (2018), News Coverage, Digital Activism, and Geographical Saliency: A Case Study of Refugee Camps and Volunteered Geographical Information, PLoS ONE, 13(11): e0206825.   https://doi.org/10.1371/journal.pone.0206825 (pdf)

Wednesday, November 07, 2018

ABM platform developers from CoMSES 2018

There are many reviews of agent-based modeling platforms (e.g. Abar et al., 2017; Kravari and Bassiliades, 2015; Castle and Crooks, 2006) but rarely do you see movies  describing how such platforms have developed or where they are heading. Recently CoMSES Net (home of many great resources for agent-based modeling) held their second virtual conference: CoMSES 2018. During this virtual conference, the were presentations from Repast, Cormas and MESA to name but a few and I thought theese were worth sharing. If you click on the links below your can go directly to their threads (discussion) on from the conference.

Repast:
 

Cormas:


Mesa:


On a slightly different note, I just came across a series of podcasts by Jacob Ingalls and Benjamin Schumann (http://brokenjars.xyz/simtalk/) who have interviewed a number of practitioners carrying out simulation modeling including the CEO of AnyLogic. Similar to the movies above, these podcasts provide a different way of learning more about simulations.