Wednesday, December 05, 2018

Detecting and Mapping Slums using Open Data

Urban and slum areas in Nairobi (False composite image created
by stacking image bands 7, 6 and 4 from the Landsat 8 satellite.
Turning back to slums, we just published paper entitled "Detecting and Mapping Slums using Open Data: A Case Study in Kenya" in the International Journal of Digital Earth. This work builds and extends our previous research on using new sources of data to explore the slum settlements in 3 cities in Kenya (i.e. Nairobi, Mombasa and Kisumu).  Specifically, we examine how the fusion of Volunteered Geographical Information, Social Media, and other open data sources can complement remote sensing imagery in supporting slum detection, mapping and monitoring. 

We do this by using data mining tools (e.g. logistic regression, discriminant analysis and the See5 decision tree), to develop context-sensitive definitions for slums based on location, as well as for testing the generalizability of indicators and derived slum models. The end result is an indicator database for slums using open sources of physical and socio-economic data that can be used to characterize slum settlements. If you wish to know more, below we provide the abstract to the paper along with some of the figures and the full citation with a link to the paper itself.

Abstract:
The worldwide slum population currently stands at over one billion, with substantial growth expected in the coming decades. Traditionally, slums have been mapped using information derived mainly from either physical indicators using remote sensing data, or socio-economic indicators using census data. Each data source on its own provides only a partial view of slums, an issue further compounded by data poverty in less developed countries. To overcome such issues, this paper explores the fusion of traditional with emerging open data sources and data mining tools to identify additional indicators that can be used to detect and map the presence of slums, map their footprint, and map their evolution. Towards this goal, we develop an indicator database for slums using open sources of physical and socio-economic data that can be used to characterize slum settlements. Using this database, we then leverage data mining techniques to identify the most suitable combination of these indicators for mapping slums. Using three cities in Kenya as test cases, results show that the fusion of these data can improve the mapping accuracy of slums. These results suggest that the proposed approach can provide a viable solution to the emerging challenge of monitoring the growth of slums.
Keywords: Slums; Remote Sensing; Socio-economic; Urban sustainability; Data mining; Kenya

Study areas in Kenya

Methodology workflow

Distribution of positive classified cases for slums for (a) logistic regression, (b) discriminant analysis and (c) the See5 decision tree.
Full Reference:
Mahabir, R., Agouris, P., Stefanidis, A., Croitoru, A. and Crooks, A.T. (2018), Detecting and Mapping Slums using Open Data: A Case Study in Kenya, International Journal of Digital Earth. DOI: https://doi.org/10.1080/17538947.2018.1554010. (pdf)

Bots in Social Networks

Recently our research has started to dig deeper into social media, especially how bots diffuse information in online social networks (OSNs). To this end at the  7th International Conference on Complex Networks and Their Applications we had a paper entitled: Bots in Nets: Empirical Comparative Analysis of Bot Evidence in Social Networks

In this paper we present a framework to characterize the pervasiveness and relative importance of bots in various OSNs conversations of three significant global events in 2016. In total, we harvested more than 30 million tweets from the U.S. Presidential Election, the Ukrainian Conflict and Turkish Political Censorship and compared the conversational patterns of bots and humans within each event. The results from this analysis showed that although Twitter participants identified as social bots comprised only 0.28% of all OSN users in this study, they accounted for a significantly large portion of prominent centrality rankings across the three conversations. If you want to know more about this new work, below we provide the abstract to the paper, a selection of figures and tables (including our methodology, some summary information about our data corpus and some of the results). Finally at the bottom of this post we have the full reference and a link to the paper.

Abstract:
The emergence of social bots within online social networks (OSNs) to diffuse information at scale has given rise to many efforts to detect them. While methodologies employed to detect the evolving sophistication of bots continue to improve, much work can be done to characterize the impact of bots on communication networks. In this study, we present a framework to describe the pervasiveness and relative importance of participants recognized as bots in various OSN conversations. Specifically, we harvested over 30 million tweets from three major global events in 2016 (the U.S. Presidential Election, the Ukrainian Conflict and Turkish Political Censorship) and compared the conversational patterns of bots and humans within each event. We further examined the social network structure of each conversation to determine if bots exhibited any particular network influence, while also determining bot participation in key emergent network communities. The results showed that although participants recognized as social bots comprised only 0.28% of all OSN users in this study, they accounted for a significantly large portion of prominent centrality rankings across the three conversations. This includes the identification of individual bots as top-10 influencer nodes out of a total corpus consisting of more than 2.8 million nodes. 

Keywords: bots, online social networks, social network analysis.
Fig. 1. Overall methodology to analyze bot evidence across multiple Twitter OSN conversations.
Table 1. Harvested Twitter Corpus Overview
Fig. 5. Correlation of centrality measures for select centrality comparisons: (a) U.S. Election eigenvector versus betweenness analysis, (b) Ukraine Conflict eigenvector versus betweenness analysis and (c)Ukraine Conflict eigenvector versus degree analysis.
Table 2. Bot density of largest emergent communities.

Full Reference:
Schuchard, R., Crooks, A.T., Stefanidis, A.  and Croitoru, A. (2018), Bots in Nets: Empirical Comparative Analysis of Bot Evidence in Social Networks, in Aiello, L.M., Cherifi, C. Cherifi, H., Lambiotte, R., LiĆ³, P. and Rocha, L.M. (eds.), Volume 2, Proceedings of the 7th International Conference on Complex Networks and Their Applications, Cambridge, United Kingdom, Springer, pp 424-436. (pdf)