Wednesday, January 06, 2021

Elections and Bots

Continuing our work on botsRoss Schuchard and myself have a new paper in PLOS ONE entitled "Insights into elections: An ensemble bot detection coverage framework applied to the 2018 U.S. midterm elections." Our motivation for the work came from the fact that during elections internet-based technological platforms (e.g., online social networks (OSNs), online political blogs etc.) are gaining more power compared to mainstream media sources (e.g., print, television and radio). While such technologies are reducing the barrier for individuals to actively participate in political dialogue, the relatively unsupervised nature of OSNs increases susceptibility to misinformation campaigns, especially with respect to political and election dialogue. This is especially the case for social bots—automated software agents designed to mimic or impersonate humans which are prevalent actors in OSN platforms and have proven to amplify misinformation.  

The issue however is that no single detection algorithm is able to account for the myriad of social bots operating in OSNs. To overcome this issue, this research incorporates multiple social bot detection services to determine the prevalence and relative importance of social bots within an OSN conversation of tweets. Through the lens of the 2018 U.S. midterm elections, 43.5 million tweets were harvested capturing the election conversation which were then analyzed for evidence of bots using three bot detection platform services: Botometer, DeBot and Bot-hunter.

We found that bot and human accounts contributed temporally to our tweet election corpus at relatively similar cumulative rates. The multi-detection platform comparative analysis of intra-group and cross-group interactions showed that bots detected by DeBot and Bot-hunter persistently engaged humans at rates much higher than bots detected by Botometer. Furthermore, while bots accounted for less than 8% of all unique accounts in the election conversation retweet network, bots accounted for more than 20% of the top-100 and top-25 ranking out-degree centrality, thus suggesting persistent activity to engage with human accounts. Finally, the bot coverage overlap analysis shows that minimal overlap existed among the bots detected by the three bot detection platforms, with only eight total bot accounts detected by all (out of a total of 254,492 unique bots in the overall tweet corpus ).

If this research sounds interesting to you, below we provide the abstract to the paper along with some figures outlining our methodology and some of the results. While at the bottom of the post you can see the full reference and there is a link to the paper were you can read more.

Abstract:

The participation of automated software agents known as social bots within online social network (OSN) engagements continues to grow at an immense pace. Choruses of concern speculate as to the impact social bots have within online communications as evidence shows that an increasing number of individuals are turning to OSNs as a primary source for information. This automated interaction proliferation within OSNs has led to the emergence of social bot detection efforts to better understand the extent and behavior of social bots. While rapidly evolving and continually improving, current social bot detection efforts are quite varied in their design and performance characteristics. Therefore, social bot research efforts that rely upon only a single bot detection source will produce very limited results. Our study expands beyond the limitation of current social bot detection research by introducing an ensemble bot detection coverage framework that harnesses the power of multiple detection sources to detect a wider variety of bots within a given OSN corpus of Twitter data. To test this framework, we focused on identifying social bot activity within OSN interactions taking place on Twitter related to the 2018 U.S. Midterm Election by using three available bot detection sources. This approach clearly showed that minimal overlap existed between the bot accounts detected within the same tweet corpus. Our findings suggest that social bot research efforts must incorporate multiple detection sources to account for the variety of social bots operating in OSNs, while incorporating improved or new detection methods to keep pace with the constant evolution of bot complexity.

 

Fig 1. Social bot analysis framework employing multiple bot detection platforms. The framework enables the application of ensemble analysis methods to determine the prevalence and relative importance of social bots within Twitter conversations discussing the 2018 U.S. midterm elections.
 

Fig 3. Cumulative tweet contribution rates for the 2018 U.S. midterm OSN conversation (October 10 – November 6, 2018) from the (a) human (blue) / bot (red) and (b) DeBot (green) / Botometer (pink) / Bot-hunter (orange) account classification perspectives.

Fig 4. Intra-group and cross-group retweet communication patterns of human (blue) and social bot (red) users within the 2018 U.S. midterm election Twitter conversation according to each bot detection classification platform: (a) Combined Bot Sources (b) DeBot (c) Botometer (d) Bot-hunter. The combined bot sources results (shown in gray) classified an account as a bot in aggregate fashion if any of the three detection platforms classified the account as a bot.

Fig 5. Social bot account evidence within the top-N (where, N = 1000 / 500 / 100 / 25) centrality rankings [(a) eigenvector (b) in-degree (c) out-degree (d) PageRank] according to bot classification results from Bot-hunter (orange), Botometer (pink) and DeBot (green).
 
Fig 7. Bot detection coverage analysis for bots detected within the 2018 U.S. midterm election Twitter conversation using the Botometer, Bot-hunter and DeBot bot detection platforms.

 

Full reference:

Schuchard, R.J. and Crooks, A.T. (2021), Insights into Elections: An Ensemble Bot Detection Coverage Framework Applied to the 2018 U.S. Midterm Elections, PLoS ONE, 16(1): e0244309. Available at  https://doi.org/10.1371/journal.pone.0244309. (pdf).

Friday, December 04, 2020

Future Developments in Geographical Agent-Based Models: Challenges and Opportunities

Its been a while since (to say the least), that we wrote a position paper about agent-based modeling. But with agent-based modeling becoming more widely accepted  and the growth of machine learning within the geographical sciences we thought we would revisit some of the existing challenges  (e.g. validation, representing behavior) and discuss how machine learning and data might help here. To this end, Alison HeppenstallNick Malleson, Ed Manley, Jiaqi Ge and Mike Batty, have recently published a paper entitled "Future Developments in Geographical Agent-Based Models: Challenges and Opportunities" in Geographical Analysis.  Below we provide the abstract to the paper, and if this is of interest please follow the links to the paper itself.

Abstract

Despite reaching a point of acceptance as a research tool across the geographical and social sciences, there remain significant methodological challenges for agent-based models. These include recognizing and simulating emergent phenomena, agent representation, construction of behavioral rules, calibration and validation. Whilst advances in individual-level data and computing power have opened up new research avenues, they have also brought with them a new set of challenges. This paper reviews some of the challenges that the field has faced, the opportunities available to advance the state-of-the-art, and the outlook for the field over the next decade. We argue that although agent-based models continue to have enormous promise as a means of developing dynamic spatial simulations, the field needs to fully embrace the potential offered by approaches from machine learning to allow us to fully broaden and deepen our understanding of geographical systems.

Full Reference:

Heppenstall, A., Crooks, A.T., Malleson, N., Manley, E., Ge, J. and Batty, M. (2020), Future Developments in Geographical Agent-Based Models: Challenges and Opportunities, Geographical Analysis. https://doi.org/10.1111/gean.12267 (pdf)

Tuesday, November 03, 2020

Integrating Social Networks into Large-scale Urban Simulations

Building on past posts about our work with respect to generating large scale synthetic populations for agent-based models, we have a new paper entitled "Integrating Social Networks into Large-scale Urban Simulations for Disaster Responses" that was accepted at the 3rd ACM SIGSPATIAL International Workshop on GeoSpatial Simulation. In the paper we discuss our method to create synthetic populations  which incorporates social networks to generate for the New York megacity region. To demonstrate the utility of our approach, we use the generated synthetic population to initialize an agent-based model which not only generates basic patterns of life (e.g., commuting to and from work), but also allows us to explore how people react to disasters and how their social networks are changed by such events. 

If sounds of interest to you, below we provide the abstract to the paper, along with our synthetic population workflow and some sample outcomes from the model. At the bottom of the post we provide the full reference and link to the paper (the paper itself also links to a GitHub repository where more information about the synesthetic population can be found).

ABSTRACT: Social connections between people influence how they behave and where they go; however, such networks are rarely incorporated in agent-based models of disaster. To address this, we introduce a novel synthetic population method which specifically creates social relationships. This synthetic population is then used to instantiate a geographically explicit agent-based model for the New York megacity region which captures pre- and post- disaster behaviors. We demonstrate not only how social networks can be incorporated into models of disaster but also how such networks can impact decision making, opening up a variety of new application areas where network structures matter in urban settings. 

KEYWORDS: Urban Simulation, Agent-based models, Synthetic Populations, Social Networks, Geographical Information Systems, Disasters.

Synthetic population and social network generation workflow.
Synthetic population at household level within a census track (A) and social network of one individual (B).
Example of a heat-map of traffic density (A) Manhattan is center of the plot. The impact area of the disaster and the health status of the agents (B).

Full Reference:

Jiang, N., Burger, A., Crooks, A.T. and Kennedy, W.G. (2020), Integrating Social Networks into Large-scale Urban Simulations for Disaster Responses. Geosim ’20: 3rd ACM SIGSPATIAL International Workshop on GeoSpatial Simulation, Seattle, WA. (pdf)

Thursday, October 15, 2020

The Impact of Mandatory Remote Work during the COVID-19 Pandemic

In the past we have written about using agent-based modeling to study human resources management issues and how workplace the layout might impact subordinates interactions with managers but with growing amounts data we can explore how employees communicate with each other. To this end, Talha Oz and myself have a  new paper entitled "Exploring the Impact of Mandatory Remote Workduring the COVID-19 Pandemic" which will be presented in a special session on COVID-19 at the 2020 International Conference on Social Computing, Behavioral-Cultural Modeling, & Prediction and Behavior Representation in Modeling and Simulation (or SBP-Brims 2020 for short). 

In this study we exploit metadata (and not content) emitted from commonplace workplace technologies such as calendar and workplace messaging apps collected from a tech company in order to see how mandatory remote work changed communication patterns and how such data can be used to measure organizational health. If this is of interest to you, below we provide the abstract to the paper along with some of the results with respect to how meetings and communication patterns changed from  business as usual (BAU), pre pandemic to that when people were forced to work from home (WFH). Finally at the bottom of the post we provide the full reference and the link to the paper.

Abstract. During the early months of the COVID-19 pandemic, millions of people had to work from home. We examine the ways in which COVID-19 affect organizational communication by analyzing five months of calendar and messaging metadata from a technology company. We found that: (i) cross-level communication increased more than that of same-level, (ii) while within-team messaging increased considerably, meetings stayed the same, (iii) off-hours messaging became much more frequent, and that this effect was stronger for women; (iv) employees respond to non-managers faster than managers; finally, (v) the number of short meetings increased while long meetings decreased. These findings contribute to theories on organizational communication, remote work, management, and flexibility stigma. Besides, this study exemplifies a strategy to measure organizational health using an objective (not self-report based) method. To the best of our knowledge, this is the first study using workplace communication metadata to examine the heterogeneous effects of mandatory remote work. 

Keywords: Work from Home, Communication, COVID-19, Organization.




Full Reference:

Oz, T. and Crooks, A.T. (2020), Exploring the Impact of Mandatory Remote Work during the COVID-19 Pandemic, 2020 International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation, Washington DC. (pdf)

 If you would like a pre-print of  paper, just let us know and we can email you one. 

Utilizing Python for Agent-based Modeling: The Mesa Framework

In the past I have mentioned Mesa, an agent-based modeling framework in Python is several posts but not really discussed it in detail. This is about to change with this post. The reason being is that we have a paper at the forthcoming International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation (or SBP-BRiMS for short) entitled "Utilizing Python for Agent-based Modeling: The Mesa Framework".

While Mesa started off with two students from the CSS program at George Mason University: Jackie Kazil and David Masad it has now grown to include over 70 contributors. In this new paper we discuss the rationale for developing Mesa (see https://github.com/projectmesa/mesa) which arose because there was no framework for easily building agent-based models in Python. Furthermore we discuss Mesa's design goals and its architecture and usage, along with who is using Mesa and extensions to it (e.g. Mesa-Geo, Multi-Level Mesa), finally we conclude the paper with future development directions. Below we provide the abstract to the paper and a selection of figures which highlights Mesa's model components (model, analysis and visualization), how various activation schedules are incorporated within Mesa and an illustration of how these different schemes impact a model and some examples of Mesa's visualization functionality. At the bottom of the post we have the full reference and a link to the paper.

Abstract.
Mesa is an agent-based modeling framework written in Python. Originally started in 2013, it was created to be the go-to tool in for re-searchers wishing to build agent-based models with Python. Within this paper we present Mesa’s design goals, along with its underlying architecture. This includes its core components: 1) the model (Model, Agent, Schedule, and Space), 2) analysis (Data Collector and Batch Runner) and the visualization (Visualization Server and Visualization Browser Page). We then discuss how agent-based models can be created in Mesa. This is fol-lowed by a discussion of applications and extensions by other researchers to demonstrate how Mesa design is decoupled and extensible and thus creating the opportunity for a larger decentralized ecosystem of packages that people can share and reuse for their own needs. Finally, the paper concludes with a summary and discussion of future development areas for Mesa. 

Keywords: Agent-based Modeling, Python, Framework, Complex Systems. 
Mesa model components: model, analysis and visualization.
Activation schedules within Mesa and an illustration of how these different schemes impact a model. In this case the Prisoner’s Dilemma. Defecting agents are in red and cooperating agents are in blue. Each image is from the same step, but different activation schemes are used.
Model visualization of two Mesa applications within a web browser: (A) Wolf-sheep predation Model. (B) Virus on a network (Source: https://github.com/projectmesa).

Full Reference: 
Kazil, J., Masad, D. and Crooks, A.T. (2020), Utilizing Python for Agent-based Modeling: The Mesa Framework, in Thomson, R., Bisgin, H., Dancy, C., Hyder, A. and Hussain, M. (eds), 2020 International Conference on Social Computing, Behavioral-Cultural Modeling & Prediction and Behavior Representation in Modeling and Simulation, Washington DC., pp. 308-317. (pdf)