Wednesday, January 06, 2021

Elections and Bots

Continuing our work on botsRoss Schuchard and myself have a new paper in PLOS ONE entitled "Insights into elections: An ensemble bot detection coverage framework applied to the 2018 U.S. midterm elections." Our motivation for the work came from the fact that during elections internet-based technological platforms (e.g., online social networks (OSNs), online political blogs etc.) are gaining more power compared to mainstream media sources (e.g., print, television and radio). While such technologies are reducing the barrier for individuals to actively participate in political dialogue, the relatively unsupervised nature of OSNs increases susceptibility to misinformation campaigns, especially with respect to political and election dialogue. This is especially the case for social bots—automated software agents designed to mimic or impersonate humans which are prevalent actors in OSN platforms and have proven to amplify misinformation.  

The issue however is that no single detection algorithm is able to account for the myriad of social bots operating in OSNs. To overcome this issue, this research incorporates multiple social bot detection services to determine the prevalence and relative importance of social bots within an OSN conversation of tweets. Through the lens of the 2018 U.S. midterm elections, 43.5 million tweets were harvested capturing the election conversation which were then analyzed for evidence of bots using three bot detection platform services: Botometer, DeBot and Bot-hunter.

We found that bot and human accounts contributed temporally to our tweet election corpus at relatively similar cumulative rates. The multi-detection platform comparative analysis of intra-group and cross-group interactions showed that bots detected by DeBot and Bot-hunter persistently engaged humans at rates much higher than bots detected by Botometer. Furthermore, while bots accounted for less than 8% of all unique accounts in the election conversation retweet network, bots accounted for more than 20% of the top-100 and top-25 ranking out-degree centrality, thus suggesting persistent activity to engage with human accounts. Finally, the bot coverage overlap analysis shows that minimal overlap existed among the bots detected by the three bot detection platforms, with only eight total bot accounts detected by all (out of a total of 254,492 unique bots in the overall tweet corpus ).

If this research sounds interesting to you, below we provide the abstract to the paper along with some figures outlining our methodology and some of the results. While at the bottom of the post you can see the full reference and there is a link to the paper were you can read more.

Abstract:

The participation of automated software agents known as social bots within online social network (OSN) engagements continues to grow at an immense pace. Choruses of concern speculate as to the impact social bots have within online communications as evidence shows that an increasing number of individuals are turning to OSNs as a primary source for information. This automated interaction proliferation within OSNs has led to the emergence of social bot detection efforts to better understand the extent and behavior of social bots. While rapidly evolving and continually improving, current social bot detection efforts are quite varied in their design and performance characteristics. Therefore, social bot research efforts that rely upon only a single bot detection source will produce very limited results. Our study expands beyond the limitation of current social bot detection research by introducing an ensemble bot detection coverage framework that harnesses the power of multiple detection sources to detect a wider variety of bots within a given OSN corpus of Twitter data. To test this framework, we focused on identifying social bot activity within OSN interactions taking place on Twitter related to the 2018 U.S. Midterm Election by using three available bot detection sources. This approach clearly showed that minimal overlap existed between the bot accounts detected within the same tweet corpus. Our findings suggest that social bot research efforts must incorporate multiple detection sources to account for the variety of social bots operating in OSNs, while incorporating improved or new detection methods to keep pace with the constant evolution of bot complexity.

 

Fig 1. Social bot analysis framework employing multiple bot detection platforms. The framework enables the application of ensemble analysis methods to determine the prevalence and relative importance of social bots within Twitter conversations discussing the 2018 U.S. midterm elections.
 

Fig 3. Cumulative tweet contribution rates for the 2018 U.S. midterm OSN conversation (October 10 – November 6, 2018) from the (a) human (blue) / bot (red) and (b) DeBot (green) / Botometer (pink) / Bot-hunter (orange) account classification perspectives.

Fig 4. Intra-group and cross-group retweet communication patterns of human (blue) and social bot (red) users within the 2018 U.S. midterm election Twitter conversation according to each bot detection classification platform: (a) Combined Bot Sources (b) DeBot (c) Botometer (d) Bot-hunter. The combined bot sources results (shown in gray) classified an account as a bot in aggregate fashion if any of the three detection platforms classified the account as a bot.

Fig 5. Social bot account evidence within the top-N (where, N = 1000 / 500 / 100 / 25) centrality rankings [(a) eigenvector (b) in-degree (c) out-degree (d) PageRank] according to bot classification results from Bot-hunter (orange), Botometer (pink) and DeBot (green).
 
Fig 7. Bot detection coverage analysis for bots detected within the 2018 U.S. midterm election Twitter conversation using the Botometer, Bot-hunter and DeBot bot detection platforms.

 

Full reference:

Schuchard, R.J. and Crooks, A.T. (2021), Insights into Elections: An Ensemble Bot Detection Coverage Framework Applied to the 2018 U.S. Midterm Elections, PLoS ONE, 16(1): e0244309. Available at  https://doi.org/10.1371/journal.pone.0244309. (pdf).

No comments: