Tuesday, September 01, 2020

Beyond Words: Comparing Structure, Emoji Use, and Consistency Across Social Media Posts

Continuing our work on Emojis, at the forthcoming International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation (or SBP-BRiMS for short) we (Melanie Swartz, Arie Croitoru and myself) have a paper entitled "Beyond Words: Comparing Structure, Emoji Use, and Consistency Across Social Media Posts." In the paper we introduce and demonstrate a language-agnostic methodology to characterize structures of content and emoji use within a document (in this case a tweet), measure consistency of structures across a set of documents, and cluster documents and users with similar patterns and behavior. Using a corpus of 44 million tweets collected in October and November 2018 related to the 2018 U.S. midterm elections based on keywords, hashtags, and user accounts associated with candidates or political parties we were able to gain insights into the unique or shared structures of communication styles and emoji use of over 3.3 million unique users and user roles such as journalists, bots and others. If this sounds of interest to you, below we provide the abstract to the paper, some tables and figures of the our findings along with the full reference and a link to the paper. Furthermore, if you are interested in extending this work to other areas, Melanie has made the code available at https://github.com/msemoji/.

Abstract
Social media content analysis often focuses on just the words used in documents or by users and often overlooks the structural components of document composition and linguistic style. We propose that document structure and emoji use are also important to consider as they are impacted by individual communication style preferences and social norms associated with user role and intent, topic domain, and dissemination platform. In this paper we introduce and demonstrate a novel methodology to conduct structural content analysis and measure user consistency of document structures and emoji use. Document structure is represented as the order of content types and number of features per document and emoji use is characterized by the attributes, position, order, and repetition of emojis within a document. With these structures we identified user signatures of behavior, clustered users based on consistency of structures utilized, and identified users with similar document structures and emoji use such as those associated with bots, news organizations, and other user types. This research compliments existing text mining and behavior modeling approaches by offering a language agnostic methodology with lower dimensionality than topic modeling, and focuses on three features often overlooked: document structure, emoji use, and consistency of behavior.
Keywords: Data Mining, Social Media, Emojis, User Behavior Modeling.
Emoji attributes

Most common content structures with emojis for non-retweets.

Clusters of users with similar behavior across two factors in non-retweets (left) and retweets (right) Colors indicate cluster assignments.

Full Reference: 
Swartz, M., Crooks, A.T. and Croitoru, A. (accepted), Beyond Words: Comparing Structure, Emoji Use, and Consistency Across Social Media Posts, 2020 International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation, Washington DC.

If you would like a pre-print of  paper, just let us know and we can email you one.

 

No comments: