Integrity in Social Networks and Media
Integrity 2022, the third edition of the Integrity Workshop, is an event colocated within the WSDM conference happening on February 25th 2022 in Phoenix, AZ (MST timezone), which will take place online. Integrity 2022 aims to repeat the success achieved in the previous editions, Integrity 2020 and Integrity 2021.
Register through the WSDM one-day registration process: https://www.wsdm-conference.org/2022/
In the past decade, social networks and social media sites, such as Facebook and Twitter, have become the default channels of communication and information. The popularity of these online portals has exposed a collection of integrity issues: cases where the content produced and exchanged compromises the quality, operation, and eventually the integrity of the platform. Examples include misinformation, low quality and abusive content and behaviors, and polarization and opinion extremism. There is an urgent need to detect and mitigate the effects of these integrity issues, in a timely, efficient, and unbiased manner.
This workshop aims to bring together top researchers and practitioners from academia and industry, to engage in a discussion about algorithmic and system aspects of integrity challenges. The WSDM Conference, that combines Data Mining and Machine Learning with research on Web and Information Retrieval offers the ideal forum for such a discussion, and we expect the workshop to be of interest to everyone in the community. The topic of the workshop is also interdisciplinary, as it overlaps with psychology, sociology, and economics, while also raising legal and ethical questions, so we expect it to attract a broader audience.
All times below are MST timezone (Phoeniz, AZ, USA), on February 25th 2022.
8:45 am -- Opening remarks
9:00 am -- Information Access Equality on Network Generative Models (Tina Eliassi-Rad)
9:40 am -- Understanding and building a map of the news landscape (Sibel Adali)
10:20 am -- Break
10:30 am -- Characterizing Information diffusion: Social influence, propagation speed, polarization (Giuseppe Manco)
11:10 am -- How Pinterest powers a safe and trustworthy online experience at scale (Vishwakarma Singh)
11:50 am -- Break
12:00 pm -- Beyond traditional Integrity: Personalization, Sentiment, and Sensitive Content (David Vickrey)
12:40 pm -- Optimizing safety and resilience in mental health content moderation (Anthony McCosker and Yong-Bin Kang)
1:20 pm -- Closing
Information Access Equality on Network Generative Models
by Tina Eliassi-Rad (Northeastern/ Network Science Institute)
Abstract: It is well known that networks generated by common mechanisms such as preferential attachment and homophily can disadvantage the minority group by limiting their ability to establish links with the majority group. This has the effect of limiting minority nodes’ access to information. We present the results of an empirical study on the equality of information access in network models with different growth mechanisms and spreading processes. For growth mechanisms, we focus on the majority/minority dichotomy, homophily, preferential attachment, and diversity. For spreading processes, we investigate simple vs. complex contagions, different transmission rates within and between groups, and various seeding conditions. We observe two phenomena. First, information access equality is a complex interplay between network structures and the spreading processes. Second, there is a trade-off between equality and efficiency of information access under certain circumstances (e.g., when inter-group edges are low and information transmits asymmetrically). Our findings can be used to make recommendations for mechanistic design of social networks with information access equality. This is joint work with Xindi Wang and Onur Varol, and is based on the paper at https://arxiv.org/pdf/2107.02263.pdf .
Bio: Tina Eliassi-Rad is a Professor of Computer Science at Northeastern University. She is also a core faculty member at Northeastern's Network Science Institute and the Institute for Experiential AI. In addition, she is an external faculty member at the Santa Fe Institute and the Vermont Complex Systems Center. Her research is at the intersection of data mining, machine learning, and network science. Tina's work has been applied to personalized search on the World-Wide Web, statistical indices of large-scale scientific simulation data, fraud detection, mobile ad targeting, cyber situational awareness, and ethics in machine learning. Her algorithms have been incorporated into systems used by the government and industry (e.g., IBM System G Graph Analytics) as well as open-source software (e.g., Stanford Network Analysis Project). Tina received an Outstanding Mentor Award from the Office of Science at the US Department of Energy in 2010; became a Fellow of the ISI Foundation (in Turin Italy) in 2019; and was named one of the 100 Brilliant Women in AI Ethics for 2021.
Understanding and building a map of the news landscape
by Sibel Adali (Rensselaer Polytechnic Institute)
Abstract: In this talk, I will present both qualitative and quantitative work that aims to understand relationships, similarities and differences, between news and media sources. I will first present an analysis of source behavior in the form of content sharing. I will then show analysis word usage between sources using the self-supervised semantic detection method developed by my group. I will illustrate how these methods can enhance methods developed for misinformation detection as well open up the potential for novel intervention methods.
Bio: Sibel Adali is a Professor of Computer Science at Rensselaer Polytechnic Institute and the Associate Dean of Science for Research and Graduate Studies. Adali obtained her PhD from the University of Maryland College Park. Her current work concentrates on interdisciplinary problems related to social and information trust, networks, information retrieval and misinformation. She is the author of the book "Modeling Trust Context in Networks".
Characterizing Information diffusion: Social influence, propagation speed, polarization
by Giuseppe Manco (National Research Council, Italy)
Bio: Giuseppe Manco is director of research at the institute of High Performance Computing and Networks of the Italian National Research Council (ICAR-CNR) and contract professor at the University of Calabria. His current research interests include machine learning, knowledge discovery and data mining, probabilistic modeling and recommender systems, social network analysis and information diffusion, AI and cybersecurity. He is the author of more than 100 papers in these areas, as well as the co-inventor of three international patents. He was a recipient of the 2014 YAHOO FREP (Faculty Research and Engagement Program) prize for research activity on machine learning applied to social network analysis and information diffusion. He is the scientific coordinator of the ICAR’s research group “Behavioral Modeling and Scalable Analytics – BMSA”. He was scientific coordinator of several funded scientific initiatives, and was active in several research projects supported by regional, national and international organizations. He is regularly serving as a PC member/Senior PC Member/Area Chair of several top conferences on Data Mining, Machine Learning and Artificial Intelligence (including WWW, ICDM, AAAI, IJCAI, ECML PKDD) and serves as an associate editor for the journals “Machine Learning”, “Knowledge and Information Systems”, “Journal of Intelligent Information Systems”.
How Pinterest powers a safe and trustworthy online experience at scale
by Vishwakarma Singh (Pinterest)
Bio: Vishwakarma Singh is Machine Learning Technical Lead for Trust and Safety at Pinterest where he leads strategy, innovation, and solutions for proactively fighting various platform abuse at scale using Machine Learning. He previously worked at Apple as a Principal Machine Learning Scientist. He earned a PhD in Computer Science from University of California at Santa Barbara. He has published many research papers in peer-reviewed conferences and journals.
Sentiment, Controls, and Sensitive Content
by David Vickrey (Meta)
Abstract: One of the most common approaches to integrity at scale is defining an array of types of problematic content and then building machine learning solutions of various types in order to automatically identify these categories. In practice, though, this approach runs into at least two major hurdles: 1) the concepts in question are difficult to define and difficult to predict, and 2) there often is no agreed-upon definition of what should be acceptable and/or favored speech. Over the last several years, one of the ways the FB App integrity team has responded to this challenge is by developing effective integrity strategies that don’t require making judgments about the quality or integrity of individual pieces of content or actors. In this talk, I’ll discuss some of these strategies, in two broad categories: better supporting user needs through sentiment surveys and controls; and structural changes to how we rank and evaluate product changes, particularly for sensitive areas such as politics.
Bio: David Vickrey is a senior Research Scientist at Meta. He has worked on many different aspects of News Feed ranking, including the development of the original machine learning system for Facebook News Feed. The last six years, he has focused on challenging interdisciplinary aspects of ranking, in areas including integrity, polarization, and human value alignment. Prior to starting at Facebook, he received a Ph.D. at Stanford, specializing in machine learning and natural language processing.
Optimizing safety and resilience in mental health content moderation
by Anthony McCosker and Yong-Bin Kang (Swinburne University)
Abstract: Mental ill-health is treated by most social media moderation systems as borderline or problematic, with access often restricted to reduce assumed secondary harms. Our work with three mental health organisations in Australia, who provide successful mental health discussion and support platforms, identifies content moderation practices that can help to re-think how mental health is best managed. The work has two aims: 1) to optimize safety and resilience in dedicated digital mental health platforms through improved data analytics and automated moderation; 2) to draw insights from successful content moderation practices to inform the treatment of mental health content more widely across large commercial social media. Along with qualitative research with the organisations, we use semi-automatic natural language processing to establish a strengths-based resilience dictionary to inform text analysis and aid automated content moderation for borderline mental health content, challenging simplistic assessments of mental health content as risk.
Bios: Anthony McCosker is Professor of media and communication at Swinburne University of Technology, Australia. He is a Chief Investigator in the ARC Centre of Excellence for Automated Decision Making and Society, and Deputy Director of the Swinburne's Social Innovation Research Institute. His research addresses digital inclusion and participation, particularly in relation to health and wellbeing and social inclusion. Current research also addresses the need for community-led approaches to data and analytics capability. He is co-author of the forthcoming book Everyday Data Cultures (Polity Press).
Yong-Bin Kang is Senior Research Fellow, Data Science, in the ARC Centre of Excellence for Automated Decision Making and Society. His research expertise and interests are mainly in the fields of natural language processing (NLP), machine learning (ML), and data mining (DM). He has experience in working, managing and delivering large industrial, multi-disciplinary research projects in data science such as patent analytics, clinical data analytics, scientific-article analytics, social-media data analytics, expert finding and matching, and machine learning algorithms and applications.