[DETECTING MISINFORMATIVE CONTENTS THROUGH USERS COMMENTS AND REVIEWS]
Advanced Machine Learning Course
Yusuf Elnady (yelnady@vt.edu) Eslam Hussein (ehussein@vt.edu)
You can now access our paper from this link.
Mission of the project
Misinformation is a prevalent and challenging phenomenon on online platforms
Users engage with misinformative content (Videos, Posts, Tweets … etc.) by liking, sharing, commenting and reviewing
User engagement has signals that could indicate the credibility of the content.
i.e. user comment on a video could be used to annotate its credibility (misinformative, normal or anti-misinformative video)
Objective
Objective: Classify YouTube videos and Amazon items using users' comments/reviews into three classes:
Pro Misinfo: content that promotes false information about a specific topic.
Anti Misinfo: content that corrects/criticizes false information about a topic.
Neutral: content that gives information about a topic without having a specific stance on any misinformation surrounding that topic.
Dataset Summary for YouTube
Number of Videos per Topic
We have 1074 videos, each is manually labeled by its stance; {Pro, Neutral, or Anti}.
Each Video has the following attributes: 'topic', 'aria-label', 'description', ‘vidTitle', 'vid_url', 'annotation', 'notes', 'normalizedAnnotation', 'duration', 'viewCount', 'likeCount', 'dislikeCount', 'favoriteCount', 'commentCount', 'popularity', 'Language'
Every comment on a video has the following attributes: 'comment', 'likeCount', ‘topic', 'stance', 'ID'
Dataset Summary for Amazon
Number of Products per Topic
For each of 1419 items, we scraped customers’ reviews (~40k in total)
For each review, we collected its:
Title
Review text
Rating (1→ 5 stars)
No. of people who found the review helpful, review class (Positive vs. Critical)