[DETECTING MISINFORMATIVE CONTENTS THROUGH USERS COMMENTS AND REVIEWS]

Advanced Machine Learning Course

Yusuf Elnady (yelnady@vt.edu) Eslam Hussein (ehussein@vt.edu)

You can now access our paper from this link.

Mission of the project

  • Misinformation is a prevalent and challenging phenomenon on online platforms

  • Users engage with misinformative content (Videos, Posts, Tweets … etc.) by liking, sharing, commenting and reviewing

  • User engagement has signals that could indicate the credibility of the content.

  • i.e. user comment on a video could be used to annotate its credibility (misinformative, normal or anti-misinformative video)

Objective

  • Objective: Classify YouTube videos and Amazon items using users' comments/reviews into three classes:

    • Pro Misinfo: content that promotes false information about a specific topic.

    • Anti Misinfo: content that corrects/criticizes false information about a topic.

    • Neutral: content that gives information about a topic without having a specific stance on any misinformation surrounding that topic.

Dataset Summary for YouTube

Number of Videos per Topic

  • We have 1074 videos, each is manually labeled by its stance; {Pro, Neutral, or Anti}.

  • Each Video has the following attributes: 'topic', 'aria-label', 'description', ‘vidTitle', 'vid_url', 'annotation', 'notes', 'normalizedAnnotation', 'duration', 'viewCount', 'likeCount', 'dislikeCount', 'favoriteCount', 'commentCount', 'popularity', 'Language'

  • Every comment on a video has the following attributes: 'comment', 'likeCount', ‘topic', 'stance', 'ID'


Dataset Summary for Amazon

Number of Products per Topic

  • For each of 1419 items, we scraped customers’ reviews (~40k in total)

  • For each review, we collected its:

    • Title

    • Review text

    • Rating (1→ 5 stars)

    • No. of people who found the review helpful, review class (Positive vs. Critical)