Classifying YouTube Videos and Amazon Products

Classifying YouTube Videos and Amazon Products

[DETECTING MISINFORMATIVE CONTENTS THROUGH USERS COMMENTS AND REVIEWS]

Advanced Machine Learning Course

Yusuf Elnady (yelnady@vt.edu) Eslam Hussein (ehussein@vt.edu)

You can now access our paper from this link.

Mission of the project

Misinformation is a prevalent and challenging phenomenon on online platforms
Users engage with misinformative content (Videos, Posts, Tweets … etc.) by liking, sharing, commenting and reviewing
User engagement has signals that could indicate the credibility of the content.
i.e. user comment on a video could be used to annotate its credibility (misinformative, normal or anti-misinformative video)

Objective

Objective: Classify YouTube videos and Amazon items using users' comments/reviews into three classes:
- Pro Misinfo: content that promotes false information about a specific topic.
- Anti Misinfo: content that corrects/criticizes false information about a topic.
- Neutral: content that gives information about a topic without having a specific stance on any misinformation surrounding that topic.

Dataset Summary for YouTube

Number of Videos per Topic

We have 1074 videos, each is manually labeled by its stance; {Pro, Neutral, or Anti}.

Each Video has the following attributes: 'topic', 'aria-label', 'description', ‘vidTitle', 'vid_url', 'annotation', 'notes', 'normalizedAnnotation', 'duration', 'viewCount', 'likeCount', 'dislikeCount', 'favoriteCount', 'commentCount', 'popularity', 'Language'
Every comment on a video has the following attributes: 'comment', 'likeCount', ‘topic', 'stance', 'ID'

Dataset Summary for Amazon

Number of Products per Topic

For each of 1419 items, we scraped customers’ reviews (~40k in total)
For each review, we collected its:
- Title
- Review text
- Rating (1→ 5 stars)
- No. of people who found the review helpful, review class (Positive vs. Critical)

Google Sites

Report abuse