Bypassing Instagram's Fact-Checking using Adversarial Attacks

Mission of the project

Whether it is the elections, news about a new strain of COVID, or vaccine paranoia - over the past few years we've seen an ever-evolving scope of social media overpower the conventional news sources all over the world. According to a survey by Pew Research Services, eight out of ten adults in the US admitted to getting their news from social media platforms. No wonder, there has been an overwhelming amount of misinformation circulating on these platforms.

With the influx of misinformation, social media companies have also come forward with ways to combat its spread. One such approach is using fact-checkers to verify reported images and flag them as fake/harmful if they are found to contain misinformation. Moreover, other images which are found similar to this image by image matching algorithms are also marked unsafe.

Our project aims to demonstrate this approach to preventing the spread of misinformation is imperfect and can be bypassed by using adversarial perturbations. Using an adversarial example, we can prevent a reportedly fake image from being hidden by tricking the fact-checking algorithm into believing it is an innocuous image. We aim to study this on a popular social media platform, Instagram, and demonstrate this as an outcome of our project.

Some Real News About Fake News

Survey Conducted by Pew Research Center

Why Instagram?

High User Count
Active User Base
Image Platform
More susceptible to the spread of misinformation

Problem Statement

Using adversarial examples to fool State-of-the-art Optical Character Recognition Models
Masking false-information images with adversarial masks to bypass Instagram's fact-checking algorithm

Why does it Matter?

The recent surge in misinformation regarding vaccines and the pandemic.
A major chunk of the teenage and adult audience uses Instagram and social media platforms, and it’s still growing.
Our solution brings light to gaps in the current state of the art.

Current State of the Art

Current solutions and workarounds focus on detection by evasion, for instance, V@ccination/C0V1D-19
No known work on adversarial example detection for Instagram
OCR Models: Tesseract, Instagram Fact-checking Algorithm

Implementation Details

Page updated

Google Sites

Report abuse

MASKuerade