A.I. DETECTION

DEEP FAKES & SYNTHETIC MEDIA: WHY WE NEED DETECTION TOOLS

The skills, experience, knowledge and existing tools available to teams such as BBC Monitoring remain a crucial part of the defence against disinformation. However, media that has been generated or manipulated by AI & ML tools 'steps the game up' in terms of the quality of the disinformation produced.

AL & ML can now for example;

  • Manipulate faces and body language
  • Recreate voices
  • Write text

To combat AI & ML in this space, we are now going to need some "good" AI & ML tools developed specifically to detect the "bad" ones, which will augment our existing approaches (Research, Authentication & Literacy)

CAN YOU TRUST WHAT YOU SEE?

Thanks to advances in machine learning, the ability to create synthetic media is becoming available both a professional service, and as a desktop tool for everyday computer users. The results and quality of the resulting 'deep fakes' depends on the quality and amount of data entered into the model.

We used a company called Synthesia to record BBC newsreader Matthew Amroliwala. This footage was then manipulated to generate a new visuals to create new 'language dubs' where Matthew appears to be fluent in Spanish, Mandarin and Hindi.

This type of technology can be used for;

GOOD: Creative dubs, factual corrections, editing fixes

ILL: Disinformation

CAN YOU TRUST WHAT YOU HEAR?

Similarly, there are numerous products emerging that also use machine learning to generate audio based on 'training data' provided to it. This training data can be as little as 20 minutes worth of audio clips of a person's voice. This raises implications for presenters and other public figures, whose 'voice data' is in the public sphere in large amounts.

The more sophisticated tools actually make it easy for voice to be manipulated, allowing fir direct editing of text, rather than audio waveforms. To manipulate speech, one only needs to know how to type - you do not have to be an audio engineer...

DEVELOPING DETECTION TOOLS

We are aware that we will need AI & ML tools to help us spot disinformation that other AI & ML tools have created in the first place.

  • These tools do not really exist yet
  • To help develop these tools, datasets of 'known fakes' maybe required*
  • Once detection tools exist, each 'side' will then be locked in an 'arms race' as both manipulation and detection tools will try to beat each other.
  • The AI & ML tools to generate fake news will become commonplace
  • These tools will only ever keep us on a level playing field - much like current cybersecurity. Nobody wins in the end...

BLINK OF AN EYE

One example of an early idea is to use ML to detect what may or may not be 'natural' blinking rates for real human faces versus 'deepfaked' faces.

However, it is already easy to envisage more sophisticated deepfake tools that will learn how to make blinks appear naturally.

Research Paper: https://arxiv.org/pdf/1806.02877.pdf

TRAINING THE TOOLS WITH DATASETS


Projects such as Face2Face at NiessnerLab are trying to make datatsets of manipulated Faces, and 'mapped' genuine faces, available at scale so that further tools can be developed.

http://niessnerlab.org/projects/roessler2018faceforensics.html

Another project, Deepnews.ai is attempting to train algorithms to detect low quality (ie "Fake") news automatically at scale. Though the intention is for the ML to do the bulk of the work, a "Human Scoring Interface" verification method will be needed to train the model in its early stages.

https://www.deepnews.ai/