Pragmatic Image Compression

for Human-in-the-Loop Decision-Making

Paper (to appear at NeurIPS 2021) | OpenReview | Blog post | Video | Code

Siddharth Reddy, Anca D. Dragan, Sergey Levine

University of California, Berkeley

Standard lossy image compression algorithms aim to preserve an image's appearance, while minimizing the number of bits needed to transmit it. However, the amount of information actually needed by a user for downstream tasks — e.g., deciding which product to click on in a shopping website — is likely much lower. To achieve this lower bitrate, we would ideally only transmit the visual features that drive user behavior, while discarding details irrelevant to the user's decisions. We approach this problem by training a compression model through human-in-the-loop learning as the user performs tasks with the compressed images.

The key insight is to train the model to produce a compressed image that induces the user to take the same action that they would have taken had they seen the original image. To approximate the loss function for this model, we train a discriminator that tries to distinguish whether a user's action was taken in response to the compressed image or the original. We call our method PragmatIc COmpression (PICO).

We evaluate our method through experiments with human participants on four tasks:

  • reading handwritten digits,

  • browsing an online shopping catalogue of cars,

  • verifying photos of faces,

  • and playing a car racing video game.

The results show that our method learns to match the user's actions with and without compression at lower bitrates than baseline methods, and adapts the compression model to the user's behavior.

In the digit reading task, PICO learns to preserve the digit number, while a baseline compression method that optimizes perceptual similarity learns to preserve task-irrelevant details like line thickness and pose angle.

PICO learns to compress the same images in different ways, depending on the tasks that users perform with those images.

In a shopping task, PICO learns to preserve the sportiness and perceived price of the car, while randomizing color and background.

For users who perform a different task that involves surveying paint jobs, PICO learns to preserve the color of the car, while randomizing the model and pose of the car.

In a photo verification task that involves checking for eyeglasses, PICO learns to preserve eyeglasses while randomizing faces, hats, and other task-irrelevant features. When we change the task to checking for hats, PICO adapts to preserving hats while randomizing eyeglasses.

The Car Racing environment

In a car racing video game with an extremely high compression rate (50%), PICO learns to preserve bends in the road better than baseline methods, enabling users to drive more safely and stay off the grass.

What is actually happening (uncompressed):




What the user sees (compressed):

PICO (Ours)

Perceptual Similarity (Baseline)

Non-Adaptive (Baseline)