Low-Compute Test-Time Adaptation Research Track

Overview

This page is for students who are interested in improving models at test time while keeping GPU usage as small as possible.

You do not need to start with large-scale training or expensive hardware. This track is designed for students who want a modern research topic with realistic experiments and clear questions.

The goal of this track is to explore questions such as:

How can we improve a pretrained model at test time without full retraining?
How can we make test-time adaptation accurate and cheap?
How can we study modern CVPR-level topics with limited compute?
Can these ideas work for vision-only, language-only, vision-language, or vision-language-action models?

If you are interested in these questions, this track may be a good place to start.

What to avoid at the beginning

Do not begin with:

huge multimodal LLM fine-tuning
methods requiring heavy backpropagation at test time
projects with unclear evaluation settings
very large benchmarks before understanding the core problem

When to contact me

If you read some of the papers on this page and feel interested, feel free to contact me.

You do not need to understand everything before reaching out.
Interest, curiosity, and steady effort are enough.
A careful research attitude matters more than starting with a large model.

Part I. A simple starting path

A good starting path is the following:

Start with CLIP to understand zero-shot transfer.
Read TDA to see how test-time improvement can be done efficiently.
Read one recent paper on realistic test-time evaluation.
Reproduce one baseline on a small benchmark with distribution shift.
Measure not only accuracy, but also latency and memory cost.

You do not need to do everything at once.

Part II. Core background for efficient test-time learning

These are the most important shared starting points for this page.

1. CLIP (ICML 2021)

Paper: Learning Transferable Visual Models From Natural Language Supervision
Code: https://github.com/openai/CLIP

Why read it: a strong starting point for modern vision-language transfer
Focus on: image-text alignment, zero-shot classification, prompt-based inference

2. Tent (ICLR 2021)

Paper: Fully Test-Time Adaptation by Entropy Minimization
Code: https://github.com/DequanWang/tent

Why read it: one of the simplest and most influential starting points for test-time adaptation
Focus on: entropy minimization, online adaptation, normalization-based updates

3. TDA (CVPR 2024)

Paper: Efficient Test-Time Adaptation of Vision-Language Models
Code: https://github.com/kdiAAA/TDA

Why read it: a clear starting point for efficient test-time adaptation in multimodal settings
Focus on: training-free adaptation, cache-based updates, low-cost improvement at inference time

Part III. Main track — Efficient Test-Time Adaptation

Typical question:

How can we improve a pretrained model on shifted data without expensive retraining?

Why this track is good

This track is suitable for students who want:

a modern topic with manageable experiments
strong connections to robustness and deployment
a practical path toward publication

Possible directions

A. Vision-only test-time adaptation

Why study it: often the simplest place to begin
Good for students because: experiments are lighter and the core adaptation issue is easier to isolate

What students should reproduce first in this track

Choose one:

TDA
Bayesian Test-Time Adaptation

Then compare it against:

one frozen pretrained baseline
one realistic evaluation setting

Good starter experiments for this track

low-resolution corruption
blur / noise corruption
small online test streams
few-shot target-domain adaptation
modality-specific settings depending on the chosen direction

Part IV. A promising publication direction

A strong project in this track could be:

Realistic and efficient test-time adaptation under strict compute constraints

This direction is attractive because it combines:

modern vision-language learning
realistic deployment constraints
low-GPU experimentation
a clear and focused research question

The page should stay focused on one question:

How can we improve a deployed model at test time with minimal extra compute?

Part V. Good starter benchmarks

CIFAR-100
Oxford Flowers
DTD
EuroSAT
ImageNet subset
corruption benchmarks with blur, noise, or low resolution

Avoid very large datasets at the beginning.

Part VI. Suggested first mini-project

A strong first project is:

reproduce one lightweight test-time adaptation baseline
compare it with one frozen pretrained baseline
evaluate under one realistic shift setting
report accuracy, latency overhead, and memory overhead

This is a good starting point because it answers a concrete and modern question:

Can test-time adaptation remain useful when compute is limited and deployment is realistic?

Final note

It is better to have one clear track than to force separate tracks with the same papers.

So this page should stay centered on:

test-time adaptation
realistic evaluation
low-compute improvement
robustness under shift
modality flexibility across vision, language, multimodal, or VLA settings