[AAAI2026] Transferable Model-agnostic Vision-Language Model Adaptation for Efficient Weak-to-Strong Generalization
Jihwan Park1, Taehoon Song2, Sanghyeok Lee2, Miso Choi1, Hyunwoo J. Kim2
1Korea University 2Korea Advanced Institute of Science and Technology (KAIST)
Note: This repository currently contains the raw code. Code cleaning is in progress.
conda create -n transmiter python=3.9 -y && conda activate transmiter
pip install torch==1.13.0+cu117 torchvision==0.14.0+cu117 torchaudio==0.13.0 --extra-index-url https://download.pytorch.org/whl/cu117
pip install -r requirements.txtDownload the data from Hugging Face and place it in the data folder with the following structure:
data/
├── classnames_dictionary.csv
└── base2new_promptsrc/
├── caltech101/
├── dtd/
├── eurosat/
├── fgvc_aircraft/
├── food101/
├── imagenet/
├── oxford_flowers/
├── oxford_pets/
├── stanford_cars/
├── sun397/
└── ucf101/
We use PromptSRC for fine-tuning the source model.
Supported datasets: EuroSAT, DTD, Food101, Pets, Aircraft, Flowers, UCF, Caltech, Cars, SUN397, ImageNet
bash run_scripts/base2novel/knowledge_extraction/ket.sh \
ViT-B-16 \ # source model
"['promptsrc']" \ # source fine-tuning strategy
dtd \ # dataset
1 # seedbash run_scripts/base2novel/knowledge_transfer/transfer.sh \
ViT-B-16 \ # source model
ViT-L-14 \ # target model
"['promptsrc']" \ # source fine-tuning strategy
dtd \ # dataset
1 # seed@inproceedings{park2026transmiter,
title={Transferable Model-agnostic Vision-Language Model Adaptation for Efficient Weak-to-Strong Generalization},
author={Park, Jihwan and Song, Taehoon and Lee, Sanghyeok and Choi, Miso and Kim, Hyunwoo J.},
booktitle={AAAI},
year={2026}
}