PPP-Agent is an open-source framework for training LLM agents that are not only productive (task success) but also proactive (asking essential clarifying questions) and personalized (adapting to diverse user preferences). It includes UserVille, an interactive environment that turns existing agent benchmarks into multi-turn, preference-aware simulations.
Training Proactive and Personalized LLM Agents
Authors: Weiwei Sun, Xuhui Zhou, Weihua Du, Xingyao Wang, Sean Welleck, Graham Neubig, Maarten Sap, Yiming Yang
https://arxiv.org/pdf/2511.02208
- UserVille: Converts precise prompts into vague ones and simulates 20 user preferences.
- PPP RL: Multi-objective RL optimizing Productivity, Proactivity, and Personalization.
- Tools: Plug-and-play scaffolds for SWE (SWE-Bench, SWE-Gym) and Deep Research (BrowseComp+).
- Metrics: Effort-based proactivity and preference-following personalization.
- Generalization: Transfers to unseen preferences, simulators, and tasks.
PPP-36B is a Seed-36B-Instruct model trained with our PPP RL framework on SWE-Func-Loc tasks: 🤗 sunweiwei/PPP-36B
Download SWE-Bench repo data to envs/gym_data:
cd envs
python download_swe_repo.py --dataset princeton-nlp/SWE-bench_Verified --split test
# Download SWE-Gym data: python download_swe_repo.py --dataset SWE-Gym/SWE-Gym --split trainStart the repo server
cd envs && python repo_server.pyThis contains the training and test data used in our experiments: https://drive.google.com/drive/folders/1yJHQckiRTkshF8SScZHUK3sjp9SpqW2x?usp=sharing
Place the downloaded parquet files in the data/ directory.
Basic usage with the OpenAI API:
python scripts/eval_func_loc.py \
--data_path data/test_ood.parquet \
--model_name gpt-4o-mini \
--local_repo_path /path/to/envs/gym_data \
--local_repo_url http://localhost:8011 \
--num_workers 1Evaluate on multiple datasets:
python scripts/eval_func_loc.py \
--data_path data/test_ood.parquet data/test_id.parquet \
--model_name gpt-4o-mini \
--local_repo_path /path/to/envs/gym_data \
--local_repo_url http://localhost:8011 \
--num_workers 1Key arguments:
--local_repo_path: Path to the gym_data directory--local_repo_url: URL of the repo server--data_path: One or more parquet files to evaluate--model_name: Model to use for evaluation
For all available arguments, run:
python scripts/eval_func_loc.py --helpTo evaluate with vLLM using PPP-36B or other local models:
Note: PPP-36B includes bias terms in attention output projections and requires a patch to serve correctly with vLLM.
# Start vLLM server with the patch (scripts/patch_seed_oss.py)
PYTHONPATH=scripts python -c "import patch_seed_oss" && vllm serve sunweiwei/PPP-36B --port 8000
# Set OPENAI_BASE_URL and run evaluation
export OPENAI_BASE_URL=http://localhost:8000/v1
python scripts/eval_func_loc.py \
--data_path data/test_ood.parquet \
--model_name sunweiwei/PPP-36B \
--local_repo_path /path/to/envs/gym_data \
--local_repo_url http://localhost:8011 \
--num_workers 1See scripts/train_ppp.py and scripts/train_ppp.sh
If you find this work useful, please consider citing our paper:
@article{sun2025pppagent,
title={Training Proactive and Personalized LLM Agents},
author={Sun, Weiwei and Zhou, Xuhui and Du, Weihua and Wang, Xingyao and Welleck, Sean and Neubig, Graham and Sap, Maarten and Yang, Yiming},
journal={arXiv preprint arXiv:2511.02208},
year={2025},
url={https://arxiv.org/abs/2511.02208}
}This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
