GitHub - AgneseRe/Benchmarking-Open-MLLMs: Evaluation framework for MLLMs on the Odd-One-Out task. Benchmarking spatial reasoning, relational logic, zero-shot anomaly detection in complex multi-object scenes.

AgneseRe / Benchmarking-Open-MLLMs Public

Notifications You must be signed in to change notification settings
Fork 0
Star 0

Evaluation framework for MLLMs on the Odd-One-Out task. Benchmarking spatial reasoning, relational logic, zero-shot anomaly detection in complex multi-object scenes.

0 stars 0 forks Branches Tags Activity

Star

Notifications

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
common		common
ResultsOpenMLLMs.ipynb		ResultsOpenMLLMs.ipynb
internvl.py		internvl.py
llava.py		llava.py
minicpm.py		minicpm.py
mllm_odd_one_out_results.json		mllm_odd_one_out_results.json
qwenvl.py		qwenvl.py

About

Evaluation framework for MLLMs on the Odd-One-Out task. Benchmarking spatial reasoning, relational logic, zero-shot anomaly detection in complex multi-object scenes.

anomaly-detection relational-reasoning mllm zero-shot-anomaly-detection mllm-evaluation