- Published on
KOFFVQA: An Objectively Evaluated Free-form VQA Benchmark for Large Vision-Language Models in the Korean Language
subjective-evaluationopen-endednessgrading-criteriaKorean-language-benchmarksLarge-Vision-Language-Modelsvisual-question-answeringevaluation-methodsgenerative-language-models
MAUM AI Inc. / Republic of Korea•
The recent emergence of Large Vision-Language Models(VLMs) has resulted in a variety of different benchmarks for evaluating such models. Despite this, we observe that most existing evaluation methods suffer from the fact that they either require the model to choose from pre-determined responses,...