Interesting observations: \* Llama 3.2 multimodal actually still ranks below Mol...

daemonologist · on Sept 25, 2024

On the second point, you're comparing MMMU-Pro (multimodal) to MMLU-Pro (text only). I don't think they published scores on MMLU-Pro for 3.2.

(Edit: parent comment was corrected, thanks!)

wesleyyue · on Sept 25, 2024

Yep you're right, thanks for catching (sorry for the ninja edit!)

idiliv · on Sept 25, 2024

Where do you see the MMLU-Pro evaluation for Llama 3.2 90B? On the link I only see Llama 3.2 90B evaluated against multimodal benchmarks.

wesleyyue · on Sept 25, 2024

Ah you're right I totally misread that!