Mistral Large 2 just dropped! And it's not too large.
3X smaller than Llama 3.1 405B, but performance superior and on par.
This is TWO open-source models similar to GPT-4 released in the past two days.
AI access to all!
Model Introduction:
Mistral Large 2: 123B-parameter model with a 128k context window.
Performance: Superior or on par with Llama 3.1 405B in many benchmarks.
Training Data: Large amounts of source code and multilingual data.
Benchmark Performance:
HumanEval and MultiPL-E: Outperforms Llama 3.1 405B instruct, scores just below GPT-4o.
MATH (0-shot, without CoT): Only falls behind GPT-4o.
Multilingual MMLU: Outperforms Llama 3.1 70B base by +6.3% average over 9 languages; on par with Llama 3 405B (-0.4% below).
Alignment and Instruction Capabilities:
Increased effort in alignment and instruction capabilities compared to previous Mistral Large.
Performance on WildBench, ArenaHard, and MT Bench: On par with the best models, significantly less verbose.