Prachtige post waarin ik me helemaal kan vinden..
Voor de 'AI-nerds' onder ons:
🎯 o3 model van OpenAI scores:
 87.5% on ARC-AGI (the human threshold is 85%)
25.2% of EpochAI's Frontier Math problems (when no other model breaks 2%)
96.7% on AIME 2024 (missed one question)
71.7% on software engineer (o1 was 48.9)
87.7% on PhD-level science (above human expert scores)Â
https://sites.google.com/view/eurekajohn/onderwerpen/ai-benchmarksÂ
🚀 The AI Arms Race: Breaking Down the Battle for Top Large Language Models (LLMs) in 2024 21.09.2024
Useful Links:
Learn more about Elo ratings and Chatbot Arena: https://lmsys.org
OpenAI vs Google – Who's Winning?: https://huggingface.co/.../lmsys/chatbot-arena-leaderboard
Google’s AI architecture innovations: https://artificialanalysis.ai
Global AI market growth statistics: https://huggingface.co/collections/open-llm-leaderboard
Details on xAI’s rise and Gemini’s impact: https://lmsys.org
https://lifearchitect.ai/iq-testing-ai/ - Dr Alan D. Thompson *** het beste, actuele, overzicht++  voor mij ***
2024-09-17Â
Rank* (UB)
Model
Arena Score
95% CI
Votes
Organization
License
Knowledge Cutoff
1
1355
+12/-11
2991
OpenAI
Proprietary
2023/10
Update 22.12.2024