Dotika
Posts
New AGI test reveals AI’s limitations

New AGI test reveals AI’s limitations

ALSO : DeepSeek new model disrupts AI

Geoffrey NICHIL & Alexandre HOTTON
March 25, 2025

Hi Synapticians!

When we talk about "intelligent" AI, what exactly do we mean? Is it the ability to write poetry, play chess, or generate convincing images? As AI systems become increasingly sophisticated, measuring their true intelligence becomes both crucial and challenging.

One of the best ways to evaluate AI intelligence is through carefully designed benchmarks - standardized tests that reveal what these systems can and cannot do.

This brings us to this week's news: ARC AGI 2 has officially launched, raising the bar for what counts as "intelligent" in the AI world. While ARC-AGI-1 made history in 2019-2024 as the benchmark that caught the moment AI moved beyond memorization (remember OpenAI's o3 system?), this next-gen test is significantly tougher.

What makes ARC AGI special? Unlike other benchmarks obsessed with superhuman capabilities, it focuses on tasks humans find easy but AI finds impossible. The smaller this gap, the closer we are to true AGI.

Every ARC-AGI-2 task was solved by at least two humans in two attempts or less, yet remains unsolved by frontier AI systems. To win, AI needs both high adaptability AND efficiency - no more brute-forcing solutions.

As one researcher put it: "We've gone from testing if AI can pass algebra to seeing if it can elegantly solve calculus while explaining its work to a fifth-grader." The race to true general intelligence just got more interesting!

Here are the starting scores:

Top AI news

1. AI models struggle with new AGI test
The Arc Prize Foundation introduced ARC-AGI-2, a new test assessing AI’s ability to adapt to novel problems. Leading models like OpenAI’s o1-pro and DeepSeek’s R1 scored under 1.3%, while humans averaged 60%. Unlike previous benchmarks, this test prioritizes efficiency over brute-force computation. The results highlight AI’s current limitations in reasoning and adaptability, raising questions about the true progress toward artificial general intelligence. A new challenge offers a prize for achieving 85% accuracy at minimal cost, pushing AI research toward more efficient learning methods.

2. DeepSeek new model challenges OpenAI with open-source efficiency
DeepSeek-V3-0324 is a groundbreaking open-source AI model that runs at 20 tokens per second on a Mac Studio. Its mixture-of-experts architecture optimizes efficiency by activating only necessary parameters, while Multi-Head Latent Attention and Multi-Token Prediction enhance speed. Unlike OpenAI’s closed models, DeepSeek’s MIT-licensed approach democratizes AI access. This release signals a shift in AI development, with China embracing open-source strategies to compete with Silicon Valley.

3. The rise of sovereign AI
As US AI models struggle with language inclusivity and content moderation, countries like those in Europe and India are investing in sovereign AI. These efforts aim to create localized, culturally aware AI systems that better serve their populations. The shift is driven by geopolitical concerns, the decline of US digital rights funding, and the need for more effective moderation. While promising, these initiatives also raise governance and bias challenges. The future of AI is no longer just about technology—it’s about sovereignty and control.

Bonus. LetzAI’s AI-powered creativity
LetzAI, initially a local AI image generator in Luxembourg, has expanded globally, allowing users to create unique visuals by blending their own images with community models. Facing a global GPU shortage, LetzAI partnered with Gcore, utilizing NVIDIA H100 GPUs to enhance model training and real-time image generation. This collaboration optimized costs and improved performance, enabling brands like PUMA and Sloggi to leverage AI-generated visuals for marketing. With continuous innovation, including image upscaling and video generation, LetzAI’s partnership with Gcore remains crucial for its future growth.

Tweet of the Day

Introducing learned natural walking
Figure can now walk naturally like a human
Years of data was trained in simulation in just a few hours
— Figure (@Figure_robot)
2:02 PM • Mar 25, 2025

Theme of the Week

AI for mental health - The startup

Discover how Woebot Health leverages conversational AI to revolutionize mental health care, offering immediate, personalized support through empathetic digital interactions. Explore their innovative approach, blending psychology and advanced technology to tackle mental health challenges. Dive into this article to see how AI-driven therapy is reshaping patient well-being and transforming healthcare access.

Stay Connected

Feel free to contact us with any feedback or suggestions—we’d love to hear from you !

Reply

or to participate.