Grok
Musk's AI tops ChatGPT in 2025 benchmarks AFP News

On 17 November 2025, Elon Musk's xAI launched Grok 4.1, touting it as an AI that 'sees the world in real time' through seamless live data access and tool integration. Positioned as a direct challenger to OpenAI's ChatGPT, Grok 4.1 promises not just speed, but emotional intelligence and dynamic responsiveness.

With a top score of 1483 Elo in thinking mode and second place in fast mode on the LMArena leaderboard, Grok 4.1 has sparked fresh debate around Musk's AI ambitions, benchmark supremacy, and the evolving role of empathy in machine intelligence.

Interest in Grok 4.1 has surged, with online searches reflecting growing curiosity about its real-time capabilities, emotional intelligence, and how it stacks up against ChatGPT.

Unveiling Grok 4.1: Benchmark Breakthroughs

Between 1 to 14 November 2025, xAI conducted a silent rollout of Grok 4.1, gradually expanding access to test builds and gathering feedback from increasing user numbers. Blind preference tests during this phase showed users selecting Grok 4.1 over previous versions 64.78 per cent of the time, indicating clear improvements in usability and appeal.

On the LMArena Text Leaderboard, it secured the top overall spot in thinking mode with 1483 Elo and second in fast mode, decisively surpassing Grok 4. Emotional intelligence benchmarks also saw a leap, with high EQ-Bench scores for empathy as highlighted in a post on X. The Creative Writing v3 tests highlighted Grok's flair for lively content.

Factual errors, often referred to as hallucinations, dropped in quick-answer scenarios, verified by real user questions and biography tests. Available free on grok.com, X, and mobile apps, Grok 4.1 retains its smart, consistent personality, thanks to AI-judged training.

Decoding 'Real-Time World Perception'

Grok 4.1 emphasises practical immediacy. Consider the prompt: 'Want me to check X for real-time visitor feedback or generate a map of these spots?' This stems from Grok 4's real-time search and tool use, now refined in 4.1 for sharper, more current responses that bypass ChatGPT's static data limitations.

As one X user noted, 'ChatGPT is an encyclopedia; it's static. Grok is a live feed', making it suitable for quick-moving fields like finance or news. It enables instant trend analysis and fact-checking, building trust in time-sensitive contexts.

A 42 per cent latency reduction and 18 per cent intent accuracy gain make Grok 4.1 more conversational and perceptive, preserving its 'razor-sharp intelligence' while enhancing emotional nuance. Musk's vision translates to an AI that actively engages the present, fostering more human-like partnerships.

Grok 4.1 Versus ChatGPT: Strengths and Rivalries

Grok 4.1 outshines ChatGPT in speed and real-time insights, excelling in fast problem-solving and analytical tasks. ChatGPT, however, provides superior polished outputs for structured work and has a stronger ecosystem. Grok 4.1's code-testing approach checks hallucinations aggressively, prioritising execution over training narratives, a unique edge in technical accuracy.

In 2025 comparisons, Grok leads in reasoning and latency, while ChatGPT dominates in features like voice and developer tools. Feedback on X praises Grok's fun, empathetic tone, especially in creative exchanges.

For live, dynamic needs, Grok is the go-to. For comprehensive support and polish, ChatGPT still holds sway. Their rivalry accelerates innovation, ultimately benefiting users.

Grok 4.1's launch embodies Musk's push for adaptive AI, blending real-time awareness with benchmark excellence to outpace competitors like ChatGPT. As these systems evolve, they promise to make powerful tools available to all, ushering in a future of intuitive, empathetic technology that mirrors our world's pace.