Looking at the Chatbot Arena leaderboard this afternoon, I came across something unexpected.
For anyone that may not be familiar, Chatbot Arena is a place where the AI community evaluates and rates different LLMs, comparing their outputs in head-to-head competition. The user will provide a prompt and then the two LLMs will simultaneously create an output for the user. The user will then vote on which LLM did provides the best output. The score is then calculated into their overall score and the leaderboard updates in real-time.
I hadn’t visited Chatbot Arena very recently, and the last time I did, I was still learning what the LLM landscape looked like and I didn’t have knowledge of all of the models on the leaderboard.
But today I have a much better understanding of what I am looking at having used many of these models.

I see X’s Grok, Open AI’s GPT 4.5, o1, and o3 models, Google’s Gemini, DeepSeek’r R1 model, and of course Claude’s Sonnet 3.7.
But I also see two names that I am unfamiliar with in AI.
The first was Zhipu.
I took a few minutes just now to look up Zhipu. And I feel like they deserve their own feature. Truthfully, I had never heard of Zhipu before today, but now I am intrigued at what I have read.
- multimodal capabilities, specializing in image and video understanding
- 20 million downloads
And then there was Alibaba. Up to this point, I was completely unaware of Alibaba building in the AI space. Yet here were two different AI models.
Seeing that, I was surprised that hadn’t heard anyone talking about Alibaba’s Qwen2.5-Max or Qwen Plus models.
Coincidentally, today as I am writing this, Alibaba released its latest reasoning model QwQ-32B.
Alibaba claims their QwQ-32B model rivals DeepSeek’s R1 model and outperforms OpenAI’s o1-mini model.
And the model requires significantly less computing power, containing only 32 billion parameters when compared to R1’s 671 billion parameters.
It sounds interesting and I will likely update as I learn more about Alibaba AI.
But for now, Alibaba AI is flying under the radar.
Will the new QwQ-32B model change this?
Time will tell.
But in the meantime, I need to try it.
Leave a reply to claude > other? – Conversating Claude Cancel reply