OpenAI o1, New Gemini models, Qwen 2.5

OpenAI's o1: an AI model with enhanced reasoning to handle complex tasks

o1-mini | ChatHub
OpenAI’s cost-efficient reasoning model, excels at STEM, especially math and coding—nearly matching the performance of OpenAI o1 on evaluation benchmarks such as AIME and Codeforces.

OpenAI has introduced the o1 model, a new series designed to improve the way AI thinks before responding. This model is particularly adept at handling complex tasks in science, coding, and math, making it a valuable tool for users who require deep reasoning capabilities.

For more details read the release post from OpenAI.

Updated Gemini 1.5 models: A Leap in General Performance

Gemini 1.5 Pro | ChatHub
The Gemini 1.5 Pro is a cutting-edge multimodal AI model developed by Google DeepMind. It excels in processing and understanding text, images, audio, and video, featuring a breakthrough long context window of up to 1 million tokens. This model powers generative AI services across Google’s platforms and supports third-party developers.

Google's Gemini 1.5 series has also released updated version, offering significant improvements in quality, particularly in math, long context processing, and vision tasks. These models are versatile, designed to handle a broad spectrum of text, code, and multimodal tasks.:

With the latest updates, 1.5 Pro and Flash are now better, faster, and more cost-efficient to build with in production. We see a ~7% increase in MMLU-Pro, a more challenging version of the popular MMLU benchmark. On MATH and HiddenMath (an internal holdout set of competition math problems) benchmarks, both models have made a considerable ~20% improvement. For vision and code use cases, both models also perform better (ranging from ~2-7%) across evals measuring visual understanding and Python code generation.

For more information, check out the release post from Google.

Qwen2.5: A Landmark in Open-Source AI

Qwen2.5 72B | ChatHub
Qwen2.5 is a model pretrained on a large-scale dataset of up to 18 trillion tokens, offering significant improvements in knowledge, coding, mathematics, and instruction following compared to its predecessor Qwen2. The model also features enhanced capabilities in generating long texts, understanding structured data, and generating structured outputs, while supporting multilingual capabilities for over 29 languages.

In the months following Qwen2’s release, developers have provided invaluable feedback, leading to the development of smarter and more knowledgeable language models. The introduction of Qwen2.5 marks a significant milestone in open-source AI.

Highlights:

  • Variety of Sizes: Qwen2.5 is available in sizes from 0.5B to 72B parameters, catering to diverse needs.
  • Specialized Models: Includes Qwen2.5-Coder and Qwen2.5-Math, tailored for coding and mathematical tasks.
  • Open-Source Accessibility: Most models are licensed under Apache 2.0, encouraging collaboration and innovation.