H2O.ai redefines the frontier of general-purpose AI agents

H2O.ai, a portfolio company of Veligera Capital, has redefined the state of the art in general-purpose AI agents, outperforming Google and Microsoft to set a new benchmark record on GAIA (General AI Assistants) — the most rigorous test of real-world AI utility to date.

The GAIA benchmark assesses how effectively AI systems solve practical, open-ended problems requiring deep research, data analysis, document understanding, and reasoning. On average, human experts with relevant degrees score 92% across the 300-task GAIA test set, requiring several person-days to complete.

Художник GAIA оценивает, как эффективно системы ИИ решают практические, открытые проблемы, требующие глубоких исследований, анализа данных, понимания документов и рассуждений. В среднем человеческие эксперты с соответствующими степенями набирают 92% по сравнению с 300-х задачами GAIA, что требует нескольких человеческих дней для завершения.

H2O.ai’s h2oGPTe Agent ranked #1 in the GAIA leaderboard with a groundbreaking score of 65%, significantly ahead of Google’s Langfun Agent (49%), Microsoft Research (38%), and Hugging Face (33%). The h2oGPTe Agent outperformed the previous GAIA record by 15 percentage points, setting a new global standard.

Built with the world’s leading models for reasoning, multimodal image and video understanding, natural language processing, and code generation/execution, h2oGPTe reflects H2O.ai’s deep-stack innovation and engineering excellence. These results underscore H2O.ai’s leadership in the race to deliver adaptive, enterprise-grade AI agents that can meaningfully accelerate decision-making and digital transformation across industries.

H2O.ai redefines the frontier of general-purpose AI agents

COOKIES POLICY

COOKIES POLICY