Media Summary: Introducing the Agent Arena by Gorilla X LMSYS Chatbot Arena How do different agents stack up in tasks like search, ... Most agents get tested by running a few queries and checking if it looks right. Laurie calls this the vibes problem: it doesn't catch ... As agents evolve from text conversations to autonomous agents capable of multi-step reasoning, tool use, and real-world task ...

Agentic Evals By Shishir Patil - Detailed Analysis & Overview

Introducing the Agent Arena by Gorilla X LMSYS Chatbot Arena How do different agents stack up in tasks like search, ... Most agents get tested by running a few queries and checking if it looks right. Laurie calls this the vibes problem: it doesn't catch ... As agents evolve from text conversations to autonomous agents capable of multi-step reasoning, tool use, and real-world task ... Getting context into an LLM is not just a retrieval problem. It is a search problem. This workshop digs into the part of context ... Join Google's AI leadership to discuss how the rapid acceleration of model capabilities is transforming productivity and access to ... Episode 98 of the Stanford MLSys Seminar Series! Teaching LLMs to Use Tools at Scale Speaker:

The Lakehouse made big data accessible. But it did not come with the management layer needed for what comes next. Gorilla is an open-source LLM from the Sky Lab at UC Berkeley that generates API calls for massive APIs. Gorilla is built by ... With nearly two-thirds of enterprise developers planning production deployments of large language models this year, LLM ... Today, I want to share a new episode with Aman Khan. The best way to learn about AI Hamel Husain and Shreya Shankar teach the world's most popular course on AI In this episode we are joined by a dynamic young entrepreneur — Tirth

Join Mahesh Yadav, top Maven instructor and former AI PM leader at Google, Meta, and Microsoft. In this session, Mahesh breaks ...

Photo Gallery

Agentic Evals by Shishir Patil
LLM Agent Arena (agent-arena.com)
Ship Real Agents: Hands-On Evals for Agentic Applications — Laurie Voss, Arize
Agentic Evaluations Workshop - Deep Dive on the Future on Evals for Agents.
Agentic Search for Context Engineering — Leonie Monigatti, Elastic
Defining the agentic AI era
Teaching LLMs to Use Tools at Scale - Shishir Patil | Stanford MLSys #98
Ep 7: DJ Patil – How Agentic AI Breaks Data Platforms
Generating Conversation: Gorilla, An LLM for Massive APIs - Shishir Patil, Tianjun Zhang (Episode 7)
Lessons from the Trenches: Building LLM Evals That Work IRL: Aparna Dhinkaran
Agentic AI in the Enterprise 2026
Complete Beginner's Course on AI Evaluations in 50 Minutes (2025) | Aman Khan
Sponsored
Sponsored
View Detailed Profile
Sponsored
Sponsored