19:46
Five hard earned lessons about Evals — Ankur Goyal, Braintrust
AI Engineer
19:12
On Engineering AI Systems that Endure The Bitter Lesson - Omar Khattab, DSPy & Databricks
15:22
Evals Are Not Unit Tests — Ido Pesok, Vercel v0
5:14
The Future of Evals - Ankur Goyal, Braintrust
1:43:46
How to build world-class AI products — Sarah Sachs (AI lead @ Notion) & Carlos Esteban (Braintrust)
16:28
Perceptual Evaluations: Evals for Aesthetics — Diego Rodriguez, Krea.ai
19:14
2025 is the Year of Evals! Just like 2024, and 2023, and … — John Dickerson, CEO Mozilla AI
Fuzzing in the GenAI Era — Leonard Tang, Haize Labs
5:41
Why should anyone care about Evals? — Manu Goyal, Braintrust
19:23
How to look at your data — Jeff Huber (Choma) + Jason Liu (567)
16:15
Turning Fails into Features: Zapier’s Hard-Won Eval Lessons — Rafal Willinski, Vitor Balocco, Zapier
40:28
[Full Workshop] Building Metrics that actually work — David Karam, Pi Labs (fmr Google Search)
20:33
Evaluating AI Search: A Practical Framework for Augmented AI Systems — Quotient AI + Tavily
32:28
Strategies for LLM Evals (GuideLLM, lm-eval-harness, OpenAI Evals Workshop) — Taylor Jordan Smith
1:25:08
[Evals Workshop] Mastering AI Evaluation: From Playground to Production
19:32
From Self-driving to Autonomous Voice Agents — Brooke Hopkins, Coval