Evals: AI Engineer World's Fair 2025

RSS

AI Engineer | 16 videos | Updated 1 month ago

View playlist on YouTube | Switch Invidious Instance

19:46

Five hard earned lessons about Evals — Ankur Goyal, Braintrust

AI Engineer

19:12

On Engineering AI Systems that Endure The Bitter Lesson - Omar Khattab, DSPy & Databricks

AI Engineer

15:22

Evals Are Not Unit Tests — Ido Pesok, Vercel v0

AI Engineer

5:14

The Future of Evals - Ankur Goyal, Braintrust

AI Engineer

1:43:46

How to build world-class AI products — Sarah Sachs (AI lead @ Notion) & Carlos Esteban (Braintrust)

AI Engineer

16:28

Perceptual Evaluations: Evals for Aesthetics — Diego Rodriguez, Krea.ai

AI Engineer

19:14

2025 is the Year of Evals! Just like 2024, and 2023, and … — John Dickerson, CEO Mozilla AI

AI Engineer

19:12

Fuzzing in the GenAI Era — Leonard Tang, Haize Labs

AI Engineer

5:41

Why should anyone care about Evals? — Manu Goyal, Braintrust

AI Engineer

19:23

How to look at your data — Jeff Huber (Choma) + Jason Liu (567)

AI Engineer

16:15

Turning Fails into Features: Zapier’s Hard-Won Eval Lessons — Rafal Willinski, Vitor Balocco, Zapier

AI Engineer

40:28

[Full Workshop] Building Metrics that actually work — David Karam, Pi Labs (fmr Google Search)

AI Engineer

20:33

Evaluating AI Search: A Practical Framework for Augmented AI Systems — Quotient AI + Tavily

AI Engineer

32:28

Strategies for LLM Evals (GuideLLM, lm-eval-harness, OpenAI Evals Workshop) — Taylor Jordan Smith

AI Engineer

1:25:08

[Evals Workshop] Mastering AI Evaluation: From Playground to Production

AI Engineer

19:32

From Self-driving to Autonomous Voice Agents — Brooke Hopkins, Coval

AI Engineer