Changelog Master Feed   /     Creating tested, reliable AI applications (Practical AI #295)

Description

It can be frustrating to get an AI application working amazingly well 80% of the time and failing miserably the other 20%. How can you close the gap and create something that you rely on? Chris and Daniel talk through this process, behavior testing, and the flow from prototype to production in this episode. They also talk a bit about the apparent slow down in the release of frontier models.

Subtitle
Duration
50:09
Publishing date
2024-11-13 19:30
Link
https://changelog.com/practicalai/295
Contributors
Enclosures
https://op3.dev/e/https://cdn.changelog.com/uploads/practicalai/295/practical-ai-295.mp3
audio/mpeg

Shownotes

It can be frustrating to get an AI application working amazingly well 80% of the time and failing miserably the other 20%. How can you close the gap and create something that you rely on? Chris and Daniel talk through this process, behavior testing, and the flow from prototype to production in this episode. They also talk a bit about the apparent slow down in the release of frontier models.

Join the discussion

Changelog++ members save 10 minutes on this episode because they made the ads disappear. Join today!

Sponsors:

  • Fly.ioThe home of Changelog.com — Deploy your apps close to your users — global Anycast load-balancing, zero-configuration private networking, hardware isolation, and instant WireGuard VPN connections. Push-button deployments that scale to thousands of instances. Check out the speedrun to get started in minutes.
  • TimescalePurpose-built performance for AI Build RAG, search, and AI agents on the cloud and with PostgreSQL and purpose-built extensions for AI: pgvector, pgvectorscale, and pgai.
  • Eight SleepUp to $600 off Pod 4 Ultra Go to eightsleep.com/changelog and use the code CHANGELOG. You can try it for free for 30 days - but we’re confident you will not want to return it (we love ours). Once you experience AI-optimized sleep, you’ll wonder how you ever slept without it. Currently shipping to: United States, Canada, United Kingdom, Europe, and Australia.

Featuring:

Show Notes:

Something missing or broken? PRs welcome!

Deeplinks to Chapters

0 Welcome to Practical AI
255
1015 Sponsor: Timescale
255
1174 Robust AI workflows
255
1479 Finding the right workflow
255
1847 Transition from notebook to code
255
2046 Sponsor: Eight Sleep
255
213 Thanksgiving preparations
255
2204 Testing and integrating
255
2407 Sketching out a good framework
255
2843 Roles have shifted
255
2960 Outro
255
297 Agents in production
255
387 AI ceiling & current hype
255
519 Level of transformation
255
57 Sponsor: Fly
255
649 Current models are mostly good enough
255