Blog

How to improve your evaluations
Learn how to improve your evals by identifying new evaluators, iterating on existing scorers, and adding new test cases.
How Zapier builds production-ready AI products
Zapier was one of the earliest adopters of GenAI. In this post, we share insights from Mike Knoop, Co-founder & Head of AI at Zapier.
AI Development Loops
Key activities that enable fast feedback and clear signal when developing AI features.
Getting started with automated evaluations
Three actionable approaches for engineering teams to get started with automated evaluations.
Eval feedback loops
Learn how to build robust eval feedback loops for AI products by connecting real-world log data to your evals. Discover best practices for structuring evals, flowing production logs into eval datasets, and using Braintrust to streamline the process.
Braintrust selected to be in the Enterprise Tech 30
The Enterprise Tech 30 by Wing Venture Capital names the highest potential private companies in enterprise technology.
How Hostinger Evaluates AI Applications with Braintrust
Liucija, Senior Data Scientist on the AI team @ Hostinger, provides an overview of how she leverages Braintrust to accelerate Hostinger's AI development process and automate over 40% of customer support chat conversations.
2023, a year in review
Check out your Braintrust 2023 year in review to see how you did this year!
Braintrust's seed round: $5m to build infrastructure for AI products
Announcing Braintrust's seed round led by Greylock. The round builds on our early traction with customers like Zapier, Coda, Airtable, and Instacart and allows us to accelerate our vision of building world-class infrastructure for AI products. We are hiring for a number of roles, so please check out our careers page if you are interested in joining us.
Open sourcing the AI proxy
The Braintrust AI Proxy is now open source! We also added support for Azure OpenAI, provider load balancing, and the Replicate lifeboat model.
AI proxy: fostering a more open ecosystem
Introducing Braintrust's latest feature: an AI proxy that lets you use open source models like LLaMa 2 and Mistral, as well as all of OpenAI's and Anthropic's models, behind a single interface with caching, security, and API key management built in.
State of AI Development 2023
Retool recently surveyed over 1,500 workers and how their companies are adopting AI in their State of AI 2023 report. Here's what they are struggling with and how Braintrust can help them.
The AI product development journey
Building reliable AI apps is hard. It’s easy to build a cool demo but hard to build an AI app that works in production for real users. In traditional software development, there’s a set of best practices like setting up CI/CD and writing tests to make your software robust and easy to build on. But, with LLM apps it’s not obvious how to create these tests or processes.
Weekly update 11/13/23
Function calling and tool support, new blog posts, and project UI improvements.
Weekly update 11/06/23
Perplexity models support, new OpenAI models, reworked diff selector in experiment view.
Weekly update 10/30/23
Resizable sidebar, new help tooltips, performance optimizations, Replit.
Weekly update 10/23/23
Auto input variables in the playground, duration metrics, performance optimizations, partner releases.
Weekly update 10/16/23
Tracing, experiment dashboard customization, text-block prompts, bigger tables, new eval docs.
Weekly update 10/09/23
Performance improvements, fine tuning tutorial, Alpaca Evals, autocomplete in the playground.
It's time to build reliable AI
Introducing Braintrust: the enterprise-grade stack for building AI products. From evaluations, to prompt playground, to data management, we take uncertainty and tedium out of incorporating AI into your business.

Ship AI with confidence