Braintrust Weekly Update

Ankur Goyal · Founder

09 October 2023

It’s been a busy week for us at Braintrust. Here’s some of the new features we shipped this week:

Dataset UI

You can easily finetune GPT3.5 to generate SQL queries using OpenAI and then evaluate how the fine tuned model compares to the base model using Braintrust. Check out the Jupyter Notebook example here to get started.

We evaluated the Alpaca evals leaderboard in Braintrust
The Alpaca evals use Claude and GPT4 to rank how different LLMs perform on a variety of tasks. You can see the aggregated rankings and also dig into individual models and better understand their strengths and weaknesses. Check out the Alpaca Evals braintrust project on Braintrust to dig in further—no login required.

We improved Datasets. See when they were last edited and the version number from the UI.
Easily see when a dataset was last changed from the UI by hovering over the ID. We also provide example code so you can quickly use the current dataset version in your project. Learn more on our datasets guide.

Release notes

