Posts

2026

Getting More Out of Databricks Genie Code

13 May 2026

A walkthrough of how to improve Databricks Genie Code’s output by building a custom instructions file and skills. Includes a real-world test project — a full medallion pipeline pulling from the TMDB API — to see whether the investment actually pays off.

Boston Code Camp Recap: Building AI Agents in Databricks

5 April 2026

A recap of my session at Boston Code Camp, covering how to build, deploy, and test AI agents in Databricks using the Responses Agent framework. Includes links to the slides and code on GitHub.

The Hidden Cost of Databricks AI Agent Redeploys

12 February 2026

Redeploying AI agents in Databricks can quietly increase serving costs in ways that aren’t immediately obvious. Each call to agents.deploy() creates a new agent version, and even versions receiving 0% of traffic may still consume compute resources. In this post, I walk through how we uncovered this behavior, the hypotheses we tested, and the experiment that confirmed it. Cleaning up unused agent versions ultimately reduced our serving costs by roughly 50%.

2025

Creating AI Processing Pipelines: A Data-First Approach

17 December 2025

Databricks Mosaic AI Gateway captures rich AI agent request and response data, but not in a format suitable for analysis. Turning that data into insights requires processing pipelines, and before building them, you need to understand the different shapes inference data can take. This post argues for a data-first approach that intentionally generates and examines real inference cases before designing pipelines that have to survive production.

Tracing with Databricks Mosaic AI Gateway: A Practical Guide

5 October 2025

Step-by-step guide to enabling MLflow tracing with Databricks Mosaic AI Gateway. Details the recommended ResponsesAgent approach, examines alternative methods (foundation/external endpoints and custom Python models), and highlights the pitfalls that make the agent path preferable.

Databricks Data Quality Expectations Guide

21 September 2025

Practical walkthrough for implementing data quality expectations in Databricks Delta Live Tables (Lakeflow Declarative Pipelines). Covers a four-step process to profile data, formalize and translate rules, quarantine noncompliant records, and balance strict versus permissive checks, with lessons learned and links to the talk, slides, and demo code.

The Databricks Tool You Didn't Know You Needed

16 September 2025

Explains how to clearly distinguish Databricks environments by adding environment-specific labels to browser tabs with a Tampermonkey userscript. Outlines the risk of identical tab titles, provides the script with domain placeholders, and gives step-by-step setup and troubleshooting guidance for dev/QA/prod.

↑