Posts on Matthew Norberg's Data Engineering Blog

Boston Code Camp Recap: Building AI Agents in Databricks

mattnorberg4@gmail.com (Matthew Norberg) — Sun, 05 Apr 2026 00:00:00 +0000

A recap of my session at Boston Code Camp, covering how to build, deploy, and test AI agents in Databricks using the Responses Agent framework. Includes links to the slides and code on GitHub.

The Hidden Cost of Databricks AI Agent Redeploys

mattnorberg4@gmail.com (Matthew Norberg) — Thu, 12 Feb 2026 00:00:00 +0000

Redeploying AI agents in Databricks can quietly increase serving costs in ways that aren’t immediately obvious. Each call to agents.deploy() creates a new agent version, and even versions receiving 0% of traffic may still consume compute resources. In this post, I walk through how we uncovered this behavior, the hypotheses we tested, and the experiment that confirmed it. Cleaning up unused agent versions ultimately reduced our serving costs by roughly 50%.

Creating AI Processing Pipelines: A Data-First Approach

mattnorberg4@gmail.com (Matthew Norberg) — Wed, 17 Dec 2025 00:00:00 +0000

Databricks Mosaic AI Gateway captures rich AI agent request and response data, but not in a format suitable for analysis. Turning that data into insights requires processing pipelines, and before building them, you need to understand the different shapes inference data can take. This post argues for a data-first approach that intentionally generates and examines real inference cases before designing pipelines that have to survive production.

Tracing with Databricks Mosaic AI Gateway: A Practical Guide

mattnorberg4@gmail.com (Matthew Norberg) — Sun, 05 Oct 2025 00:00:00 +0000

Step-by-step guide to enabling MLflow tracing with Databricks Mosaic AI Gateway. Details the recommended ResponsesAgent approach, examines alternative methods (foundation/external endpoints and custom Python models), and highlights the pitfalls that make the agent path preferable.

Databricks Data Quality Expectations Guide

mattnorberg4@gmail.com (Matthew Norberg) — Sun, 21 Sep 2025 00:00:00 +0000

Practical walkthrough for implementing data quality expectations in Databricks Delta Live Tables (Lakeflow Declarative Pipelines). Covers a four-step process to profile data, formalize and translate rules, quarantine noncompliant records, and balance strict versus permissive checks, with lessons learned and links to the talk, slides, and demo code.

The Databricks Tool You Didn't Know You Needed

mattnorberg4@gmail.com (Matthew Norberg) — Tue, 16 Sep 2025 00:00:00 +0000

Explains how to clearly distinguish Databricks environments by adding environment-specific labels to browser tabs with a Tampermonkey userscript. Outlines the risk of identical tab titles, provides the script with domain placeholders, and gives step-by-step setup and troubleshooting guidance for dev/QA/prod.