Skip to main content

About Me

Matthew Norberg
Author
Matthew Norberg
Data engineer building AI applications, reliable pipelines, and practical tools.

Hi! I’m Matthew Norberg, a Data Engineer with a passion for turning complex data challenges into clean, maintainable, and high-performing solutions. Over the past several years, I’ve had the opportunity to work with Databricks, Azure, and a variety of modern data tools, building platforms, pipelines, and systems that help organizations make better use of their data.

My Journey in Data Engineering
#

I’ve always had a passion for programming and problem-solving, which first led me down the path of software engineering during my undergraduate studies. In graduate school, I started working more with data, earning a degree in Computer Science with a concentration in Data Science. This combination of software engineering and data expertise naturally led me to data engineering—a perfect middle ground between building robust systems and working with meaningful data. My background in CS and data science has prepared me well to tackle the challenges of modern data engineering.

My approach to data engineering can be summed up in one simple philosophy: first make it work, then make it work better, a mindset I learned from my mentor, Rich Dudley. I thrive in environments where I can take a messy or incomplete problem and turn it into something reliable, scalable, and elegant.

My Projects & Work Highlights
#

  • Platform Engineering & Architecture: Built and maintained Databricks environments in Azure using Terraform and Terragrunt. Configured Unity Catalogs, external locations, and volumes, and designed Dev, QA, and Prod environments. Contributed to cost optimization efforts in the Databricks Well-Architected Framework.

  • Data Pipelines & Medallion Architecture: Designed fault-tolerant pipelines that move data through the medallion architecture layers—bronze, silver, and gold—improving data quality at each step and preparing it for use by downstream teams. Pipelines were engineered to minimize manual steps, making workflows easy to deploy, manage, and maintain.

  • Safeguarded Sensitive Procedures: Wrote code to handle sensitive processes mindfully, reducing the risk of mistakes. For example, I developed a WordPress ingestion client in Python with defensive safeguards to ensure proper usage.

  • Data Quality & Governance: Developed data quality checks and DLT pipelines using Data Expectations. Built dashboards to monitor data health, integrated governance tools like Atlan, and was selected as a Databricks Data & AI Summit speaker candidate, submitting a presentation on Data Expectations in Databricks.

  • Large Data Processing & Analytics: Tuned Spark pipelines and computed complex KPIs. Reverse-engineered Tally Street metrics—a tool used by accountants to extract KPIs from general ledgers—using Python and SQL to improve speed, accuracy, and accessibility.

  • Democratizing Data & AI: Built AI/BI Genies in Databricks, AI-driven tools that allow colleagues to ask questions about company data and generate dashboards, making AI and analytics accessible to non-technical users.

  • Salesforce Knowledge Base Ingestion: Ingested Salesforce Knowledge Base articles—help desk pages—from Databricks using the Salesforce connector. Added this content to a vector index to power a search application, making it easier for customers to find support and internal teams to access knowledge.

  • WordPress Website Ingestion: Created a Python connector from scratch to ingest the entire company WordPress site into Databricks. Implemented retry and backoff logic to handle API failures or network issues. Designed workflows to capture the full website on day 1, then only incremental changes on subsequent days—ensuring that temporary failures don’t require manual reruns and the system automatically catches up the next day.

  • CI/CD & Automation: Configured service principals and created Azure DevOps CI/CD pipelines using Infrastructure as Code practices for reliable, repeatable deployments.

  • Documentation & Knowledge Sharing: Created Confluence documentation for all key processes, designed for future ingestion into a company-wide vector database to make knowledge accessible to other engineers.

Outside the Data World
#

When I’m not working with data, I like to stay active and explore the outdoors. Golfing and hiking are two of my favorite ways to recharge and stay focused, and I also love rock climbing, which keeps me on my toes—literally and figuratively!

I’m also a huge pizza and coffee nerd:

  • I’ve learned to make pizza dough from scratch, and true to my data-driven nature, I’ve kept a tally in 2025 of every pizza I’ve made.
  • Nearly every morning, I make a pour-over coffee using my V60, my favorite coffee brewer. My go-to beans are typically light to medium roasts from George Howell Coffee.

I also genuinely enjoy tinkering with personal coding projects, experimenting with new tools, and learning ways to make complex systems simpler and more efficient. And just like in my professional life, I love solving puzzles—whether it’s in code, a tricky climbing route, or perfecting a pizza crust!


I’m always excited to connect with fellow data enthusiasts, share what I’ve learned, and continue growing as a data engineer. Thanks for stopping by my blog!