Press "Enter" to skip to content

Curated SQL Posts

Building Materialized Lake Views in Microsoft Fabric

Nikola Ilic presses the Easy button:

For the longest time, building a medallion architecture in Microsoft Fabric meant stitching together a small orchestra of moving parts: notebooks for the transformations, pipelines for orchestration, schedules for refresh, custom code for data quality checks, and the Monitor Hub for keeping an eye on whether anything actually worked. Every layer worked – until something didn’t, and then you had to figure out which layer broke, why, and which downstream layers got affected along the way.

If you’ve ever tried to debug a silver layer that didn’t update because the bronze notebook failed three hours ago, you know exactly what I’m talking about.

Then, at FabCon Atlanta in March 2026, materialized lake views (MLVs) went generally available. And the story they’re telling is simple: what if your entire medallion pipeline could be a few SELECT statements?

Let me walk you through the whole thing – what they are, how they work, what changed between preview and GA, and where they fit (and where they don’t) in your architecture.

Read on for that walkthrough.

Leave a Comment

Generating Sample Data in Fabric Dataflows

Chris Webb builds some data:

Back in December the FabricAI.Prompt() M function was released in Fabric Dataflows Gen2. Most of the people writing about it at that time, as in this great post by my colleague Sandeep Pawar, focused on calling this function for each row in a table – something that the UI in the editor makes easy. However the FabricAI.Prompt() function itself is a lot more flexible. You can use it to summarise whole tables of data as I showed here; you can also use it to generate sample data. This is similar to what I blogged about here where I got Copilot to generate M code that returned sample data but using FabricAI.Prompt() is maybe a bit simpler.

Click through to see how.

Leave a Comment

Unplanned Failover and SQL Server on Kubernetes

Anthony Nocentino performs additional testing:

In my planned failover walkthrough, I showed what happens when you deliberately move the primary role to another replica. That’s the easy case. Now I want to show what happens when the primary pod just disappears unexpectedly, like during a node failure or a container crash. No graceful shutdown, no demotion, just gone.

I ran two test scenarios, each cycling the primary role across all three pods by force-deleting the current primary three times in a row. First, a 5GB TPC-C database idle. Then, that same 5GB database under sustained HammerDB TPC-C load. Six force-deletes total, six successful automatic failovers. I’ll walk through the error log from the promoted replica, the operator’s detection and recovery behavior, and the full timing data.

Read on to see how Anthony’s SQL Server Kubernetes operator handles when things go bump in the night.

Leave a Comment

Training, Serving, and Deploying Scikit-Learn Models via FastAPI

Abid Ali Awan serves a model:

In this article, you will learn how to train a Scikit-learn classification model, serve it with FastAPI, and deploy it to FastAPI Cloud.

Topics we will cover include:

  • How to structure a simple project and train a Scikit-learn model for inference.
  • How to build and test a FastAPI inference API locally.
  • How to deploy the API to FastAPI Cloud and prepare it for more production-ready usage.

Click through for the process.

Leave a Comment

Cross-Workspace MLflow Logging Available in Microsoft Fabric

Ruixin Xu announces a feature now generally available:

Cross-workspace logging works through the synapseml-mlflow package, which provides a Fabric-compatible MLflow tracking plugin. The core idea is simple: set the MLFLOW_TRACKING_URI* to point at your target workspace and use standard MLflow commands. Your experiments, metrics, parameters, and registered models land in the workspace you choose — not just the one you’re running in.

Read on for the full announcement.

Leave a Comment

Filtered Indexes and Computed Columns

Greg Low has a public service announcement:

On a client site some years back, I came across a situation (unfortunately too common) where a column in a table was being used for two purposes. It could either hold an integer value or a string. Only about 100 rows out of many millions had the integer value. Some of the client code needed to calculate the maximum value when it was an integer. First step I tried was to add a persisted calculated column like so:

After that, Greg tried to create a filtered index. Read on to see how that worked.

Leave a Comment

Testing Your SQL Server Backups

Vlad Drumea stresses the need to test:

Folks think that if they just take backups regularly, and their backup intervals match their recovery point objectives, then they can rest assured knowing (more like falsely thinking) they have backups from which to restore in case something bad happens to a database or to the entire instance.

Yeah, there are several things that can go wrong, even beyond the solid answers Vlad provides. Think about developers taking one-off backups that break the log chain, overly-aggressive purge processes that delete backups before they’re fully moved to long-term storage, etc.

Definitely check out Vlad’s post. If you’re interested, I also have a video series on YouTube that covers SQL Server backup and restore.

Leave a Comment

Postgres Performance on Linux 7.0 and Benchmark Artifacts

Christophe Pettus digs into a finding:

A benchmark came out of AWS earlier this month showing PostgreSQL throughput on Linux 7.0 dropping to 0.51x what the same workload produced on Linux 6.x. The Phoronix headline wrote itself. Hacker News did what Hacker News does. By the end of the week, I had been asked by three separate clients whether they needed to hold their kernel upgrades.

They don’t. Almost nobody does.

Read on for the answer, including a dive into the nature of the problem and a quick-and-easy fix for most cases.

Leave a Comment

Maestro Now 1.1.0

Will Hipson lays out an update:

maestro has officially graduated to stable release with version 1.0.0 back in January 2026 and now its latest version 1.1.0. This marks a commitment to maintaining a stable API and increased reliance on using maestro in production. In our environment alone, maestro has orchestrated millions of pipeline executions over the course of a year, effectively making it the heartbeat of our entire data stack.

If you haven’t heard of maestro, it’s a pipeline orchestration package. You can learn more about it here.

Click through to see what’s changed between the 1.0.0 release and now. H/T R-Bloggers.

Leave a Comment