Press "Enter" to skip to content

Curated SQL Posts

Highlighting Rows in DAX via Visual Calculations

Marco Russo and Alberto Ferrari point out a row:

When it comes to visuals, users may want to specific cells highlighted in order to spot important information quickly. While browsing the forums, we came across an interesting requirement that can easily be solved with a DAX measure: highlight an entire row based on the value in the last column of the visual only. In our example, we highlight Wide World Importers because it has the maximum value (71,904.98) in the last year (2026).

I’ve had a need for this several times in the past, so it’s nice to see you can do it via visual calculations.

So, let’s use this in Power BI Report Server. Oh, wait, you can’t. But if you’re not shackled to that train wreck, click through for a nice solution.

Leave a Comment

Goodhart’s Law and KPIs

Alexander Arvidsson talks metrics:

In 1974, the Soviet Union set production quotas for nails measured in tons.

Factories responded rationally. They produced a small number of enormous, completely useless nails. Problem solved. Quota met.

Moscow reconsidered and switched to quotas measured in number of nails.

Factories responded rationally. They produced millions of tiny, flimsy nails—too small to hold anything together.

The Soviet nail industry was, by any measurable standard, thriving. It was also producing almost nothing of value.

Click through for a detailed description of the problem and Alexander’s push for an answer. I typically summarize Goodhart’s Law in my own words as, any system can be gamed. I think Alexander’s tips on good measurement architecture are sound (and he even mentions gaming systems here), but even with all of those in place, you can still end up with a monkey’s paw of a KPI suite. Even the example of “ensure that users who log in are getting measurable value from the product, as evidenced by X” falls apart in the end because we now maximize for X while hollowing out Y and Z, because nobody’s watching those, as we’re focused on X-maximization. Like, say, “ensure that users who long in are getting measurable value from the product, as evidenced by average order value.” Now we maximize for AOV, streamlining the experience for people making large orders and relentlessly pushing more cart items at the user. The other points aim to mitigate the worst of these results, but it’s an off-the-cuff example of how no system is perfect. But some can still be superior.

Leave a Comment

Page-Level Caching in PostgreSQL with PAX

Laetitia Avrot has a two-parter on PAX. The first part is from a couple of months ago and sets the stage:

Picture this: you walk into a library. Each book is a database record. PostgreSQL’s traditional storage (NSM – N-ary Storage Model) stores complete books on each shelf: chapter 1, chapter 2, chapter 3, all bound together.

Here’s the problem that keeps me up at night: when you need only chapter 3 from 1,000 books, you must pull each complete book off the shelf, flip through chapters 1 and 2 that you’ll never read, grab chapter 3, and move to the next book.

You’re wasting time. You’re wasting energy. You’re wasting cache bandwidth.

But it’s not all roses:

PAX looks elegant on paper: minipages, cache locality, column-oriented access inside an 8KB page. But the moment you ask how this actually would work with Postgres, the complexity arrives fast. NULL values, variable-length types, MVCC, boundary shifts. Let’s go through it all.

To be clear, this is not a product (today). It’s thinking about how to apply the results of an academic paper to Postgres and all of the tricky challenges that can come from what seems to be an easy idea.

Leave a Comment

ANY_VALUE() in Fabric Data Warehouse

Jovan Popovic notes a feature going GA:

Fabric Data Warehouse now supports the ANY_VALUE() aggregate, making it easier to write readable, efficient T-SQL when you want to group by a key but still return descriptive columns that are functionally the same for every row in the group.

Right now, this is only available in the Fabric Data Warehouse, so no Azure SQL DB, Managed Instance, or box product support at this time.

Leave a Comment

Word Order and Constraint Naming

Andy Levy is looking for a name:

Ten years (and a couple jobs) ago, I wrote about naming default constraints to avoid having SQL Server name them for you. I closed with the following statement:

SQL Server needs a name for the constraint regardless; it’s worth specifying it yourself.

We’re back with a new wrinkle in the story.

Read on for an interesting scenario where Andy very clearly named a constraint, yet the name didn’t take.

Leave a Comment

Privilege Escalation via the DatabaseManager Role

Andreas Wolter looks into a security issue:

First of all: full credit to Emad Al-Mousa for identifying and publishing two privilege escalation paths.

His findings show how members of the ##MS_DatabaseManager## server-role in SQL Server can escalate to sysadmin by chaining existing functionality in unexpected ways.

He has published his findings responsibly after more than 8 months’ notice to MSRC: Microsoft SQL Server Privilege Elevation Through ##MS_DatabaseManager## Role [CVE-2025-24999]

Read on to learn more about the issue.

Microsoft fixed a separate privilege escalation issue but has updated the documentation for ##MS_DatabaseManager## to indicate that “Members of this role can potentially elevate their privileges under certain conditions. Apply the principle of least privilege when assigning this role and monitor all activity performed by its members.”

Leave a Comment

Updating a Mean without Original Data Points

John Cook has an interesting solution:

This post will look at the problem of updating an average grade as a very simple special case of Bayesian statistics and of Kalman filtering.

Suppose you’re keeping up with your average grade in a class, and you know your average after n tests, all weighted equally.

Click through for the walkthrough. This is similar to something I tried to puzzle out but ultimately admitted defeat: is there a way to calculate updates to the median without needing to know the entire set? In practical terms, this would be something like, how many pieces of information do I need to guarantee that I can maintain a median over time?

The best I could come up with was built along the premise of the likelihood of new data points being less than the median versus those greater than the median, where each pair of greater-lesser cancel each other out. If you have roughly equal numbers of new data points to each side, your “elements of the median” array can be pretty small. But the problem is, for any sufficiently small k, where k represents the number of elements you keep in memory, it is possible for a localized collection of (without loss of generality) lower-than-median data points to come in and completely wash out your memory. For example, if you kept 3 points and memory and you have four values below the median, you no longer know what the median is.

Trying to solve this without knowing the shape of the distribution or make any sequencing assumptions is something that I failed to do.

Leave a Comment

Tracking Resource Utilization in Performance Monitor

Erik Darling has another video tutorial:

In this video, I delve into the resource metrics tabs of the full dashboard from my free SQL Server performance monitoring tool. I explain how these tabs provide more detailed information compared to the lighter version, including CPU utilization, TempDB usage, memory usage, and a selection of perfmon counters. The goal is to help you quickly identify when your server might be experiencing issues by showing real-time data and historical trends that can pinpoint specific performance bottlenecks. I also highlight how the weight stats section helps in understanding wait statistics, making it easier to diagnose problems related to resource contention and query compilation.

Click through for the video.

Leave a Comment