Press "Enter" to skip to content

Curated SQL Posts

Mirroring to OneLake without Public Internet Access

Paul Hernandez builds a (virtual) network:

Mirroring has been a transformative technology for data integrations tasks since the early Microsoft Fabric days. Moreover, this feature has been called “pain killer as a service” in community posts. In many projects, data sources to be mirrored are behind private networking and for security reasons they are not accessible using public internet. If you want to mirror, for example, an Azure SQL database, you’ll need a data gateway. According to the official docs: “If your Azure SQL Database is not publicly accessible and doesn’t allow Azure services to connect to it, you can set up virtual network data gateway or on-premises data gateway to mirror the data”.

In this post I’ll show you step-by-step how to set up connectivity to be able to use mirroring when Azure SQL allows only private access.

There are several steps involved, but the end result is worth it compared to not having the data at all or needing to make it accessible over the Internet.

Leave a Comment

Migrating from a Contained Availability Group

Warren Departee undoes a problem:

A client was running a Contained Availability Group in SQL Server 2022, but wasn’t using the AG Listener for their application connections. This negated most of the benefits the Contained AG was designed to provide. They also had some security misunderstandings and missteps, as this was built for them without any real knowledge transfer – one of the reasons they reached out to us for help. After review, it became clear that there was no need for a contained AG here, so we helped them migrate to a Basic Availability Group (SQL Server Availability Group in SQL Server Standard Edition) while preserving their database configurations and minimizing downtime.

Read on for a step-by-step process and a few hints on configuration.

Leave a Comment

Estimating Overall Fabric Capacity Utilization

Gilbert Quevauvilliers backs into a number:

I was recently working with a customer and one of the questions they had is we are going to be running an ingestion process. We want to know how much Fabric Capacity this will be consuming.

The challenge with this question is that in Fabric a background capacity gets smoothed over 24 hours.

For example, when looking at the Capacity Metrics App I can see my overall usage, but HOW MUCH CAPACITY IS IT CONSUMING?

Read on for the answer.

Leave a Comment

Concatenation via Double Pipe Operator in SQL Server 2025

Rajendra Gupta shows off a new operator:

SQL Server 2025 adds the double pipe (||) operator for string concatenation. What is the double pipe (||) operator, and how different is it from the existing plus (+) and CONCAT function for concatenation? Let’s check it out in this article.

I still prefer CONCAT() and CONCAT_WS() for display, and would be indifferent between += and ||= for appending strings. But for companies that need to write ANSI-compliant code, it’s a positive.

Leave a Comment

Reducing Row Count Estimation Errors in PostgreSQL

Shinya Kato lays out four approaches:

PostgreSQL’s query planner relies on table statistics to estimate the number of rows (estimated rows) each operation will process, and then selects an optimal execution plan based on these estimates. When the estimated rows diverge significantly from the actual rows, the planner can choose a suboptimal plan, leading to severe query performance degradation.

This article walks through four approaches I used to reduce row count estimation errors, ordered from least to most invasive. Due to confidentiality constraints, I cannot share actual SQL or execution plans, so the focus is on the diagnostic thought process and the techniques applied.

Click through for those thought processes.

Leave a Comment

Thoughts on Creating Databases in Contained Availability Groups

Andreas Wolter digs into some updated functionality:

First of all: It’s always encouraging to see the Product team act on user feedback. SQL Server 2025 CU1 introduces an improvement that allows the creation and restoration of databases within contained availability groups (CAG). This is a step in the right direction, but as you’ll see, there are still some bumps to smooth out. Keep the feedback coming (here: Allow creation and restore of databases in contained availability group) — progress is happening, but we’re not quite there yet.

Read on for a litany of issues, as well as Andreas’s recommended solutions.

Leave a Comment

Creating a Spark Job Definition

Miles Cole builds a job:

A Spark Job Definition is effectively a way to run a packaged Spark application, Fabric’s version of executing a spark-submit job. You define:

  • what code should run (the entry point),
  • which files or resources should be shipped with it,
  • and which command-line arguments should control its behavior.

Unlike a notebook, there is no interactive editor or cell output, but this is arguably not a missing feature, it’s the whole point… an SJD is not meant for exploration; it is meant to deterministically run a Spark application.

With that concept in mind, click through for the process.

Leave a Comment

Investigating Full-Text Index Issues in SQL Server

Rich Benner doesn’t consider “remove the index” a valid solution:

The client noticed the D (data) drive was running out of free space and they asked us to investigate. We found that the SQL Logs folder was much larger than we’d expect. A considerable amount of this data was not database files (.mdf, .ldf, .ndf), but rather log files (.log and anything with a .Number file extension):

Read on for a bit of a shaggy dog story, as most IT stories tend to be. You start with one problem (almost out of disk space) and it turns into a cascading series of problems, so that by the end of things, you’re trying to figure out how to diagnose an error message when installing Node 16 on a Windows 7 laptop.

Leave a Comment

A Set of T-SQL One-Liners

Rebecca Lewis has some quick scripts:

Every DBA has a mental toolbox of go-to queries. Some took years to learn. Some were stumbled upon by chance while working a 2am outage. Today I am sharing 10 of my favorite T-SQL one-liners — the kind of stuff you copy, paste, and immediately feel like a genius. Some are classics, some are new additions — All of them are useful.

Click through for the list. When I was constantly in SQL Server, I’d have a bunch of these types of queries as keyboard shortcuts.

Leave a Comment

Generating a Set of Sequential Numbers Redux

Louis Davidson needs even more sequential numbers:

I thought I was done, nice quick little throwaway piece, but I went a little more in depth than I planned. Then Aaron Bertrand messaged me about a post that I had forgotten (even if I did edit it :)), where he was introducing GENERATE_SERIES (GENERATE_SERIES: My new go-to to build sets). In it, he had included a method of doing this using a method that replicates digits and then uses STRING_SPLIT and ROW_NUMBER to generate more digits. He also noted that it was blistering fast.

Aaron (if you know him) is rarely wrong about SQL (at the very least).

I also realized there was one other thing I wanted to add to my tests, that being just selecting from a Numbers/Tally table that has a billion rows. This should be the fastest way to pull a set of numbers.

Read on for one hundred million results. And check out Brent Ozar’s comment on getting things in descending order.

Leave a Comment