Press "Enter" to skip to content

Curated SQL Posts

Creating a Spark Job Definition

Miles Cole builds a job:

A Spark Job Definition is effectively a way to run a packaged Spark application, Fabric’s version of executing a spark-submit job. You define:

  • what code should run (the entry point),
  • which files or resources should be shipped with it,
  • and which command-line arguments should control its behavior.

Unlike a notebook, there is no interactive editor or cell output, but this is arguably not a missing feature, it’s the whole point… an SJD is not meant for exploration; it is meant to deterministically run a Spark application.

With that concept in mind, click through for the process.

Leave a Comment

Investigating Full-Text Index Issues in SQL Server

Rich Benner doesn’t consider “remove the index” a valid solution:

The client noticed the D (data) drive was running out of free space and they asked us to investigate. We found that the SQL Logs folder was much larger than we’d expect. A considerable amount of this data was not database files (.mdf, .ldf, .ndf), but rather log files (.log and anything with a .Number file extension):

Read on for a bit of a shaggy dog story, as most IT stories tend to be. You start with one problem (almost out of disk space) and it turns into a cascading series of problems, so that by the end of things, you’re trying to figure out how to diagnose an error message when installing Node 16 on a Windows 7 laptop.

Leave a Comment

A Set of T-SQL One-Liners

Rebecca Lewis has some quick scripts:

Every DBA has a mental toolbox of go-to queries. Some took years to learn. Some were stumbled upon by chance while working a 2am outage. Today I am sharing 10 of my favorite T-SQL one-liners — the kind of stuff you copy, paste, and immediately feel like a genius. Some are classics, some are new additions — All of them are useful.

Click through for the list. When I was constantly in SQL Server, I’d have a bunch of these types of queries as keyboard shortcuts.

Leave a Comment

Generating a Set of Sequential Numbers Redux

Louis Davidson needs even more sequential numbers:

I thought I was done, nice quick little throwaway piece, but I went a little more in depth than I planned. Then Aaron Bertrand messaged me about a post that I had forgotten (even if I did edit it :)), where he was introducing GENERATE_SERIES (GENERATE_SERIES: My new go-to to build sets). In it, he had included a method of doing this using a method that replicates digits and then uses STRING_SPLIT and ROW_NUMBER to generate more digits. He also noted that it was blistering fast.

Aaron (if you know him) is rarely wrong about SQL (at the very least).

I also realized there was one other thing I wanted to add to my tests, that being just selecting from a Numbers/Tally table that has a billion rows. This should be the fastest way to pull a set of numbers.

Read on for one hundred million results. And check out Brent Ozar’s comment on getting things in descending order.

Leave a Comment

Install and Configure SSIS 2025

Koen Verbeeck performs an installation:

We’re an on-premises shop running all our services on our own machines. We’re planning to migrate to SQL Server 2025 and there are some older SSIS projects we might need. Is SQL Server Integration Services still supported in SQL Server 2025? If yes, how can we install it on our server? Do we still need Visual Studio to develop projects and packages, and how can we convert the older projects?

Granted, SSIS hasn’t exactly changed a lot with SQL Server 2025, but it’s there for you.

Leave a Comment

The Troubles of Documentation: Microsoft Fabric API Edition

Rob Sewell walks through a recent experience:

Firstly, an apology to my friends (especially Randolph) in the documentation team at Microsoft. I know how hard you work to produce accurate and useful documentation, and I appreciate your efforts. This is not a criticism of your work, but rather an observation about the challenges I faced.

This is a story about a recent experience and the lessons learned along the way.

Read on for the issue and what Rob had to do. This is a case study in how hard it is to write good documentation, especially around the edges of what is possible.

Leave a Comment

Combining Fabric Real-Time Intelligence, Notebooks, and Spark Structured Streaming

Arindam Chatterjee and QiXiao Wang show off some preview functionality:

Building event-driven, real-time applications using Fabric Eventstreams and Spark Notebooks just got a whole lot easier. With the Preview of Spark Notebooks and Real-Time Intelligence integration — a new capability that brings together the open-source community supported richness of Spark Structured Streaming with the real-time stream processing power of Fabric Eventstreams — developers can now build low-latency, end-to-end real-time analytics and AI pipelines all within Microsoft Fabric.

You can now seamlessly access streaming data from Eventstreams directly inside Spark notebooks, enabling real-time insights and decision-making without the complexity & tediousness of manual coding and configuration.

Click through to learn more.

Leave a Comment

Building a Better .gitconfig

Colin Gillespie digs in:

Getting started with Git is easy (ha!), but once you’ve mastered the basics, it’s natural for developers to start thinking about customising their git process. Most Git settings live in the .gitconfig file. In this blog post, I’ll discuss what you should consider setting in your config file to make a more efficient development environment.

There are some interesting settings that I hadn’t heard of, but I could see making sense.

Leave a Comment

More Spark Jobs, Fewer Notebooks

Miles Cole lays out an argument:

I’m guilty. I’ve peddled the #NotebookEverything tagline more than a few times.

To be fair, notebooks are an amazing entry point to coding, documentation, and exploration. But this post is dedicated to convincing you that notebooks are not, in fact, everything, and that many production Spark workloads would be better executed as a non-interactive Spark Job.

Miles has a “controversial claim” at the end that I don’t think is particularly controversial at all. I agree with pretty much the entire article, especially around the difficulties of testing notebooks properly.

Leave a Comment

SSMS Updates and Code Completions

Brent Ozar wants an update:

A long time ago in a galaxy far, far away, SQL Server Management Studio was included as part of the SQL Server installer.

Back then, upgrading SSMS was not only a technical problem, but a political one too. Organizations would say things like, “Sorry, we haven’t certified that cool new SQL Server 1982 here yet, so you can’t have access to the installer.” Developers and DBAs were forced to run SSMS from whatever ancient legacy version of SQL Server that their company had certified.

Working in a controlled industry, I still get to hear that answer.

Leave a Comment