It's just Emanuele showing you what you can do in the Microsoft Data Platform


You can now access the SQL Server Diagnostic Book remotely!


Good news everyone, thanks to the Azure Data Studio August 2020 update you can now access the SQL Server Diagnostic Book without having to download the book from my Github. Remote Jupyter Books have been added to ADS, now you just need to click “Add Remote Jupyter Book” in the Notebooks blade and insert my GitHub repository URL, like so: Be sure to click the “Search”...

How to lose hundreds of thousands of dollars by using functions in SQL Server


Ahh, functions, the greatest tool at a programmer’s disposal, they make code reusable and easy to read, they’ve been essential since the first function call was made in the last century… Unfortunately for the developers out there, when working for SQL Server, especially when tuning for performance, you need to get everything you knew about programming and throw it out of the...

A Self-deployable TICK Stack for ingesting data, monitoring and alerting for any service (including SQL Server)


Oh boy, this is a spinoff of my previous post on “How To Use Grafana (On Docker) To Monitor Your Sql Server (Eventually On Docker Too) – Feat. Influxdb And Telegraf” , which is a nice solution, but I wanted to create something that’s even more easy to deploy, more configurable and without the need of actually knowing influxSQL or learning the influxDB Telegraf schema; for these...

PROTIP: Performance Tuning on the cloud will save you money by the hour


Note: I'll be focusing on Azure Cloud and SQL Server, but these considerations are valid for any Cloud/DB Vendor. Whenever people talk about “The Cloud” I often hear: “cloud is expensive” and, sooner or later, “if performance aren’t good we can scale up in minutes later” .The mindset of taking care of performance issue by scaling up/down cloud resources...

Get faster performance and lower network usage in SQL Server Loops by avoiding the “DONE Token” overload


FYI: You can get the Notebook for this article on my github and experiment yourself (opens with Azure Data Studio). Preamble Everybody knows that using loops in SQL Server is not efficient, if you’re able to write that same logic in a set-based statement it’s guaranteed to be faster.Still, devs can’t be helped, you just can’t seem to nail down the set-equivalent statement...

Row Level Security Performance Implications


A conversation had me wondering about performance implications in SQL Server RLS feature, specifically on how to mitigate the performance impact;TL; DR; ? Since security functions are actually Inline Table Valued Functions (ITVF), if you write them in a decent way the queries will run in the most efficent way possibile, avoiding RBAR processing. Let’s set the case scene; I’m starting...

A script to automatically align columnstore indexes to enhance segment elimination (and hence performances)


Columstore indexes are a “new” neat data structure that I like, even if technically they’ve been around for years, only recently they’ve become usable by most customers. Let me recap a little bit what we’re talking about, so the point of this will be clearer: The table is divided in rowgroups of about one million rows max, then each column is stored by itself in a...

A clarification on the waitstat: SOS_SCHEDULER_YIELD


Are you one of the people that says “we have a CPU related issue” whenever it sees SOS_SCHEDULER_YIELD popping up? Let me explain why you’re wrong. In the books online, SOS_SCHEDULER_YIELD is defined simply as: Occurs when a task voluntarily yields the scheduler for other tasks to execute. During this wait the task is waiting for its quantum to be renewed. Well, since the SQL...

What’s the best way to massively update a big table?


A thing that can happen once in a while in a DW is the need to massively update a column in a table, let’s find out. I’ll be using the same table as last post , the lineitem table of the tpch test by HammerDB, the three contenders are: Heap Clustered Index Clustered Columnstore Index No other NonClustered index will be included in the base table. You should already know that each...

Test: confronting various methods of bulk loading data from a table to another, what’s the fastest?


 Most of the support request I get involved with can be summarized with the following keywords: “slow” + “datawarehouse” + “ETL” + “Save us” What about thinking about ETL performance before it goes bad? Before the system is in production for some time , the data size reaches a decent size which wasn’t tested in DEV and you find out that the...

It's just Emanuele showing you what you can do in the Microsoft Data Platform

Emanuele Meazzo

My Social Profiles

If you're interested in following me around in the social world, here are some URLs for you