An adventure on SQL Server performance and features


A script to automatically align columnstore indexes to enhance segment elimination (and hence performances)


Columstore indexes are a “new” neat data structure that I like, even if technically they’ve been around for years, only recently they’ve become usable by most customers. Let me recap a little bit what we’re talking about, so the point of this will be clearer: The table is divided in rowgroups of about one million rows max, then each column is stored by itself in a...

A clarification on the waitstat: SOS_SCHEDULER_YIELD


Are you one of the people that says “we have a CPU related issue” whenever it sees SOS_SCHEDULER_YIELD popping up? Let me explain why you’re wrong. In the books online, SOS_SCHEDULER_YIELD is defined simply as: Occurs when a task voluntarily yields the scheduler for other tasks to execute. During this wait the task is waiting for its quantum to be renewed. Well, since the SQL...

What’s the best way to massively update a big table?


A thing that can happen once in a while in a DW is the need to massively update a column in a table, let’s find out. I’ll be using the same table as last post , the lineitem table of the tpch test by HammerDB, the three contenders are: Heap Clustered Index Clustered Columnstore Index No other NonClustered index will be included in the base table. You should already know that each...

Test: confronting various methods of bulk loading data from a table to another, what’s the fastest?


 Most of the support request I get involved with can be summarized with the following keywords: “slow” + “datawarehouse” + “ETL” + “Save us” What about thinking about ETL performance before it goes bad? Before the system is in production for some time , the data size reaches a decent size which wasn’t tested in DEV and you find out that the...

Scan Performance Showdown: INROW Predicate pushdown VS Batch Mode Adaptive Join vs Columnstore indexes


I was reading Paul White’s post about bitmaps in SQL Server and, other than realizing that I basically know nothing about anything (like a Socrate’s epiphany) I wanted to at least see how this good stuff added up in the latest 7 years. Down with experimenting, we’re doing this LIVE! (Well, it’s live for me, for you it’s still a blog post) I’ll start by creating...

Create an asynchronous multithreaded workflow on your SQL Server using the Service Broker


As any other *Server, SQL Server is born with concurrency in mind, a lot of users/applications can connect concurrently and do their job, as designed, perfect, but, what if you’re the only user? What if you’re in some kind of data warehouse configuration where you have to ETL, or you are in a maintenance window and have to do a bunch of stuff in that time, or simply you need to launch...

How to avoid the traps when dealing with parallelism


I work with a vendor application which needs to do all sorts of data crunching in some nightly batches, and, depending on the size of data, this process may take a lot of time. Some time a go a colleague said: “we need more parallelism, we want to parallelize all the things! What can you think may go wrong?” This opened a stream of consciousness that could be really handled only by...

An adventure on SQL Server performance and features

Emanuele Meazzo

My Social Profiles

If you're interested in following me around in the social world, here are some URLs for you