It's just Emanuele showing you what you can do in the Microsoft Data Platform

A SQL script to automatically infer the data types of data that someone just dumped in your DB with no effort

A

Do you know the feeling when someone gives you a flat file to import, without any schema information? You have to dump it in your database as-is, then spend precious time figuring out the data type for each column ๐Ÿ˜ฅ
Let me help you with that, as I’m helping myself too.

Use case:

  • Someone sends you a flat file (csv), of course they don’t give you ANY information about the data types contained in it
  • You dump it as-is in your database staging area:
    • When using SSMS Flat file importer it can mostly do this job for you, but not 100%
    • But if you’re like me and get files from a Data Lake and import it straight away with Azure Data Factory with the “just auto create the table and I’ll deal with it later because I can’t bother right now please” flag on you end up with a table with the right column names but all nvarchar(MAX) columns because the import engine is as lazy as me
  • At the end of the day you have this table with sub-optimal data types, and here is where this dumb script can help you

Using the power of trial and error and dynamic SQL, I’ve come up with this script which basic idea is simply doing a TRY_CONVERT in various data types (most common defined, you can add more and/or different sizes easily) and checking if the engine was able to convert all of them: it comes up with suggestions about compatible file types for each column, the min/max length for any varchar value and a useful raw data sample too ๐Ÿ˜‰

Here is a sample output:

Of course this is a first draft that can be expanded upon, but that’s what open source is all about, you can do it yourself ๐Ÿ‘€

Find the full script on my GitHub: tsql.tech-Code-snippets/Schema-Related/data column type checker+suggester.sql at master ยท EmanueleMeazzo/tsql.tech-Code-snippets (github.com)

You’re welcome.

About the author

Emanuele Meazzo

Add comment

It's just Emanuele showing you what you can do in the Microsoft Data Platform

Emanuele Meazzo

My Social Profiles

If you're interested in following me around in the social world, here are some URLs for you

Find Us

Address
123 Main Street
London EC1 4UK

Hours
Monday—Friday: 9:00AM–5:00PM
Saturday & Sunday: 11:00AM–3:00PM

About This Site

This may be a good place to introduce yourself and your site or include some credits.