If you read my previous article on how to configure a Linked server to Oracle , you know that I feel like someone is plotting to keep the topic of how to get our precious data outside of the Oracle ecosystem as obscure as possible out of the oracle circle
Fear not! I’m here to get you all the info in order to get data from Oracle Database via Powershell, in a native high-performance way, allowing you to create a multithreaded, reliable and connected ETL flow to feed a data warehouse from Oracle data, like the owner of this blog, or just move quickly some data from one place to another, programmatically, whatever it’s your need.
Disclaimer: I'll be using and mentioning PowerShell 7 (Core) in this guide, as it's mature enough for my purposes (and foreach -parallel is too convenient for my laziness), if you're using the legacy Powershell just get the non-core components
The magic component for this task is exploiting the Oracle Data Provider for .NET; as our .NET developer friends can use this components in their apps, with Powershell we can tap into the same power.
There are two ways to get the goods: If you’re a developer, familiar with nuget, go get Oracle.ManagedDataAccess.Core and then you’ll reference the DLL from the location you’ve installed it.
If you don’t have nuget and/or you just want to get read the damn data from Oracle using Powershell already:
- Get to this page to download the Oracle Managed Data Access package
- Click on “Download Package” on the right, a file with the .nupkg extension will be downloaded
- The “package” is really just a zip file, open it with any archive reader (like 7Zip, Winrar or rename the extension to *.zip, whatever)
- Go to the lib\netstandard2.1\ folder and get the precious Oracle.ManagedDataAccess.dll file, that’s all we need
Now that you have the magic DLL, put it in the same folder as the script you’re going to write, and follow this example:
#Parameters $OracleDLLPath = ".\Oracle.ManagedDataAccess.dll" #The oracle DataSource as you would compile it in TNSNAMES.ORA $datasource = " (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP) (HOST = YOUR_ORACLE_SERVER)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = YOUR_SERVICE) (FAILOVER_MODE = (TYPE = SELECT) (METHOD = BASIC) (RETRIES = 180) (DELAY = 5))))" $username = "youroracleuser" $password = "youroraclepass" $queryStatment = "SELECT * FROM MYSCHEMA.MYTABLE" #Be careful not to terminate it with a semicolon, it doesn't like it #Actual Code #Load Required Types and modules Add-Type -Path $OracleDLLPath Import-Module SqlServer #Create the connection string $connectionstring = 'User Id=' + $username + ';Password=' + $password + ';Data Source=' + $datasource #Create the connection object $con = New-Object Oracle.ManagedDataAccess.Client.OracleConnection($connectionstring) #Create a command and configure it $cmd = $con.CreateCommand() $cmd.CommandText = $queryStatment $cmd.CommandTimeout = 3600 #Seconds $cmd.FetchSize = 10000000 #10MB #Creates a data adapter for the command $da = New-Object Oracle.ManagedDataAccess.Client.OracleDataAdapter($cmd); #The Data adapter will fill this DataTable $resultSet = New-Object System.Data.DataTable #Only here the query is sent and executed in Oracle [void]$da.fill($resultSet) #Close the connection $con.Close() #Now you can do anything with the data inside of $resultset, we're uploading it to SQL Server just because , $resultSet | Write-DbaDbTableData -SqlInstance $destinationInstance -Database $destinationDB -Schema $destinationSchema -Table $destinationTable -Truncate -AutoCreateTable -BatchSize 10000 -EnableException
Let me explain what’s going on, in case that comments aren’t enough:
- Line 1 to 20
- It’s just a declaration of parameters, first one the Path for our precious Oracle dll, which in this case it’s in the same folder as the powershell script
- The datasource has the same format as the classic TNSNAMES.ORA files, so you can literally copy it from there (or ask your Oracle DBA, if you’re not impersonating both, to pass this info to you)
- For some reason, when select statements are terminated by a semicolon, the query won’t execute; practically the opposite of the standard Oracle behavior, of course
- Line 21 to 29
- I’m loading up the Oracle DLL and importing the SQLServer module, just because in this example I’ll be taking the fetched data and dump it into a SQL Server table, just because
- At this point I’m building the actual oracle connection string , with username and password too
- Line 30 to 42
- I’m finally doing something concrete, by actually creating an Oracle connection object to use to run our query
- A command object is needed to ask the connection to do stuff, you cannot rush these things; in the example it’s shown how to assign the query statement and how to configure a couple of properties, for more follow the docs
- For a clean and straightforward experience, I’m using an OracleDataAdapter object; it’s very neat as it incapsulates all the bothersome tasks of fetching rows in a simple action that executes the command just configured
- Line 43 to 49
- Calling the Fill Method on the Data Adapter object gets all the rows that we’re fetching from Oracle DB into the Result Set object declared just above
- Once that the data is in the DataTable object, it’s easily manageable in Powershell as it is a native data type and plenty of examples all over the internet exists on how to manage this object
- Line 50 & 51
- Just for the sake of having an ending, I’m using the Write-DbaDbTableData cmdled from the DBATOOLS Powershell module to get the data from the resultset into a SQL Server instance
- I’m piping the data directly from the resultset to the cmdlet, not that I put a comma in front of the DataTable object; that’s because I want to bulk load the rows into the table, and the comma basically says “pass the whole object through the pipe”, otherwise the pipe would pass each row of the DataTable one by one, and we know that’s not really efficient.
- Other than setting the -Truncate and -AutoCreateTable flags, to create a table for the data if it doesn’t exist and truncate the table if it already exists, the -EnableException flag is pretty relevant, because by default any failure in writing the data to SQL Server will raise an error, but not an exception, so without this flag your script will just go on (and won’t even fall in a Try-Catch statement) and just display the error
This is the easiest and cleanest way to query Oracle Database from Powershell, as you don’t need to install Oracle Client, configure any ODBC, TNS Files or all the other messy stuff usually needed, and it’s fast too, as we’re dealing with native code here.
You have to be careful when dealing with huge datasets, as using the Data Adapter to fill the ResultSet as straightforward as in the example of course will lead to having all of that data in memory, which could cause issues if you didn’t realize this; so you may need to batch the data download/upload and/or use some other class other than OracleDataAdapter, but I trust that if you have this much data to move around, you’ll be able to find a way.