(scottish) sql bob blog

rss

Some thoughts of your typical data shepard / data groomer / data dance teacher sort of person.


Import sessionize.com JSON data into SQL Server - Part 1
Import sessionize.com JSON data into SQL Server - Part 1
....TLDR in the next few blog posts I will be showing how to 

  • Export Json data from a Sessionise.com API url
  • Insert the Json data into SQL Server 2016 database
  • Generate a dataset from the imported Json data
  • Display data on SSRS report
I love Azure Data Studio with the addition of the notebook feature it's just soooo much more wonderful.  For me this feature is something that makes the product even more amazing.  My first introduction to notebooks was Jupiter notebooks, when I did some courses on Python. The coursework required you to use them to submit coursework using Jupiter notebooks. So when the Azure Data Studio team announced that they would support notebooks I was very excited. Even better the notebooks can support SQL, Python, and several other languages. So this was a great opportunity for me to look at using Python and SQL. 

During one of the discussions about organising Data Scotland it was suggested to create some cards with the session details. The card would include the speakers photograph, session title, room name, twitter handle.  The required information is already stored on the sessionize.com website.

So what am I going to show in the following blog posts?  

  • Call the sessionize API grab the data returned as a JSON String 
  • Place the data into SQL server, using some TSQL code with a bit of Python magic
  • Query the JSON data in SQL server using TSQL
  • Return a recordset from the JSON data that can be read by SSRS Report Builder 

So that’s what I am going to show, next let’s look at what is required to do it.

Ingredients

  • SQL Server 2016 database instance 
  • SQL Server 2016 database (compatibility level 130) running on 2016 SQL Server 
  • Azure data studio with Python installed

First download and install Azure Data Studio you can download the program from here. 

Once you have installed Azure Data Studio, open the application.  In Azure Data Studio in the menu find ‘File’ and click it, from the menu select ‘New Notebook’ see Figure 1 below.

 

 

 

 

 

 



Figure 1 – File menu showing where to find the ‘New Notebook’ menu item. 


This will open a new notebook (yippee!!) this might not sound very exciting yet, however it is!  When a new notebook opens the Kernel must be set.  The way that I think about this that it sets the language which will be run in the notebook, and will default to SQL.  What we want run is Python v3.   From the list of Kernels available selected ‘Python 3’, this will set the language that will be run in the notebook. 


 

 

 

 



Figure 2 – selecting the Kernel (programming language) that will be run in the notebook.

Once ‘Python 3’ has been selected and if Python is not set up and installed, then Azure Data Studio will prompt you to set up and configure Python for Notebooks.  A screen will open as we can see in Figure 3.  For this blog post I accepted the default location and clicked on the ‘install’ button.





















Figure 3 – Install and configure python for use in Azure Data Studio

If everything has gone to plan, then you should see something that looks like Figure 4.


 

 

 

 

 

 

 

 

 

 

Figure 4 – installation of Python going as planned

Installing of Python can take sometime so it might be good idea to get a hot beverage or do something else till it is finished installing.  

 

 

 

 

 

 

 

 

 

 


Figure 5 – Installation of python is now completed successfully

In sessionize.com it is possible to create different API’s to output data, with this example the data is outputted as JSON.   It is possible to select different parts of the data to be outputted, in this example ‘All Data’ is selected.  Selecting the data from sessionize.com is beyond the scope of this blog post, it is very easy to do though. 

 In figure 6 the last step is to get the URL to be called in the code, this can be seen in Figure 6 below.

 

 


 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 6 - API /Embed screen in Session.com for Data Scotland 2019.

In figure 6a (yes I forgot to include this till a later edit) is the columns that are outputted from Sessions.com for the API endpoint used.


Figure 6a - Settings for Available API endpoint used in this blog post.

Ok enough setting up lets write some code.  To get access to other libraries in Python, the command that is used is import <library name>.  In this example there are four libraries which are imported to be used.   If you run the code shown in figure 7 you might get the error message shown.  












Figure 7 – Error message if the package for the library being imported is not installed.

If you do see this error message then all you need to do is install the required package.  In figure 7 at the top left hand side there a button titled ‘Install Packages’.  Click on that button and the terminal window will open (see Figure 8).  The command that installs the library ‘pyodbc’ is ‘.\python.exe - m pip install pyodbc, type the command into the terminal window and press enter.



 

 

 

 

 

 

 

 

Figure 8 – Entering the command to install the ‘pyodbc’ package in the terminal window.

Hopefully the ‘pyodbc’ package will install without any challenges.  If like me you are not so lucky and you get the error message shown in Figure 9.  Then this is quite easy to fix.

 

 

 

 


Figure 9 – Error message stating PIP (Pip Installs Packages) requires to be upgraded to install ‘pyodbc’ package
If you get the error message shown in Figure 9 then enter the following command at the prompt ‘.\python.exe - m pip install –upgrade pip’.   If everything goes well you will see a message like the one shown in Figure 10.

 

 

 

 


Figure 10 – Successfully upgraded PIP to v 18.

Once the new version of PIP has been installed restart Azure Data Studio.  Then open a notebook select Python 3 as the kernel language then click on the ‘Install Packages’ and install ‘pyobdc’ library (see Figure 8).  Once ‘pyobc’ has been installed, it is now time to run the Python script

The Python Script will do the following

1 - call the API call and get the Json string returned is this into a dict Object which is then cast to a string object.

2 - open a connection to a SQL database run SQL script to create table if does not exist

3 - insert Json string into field in the table

Below is the Python script that is used.  Much of the credit must go to the various websites which I have add references to in the script. In figure 10 we can see the script that is used.  All that is require to change, is URL for the sessionize.com API, user credentials in the connection string.  Otherwise this is the script is what I used.



 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 11 - Python script in Azure Data Studio Notebook to import Json in SQL server 2016

The Azure Data Studio Notebook that is shown in Figure 11 can be downloaded from here.

In the next blog post we will look at how work with the Json data in SQL Server.


How to find SQL Server Instance IP address
Quite often I need to get some information about a SQL server instance, this morning it was the IP address for an SQL server instance.  Having googled for this more times than I care to remember this time I thought I would put the information in a blog post.


Method 1 – TSQL 
SELECT   ConnectionProperty('net_transport')          AS 'net_transport'
      , ConnectionProperty('protocol_type')          AS 'protocol_type'
      , ConnectionProperty('auth_scheme')            AS 'auth_scheme'
      , ConnectionProperty('local_net_address')      AS 'local_net_address'
      , ConnectionProperty('local_tcp_port')          AS 'local_tcp_port'
      , ConnectionProperty('client_net_address')      AS 'client_net_address'
      , ConnectionProperty('physical_net_transport') AS 'physical_net_transport'

Local_net_address was the field that I required.

This post was inspired by - https://www.sqlservercentral.com/forums/reply/1519373
Further information on ConnectionProperty can be found here.

Method 2 – DBA Tools

Test-DbaConnection <InstanceName>  see -> https://docs.dbatools.io/#Test-DbaConnection

The output includes a property IPAddress which is what I wanted.   If you have not looked at DBATools then you should, it is an amazing, open source project, and a genuine time saver.

Are there any other methods which I have not thought of please let me know.


Monitoring replication status using Nagios (using PowerShell script and TSQL)

For various reasons which l have now forgotten, l set up transactional replication for some clients.  The result of this is l am the caretaker of transactional replication for two of our clients, what l lucky person l am !

T-SQL code used to check Transaction replication 
Once l got the process up and running (a very, very long and stressful story).  At that point l realised that I would have to monitor these processes.  Following some goggling this article was found with some TSQL written by SQLSoldier here.  This worked for me and l used this to monitor replication by running the script and checking the results manually.
 
Nagios output 
Our SysAdmin uses a tool called nagios to monitor out IT estate.  So they suggested that a script could be written to monitor the replication process and send an alert if anything needed to be looked at.  This sounded like an excellent idea, how to do it?  The approach that was arrived at, involved using a PowerShell script which would run a SQL query examine the results then respond to Nagios, with the following values  

0 - all ok nothing to see here 
1- something happening nothing to worry about (just now) 
2-yes there is something that really needs some attention

See this page for more guidance.

Stored procedure 

Next we decided to use PowerShell to return the results.  Following some goggling we found this page http://www.madeiradata.com/cross-server-replication-health-check-using-powershell/ which runs a SQL script and returned a data set.  First challenge was the TSQL script from SQLSoldier was rather long for my first powershell script, l wanted some thing smaller.  So l created a stored procedure based on the script, and placed it in my Distribution database on the replicated database server.  Doing this had a two reasons, first less TSQL in the PowerShell script, second changing one of the parameters meant it returns different amounts of data.  The stored procedure takes the following parameters ; 

@Publisher  - The name of the publisher database server instance 
@PublisherDB - The name of the publisher database 
@NagiosOutput - Y = only output return code and short error code( max of 80 characters), N = output all results 

The following script was used to check the results that would be returned to Nagios. 
USE [distribution]; 
DECLARE @Publisher AS sysname 
DECLARE @PublisherDB AS sysname 
DECLARE @NagiosOutput AS Char(1) 
SET @Publisher = 'ReplPublisherServerName' 
SET @PublisherDB = 'ReplPublisherDBName' 
SET @NagiosOutput = 'Y' 
EXEC [dbo].[Check_Replication] @Publisher, @PublisherDB, @NagiosOutput; 

Note the account connecting to the database from Nagios will require execute permissions to the stored procedure otherwise it cannot run the stored procedure.  The code for the stored procedure is here.

PowerShell script 
Having adapted the PowerShell script found here to run the stored procedure.  When the PowerShell script was run by Nagios there was no 'Return Code' returned (this is what Nagios expects).  We did find the solution on this page, and inserted function ExitWithCode, and made a few other changes.  The resulting PowerShell script is below -

## Beginning of Monitor 
##Connection String With Server Variable, Distribution Database name is 'Distribution' 
$con = "server=127.0.0.1;database=Distribution;Integrated Security=sspi" 
 
##Begin SQL Query 
$cmd = "SET NOCOUNT ON; " 
$cmd = $cmd + " USE [distribution];" 
$cmd = $cmd + " DECLARE @Publisher  AS sysname" 
$cmd = $cmd + " DECLARE @PublisherDB AS sysname" 
$cmd = $cmd + " DECLARE @NagiosOutput AS Char(1)" 
$cmd = $cmd + " SET @Publisher = 'ReplPublisherServerName' " 
$cmd = $cmd + " SET @PublisherDB = 'ReplPublisherDBName' " 
$cmd = $cmd + " SET @NagiosOutput = 'Y'" 
$cmd = $cmd + " EXEC [dbo].[Check_Replication] @Publisher, @PublisherDB, @NagiosOutput;" 
 
##Creating DataSet Object 
$set = new-object system.data.dataset 
##Running Query 
$da = new-object System.Data.SqlClient.SqlDataAdapter ($cmd, $con) 
##Filling DataSet With Results 
$da.fill($set) | out-null 
##Creating Table Object and Inserting DataSet 
$dt = new-object System.Data.DataTable 
$dt = $set.Tables[0] 
 
## loop over each column in the DataSet 
foreach ($row in $set.Tables[0].Rows) 

##write out the 2nd row which contains the message text 
write-host $row[1].ToString() 

$exitcode = $row[0].ToString() 
 
## The 'exit code' fragment below was adapted from: 
## http://weblogs.asp.net/soever/returning-an-exit-code-from-a-powershell-script 
##SysAdmin, 2015-Nov 
function ExitWithCode  
{  
    param  
    (  
        $exitcode  
    ) 
    $host.SetShouldExit($exitcode)  
    exit  

 
The output from stored procedure when  is very short.  Note that Nagios only permits a maximum of 80 characters to be returned.  Hence the sample output if run the TSQL will looking this ; 

No issues
Error Code   ErrorMessage  
0 Replication OK 

Issue(s) requiring attention 
Error Code  ErrorMessage  
2  TNotRepl=45 CNotRepl=4 Latency=5 Status=In progress 

Nagios expects a error message of a maximum of 80 characters which is the reason for the brevity of the error message. The error messages are -

TNotRepl=45
 -  'Transactions not replicated' total number of commands queued to be applied to the subscriber database. 
CNotRepl=4 -  'Commands not replicated' total number of commands queued to be applied to the subscriber database 
Latency=5 - This is the time taken to from when a transaction is applied to the publisher database the amount of time it takes till the same transaction is applied to the subscriber database (in seconds).  
Status=In progress - current status of the replication process

When considering this error message it was primarily to give some guidance as to what might be happening with the replication process.  It is not intended to give any guidance on the underlying reason that is causing the issue.  All that is required is that the Nagios process shows that there is something wrong.  What ever the reason it requires some form of human intervention.  Once an error condition has been detected then the issue will be handed to me to resolve.  At least now l do not have to check the process periodically, now l just have to wait for a message from our sysadmin.