Some thoughts of your typical data shepherd / data plumber / data dance teacher sort of person.
Figure 1 – The function ISJSON() returns 1 showing the data in the field [Data] is valid JSON
Now the JSON data is in the SQL database lets see about doing something useful with it. In SQL Server 2016 there was a number of new functions added that allow the querying and manipulation of JSON data. Having done some research, I found this blog post - https://visakhm.blogspot.com/2016/07/whats-new-in-sql-2016-native-json_13.html. Using code in this blog post I was able to extract the data from the JSON string supplied by the API from the sessionise.com website.
Before querying the data I need to explain one concept which is crucial for extracting data from structured JSON. In the example in Figure 1 below the path of the ‘title’ key value pair is as follows
Sessions.0.title this would have the key value pair 'title: “DAX Gotchas”' see Figure 2
Figure 2 – JSON data showing the sessions node and the first speaker node.
In the JSON object that was returned from sessionize.com API there are a number of nodes for each session. Starting with the number 0 through to 29 within each node there are a number of Key : Value pairs eg 'id : “117469”'. The path, nodes and arrays eg speakers, and categoryItems are what TSQL is going to extract values from.
Enough with all that waffling about JSON objects, lets write some proper TSQL. In the next example we are going to use a function called OPENJSON(). This is only available in SQL 2016 or upwards. Using OPENJSON() in this example we are going to provide two arguments, @AllJson which contains the JSON object and must be datatype NVARCHAR(). Next is the path, the way I think about the path, is it specifies the node or array that I want to return from the @AllJson. The other function that we will use is JSON_VALUE(). This function also accepts two parameters, and an expression which is a variable or field name containing JSON data. The other one is path, the way I think about the path is it specifics the node or array that I want to return from the JSON data (yes I said that already just wanted to see if you are paying attention ;->).
That’s a lot of words so let's look at some TSQL in Figure 3 below
Figure 3 – The JSON data from the sessions node returned as a result set in SSMS
When we look at Figure 3 we will notice that the first row of the data is the same as the data shown in Figure 2. In essence the FROM OPENJSON(@AllJson, ‘$.sessions’) is returning a dataset which consists of three fields namely Key, Value, and Type. The field Value contains the JSON object for all 30 session nodes. Next the JSON_VALUE() function takes the Json and extracts the value for one key pair. This is done by specifying the Key value for the 'Key:Value pair'. So in the case of title the path ‘$.title’ is supplied for the path parameter. Since there is only one 'Key:Value' pair where the Key = title, the value is return from the JSON_VALUE() function, and returned in the field ‘SessionTitle’.
Looking at Figure 2, there is a Key:Value pair in the speakers array. So sessions.id.value is “1174469”, the corresponding lookup value is speakers.sessions.value is “117469”. The two values are their locations in the JSON object are shown in Figure 4 below.
Figure 4 – Showing the lookup values for both sessions to speakers and vice versa.
So we know that we want to get access to the data in the speakers array as this contains the list of speakerID’s for each session. How is this done? Well I found an answer in this blog post - https://visakhm.blogspot.com/2016/07/whats-new-in-sql-2016-native-json_13.html. Below in Figure 5 is the TSQL and result set.
Figure 5 – Updated query to return the speakerID from the speakers array.
All we have done in the query shown in Figure 5 is to add a CROSS APPLY with a simple select statement. Now the speaker ID is returned, note that if there is more than one speakerID, such as in the case of sessionID 117615 (which has two awesome speakers). In which case the query returns two rows, returning a different speakerID for each, which is just what we wanted.
Next let's have a look at returning data for the speaker's node. Below in Figure 6 the TSQL to return some data from the speakers array.
Figure 6 – TSQL query to return data from the speakers array
Looking at the query inside the CROSS APPLY
SELECT Value FROM OPENJSON(s.Value, '$.links')
WHERE Value LIKE '%Twitter%'
There are a couple things that are worth looking at. First it is possible to use a WHERE clause on the columns returned by the OPENJSON() function. The reason for using the WHERE clause is that the links node can contain more than one type of link. During development some of the speakers had a LinkedIn profile, which they then removed 🙁.
So by now I am sure you are saying “show me the money”. After some work I created a query which extracts, the session, speaker and room information. Then returns it as a single result set as shown in Figure 7 below.
Figure 7 – Result set with Session, Speaker and room details
If you want to have a try yourself and play with the code then you will find
TSQL source code is in this Azure Data Studio Notebook is here
Python Code is in this Azure Data Studio Notebook is here
If you have not run the python code to import the data to import the data, then I have created a azure data studio notebook, containing the code to create the database and other tasks. The notebook can be found here.
Last, but very much not least why did I spend some much effort to get all the data out of the sessonize API? The end goal was to supply the data to SQL Server Report Builder (download from here https://www.microsoft.com/en-us/download/details.aspx?id=53613) . This standalone tool will allow you to build an SSRS report. Using this tool I created a report which when you run the report outputs pages that look like the one shown in Figure 8 below.
Figure 8- Data finally published on the SSRS report