The core application my current company has built allows people to apply for grants. We track the application through various stages, much like a loan application there are many stages that the application might go through. The status of the application is not linear. So the applications can go back and forth between the same status more than once. In figure 1 below an application .can given a status of ‘under review’ then get a status of ‘submitted’ or ‘Pre-Submission’. So, an application might have the same status more than once with different dates.
Figure 1 –showing the workflow an application might take.
One question which our customers want to know is how long an application takes to move from one status to another. This question could be answered using the existing application database. Using that method is not ideal as the queries are quite complex and slow. This is one of the reasons I decided to build a Datawarehouse as a proof of concept. To see if there was a quicker way to answer that question.
Calculating time difference
Given that an application can be allocated a status more than once during its lifecycle. This has an impact on the method we used for calculating the time difference. The methodology used to get the time difference is defined below
The fact table in the POC Datawarehouse, stores the ApplicationID, Status, and the datetime the application was allocated that status. If an application has been assigned the same status then there would be a record for each time the application has received that status. This has some advantages; the end user can easily understand the data as it is shown in the table. On the other hand, this presented some difficulties with calculating the time period between each status.
After various attempts the following calculated column was created. In this case calculating the days between a grant receiving Pre-Submitted status and Rejected status.
Figure 2 – First version of the calculated column DaysFromSubmittedToRejected
For both values the logic used is very similar, for the variable FirstSubmittedDate the MIN() function is used to get the first date. The FILTER statement only returns records with same ApplicationID and Status = ‘Submitted’. Whereas the variable LastRejectedDate the MAX() function is used to return the last date. Using variables, means the logic can be re-used for other statuses, and the DATEDIFF() function is easier to read. When the results are displayed they were not as expected.
Figure 3 – results from the first version of the DaysFromSubmittedToRejected function see Figure 2
The requirement was to return the number of Days elapsed From Submitted To Rejected with the value returned against the record which is the last time the application received that status. So, another function was created to see if the current record is the LastRejectedDate, which is shown in Figure 4.
With this calculated column (LastRejectedDate) in place the return statement of ‘DaysFromSubmittedToRejected’ function just requires a small change which is as follows
VAR FirstSubmittedDate =
MIN ( 'FactAppStatuses'[DateAddedAt] ),
ALL ( 'FactAppStatuses' ),
'FactAppStatuses'[ApplicationID] = EARLIER ( 'FactAppStatuses'[ApplicationID] )
&& 'FactAppStatuses'[Status] = ‘Submitted’
VAR LastRejectedDate =
MAX ( 'FactAppStatuses'[DateAddedAt] ),
ALL ( 'FactAppStatuses' ),
'FactAppStatuses'[ApplicationID] = EARLIER ( 'FactAppStatuses'[ApplicationID] )
&& 'FactAppStatuses'[Status] = 'Rejected’
'FactGrantStatuses'[LastRejectedDate] = "Y",
DATEDIFF ( FirstSubmittedDate, LastRejectedDate, DAY ),
Figure 5 – Second version of the calculated column DaysFromSubmittedToRejected
When the function is added to the PowerBI report shown in Figure 5 then run then the results can be see in Figure 6 below.
Figure 6 – results from the second version of the DaysFromSubmittedToRejected function (see Figure 5)
The addition of the IF() statement uses the value to check if current row contains the LastRejectedDate. Otherwise it returns a zero. This returns the record set which is expected.
The positive uses of this method are that the same calculation can be re-used for a variety of statuses only requiring a minor change to the formula. On the negative side the more calculated columns there are the longer it would take to generate a report.
No doubt there are other approaches that can be taken to improve this. The time taken to evaluate the calculated column will be slower as the number of rows increases. On the other hand, the current method is significantly faster than doing the same calculations on the application database. Even better with the use of a PowerBI report or a tabular data model. These figures could easily be used by an end user to reports which provide insight into their data. All the better that our clients could do this for themselves, with relative ease.
On the the 10th of November 2017 the Glasgow SQL server user group ran #SQLGLA. This was a half day event, primarily for SQL server professionals in Scotland. Although we did have some visitors for further afield. Namely our four MVP speakers, William Durkin, Chrissy LeMaire, Andre Kamman and Rob Sewell.
Now the event is completed, we have gathered and compiled the feedback. Now I feel that there is time to think about the event. It's been a bit a rollercoaster ride, with the full complement of good bits and scary bits. Much of the thanks for making this entire event a success goes to my co-organiser Craig Porteous. Without Craig this event would not have come off at all.
It would be fair to say that both Craig and myself had visions of very few people coming. Other than the speakers, volunteers and ourselves. People did come and as often happens at these, people you did not expect arrived and those we expected did not arrive. All part of the learning curve. My memories of the actual day are, we arrived at the venue, we all ran up and down stairs, realized what we had forgotten to bring, set up the venue ready for our guests. People arrived, attendees, speakers, sponsors. Speakers, spoke, people mingled, chatted, had tea, coffee, sandwiches, then drank beer, & wine. We cleared up, said good bye. Some people head off to the pub to continue the fun. It was a fantastic day, all the objectives we had aimed for were achieved.
As we know these events do not run themselves. There is a dedicated team of selfless people who make these events happen. Craig Porteous my co-organiser put a massive amount of work into making this event happen. Even more his selfless decision to have one more drink at the SQL Bits party meant he met William Durkin, and the conversation resulted in the event :-). We were also very kindly blessed with two very enthusiastic and industrious volunteers Edith and Paul. Who always seemed to be in the right place at the right time, and just did what needed to be done.
This event was a huge learning experience for Craig and myself. This event combined with running the user group has taught me more than a few lessons. Some of which we have learned, some I am sure we are still learning. Now the event is over, and I have sometime to look back, I will blog about some of the lessons. About running a user group and a larger event such as #SQLGLA.
So first l created a variable in the query window see figure 1.
Now you have a card with dynamic title, if you want to change the client name all you have to do is change the text in the variable. A sample PowerBI report file can be downloaded here which contains the report shown above.
Whilst nothing can beat experience having a certificate can also improve the chances when looking for a new job. If an interviewer is looking at two evenly matched candidates, one without a certificate one with a certificate. Personally speaking, if l was the person making the decision then the person with the certificate would the one whom l would recommend getting the job. By studying and passing the exam the candidate has demonstrated, initiative, self-motivation, and a genuine desire for technology. At least that's my opinion. In one case l know of a colleague whom l work with, who said one the reasons they were offered a job was due to the fact that had studied and passed two exams, which were relevant to the role they applied for.
Craig Porteous spoke to some recruiters he knows and asked them some questions about their view of certification. So these thoughts are directly from those who make hiring decisions.
What weight do you put on certifications when hiring?
1) A lot. With two otherwise comparable candidates, the one with certifications wins in my mind.
2) A lot, think it shows that candidates are focused in developing their career.
3) For a technical role I see it as essential
Do you encourage the pursuit of certifications by your team?
Do you see any downsides/negative aspects to certifications?
1) No, none at all.
2) Some of the accreditation's could be more hands on focused.
3) Cost (retaining skilled up workers)
With staff who have completed certifications, do you see any differences in working practice etc to those who haven’t?
1) Yes. People working towards certifications are more engaged with technology and tend to apply their learning in the work environment, sharing their knowledge and improving the overall team dynamic leading to improved productivity.
2) It really depends on the individual so don’t think it’s a fair comparison. Better way to look at it is how doing accreditations adds value to that person in regards technical ability and confidence.
3) Last example was a infrastructure type. Stuck him on a sccm course. A,month later our sccm world upgraded. A year later he left and now heads up sccm at dell secure works in the US . For me a benefit – I get a sccm upgrade from a capable engineer. They get a badge they can use to get their career upgraded. Win win.
One reason l have heard for not doing any certification is the cost. My personal point of view on this is that l am investing in myself. If l learn a new skill gain or learn some new techniques. Yes, the company l am working for will benefit which is excellent news for them. If l choose to move to another role with a different company, then those skills transfer with me. Those skills l have invested both my time and (more importantly very often) my money in they are mine. So my choice is to invest in myself as l believe that the return on investment (ROI) is excellent.
So you have decided to study for certification, what resources are there available? The following list is just suggestions, based on largely on my experience and some others.
If you are planning to take the data platform exams then l would strongly suggest investing in the books for the relevant exam. For example, this one is for the 70-461 exam. The book covers all the topics that could be questioned in the exam. There is an accompanying CD has an electronic copy of the book and practice exam questions. One series of books which I used whilst studying for 70-461 exam Querying Microsoft SQL Server 2012/14 was series titled 2 Joes. The purpose of the books is to take a complete beginner through all the skills required to pass the exam. Personally l found this is the best explanation of how to query xml data using TSQL. They where for me worth the investment.
CBT Nuggets - an excellent resource, the videos l viewed for my 70-462 exam were really helpful. As part of the training package l signed up for included exam questions which l also found where excellent. More expensive that other sites, you have access to all the courses. Really worth considering if want to maximize your study time.
Udemy - with this provider you purchase one course at a time which allows you "lifetime access" to the course. When the courses are on sale, the prices are low. The quality of the courses can vary, so have a look a the reviews on the course before purchasing. Personally, l found the 70-463 covered the basics well, on the other hand, it did not go into sufficient detail for the exam questions.
Pluralsight - this is a well know training site (ok l have heard the name in quite a few place) there are lists of videos for specific certifications. The quaility of the courses l have watched were of a very high quality.
Microsoft Virtual Academy - its Free, which is not always a recommendation. That said the quality of the courses that l have viewed have been excellent. For the exams search for jump start videos, which are a really good jumping off point to start your studying. So excellent hints and tips in the videos from people who passed and training people for the exams.
SQL Bits - apart from being one of the best SQL conferences in Europe. The organizers have given back to the SQL community by recording some of the sessions and making the available for FREE on their website. To find what you are looking for might take a bit of searching, most topics will have at least one video on them.
YouTube - there are a lot of videos uploaded on a wide variety of topics. There will be some searching to find the topic you are looking. On the downside the quality of content is variable. Ranging from the excellent to the not so good.
Blog posts - again this will require some searching. It has in my experience, been worth the time and energy required. One author whilst studying for the 70-463 exam blogged about what she was learning as they went along.
Passing the exams
What is required to pass the exams? Practice and lots of it! One of the keys that are borne out by other people who passed the exams is the practice exams. These exam questions will not be exactly like the exam questions. What they will do is get in the way of thinking when doing the exams. Reading exam questions to see the question, examining multiple choice questions for the correct answer. Best way to get practice is to get hold of practice exam questions and take the exam. There are a number of providers which will allow you to purchase them, MeasureUp, CBTNuggets and others. They will not be exactly the same as the exam questions you take when you go into the exam room. On the other hand they will give you practice at answering the questions. If you pay attention to the score at the end of the practice exam you will also be able to see where you need to improve.
The questions are designed quite deliberately to test your knowledge, well you would not want them to be too easy ? This article from Pluralsight has some excellent examples of the format of the type of question you will be answering in the exam. If you are really interested in how the question are constructed and methodology behind them, this video from Pluralsight has an interview with someone who designs the exams.
All that remains to say if you have decided to study for a certification, good luck and happy studying.
Figure 2 – SELECT statement from Figure 1 showing the output when FOR JSON PATH is used
Let's make one change to the SELECT statement in Figure 2, instead of using FOR JSON PATH use FOR JSON AUTO.
Figure 3 – Output from Select statement with FOR JSON AUTO
Looking at Figure 3 the JSON string outputted is different from one shown in Figure 2. With the select statement in Figure 3 all the records are from a single SalesOrderNumber and Order date. Hence all the records from Sales.SalesOrderDetail alias “D” are in a child node underneath the parent record from Sales.SalesOrderHeader.
Let's try a different SELECT query this time select 3 records with different SalesOrderNumbers see Figure 4.
Figure 4 – results set for the three SalesOrderNumbers
Now lets try the same query with ‘FOR JSON AUTO’ the query and output is shown in Figure 5.
Figure 5 – results set for the three SalesOrderNumbers outputted as a JSON string
Looking at Figure 5 each individual SalesOrderNumber are placed in their own node. The corresponding SalesOrderDetail values are placed in a child node underneath the parent SalesOrderNumber.
Using FOR JSON AUTO the format of the JSON string is determined by the SELECT statement. Whereas FOR JSON PATH which we demonstrated in the previous blog post the JSON string output is controlled by the fields and table presented in the SELECT statement.
Figure 1 – Three records returned from the temporary table.
There are three records returned, note that the second record the ‘Colour’ field has a NULL value returned. In the next step I am going to add ‘FOR JSON PATH’ after the ‘FROM’ statement.
Figure 2 – Adding ‘JSON PATH’ to the select statement, the results set it outputted as a JSON formatted string
When the SELECT statement with the ‘JSON PATH’ command is run the results are returned as a JSON formatted string. The query was ran in Azure Data Studio, so to see the JSON string formatted. All that was required is to click once with left hand mouse button to open the results in a new window, as show in Figure 2.
Note that for the second record, the ‘Colour’ field has no data returned, since the value for that record is ‘NULL’. Can we write a SELECT statement which includes ‘JSON PATH’ and return records with NULL values?
All we need to is add ‘INCLUDE_NULL_VALUES’ to the query see Figure 3.
Figure 4 - Adding ‘WITHOUT_ARRAY_WRAPPER’ to the query and the square brackets are suppressed.
All the queries shown in the screenshots were ran using Azure Data Studio, which if you click on the Results set returned opens it in another new window and formats the JSON.
That’s enough for just now there is more to come.