THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Louis Davidson

  • Curious - How does STRING_SPLIT (2016) handle empty/NULL Input?

    If you haven't seen STRING_SPLIT, it is pretty awesome, and something we have all been working with for many years, which you can read about in Erland Sommarskog's great article here along with many other concepts involved in sending in a set of data to a stored procedure.

    As a quick starting example, consider the string of characters: A,B,C;D. Using a separator of ',' this will give you three separate values. The STRING_SPLIT function will return the multiple values that are parsed into rows in a tabular output.

    DECLARE @StringToSplit varchar(30) = 'A,B,C;D'; --could be varchar(max)

    SELECT *
    FROM   STRING_SPLIT(@StringToSplit,',');


    This returns: 

    value
    ------------------------------
    A
    B
    C;D


    Note: BOL https://msdn.microsoft.com/en-us/library/mt684588.aspx doesn't specify a output value length for columns, but when I added an "INTO hold" clause, the column length varied by input value, and was varchar when the variable was varchar, and nvarchar when it was nvarchar. The length of the output for my call is varchar(30).

    But what about the two versions of an empty value? '' (zero-length/empty string) and NULL. My NULL sense told me that the NULL one would return a single row with NULL, and the empty string would return a single empty string row.  Of course, I was wrong, and it makes sense why (a row of NULL would be really annoying, especially if you want to use the output as an exclusion list, because A NOT IN (SET(B,NULL)) always returns NULL, not TRUE. )

    For example, say the output could include NULL. You could end up with something like the following, where even though the input value of A is not in the NOT IN list, no rows are returned:

    SELECT *
    FROM   (VALUES ('A')) AS List(Col)
    WHERE  Col NOT IN ( SELECT *
                        FROM   (VALUES ('B'), (NULL)) AS Excludes(Col) );

    Change NULL to 'C' and you get the result you expect.

    Ok, so let's look at the output. What does '' (empty-string) return?

    DECLARE @StringToSplit varchar(30) = '';

    SELECT *
    FROM   STRING_SPLIT(@StringToSplit,',');

    Just as expected, a single row with an empty value (which is easier to see if you keep the rows affected message on for the example because in the next output, it looks the same, but is a different result.

    value
    ------------------------------


    (1 row(s) affected)

    Any length of value between the commas will return a value. BOL notes that if you have spaces between separators, you will get outputs of various lengths, so you can do the following: 

    DECLARE @StringToSplit varchar(30) = ', ,  ,   ,  bob  '; --could be varchar(max)

    SELECT *
    FROM   STRING_SPLIT(@StringToSplit,',');

    And you will get 5 rows, one empty, one single string, one string of '  bob  ' etc. Using RTRIM and LTRIM you can deal with spaces in the front and back of your string. But what about NULL (and not the string value 'NULL' which will confuse as you are writing early in the morning)?

    DECLARE @StringToSplit varchar(30) = NULL;

    SELECT *
    FROM   STRING_SPLIT(@StringToSplit,',');

    This returns:

    value
    ------------------------------

    (0 row(s) affected)

    A NULL input returns 0 rows, which makes some sense, since there is no string value to split, but as everything with NULL, may not be exactly what you expect. In this case, the output is more than likely what you will have wanted in the first place, which definitely is not always the case with NULL.

  • Post PASS Summit Resolutions 2016

    So, I am finally back and lightly rested from the two big weeks of SQL learning this year (PASS Summit and MVP Summit) and like pretty much every year I am physically exhausted, feeling my age. So many sessions, so many miles put on my feet. But my brain recovers from the ordeal faster than my feet, and is like: "wow, I am going to need more hours in the week!" But I do want to set the expectation dial to around a 6, not 11, because from what I hear, there is more to life than SQL (and yes, I did chuckle to myself when I said it).

    So here we go, my resolutions for the 2017 year:

    1.  Keep involved locally. My wife and I are leaving the Nashville area this year, moving to Cleveland, TN (south of Nashville, unlike that other more famous Cleveland!) So I will hopefully get involved in the Chattanooga group, but who knows what I will work out. I probably see a meeting or two in Nashville this year too.

    2.  Write more. Pretty much anything, but I want to keep writing books if I can. I have one coming out later this year, and one more early next year. One project I really hope to start on this year is upgrading the DMV book, starting with blogs on the DMVs that need to be added, and hopefully getting ideas from the community, and probably pick up a cowriter or two before it is all said and done.

    On the blogging front, I will try to do my best to put out something every week (well, every other week or so, to be completely honest), even when I am working on a larger bit of material like a book. This will probably be the hardest resolution to keep, because I get pretty focused on one thing at a time and can’t find the mental energy to work on something else while another is waiting. I have tons of unwritten blogs waiting for me and I will try to complete at least some of them. Something I will do at least a few times is to try to participate in T-SQL Tuesday and other blogging challenges now and again.

    3. Continue to speak, but do so "economically." I almost always foot my own bill to go speak at a conference, except for the PASS Summit which is part of my training budget each year. Part of our move from Nashville will be a reduction in family income, so I might slow down a little. Not that I plan to quit or anything, and have already put in for Nashville and Birmingham, and will certainly submit to Atlanta and Chattanooga, and probably an event or two more if I can split costs/travel with others.

    I also plan to submit to the PASS conference this year, something I haven’t done in a few years. This year I plan to do it with a bit of a twist. More on that in a bit,

    4. Start recording of my old presentations and put on (possibly) YouTube.  I planned on doing this a while back, calling it “directors cut” presentations (the point being that I am not restrained to 1 exact hour, so I won’t cut any demos or commentary for time.) Then I had my surgeries and just never did. I was reminded of this when I was at the MVP Summit and talking to another person who used YouTube for a similar sort of use.

    5. Upgrade my website to advertises my talents better. My current website kinda stinks, but is functional enough. But I would like to make it a showcase of my work, including more snippets, the aforementioned videos, and blogged articles. Expand some of the examples examples for my book and presentations for Azure SQL DB, and in some cases in-memory OLTP (naturally starting as blogs!)

    6. Learn something new. There is a specific technology that I am thinking I want to learn very well, but I can't mention it yet. See #2 for what I will do once it has been released.

    7. Continue to stay involved with the PASS Program Committee reviewing and commenting on abstracts for people before they submit them. Note that I had said I plan to submit to the PASS Summit. Last year I felt it was a bit of a conflict of interest as I reviewed 80+ abstracts, and easily could have stolen ideas (and a few sessions I saw seemed like they were stealing MY ideas, even if I had never told anyone.) To make sure that there isn’t a conflict of interest, I will post my abstracts to my blog before the call goes out, so no one could say I stole their idea (and I won't mind if people steal mine as attending PASS without speaking is kind of relaxing.)

    In addition, I hope to use my abstracts as a screening test for people joining the Abstract Coaching team (with some real good righting to see what people seeen.) So I can get the benefit of the team's skills, just like several past team members have (and are highly recommended to do! It is usually a skilled person who knows that other skilled persons can make them better!)

    8. Get more fit. At the PASS Summit this year I walked more than I ever have, for a total of 57215 steps (an average of 8173 steps a day, or approximately 3.5 miles a day based on what my Microsoft Band 2 tells me), though I did use my cane quite a bit. I even attended more of the parties than normal, and only used an Uber once to get to somewhere other than the airport. It still knocked the heck out of me, but I am getting there. I doubt I ever can write resolutions without saying get more fit, but hopefully I can always say I did improve in the follow up.

    9. Keep stretching my comfort zone. This year I wore a kilt, and it was not exactly easy for me. But I got through it, and it didn't kill me. With the PASS Summit in 2017 starting on Halloween, if #sqlcostume becomes a thing, I have a costume idea that will be fun, and will again stretch my comfort zone in a way that will not be scary, or cause anyone nightmares either.

    10. Start working on following through on my secret resolution from last year. Because I ended up working on an extra book and having knee surgery, I never did the leg work (ba boom ching). My idea is to start a conference at a higher level, particularly so speakers can go and learn stuff, along with the people who don't feel like they are getting anything out of conferences. If you steal my idea, you have to let me attend for at least half price… Or free.

    11. If I am not having fun, disregard any of the previous items except #8 (because that one is truly important). I enjoy this stuff most of the time, but sometimes it wears me the heck out. I occasionally just want to turn on the television and watch a football game. While I am writing this there is a game on my hotel TV as I head out of town. I had to check the TV to see who was winning. Seattle. Seattle is winning. (As I edit myself, I will note that Seattle won, and now I am watching Carolina up. Football is great TV to write while you watch.)

    ---------------------

    So there you have it. I am going to try to follow through with these resolutions, but who knows. Life is full of surprises, exciting (like getting a new book to write), not exciting (like having your hip break), or really sad (like a younger acquaintance of ours passing away suddenly).  You don't have a clue what is going to happen, and you are never in control of what is going to happen. Obviously we can only influence the future, but never are you in control.

    Finally, I ask myself why write resolutions. I remember something Scott Adams had written about a long time ago about writing down goals (I can't find the reference I remember anymore). Writing them down, and transmitting them to your web browser helped me make them real last year. Will it work this year as well as last? See you in early October next year for the followup and you will see.

  • After PASS Summit 2016 Recap (As seen by me!)

    In my last blog entry, I promised to blog about the PASS Summit each night when I got back to the room. This was a failure for two reasons. 1. I was always out at night and then exhausted. 2. I forgot the keyboard to my Surface Pro. I tweeted about it, and was picked on by the @surface twitter account:

    image

    But I did tweet about the event rather frequently, as it is much easier to capture your ideas and comments in 140 character spurts (and favoriting other posts saves typing too.) If you want to read all of the tweets about the summit, look for the #sqlsummit hashtag on Twitter.

    The first day was the Microsoft Keynote. It was led by Joseph Sirosh and while there wasn't a tremendous amount of stuff that directly excites this relational engine programmer (though love was shown for how awesome the SQL Server Engine is, both on prem and in the cloud), some of the stuff that was shown was really cool:

    1. Showing the various sources of data you can use with Polybase

    2. SQL Server on Linux - Not that I will ever use this, but it could be a boon to SQL Server usage as time passes (and for a relational programmer, you would not really notice much of a change anyhow)

    3. Azure Analysis Services is coming soon

    4. Azure SQL DW has had some tools created to make it easier to get started with (RedGate has a free tool at http://www.red-gate.com/products/azure-development/data-platform-studio/), and as Tim Ford tweets here: (https://twitter.com/sqlagentman/status/791316930703478784), you can get a free month of SQL DW to give it a try.

    The biggest takeaway was just how much data is going to affect our lives as time passes. Last year, my reaction was that the keynote was a bit creepy, taking mapping DNA and predicting health. This year, it was a couple of examples that were really cool, including some apps, websites, a few game examples, and sentiment analysis of the book War and Peace (https://twitter.com/drsql/status/791320039303491584) by Julie Koesmarno.

    An interesting turn of technology was the push towards "intelligence database" platforms. Something that many of my colleagues have discussed for years has been to leverage the data tier to get work done faster, and more reliably. What had always been missing in those scenarios has been scaling out. Hence we were constantly limited to how much we could do on a single computer. Two things have changed since those days. 1. A single computer can do as much work as most organizations need to. 2. Scaling out is far easier when dealing with read intensive scenarios. There was a demo of SQL Server 2016 handling millions of transactions where the reality was orders of magnitude lighter (and we are talking fraud detection for major credit card companies).

    However, the most moving demo finished out the keynote, and it was the closest to creeped out that I got. There was a computer guessing ages, (I think) gender, etc. Then the computer was describing the surroundings. The the computer was reading a menu at a restaurant. And then you realize this was a computer helping a blind man. Wow. That was just an amazing use of technology.

    If you want to know what Joseph Sirosh (the Corp VP for the Data Group at Microsoft) felt were the top five announcements, he shared it here: https://twitter.com/josephsirosh/status/790950683138596865. Stuff I didn't mention was really outside of what I know (ok, I admit it, care) about (I do only have so much time!)

    -------------

    After this I attended several pretty great sessions:

    • Why Datatype Choice Matters from Andy Yun, where he covered some of the internals of datatypes. The best part for me was the statement that "NULL isn't a value, it is a state of being unknown, undefined.  Hence the null bitmap in the physical record of a row." While I have written about NULL probably hundreds of times, it is good to be reminded of this point, that NULL isn't really a value, even though it does feel like it.
    • Building an SSRS monitoring system with Stacia Varga (a cowriter on this book). She covered a lot of stuff about logging that I may never use, but one thing I learned about that I might directly use is logman.exe, which lets you capture perfmon counters. There is an article here about capturing SSRS statistics: https://msdn.microsoft.com/en-us/library/ms159809.aspx).
    • Then Tom LaRock and Karen Lopez duked it out again talking about time bombs you have lurking in your database code. You know like NULLs no one understands, identity column values that no one pays attention to when the values run out.

    ----------------

    Something I am keen to learn more about came in two sessions: Buck Woody the first day and Dr Cecilia Aragon. Data Science. I don't know if I could, or would want to, become a data scientist. But in either case it leads me down the path of wanting to make sure that databases I create are ready to be a source of some of that data. I have always been a proponent of tailoring my OLTP database designs to capturing every detail that is possible. For example, cause an effect, when it is direct (such as a shipment to an order), or indirect, (such as a follow-on order that the customer tells you, or gets in a link to, a previous order.)  Data Science is about learning more about everything, and the more answers you can provide an algorithm, that can only help you see others behaving the same way.  Capturing additional data that isn't needed immediately is not always something that is greeted by developers with a hearty smile, but it is almost always going to be useful.

    Buck Woody pointed out a website (http://tylervigen.com/spurious-correlations) that has some excellent, messed up, correlations that you can make using data. Such as "Per capita consumption of chicken" and "Total US crude oil imports':

    image

    I eat a lot of hot chicken to try to help, but I am only one person!  These correlation were highlighted even more by Dr Aragon, who had a couple of very interesting quotes that piqued my interest:

    "Data science is driven more by intellectual ecosystems and software ecosystems than by hardware"

    (Paraphrasing)"Humans not gaining exponentially greater cognitive capacity. "

    "Big data is data that is 2 orders of magnitude greater than you are accustomed to"

    For me, these three quotes really put Data Science in perspective. People are now, and have been, very intelligent, regardless of how things seem at times. But what we really lack is the ability to process concepts quickly. People make algorithms, and could slog through data manually, but rather let computers whip through data and give us decisions. Will there ever be a time where machines make correlations that are completely wrong, but they act on them anyhow? It reminds me of Robot Santa Claus on Futurama who judged everyone naughty, the person who was naughty, and the person who told on the person.

    Will we ever make a machine that can come up with algorithms, and understand what is a meaningful correlation without some human logic? Heaven knows that every person who creates a machine won't be good at heart, but could machines ever be machines without people?

    It does all remind me of the Pete Townshend song "Man and Machines" from the Iron Man album :

    "Man makes machines
    To man the machines
    That make the machines
    That make the machines
    Make a machine
    To make a machine
    And man and machine
    Will make a machine
    To break the machines
    That make the machines..."

    On Singlularity Hub I was reading an article about the subject of AI, while it isn't the same thing exactly, has many of the same problems. There was a statement:

    "Based on deep neural nets, the AI impressively mastered nostalgic favorites such as Space Invaders and Pong without needing any explicit programming — it simply learned through millions of examples."

    If you stop at "without needing any explicit programming", this sounds pretty creepy. But if you give the computer an example of a successful solution, perhaps even millions of them, and combine this with the fact that computers don't make tiny mistakes (you know, what makes games fun!) it isn't that the computer can learn by itself. Just that it can try, fail, adjust, and repeat a LOT faster than people. But it still takes a human to guide the process.

    -----------------

    The second keynote had two major parts. First was the PASS business stuff. We have more chapters, more members and want orders of magnitude more people. One way of pushing this direction is, much like the MVP program did, including the entire data platform. PASS no longer means Professional Association of SQL Server, but just PASS. New logo too:

    image

    The little symbols represent what PASS encompasses in who PASS is as an organization, and we as PASS members. Interesting enough, but I always worry that things are going to go sideways and we will end up in a different community of mostly the same people. Time will tell.

    The second part was an excellent keynote address by Dr David Dewitte. It had some interesting details and comparisons of online data warehouse products, but was a lot broader than that. Good overview of internal stuff that can only help your career. I won't cover anything about it, go here (http://www.sqlpass.org/summit/2016/PASStv.aspx) and watch it.  Best quote for me: "SQL Server, Best Query Engine". But other companies are doing great stuff too.

    ----------------

    Then I went to a discussion about the way sessions are chosen. PASS choses sessions in a very interesting way, but really I think they do a good job. No matter how you do it, someone's feelings will get hurt unless you use my favorite method for SQL Saturday session choosing. Everyone gets a session if you haven't angered the organizers in some meaningful manner. Best way to anger the organizers: don't show up without a good excuse. Yes, it happens too often. And that, along with harassing others at an event (or multiple events), is something that takes a while to get over. Best way, be apologetic, attend events and don't be a jerk again.

    -----------------

    The other big thing that happens on the second day is that a group of folks wears kilts to PASS to show support for Women in Technology. This year, I was one of those people. It was not a normal thing for me, and not something I expect to do for a SQL Saturday unless for something special. Want to see the picture. Click this link to see Jamie Wick's tweet of a picture that was taken: https://twitter.com/Jamie_Wick/status/791739875439456256

    -----------------

    Friday, we woke up to a rather interesting sight for a PASS conference, even more interesting than myself in a kilt. The sun came out:

    Attended one more regular session of note: Tim Mitchell's Deep Dive of the SSISDB catalog, where I knew most everything, but using Data Taps to capture intermediate results like you might to a temp table in a SQL Query Batch was very nice. I hope to run a series of blogs about some work I have done with the SSISDB catalog over the next year or so. Another interesting idea, using SSISDB versions for regression testing. Run once, deploy new, run again, compare results, then revert.

    The other thing I went to was Speaker Idol, supporting my man Robert @SQLCowbell Verell. We co-lead the Nashville SQL User Group, and it helps us if Robert gets a speaking slot :) Robert was wild-card/runner up of the day (there are three rounds of four, with a final round of four to complete the day), and he did a great job. I really felt nervous for all of the people who participated, because what pressure. I have long sweated the process of speaking, because all of those eyes staring at you, seemingly expecting perfection (actually just expecting to learn a few new bits and pieces.) And here, while you have 10 eye staring at you, this time actually expecting perfection. In the end he didn't win, but he certainly didn't embarass himself, since he made the finals despite having a yellow background for text in SSMS that still is burned into my eyes.

    ------------------

    Then it just sort of ended… No fanfare, just a walk down to the Moore Theatre to catch Ian Anderson do his Jethro Tull rock opera. I hadn't even noticed there being concerts I cared about in the area, and prior to this year I would have never wandered that far from the hotel most nights, but I discovered the ease of Uber while there, which made walking less scary, since I occasionally aggravate my knee when walking as much as I did this week!) While there I ran into Rodney Kidd, who had a lot of great stories about music, walking back from the show (we were both at the Homewood Suites.) Add that to the stories that Pat Phelan shared at breakfast that morning about his cool experiences, and I had a great time even outside of the event.

    Well, can't wait until next year!

  • Post PASS Summit 2016 Resolutions - Followup

    Last year I started a new tradition for myself that will last until I specifically retire from the SQL Server community, something that I practically promise isn't coming in next year's resolutions. As this is the end of the PASS year, with the Summit coming up next week, it is time to see how I did with regards to what I said I would do. I didn't do as bad as I generally do with my New Year's resolutions, but like with those, still not perfect.

    ---------------------------

    1. Don’t submit to speak again next year – As much as I appreciate the surprise by a few people that I wasn’t speaking, and the kind words by a few people about liking my sessions, this was my favorite year. I am not a natural speaker, so I spend a lot of time preparing and I miss a lot. This was the first year I actually ventured out with people for a meal or two, caught a movie, and attended sessions in all but 2 slots (I saw my SQL Friend Joe Webb whom, I hadn’t seen in forever and I started talking to him and 5 other people who kept stopping by (Worth it!))

    The easiest of them all. I did not submit, and there were two reasons. First is that it is awesome to go to Seattle for the week and just attend meetings, talk to people, etc. The second is in regards to number 8. Since I have been working with the program committee helping to comment on abstracts (and unlike the other members of the team, I see them all), it feels weird to get in competition with those folks.

    2. Don’t stop submitting to speak at SQL Saturdays\User Groups – As much as I didn’t miss speaking at PASS, I do love speaking at smaller venues where I can do most prep at home, and the room size doesn’t give me heart palpitations. I plan to submit to 4-6 more events, and I have 2 or 3 user group presentations planned.  I have hopes of doing one pre-con this year as well possibly, and I will submit to a few (It won’t kill me if I don’t get chosen, it is a lot of work!) I also hope to submit for SQL In The City next year if RedGate does it (and I see the call for speakers before it closes).

    I spoke at quite a few SQL Saturdays, even doing a precon in Tampa. I also spoke at Music City Code, which is a very nice replacement for Devlink.

    3. Possibly attend a SQL Saturday and don’t submit to speak – It might be nice to just come to one and just hang out. I would be more than happy to work a booth for PASS or RedGate or whoever would want me to!

    Did this in Chattanooga, and it kind f reminded me how I felt at the summit last year. It definitely helped that I was able to go to the speaker dinner with Robert Verell, as it is always great to spend the evening talking to my peers.

    4. Get a kilt and a costume to wear for PASS next year – Okay, so this one is probably a long shot (consider this my “lose weight” resolution that may not make it, because I will have to lose weight for the first one to occur), but I have a fantastic costume idea that was inspired by Bob Pusateri’s (@SQLBob) this year (nothing to do with Dilbert, but that is all I will say.)

    I decided not to do the costume, because it just didn't seem like it was that necessary, but I did end up getting a kilt, which I am not 100% sure I will actually wear it on #sqlkilt day on Thursday. It all depends on how well I can figure out the look. The following is a very tiny picture of how the kilt looks, just to prove that I did purchase one :)

    image

    5. Get well enough to walk without making me and others feel bad – I know that I made some people uncomfortable this year watching me try to stand up after sitting down for a few minutes, but I felt better this year than I have in 6 years or more. It was just that muscles tighten when you sit for a while and it hurt really bad.

    This one is the most complicated. I have definitely made a lot of progress. I have dropped another 20-30 pounds since last year, and I have improved my stamina greatly. However, I had surgery to remove a torn meniscus in July and there turned out to be a lot of arthritis. In some respects it is now a bit harder to walk than even before the surgery (not atypical after they get in there and remove and shave down stuff.) Honestly it will probably look like I am worse off this year as I will mostly be walking with a cane, but while things are worse in some areas, I should be able to do much more walking than I have ever been able to do for a PASS Summit.

    In the next year or two I will be likely having a knee replacement, so it will be a while before the idea of walking without looking like I am in paini s a reality (and based on what I know from others, knee replacement is yet another year's recovery, which I am not relishing!)

    6. Keep being involved locally – I may not be the primary chapter leader for Nashville this year because there are so many great people to let have a swing. I don’t know what I will be doing for SQL Saturday, but I will certainly be there if I am in town.

    I did okay here. Our leadership team all kind of hit a wall of time near the end of the year, as I did picking up an additional writing project that will be coming out later this year, early next. It has really taken away my time to work on the user group, and at the same time, we have had a bunch of issues with venues. But we have some great leaders on the team, and I know we can get things up and going again.

    7. Finish my book early so I can bring some to give away at PASS Summit next year – I will have four chapter done by next week, and hopefully will be done with the all of the logical design stuff before the start of the year. Then I have to work in columnstore, in-memory, and all of the other database design changes that are in 2014 and 2016. Sounds easy and a ton of work at the same time (thankfully, I use very few screen shots.)

    This was a bit of a failure for two reasons. First, I got another writing project that took some of my time. Second, there were some (no fault) issues with technical editing that slowed things down a bit. In any case, the book is in the final stages now, and will be out later this year. The website claims it will be published on November 22, but I still have work to do (writing is all done!) http://www.apress.com/9781484219720

    8. Continue volunteering for the SQL PASS Program Committee – I have greatly enjoyed working with the program committee for the past several years, and last year we started a service to allow submitters to get a review of their abstracts done before they submitted it for the year. This year I hope we can expand the service to help more people who are great speakers to get the writing stuff out of the way.

    The abstract review team did some more great work this year, and we will hopefully do it again this year, doing our best to make it easier for submitter, and reviewer, ideally getting people who want abstract help more opinions than ever. We will see…PASS budgets and IT resources and whatnot that I don't have control over.

    9. Blog somewhat more – And not just these post PASS blogs either. I started blogging about new stuff in SQL Server 2016 until I had to get to work on the book (and some other writing projects,) and I will minimally continue that process as SQL Server 2016 matures. After the book is finished, I do hope to get some mastery of the stuff that Azure offers as well once the book is written (more about that in the next blog I will publish.)

    I did fall off once I started writing my books, but the blogs I did write on several of the new stuff in SQL Server 2016 paid major dividends. I was able to use the blogs in both books, refining quite a bit (I don't edit my blogs that great, and I probably never will!), but taking the base structure of the examples. Writing is really fun to me, though sometimes a bit more time consuming than I hope, keeping me stuck at a desk rather than working on #5.

    10. Mystery Project -  I have started shopping around an idea to some folks to see if I can do something kinda different and interesting in the coming year (or so). I have a lot of details to work out, but I want to wait until some other thing have past before I do too much else. (I may send out a survey in the coming months on twitter to see what more people thing too.)

    The mystery project is still a mystery. About the time I was planning on working to make this happen, I got the second book which took up a few months of my time.

    ---------------------------

    Overall, not a terrible first pass at fulfilling PASS resolutions, and who knows what the next ones will say. After the PASS Summit and the MVP Summit a week later, I will make my resolutions for the upcoming year. I know what I am currently thinking I want to do, but the reason I started the PASS Resolutions is that these two conferences have always inspired me to do more and more through the years. 

    I will make one resolution right here though, and this is regards to the Summit itself. I will be blogging about the Summit this year again. In past years I have tried to blog each day of the Summit, but had slacked off in the past few years, favoring tweeting. This year, I will definitely be tweeting as much as possible, but when I get back to the room each night, I will try to recap what I have seen that may be interesting to you.

  • Where have I been? Where will I be?

    SQL Saturday Orlando has been postponed. I won't be able to make it for the make up day, so that part of this blog has changed. Not the part about me loving the folks down there. They are still awesome! 

    I again have not blogged so much that I don't show up in the list on SQLBlog.com, but it is about time to start blogging again.

    Where have I been?

    Good question that probably no one has actually asked. But let me show you. I have been stuck here, fingers attached to a keyboard:

    2016-10-03 09.50.05

    Over the past months, I have been sitting behind this desk, writing and editing. My "Pro SQL Server Relational Database Design and Implementation" book (http://www.apress.com/9781484219720) is nearing completion. Jessica Moss and I turned in our last edit after tech review today. I am turning in the source code and bonus chapter on Triggers (where I include a really messy natively compiled trigger that duplicates a MUCH simpler trigger using interpreted code!).

    Second (and this is the first time I have mentioned it publicly), I am writing half of a book with another awesome person: https://www.amazon.com/Exam-Ref-70-762-Developing-Databases/dp/1509304916/, but that is really been a lot of work in a very short time period. I am getting close to finished with my part on that one too.

    Where will I be?

    This Saturday, I will be at SQL Saturday Orlando, doing my "Let Me Finish… Isolating Write Operations" session.  The abstract for this session is:

    OLTP databases can be constantly written to and reporting databases are written to at least periodically. In order to ensure consistent results, connections must be isolated from one another while executing, ideally with the lowest possible cost to concurrency. How this isolation is handled is based on the isolation level, whether the classic lock based or the newer optimistic scheme of the in-memory OLTP engine is used, or even if both engines are enlisted in the same transaction. In this session we will look at examples of how SQL Server isolates reading and writing operations from other writing operations to explore how this may affect your application through error messages and performance hits.

    I love the Orlando folks, and every year I hope my vacation plans (made 11 months in advance) sync up with their event (If you know me at all, you probably know I don't give up a day at Disney World for anything else. I mean, I am speaking while my wife is at Disney World AND at the same time as my UT Vols are playing Texas A&M…)

    Is this the last time I do this session? Maybe. I do really like it, and it is the most natural session I have ever done (particularly a session with demos!) But I have done it a bunch of times this year, and I generally like to write something new every year (except perhaps for my Database Design Fundamentals chestnut.) I am thinking something around Row Level Security/Dynamic Data Masking. RLS for sure is a pretty cool topic that I can certainly see applying some day. The last time I jumped on a session idea about a new feature it was sequences… did not get as many attendees as some other sessions! Wouldn't hate doing a fun session some day too.

    Then I will be at the PASS Summit and MVP Summits late October and early November, before hibernating until around the

    My next blog will be a follow-up on last year's resolutions I made after PASS. If it arrives in time, I might even model the outcome of one of the resolutions as well.

  • Temporal Tables - Part 5 - Start and End Type Precision

    In this fifth (and maybe final until at least getting to the concurrency chapter prep) blog about temporal I wanted to briefly cover the precision of the ROW START and END times. You can use any of the datetime2 types for these values. The precision will let you choose how many changes would be seen by the user. Even with 6 digits of precision from datetime2(7), there is no guarantee that every change will be visible to the user via the temporal settings on a FROM clause, but it is generally much more likely than if you are using datetime2(0) as we will see.

    In this blog, I will use datetime2(0) to give the least possible precision possible to show what can occur. Note that (not unsurprisingly) you have to use the same precision for both ROW START and END times or you get the following error:

    Msg 13513, Level 16, State 1, Line 6
    SYSTEM_TIME period columns cannot have different datatype precision.

    I will use the same basic structures I have used in previous examples, so if you have created the tables, you will need to drop the table and the history table:

    ALTER TABLE Sales.SalesOrder
        SET (SYSTEM_VERSIONING = OFF);
    go
    DROP TABLE Sales.SalesOrder;
    DROP TABLE Sales.SalesOrderHistory;
    GO

    To generate some data, I will use SEQUENCE object that I will format to put out a hexedecimal value, which I will put into a default constraint so the repeating code will be easier to read.

    CREATE SEQUENCE Sales.SalesOrderData_SEQUENCE MINVALUE 1 START WITH 1;
    GO
    CREATE TABLE Sales.SalesOrder

        SalesOrderId int NOT NULL CONSTRAINT PKSalesOrder PRIMARY KEY,
                                  --default to a text hex value, so we can see changes...
        Data varchar(30) NOT NULL DEFAULT (SUBSTRING(CONVERT(varchar(20), cast(NEXT VALUE FOR Sales.SalesOrderData_SEQUENCE as varbinary(5)), 1),3,12)),
        ValidStartTime datetime2 (0) GENERATED ALWAYS AS ROW START,
        ValidEndTime datetime2 (0) GENERATED ALWAYS AS ROW END,
        PERIOD FOR SYSTEM_TIME (ValidStartTime, ValidEndTime)

    WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = Sales.SalesOrderHistory)); --Another future thought, put in a different schema? I think not, but, it intrigues me nonetheless.

    Next let's create a sequence of history rows, starting with a simple insert, wait a second, then three sets of 5 inserts.

    --create a first row
    INSERT INTO Sales.SalesOrder(SalesOrderId, Data)
    VALUES (1, DEFAULT);

    WAITFOR DELAY '00:00:01';
    GO
    --update the table 5 times
    UPDATE  Sales.SalesOrder
    SET     Data = DEFAULT
    WHERE   SalesOrderId = 1;
    GO 5
    WAITFOR DELAY '00:00:01';
    GO
    --update the table 5 more times
    UPDATE  Sales.SalesOrder
    SET     Data = DEFAULT
    WHERE   SalesOrderId = 1;
    GO 5
    WAITFOR DELAY '00:00:01';
    GO
    --update the table 5 last times
    UPDATE  Sales.SalesOrder
    SET     Data = DEFAULT
    WHERE   SalesOrderId = 1;
    GO 5

    Now, checking out the data:

    SELECT  *
    FROM    Sales.SalesOrder
    WHERE   SalesOrderId = 1;
    SELECT  *
    FROM    Sales.SalesOrderHistory
    WHERE   SalesOrderId = 1;

    The final state of the row has the Data column = '0000000010', and you can see the complete progression from '0000000001' through '000000000F'.

    SalesOrderId Data                           ValidStartTime              ValidEndTime
    ------------ ------------------------------ --------------------------- ---------------------------
    1            0000000010                     2016-05-02 02:53:12         9999-12-31 23:59:59

    SalesOrderId Data                           ValidStartTime              ValidEndTime
    ------------ ------------------------------ --------------------------- ---------------------------
    1            0000000001                     2016-05-02 02:53:09         2016-05-02 02:53:10
    1            0000000002                     2016-05-02 02:53:10         2016-05-02 02:53:10
    1            0000000003                     2016-05-02 02:53:10         2016-05-02 02:53:10
    1            0000000004                     2016-05-02 02:53:10         2016-05-02 02:53:10
    1            0000000005                     2016-05-02 02:53:10         2016-05-02 02:53:10
    1            0000000006                     2016-05-02 02:53:10         2016-05-02 02:53:11
    1            0000000007                     2016-05-02 02:53:11         2016-05-02 02:53:11
    1            0000000008                     2016-05-02 02:53:11         2016-05-02 02:53:11
    1            0000000009                     2016-05-02 02:53:11         2016-05-02 02:53:11
    1            000000000A                     2016-05-02 02:53:11         2016-05-02 02:53:11
    1            000000000B                     2016-05-02 02:53:11         2016-05-02 02:53:12
    1            000000000C                     2016-05-02 02:53:12         2016-05-02 02:53:12
    1            000000000D                     2016-05-02 02:53:12         2016-05-02 02:53:12
    1            000000000E                     2016-05-02 02:53:12         2016-05-02 02:53:12
    1            000000000F                     2016-05-02 02:53:12         2016-05-02 02:53:12

    Starting at the first second that was recorded, the first thing you can see is the row where Data = '0000000001':

    SELECT *
    FROM   Sales.SalesOrder FOR SYSTEM_TIME  AS OF '2016-05-02 02:53:09';

    SalesOrderId Data                           ValidStartTime              ValidEndTime
    ------------ ------------------------------ --------------------------- ---------------------------
    1            0000000001                     2016-05-02 02:53:09         2016-05-02 02:53:10


    But the next row you will see will not be '0000000002', it will actually be '0000000006'. Using fractional times will be truncated. Such as if we try '2016-05-02 02:53:09.9'.

    SELECT *
    FROM   Sales.SalesOrder FOR SYSTEM_TIME  AS OF '2016-05-02 02:53:09.9';

    This returns the same thing that 2016-05-02 02:53:09 does:

    SalesOrderId Data                           ValidStartTime              ValidEndTime
    ------------ ------------------------------ --------------------------- ---------------------------
    1            0000000001                     2016-05-02 02:53:09         2016-05-02 02:53:10

    If you use the next second, you will get '0000000006':

    SELECT *
    FROM   Sales.SalesOrder FOR SYSTEM_TIME  AS OF '2016-05-02 02:53:10'

    This returns:

    SalesOrderId Data                           ValidStartTime              ValidEndTime
    ------------ ------------------------------ --------------------------- ---------------------------
    1            0000000006                     2016-05-02 02:53:10         2016-05-02 02:53:11

    The only way you will see rows '0000000001' - '0000000005' is to query the history table. We can only see rows where ValidStartTime <> ValidEndTime. In the following query I will get all of the rows that you can see in the table using each second:

    SELECT *
    FROM   Sales.SalesOrder FOR SYSTEM_TIME  AS OF '2016-05-02 02:53:09'
    UNION ALL
    SELECT *
    FROM   Sales.SalesOrder FOR SYSTEM_TIME  AS OF '2016-05-02 02:53:10'
    UNION ALL
    SELECT *
    FROM   Sales.SalesOrder FOR SYSTEM_TIME  AS OF '2016-05-02 02:53:11'
    UNION ALL
    SELECT *
    FROM   Sales.SalesOrder FOR SYSTEM_TIME  AS OF '2016-05-02 02:53:12';

    Which returns:

    SalesOrderId Data                           ValidStartTime              ValidEndTime
    ------------ ------------------------------ --------------------------- ---------------------------
    1            0000000001                     2016-05-02 02:53:09         2016-05-02 02:53:10
    1            0000000006                     2016-05-02 02:53:10         2016-05-02 02:53:11
    1            000000000B                     2016-05-02 02:53:11         2016-05-02 02:53:12
    1            0000000010                     2016-05-02 02:53:12         9999-12-31 23:59:59

    Note that this is the same output you will see if you execute the following query that returns all data:

    SELECT *
    FROM   Sales.SalesOrder FOR SYSTEM_TIME CONTAINED IN ('1900-01-01','9999-12-31 23:59:59.9999999');

    Or

    SELECT *
    FROM   Sales.SalesOrder FOR SYSTEM_TIME ALL;

    So you will definitely want to set your precision to the level that you will have the most likelihood of seeing all changes in your data. Once multiple connections are making simultaneous changes, you wouldn't wnt to lose data that would be interesting to the user.

  • Just over a week until SQL Saturday Atlanta

    I am looking forward to being both a speaker and an attendee at this year's SQL Saturday in Atlanta (it is number 521… wow). Don't know what SQL Saturday is? (I doubt that if you are reading this blog, but if so, click here quickly my friend).

    My topic this year is concurrency:

    Let Me Finish... Isolating Write Operations

    OLTP databases can be constantly written to and reporting databases are written to at least periodically. In order to ensure consistent results, connections must be isolated from one another while executing, ideally with the lowest possible cost to concurrency.  How this isolation is handled is based on the isolation level, whether the classic lock based or the newer optimistic scheme of the in-memory OLTP engine is used, or even if both engines are enlisted in the same transaction. In this session we will look at examples of how SQL Server isolates reading and writing operations from other writing operations to explore how this may affect your application through error messages and performance hits.

    But beyond getting to teach about SQL for an hour, I look forward to a few things even more:

    1. Seeing so many friends - Many I don't even know their names, just their avatars on twitter. Many I have known for years from SQL events. And even my cousin's husband will be in attendance so I get to see them as well. So much time to hang out with friends from dinners, lunches, between sessions, etc.

    2. Sessions! - I personally go to these events to fill in gaps in my knowledge (and sometimes reinforcing my knowledge helps too!) Sessions like: Efficient Table Design by Alex Grinberg, Building Blocks of Cortana Intelligence Suite in Azure by Melissa Coates, Indexes from the Ground Level by Lindsay Clark, and definitely Vote or Die 2016: Which is Faster by Robert Verell (I saw it the first time, so I know the answers!).  And there is a Women In Tech lunch with Rie Irish that will definitely be interesting as well.  Of course, even if I say I am going to these sessions, if I make it to one or two it will be amazing. Too often I am distracted by some other session and end up learning something completely different.

    Will I see you there? (If not, maybe I will see you in Pensacola a few weeks later?)

  • Temporal Tables - Part 4 - Synchronizing changes across tables

    So way back in June of last year, when I started this series on Temporal Tables: Part 1 - Simple Single Table Example, Part 2 – Changing history; and even in Part 3 - Synchronizing Multiple Modifications; I only referenced one table. In this entry, I want to get down to what will actually be a common concern. I want my objects to be consistent, not just at the current point in time, but throughout all points in time.  I won't even try to mix this concern with changing history, but I imagine that it could be a major undertaking depending on the data you already have from any change log tables you have created, if you were not concerned with viewing data at previous points in time.

    In part 3, I looked at what happens when the same row, or two rows in the same table would behave. 2 tables will behave quite the same, and there is a compelling reason to be cognizant of the temporal timestamps when you are dealing with multiple tables and want to see all tables as they existed at a given point in time.

    If you were writing you own versioning, you might use a trigger and write the change. Each change you would write would have a timestamp as of the time the change was written. I won't cover the code (in this blog, but I will in the book) that you need for keeping history of changes here, but it is a widely used pattern. When a row changes, make a copy of the deleted rows in the trigger in a table that looks like the primary table, and set the time when the change was made.

    A problem with this, if you want to see how the database looks at any point in time, you would see the progression of changes over time, which could lead to inconsistent views of the data at any given point in time. So say you have a SalesOrder and SalesOrderLineItem table, and you have a ControlTotal on the SalesOrder that needs to match the sum of the line item values for the SalesOrder. There really is no way to enforce a relationship between columns in different tables as this because SQL Server does not have delayed constraints or triggers. I will use this (what could be deemed a denormalization, depending on how it is viewed in the requirements) scenario, but not all issues will be quite so obvious.

    While we must trust the client to get the math correct or reject it, if we are going to keep temporal versions of data, it is highly important that we make sure that the views across time are consistent (even down to the microsecond if we are using a very high granularity in our start and end time columns,) in order to make coding easier. By making sure that our modifications are packaged in transactions, we can do this.

    First, let's create our tables (and I will drop the SalesORder table we had previously and add the ControlTotal column. Note that you can't just drop a table with versioning on, you need to turn off versioning and drop both tables.):

    ALTER TABLE Sales.SalesOrder
        SET (SYSTEM_VERSIONING = OFF);
    go
    DROP TABLE Sales.SalesOrder;
    DROP TABLE Sales.SalesOrderHistory;
    GO

    Then I will recreate the table with a change from Data to ControlTotal, and then a LineItem table that has the line total, as well as foreign key constraint to the SalesOrder table:

    SET NOCOUNT ON;
    CREATE TABLE Sales.SalesOrder

        SalesOrderId int NOT NULL CONSTRAINT PKSalesOrder PRIMARY KEY,
        ControlTotal DECIMAL(10,2) NOT NULL,
        ValidStartTime datetime2 (7) GENERATED ALWAYS AS ROW START,
        ValidEndTime datetime2 (7) GENERATED ALWAYS AS ROW END,
        PERIOD FOR SYSTEM_TIME (ValidStartTime, ValidEndTime)

    WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = Sales.SalesOrderHistory)); --Another future thought, put in a different schema? I think not, but, it intrigues me nonetheless.
    GO

    CREATE TABLE Sales.SalesOrderLineItem

        SalesOrderLineItemId INT NOT NULL CONSTRAINT PKSalesOrderLineItem PRIMARY KEY,
        SalesOrderId int NOT NULL CONSTRAINT FKSalesOrderLineItem$ContainsDetailsFor$SalesSalesOrder REFERENCES Sales.SalesOrder(SalesOrderId),
        LineItemNumber INT NOT NULL,
        LineItemTotal DECIMAL(10,2) NOT NULL,
        CONSTRAINT AKSalesORderLineItem UNIQUE (SalesOrderId, LineItemNumber),
        ValidStartTime datetime2 (7) GENERATED ALWAYS AS ROW START,
        ValidEndTime datetime2 (7) GENERATED ALWAYS AS ROW END,
        PERIOD FOR SYSTEM_TIME (ValidStartTime, ValidEndTime)

    WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = Sales.SalesOrderLineItemHistory));  GO

    Now let's do a progression of data, leaving a two second gap between operations, slowing down time a little bit like can happen in reality.

    INSERT  INTO Sales.SalesOrder (SalesOrderId, ControlTotal)
    VALUES  (1, 100);

    WAITFOR DELAY '00:00:02';

    INSERT  INTO Sales.SalesOrderLineItem (SalesOrderLineItemId, SalesOrderId, LineItemNumber, LineItemTotal)
    VALUES  (1, 1, 1, 50);

    WAITFOR DELAY '00:00:02';

    INSERT  INTO Sales.SalesOrderLineItem (SalesOrderLineItemId, SalesOrderId, LineItemNumber, LineItemTotal)
    VALUES  (2, 1, 2, 50);

    Now let's take a look at the data, just after the insert:

    SELECT  *
    FROM    Sales.SalesOrder
    WHERE   SalesOrderId = 1;
    SELECT  *
    FROM    Sales.SalesOrderHistory
    WHERE   SalesOrderId = 1;

    SELECT  *
    FROM    Sales.SalesOrderLineItem
    WHERE   SalesOrderId = 1;
    SELECT  *
    FROM    Sales.SalesOrderLineItemHistory
    WHERE   SalesOrderId = 1;

    We have 3 rows in the base tables, starting at three different times, and no history yet:


    SalesOrderId ControlTotal                            ValidStartTime              ValidEndTime
    ------------ --------------------------------------- --------------------------- ---------------------------
    1            100.00                                  2016-05-01 20:36:51.1670200 9999-12-31 23:59:59.9999999

    SalesOrderId ControlTotal                            ValidStartTime              ValidEndTime
    ------------ --------------------------------------- --------------------------- ---------------------------

    SalesOrderLineItemId SalesOrderId LineItemNumber LineItemTotal                           ValidStartTime              ValidEndTime
    -------------------- ------------ -------------- --------------------------------------- --------------------------- ---------------------------
    1                    1            1              50.00                                   2016-05-01 20:36:53.2000286 9999-12-31 23:59:59.9999999
    2                    1            2              50.00                                   2016-05-01 20:36:55.2452606 9999-12-31 23:59:59.9999999

    SalesOrderLineItemId SalesOrderId LineItemNumber LineItemTotal                           ValidStartTime              ValidEndTime
    -------------------- ------------ -------------- --------------------------------------- --------------------------- ---------------------------

    Now, even before we have any version history, the timestamps start to matter. Looking at current data, no problem

    SELECT ControlTotal, LineItemTotal
    FROM   Sales.SalesOrder
                JOIN Sales.SalesOrderLineItem
                    ON SalesOrderLineItem.SalesOrderId = SalesOrder.SalesOrderId

     
    The results look just like you expect. But what if we are doing temporal queries on the table?

    ControlTotal                            LineItemTotal
    --------------------------------------- ---------------------------------------
    100.00                                  50.00
    100.00                                  50.00

    At the time the first operation:

    DECLARE @asOfTime datetime2(7) = '2016-05-01 20:36:52';

    SELECT ControlTotal, LineItemTotal
    FROM   Sales.SalesOrder FOR SYSTEM_TIME AS OF @asOfTime
                LEFT OUTER JOIN Sales.SalesOrderLineItem FOR SYSTEM_TIME  AS OF @asOfTime
                    ON SalesOrderLineItem.SalesOrderId = SalesOrder.SalesOrderId;

    The data looks wrong:

    ControlTotal                            LineItemTotal
    --------------------------------------- ---------------------------------------
    100.00                                  NULL

    And at 20:36:54:

    DECLARE @asOfTime datetime2(7) = '2016-05-01 20:36:54';

    SELECT ControlTotal, LineItemTotal
    FROM   Sales.SalesOrder FOR SYSTEM_TIME AS OF @asOfTime
                LEFT OUTER JOIN Sales.SalesOrderLineItem FOR SYSTEM_TIME  AS OF @asOfTime
                    ON SalesOrderLineItem.SalesOrderId = SalesOrder.SalesOrderId;

    Things are better, but not quite right:

    ControlTotal                            LineItemTotal
    --------------------------------------- ---------------------------------------
    100.00                                  50.00

    Naturally, the likelihood of coming up against such a condition in the case of one row over two seconds is pretty slim, But the more concurrent operations using temporal, the more likely that you get back some weird totals that you don't want. And even more of a concern is that users often need to fix some data that is broken when you have these inter-table dependencies. If you are fixing data, you may need to fix history as well now (fixing history is an interesting topic that I may discuss some day. The central question will be based on requirements. It really depends if you want to see the data through time, or the information through time. Information should be correct. Data is what it was.

    This time, for the second SalesOrder,  I will make sure that all of the data is inserted and updated in the same transaction to ensure that the view of the data remains consistent:

    BEGIN TRANSACTION;

    INSERT  INTO Sales.SalesOrder (SalesOrderId, ControlTotal)
    VALUES  (2, 100);

    WAITFOR DELAY '00:00:02';

    INSERT  INTO Sales.SalesOrderLineItem (SalesOrderLineItemId, SalesOrderId, LineItemNumber, LineItemTotal)
    VALUES  (3, 2, 1, 50);

    WAITFOR DELAY '00:00:02';

    INSERT  INTO Sales.SalesOrderLineItem (SalesOrderLineItemId, SalesOrderId, LineItemNumber, LineItemTotal)
    VALUES  (4, 2, 2, 50);

    COMMIT TRANSACTION

    Then checking the data:

    SELECT  *
    FROM    Sales.SalesOrder
    WHERE   SalesOrderId = 2;
    SELECT  *
    FROM    Sales.SalesOrderHistory
    WHERE   SalesOrderId = 2;

    SELECT  *
    FROM    Sales.SalesOrderLineItem
    WHERE   SalesOrderId = 2;
    SELECT  *
    FROM    Sales.SalesOrderLineItemHistory
    WHERE   SalesOrderId = 2;

     

    It is clear to see that no matter what the AS OF time used, you will not have the issue with illogical results, since all of the start and end times are the same to 7 decimal places:

    SalesOrderId ControlTotal                            ValidStartTime              ValidEndTime
    ------------ --------------------------------------- --------------------------- ---------------------------
    2            100.00                                  2016-05-01 20:48:30.0154948 9999-12-31 23:59:59.9999999

    SalesOrderId ControlTotal                            ValidStartTime              ValidEndTime
    ------------ --------------------------------------- --------------------------- ---------------------------

    SalesOrderLineItemId SalesOrderId LineItemNumber LineItemTotal                           ValidStartTime              ValidEndTime
    -------------------- ------------ -------------- --------------------------------------- --------------------------- ---------------------------
    3                    2            1              50.00                                   2016-05-01 20:48:30.0154948 9999-12-31 23:59:59.9999999
    4                    2            2              50.00                                   2016-05-01 20:48:30.0154948 9999-12-31 23:59:59.9999999

    SalesOrderLineItemId SalesOrderId LineItemNumber LineItemTotal                           ValidStartTime              ValidEndTime
    -------------------- ------------ -------------- --------------------------------------- --------------------------- ---------------------------

    Finally, let's update the line item 1 row to 25, and the total to be 75

    BEGIN TRANSACTION;

    UPDATE Sales.SalesOrder
    SET  ControlTotal = 75
    WHERE SalesOrderId = 2;

    UPDATE Sales.SalesOrderLineItem
    SET LineItemTotal = 25
    WHERE SalesOrderId = 2
    AND LineItemNumber = 1;

    COMMIT TRANSACTION;

    Looking at the data and history:

    SELECT  *
    FROM    Sales.SalesOrder
    WHERE   SalesOrderId = 2;
    SELECT  *
    FROM    Sales.SalesOrderHistory
    WHERE   SalesOrderId = 2;

    SELECT  *
    FROM    Sales.SalesOrderLineItem
    WHERE   SalesOrderId = 2;
    SELECT  *
    FROM    Sales.SalesOrderLineItemHistory
    WHERE   SalesOrderId = 2;


    We see that the SalesOrder and line time 1 start times match, but not the 4th one, as you would expect since we did not apply any modification statement to that row:

    SalesOrderId ControlTotal                            ValidStartTime              ValidEndTime
    ------------ --------------------------------------- --------------------------- ---------------------------
    2            75.00                                   2016-05-01 20:52:41.6210321 9999-12-31 23:59:59.9999999

    SalesOrderId ControlTotal                            ValidStartTime              ValidEndTime
    ------------ --------------------------------------- --------------------------- ---------------------------
    2            100.00                                  2016-05-01 20:48:30.0154948 2016-05-01 20:52:41.6210321

    SalesOrderLineItemId SalesOrderId LineItemNumber LineItemTotal                           ValidStartTime              ValidEndTime
    -------------------- ------------ -------------- --------------------------------------- --------------------------- ---------------------------
    3                    2            1              25.00                                  2016-05-01 20:52:41.6210321 9999-12-31 23:59:59.9999999
    4                    2            2              50.00                                   2016-05-01 20:48:30.0154948 9999-12-31 23:59:59.9999999

    SalesOrderLineItemId SalesOrderId LineItemNumber LineItemTotal                           ValidStartTime              ValidEndTime
    -------------------- ------------ -------------- --------------------------------------- --------------------------- ---------------------------
    3                    2            1              50.00                                   2016-05-01 20:48:30.0154948 2016-05-01 20:52:41.6210321

    Now we can check the data as of a few times, and see that things are consistent:

    At the original time of insert:

    DECLARE @asOfTime datetime2(7) = '2016-05-01 20:48:30.0154948';

    SELECT ControlTotal, LineItemTotal
    FROM   Sales.SalesOrder FOR SYSTEM_TIME AS OF @asOfTime
                LEFT OUTER JOIN Sales.SalesOrderLineItem FOR SYSTEM_TIME  AS OF @asOfTime
                    ON SalesOrderLineItem.SalesOrderId = SalesOrder.SalesOrderId
    WHERE SalesOrder.SalesOrderId = 2;

    Two rows returned, total matches line item totals:

    ControlTotal                            LineItemTotal
    --------------------------------------- ---------------------------------------
    100.00                                  50.00
    100.00                                  50.00

    At the time of the update:

    DECLARE @asOfTime datetime2(7) = '2016-05-01 20:52:41.6210321';

    SELECT ControlTotal, LineItemTotal
    FROM   Sales.SalesOrder FOR SYSTEM_TIME AS OF @asOfTime
                LEFT OUTER JOIN Sales.SalesOrderLineItem FOR SYSTEM_TIME  AS OF @asOfTime
                    ON SalesOrderLineItem.SalesOrderId = SalesOrder.SalesOrderId
    WHERE SalesOrder.SalesOrderId = 2;

    This returns:

    ControlTotal                            LineItemTotal
    --------------------------------------- ---------------------------------------
    75.00                                   25.00
    75.00                                   50.00

    I will leave it to you to try other times for yourself.

    One quick note, if I had updated SalesOrderLineItemId = 4, even to the same value, I would get version rows. Be really careful not to just update rows repeatedly if there is
    no change. You will want to do what you can to avoid it, or you could get this to occur:

    UPDATE Sales.SalesOrder
    SET    SalesOrderId = SalesOrderId
    WHERE  SalesOrderId = 2
    GO 10

    Beginning execution loop
    Batch execution completed 10 times.

    So now the row has been updated 10 times with no change to the data. The version history is now considerably larger:

    SELECT  *
    FROM    Sales.SalesOrderHistory
    WHERE   SalesOrderId = 2;

    SalesOrderId ControlTotal                            ValidStartTime              ValidEndTime
    ------------ --------------------------------------- --------------------------- ---------------------------
    2            100.00                                  2016-05-01 20:48:30.0154948 2016-05-01 20:52:41.6210321
    2            75.00                                   2016-05-01 20:52:41.6210321 2016-05-01 20:59:53.9903127
    2            75.00                                   2016-05-01 20:59:53.9903127 2016-05-01 20:59:54.0059626
    2            75.00                                   2016-05-01 20:59:54.0059626 2016-05-01 20:59:54.0215896
    2            75.00                                   2016-05-01 20:59:54.0215896 2016-05-01 20:59:54.0372406
    2            75.00                                   2016-05-01 20:59:54.0372406 2016-05-01 20:59:54.0372406
    2            75.00                                   2016-05-01 20:59:54.0372406 2016-05-01 20:59:54.0528342
    2            75.00                                   2016-05-01 20:59:54.0528342 2016-05-01 20:59:54.0528342
    2            75.00                                   2016-05-01 20:59:54.0528342 2016-05-01 20:59:54.0528342
    2            75.00                                   2016-05-01 20:59:54.0528342 2016-05-01 20:59:54.0528342
    2            75.00                                   2016-05-01 20:59:54.0528342 2016-05-01 20:59:54.0684602

    Very little change in the start times, but some never the less (you can see the start and end times do change a little bit over time.) This could be a horrible feature to turn on if you have such an interface
    (as most of us probably do) where if nothing has changed and the user can press save over and over, causing update after update. So definitely watch your history tables after you turn this feature on to a new table to make sure of what is occurring. 

  • Temporal Tables - Part 3 - Syncing Multiple Modifications

    Back last June, I started this series on temporal tables, and in my head, I had enough information to make an educated set of examples on how to use them. In the back of my mind though, I knew that I hadn't quite thought enough about the whole process, particularly when you have several rows (or as I will note in the next blog entry, tables) that you are going to work with as a unit.

    So in this blog, I want to look a the mechanics of how multiple operations in the same transaction behave when we are in a single table. Then in the next entry to the series, I will take it to the logical conclusion of how we manage things when we have an "object" (like a salesOrder and salesOrderLineItem) that need to be bundled together.

    This is executing in RC3:

    select @@version

    --------------------------------------------------------------------------------------------------------------------------
    Microsoft SQL Server 2016 (RC3) - 13.0.1400.361 (X64)
        Apr  9 2016 01:59:22
        Copyright (c) Microsoft Corporation
        Enterprise Evaluation Edition (64-bit) on Windows Server 2012 R2 Standard 6.3 <X64> (Build 9600: ) (Hypervisor)

    First, create a table. Note that you don't have to use the name SysStartTime or SysEndTime for the time period columns, which is the typical name because they use is BOL, but they used a different name in the introductory example. So I used a name more like my personal naming standards. I also noticed that the datetime2 column was not a datetime2(7, but rather was a (2). Later in this series, I will try out the different granularities to show what meaning they have for our queries as well.

    CREATE SCHEMA Sales
    GO

    CREATE TABLE Sales.SalesOrder

        SalesOrderId int NOT NULL CONSTRAINT PKSalesOrder PRIMARY KEY,
        Data varchar(30) NOT NULL,    --just the piece of data I will be changing
        ValidStartTime datetime2 (7) GENERATED ALWAYS AS ROW START,
        ValidEndTime datetime2 (7) GENERATED ALWAYS AS ROW END,
        PERIOD FOR SYSTEM_TIME (ValidStartTime, ValidEndTime)

    WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = Sales.SalesOrderHistory)); --Another future thought, put in a different schema? I think not, but, it intrigues me nonetheless.
    GO

    Scenario 1: Single row, in a transaction create a row, update it a few times

    Note: I won't demo rolling back, as it will be have exactly as expected.

    First, start a transaction, and insert some data.

    BEGIN TRANSACTION
    INSERT INTO  Sales.SalesOrder (SalesOrderID, Data)
    VALUES (1,'Original');

    Looking at the data, we see:
     
    SELECT *
    FROM    Sales.SalesOrder;
    SELECT *
    FROM   Sales.SalesOrderHistory;

    This returns:

    SalesOrderId Data                      ValidStartTime              ValidEndTime
    ------------ ------------------------- --------------------------- ---------------------------
    1            Original                  2016-04-30 22:08:48.1200225 9999-12-31 23:59:59.9999999

    SalesOrderId Data                      ValidStartTime              ValidEndTime
    ------------ ------------------------- --------------------------- ---------------------------

    No version history yet, naturally, just the data as we created it. Now, still in the transaction, update the row.

    --SELECT @@TRANCOUNT; --Check if you need to make sure!
    --WAITFOR DELAY '00:00:01'; -- Use a WAITFOR if you want to test this stuff in a single batch, or the times might always be the same.


    UPDATE Sales.SalesOrder
    SET    Data = 'First Change'
    WHERE  SalesOrderID = 1

    Then we check the data:

    SELECT *
    FROM   Sales.SalesOrder;
    SELECT *
    FROM   Sales.SalesOrderHistory;

    You can see a version has been created, but notice the start and end times are exactly the same:
     

    SalesOrderId Data                      ValidStartTime              ValidEndTime
    ------------ ------------------------- --------------------------- ---------------------------
    1            First Change              2016-04-30 22:08:48.1200225 9999-12-31 23:59:59.9999999

    SalesOrderId Data                      ValidStartTime              ValidEndTime
    ------------ ------------------------- --------------------------- ---------------------------
    1            Original                  2016-04-30 22:08:48.1200225 2016-04-30 22:08:48.1200225

    Also note that the start and end time on the base SalesOrder row has not changed either. Lets update the row again:
     
    UPDATE Sales.SalesOrder
    SET    Data = 'Second Change'
    WHERE  SalesOrderID = 1;

    SELECT *
    FROM   Sales.SalesOrder;
    SELECT *
    FROM   Sales.SalesOrderHistory;

    Still no changes to the timestamp. But we keep accumulating changes:

    SalesOrderId Data                      ValidStartTime              ValidEndTime
    ------------ ------------------------- --------------------------- ---------------------------
    1            Second Change             2016-04-30 22:08:48.1200225 9999-12-31 23:59:59.9999999

    SalesOrderId Data                      ValidStartTime              ValidEndTime
    ------------ ------------------------- --------------------------- ---------------------------
    1            Original                  2016-04-30 22:08:48.1200225 2016-04-30 22:08:48.1200225
    1            First Change              2016-04-30 22:08:48.1200225 2016-04-30 22:08:48.1200225

    Now we have completed what we are wanting, so we commit:

    COMMIT

    Checking the results, effectively, since the versions never really existed, you cannot see them using the syntax in the FROM clause as you can see using the AS OF time of the start timestamp, which would be the only time that even somewhat looks like it might return the history:

    SELECT *
    FROM   Sales.SalesOrder FOR SYSTEM_TIME  AS OF '2016-04-30 22:08:48.1200225';

    This returns:

    SalesOrderId Data                      ValidStartTime              ValidEndTime
    ------------ ------------------------- --------------------------- ---------------------------
    1            Second Change             2016-04-30 22:08:48.1200225 9999-12-31 23:59:59.9999999
             
    You just see the data. This is as we expected, knowing something of how this stuff works from Part 1. (Yeah, I had to look it up too). AS OF uses:

    SysStartTime >= PassedValue > SysEndTime (where SysStartTime and SysEndTime corresponds to the names you chose)

    So the changed rows are never seen unless you query the history table.

    Scenario 2: Create two rows slightly apart. Want to make sure the start time is different for the rows. Start a transaction, update them both at different times.

    The thing here is that we want to see how you use the fact that the StartTime values are synchronized with the transaction. If you have more than one row that need to be treated as a group, ideally you modify them in a transaction already. If for no other reason than rolling back the previous operations in the last modification fails. Here though, we are starting to think temporally. If you don’t synchronize your timestamps, you are apt to get illogical results at times.  I will show this more in Part 4 when I have two tables with control values that need to match (denormalizations are another typical concern. If you expect column1 to match column2, and they are updated in two statements, the time gap in the temporal progression could give you weird results if you hit it just right.)

    First, let’s do this without a transaction

    INSERT INTO  Sales.SalesOrder (SalesOrderID, Data)
    VALUES (2,'Original');

    WAITFOR DELAY '00:00:01' --slowing down time makes showing concurrency problems easier so you can hit the gaps easier

    INSERT INTO  Sales.SalesOrder (SalesOrderID, Data)
    VALUES (3,'Original');

    and then, verify you are in a consistent state that you expect:

    SELECT *
    FROM   Sales.SalesOrder
    WHERE  SalesOrderId IN (2, 3);
    SELECT *
    FROM   Sales.SalesOrderHistory
    WHERE  SalesOrderId IN (2, 3);

    This returns:

    SalesOrderId Data                           ValidStartTime              ValidEndTime
    ------------ ------------------------------ --------------------------- ---------------------------
    2            Original                       2016-04-30 22:13:19.1927715 9999-12-31 23:59:59.9999999
    3            Original                       2016-04-30 22:13:20.2358806 9999-12-31 23:59:59.9999999

    SalesOrderId Data                           ValidStartTime              ValidEndTime
    ------------ ------------------------------ --------------------------- ---------------------------

    Note that the start time is bit over a second different. Now start a transaction, update both rows in a way that shows the time they were changed. Wait a moment between updates, either manually, or using waitfor, as I have, but this time use a transaction:

    BEGIN TRANSACTION;

    UPDATE Sales.SalesOrder
    SET    Data = SYSDATETIME()
    WHERE  SalesOrderId = 2;

    WAITFOR DELAY '00:00:02'

    UPDATE Sales.SalesOrder
    SET    Data = SYSDATETIME()
    WHERE  SalesOrderId = 3;

    Now, look at the state of the table as you can see it:

    SELECT *
    FROM   Sales.SalesOrder
    WHERE  SalesOrderId IN (2, 3);
    SELECT *
    FROM   Sales.SalesOrderHistory
    WHERE  SalesOrderId IN (2, 3);

    The start times are synchronized now for the two rows, so the view of 2 and 3 will be consistent for this change, if not the insert:

    SalesOrderId Data                           ValidStartTime              ValidEndTime
    ------------ ------------------------------ --------------------------- ---------------------------
    2            2016-04-30 17:17:16.0944986    2016-04-30 22:17:16.0944986 9999-12-31 23:59:59.9999999
    3            2016-04-30 17:17:18.0999866    2016-04-30 22:17:16.0944986 9999-12-31 23:59:59.9999999

    SalesOrderId Data                           ValidStartTime              ValidEndTime
    ------------ ------------------------------ --------------------------- ---------------------------
    2            Original                       2016-04-30 22:13:19.1927715 2016-04-30 22:17:16.0944986
    3            Original                       2016-04-30 22:13:20.2358806 2016-04-30 22:17:16.0944986

    The start time of row 2 is exactly the time you put into the Data column for row 2 (the first row) for both rows. That would effectively be the time the transaction started.

    Add one more row, with the SYSDATETIME() value for the Data column, which will let you see when the row was created:

    INSERT INTO  Sales.SalesOrder (SalesOrderID, Data)
    VALUES (4, SYSDATETIME());

    SELECT *
    FROM   Sales.SalesOrder
    WHERE  SalesOrderId IN (2, 3, 4);
    SELECT *
    FROM   Sales.SalesOrderHistory
    WHERE  SalesOrderId IN (2, 3, 4);

    The new row has the same ValidStartTime value as the other modified rows.

    SalesOrderId Data                           ValidStartTime              ValidEndTime
    ------------ ------------------------------ --------------------------- ---------------------------
    2            2016-04-30 17:17:16.0944986    2016-04-30 22:17:16.0944986 9999-12-31 23:59:59.9999999
    3            2016-04-30 17:17:18.0999866    2016-04-30 22:17:16.0944986 9999-12-31 23:59:59.9999999
    4            2016-04-30 17:18:17.5493927    2016-04-30 22:17:16.0944986 9999-12-31 23:59:59.9999999

    SalesOrderId Data                           ValidStartTime              ValidEndTime
    ------------ ------------------------------ --------------------------- ---------------------------
    2            Original                       2016-04-30 22:13:19.1927715 2016-04-30 22:17:16.0944986
    3            Original                       2016-04-30 22:13:20.2358806 2016-04-30 22:17:16.0944986

    Commit the changes:

    COMMIT

    This is actually really great, because you can effectively update, modify, and delete all rows at the same time as far as the temporal history will reflect notice. I will use this in the next blog entry to deal with logical object level changes (SalesOrder and SalesOrderLineItems).

    Scenario 3: Single row, created, updated a few times, deleted.

    To show the final thing that will occur. I will, in a transaction, make a new row, change it and toss it aside. You would think the row never really existed, and you would wrong really.  It would have log entries, it might even have been used to create other data. As such, it will still have history, which could be useful in some scenario I haven't yet figured out! (If you roll back the transaction, it really would have never existed).

    BEGIN TRANSACTION

    INSERT INTO  Sales.SalesOrder (SalesOrderID, Data)
    VALUES (5, 'Original');

    UPDATE Sales.SalesOrder
    SET    Data = 'First Change'
    WHERE  SalesOrderID = 5

    UPDATE Sales.SalesOrder
    SET    Data = 'Second Change'
    WHERE  SalesOrderID = 5;

    DELETE Sales.SalesOrder
    WHERE SalesOrderId = 5;

    COMMIT

    Look at the data

    SELECT *
    FROM   Sales.SalesOrder
    WHERE  SalesOrderId = 5;

    SalesOrderId Data                           ValidStartTime              ValidEndTime
    ------------ ------------------------------ --------------------------- ---------------------------

    But in the history:

    SELECT *
    FROM   Sales.SalesOrderHistory
    WHERE  SalesOrderId = 5;

    SalesOrderId Data                           ValidStartTime              ValidEndTime
    ------------ ------------------------------ --------------------------- ---------------------------
    5            Original                       2016-04-30 22:24:45.5750619 2016-04-30 22:24:45.5750619
    5            First Change                   2016-04-30 22:24:45.5750619 2016-04-30 22:24:45.5750619
    5            Second Change                  2016-04-30 22:24:45.5750619 2016-04-30 22:24:45.5750619

    Everything we did is in the results. But no data for SalesOrderId will show up in any query on the table, temporal or otherwise:

    SELECT *
    FROM   Sales.SalesOrder FOR SYSTEM_TIME CONTAINED IN ('1900-01-01','9999-12-31 23:59:59.9999999')
    WHERE  SalesOrderId = 5;

    Returns nothing.

    This is all quite interesting, and I hope that it helps you as a reader someday when you are wanting to try out some of these scenarios. In this blog I wanted to look as some esoteric situations for how temporal would work, leading up to how you might need to think of things when you are modifying logical objects of data, rather than just one independent row at a time.

  • Row Level Security-Part 3-A few more advanced scenarios

    In this final entry (for now!) in my series on Row Level Security (Part 1, Part 2) I am going to show a couple of slightly more complex scenarios. I will only scratch the surface of what you could do, but keep in mind that performance is really important to how you make use of Row Level Security. In a future blog, I will build a couple of really large tables and try out Dynamic Data Masking and Row Level Security, as I am sure many other bloggers will as well.

    I will do two main scenarios. First dealing with the scenario that you have one database principal context you are working with, and the second allowing you to set up a table that you specify that a user has rights to an individual row.

    Code for this third entry executed using:

    SELECT @@version

    ---------------------------------------
    Microsoft SQL Server 2016 (RC2) - 13.0.1300.275 (X64)   Mar 26 2016 03:43:12   Copyright (c) Microsoft Corporation  Enterprise Evaluation Edition (64-bit) on Windows Server 2012 R2 Standard 6.3 <X64> (Build 9600: ) (Hypervisor)

    Single Login

    In this first scenario, you have an app that only does one login, but users have different roles that you want to seperate out. SQL Server 2016 has added a connection level function that allows you to set a context on the session that a system function can read. This is a practice I have used for many years with context_info, but it was a single value that other users might tramle on. Session_context gives you a set of name-value pairs that you can read in, allowing you to set variables for a connection.

    Basically, you can execute something like this on your connection:

    EXEC sys.sp_set_session_context @key = N'securityGroup', @value = 'BigHat';

    And then when you need the value, execute:

    SELECT SESSION_CONTEXT(N'securityGroup');

    You get the following back (truncated from the max 256 characters):

    ---------------
    BigHat

    Note: if you want to read more about the details of session_context, check out Aaron Bertrand's great blog here: https://www.mssqltips.com/sqlservertip/4094/phase-out-contextinfo-in-sql-server-2016-with-sessioncontext/

    Ok, so now we will drop the security policy, and recreate the function using the SESSION_CONTEXT instead of USER_NAME() as
    we did it in the previous blogs:

    DROP SECURITY POLICY rowLevelSecurity.Demo_SaleItem_SecurityPolicy;
    GO

    ALTER FUNCTION rowLevelSecurity.ManagedByUser$SecurityPredicate (@ManagedByUser AS sysname)
        RETURNS TABLE
    WITH SCHEMABINDING
    AS
        RETURN (SELECT 1 AS ManagedByUser$SecurityPredicate
                WHERE @ManagedByUser = SESSION_CONTEXT(N'securityGroup') --If the ManagedByUser = the securityGroup
                   OR (SESSION_CONTEXT(N'securityGroup') = N'MedHat' and @ManagedByUser <> 'BigHat')  --if the securityGroup is MedHat, and the row isn't managed by BigHat
                   OR (SESSION_CONTEXT(N'securityGroup') = 'BigHat') --Or the user is the BigHat person, they can see everything ;
                   OR (USER_NAME() = 'dbo')); --dbo gets it all


    Compare to our earlier predicate function from part 1 (link)

        RETURN (SELECT 1 AS ManagedByUser$SecurityPredicate
                WHERE @ManagedByUser = USER_NAME() --If the ManagedByUser = the database username
                   OR (USER_NAME() = 'MedHat' and @ManagedByUser <> 'BigHat') --if the user is MedHat, and the row isn't managed by BigHat
                    OR (USER_NAME() IN ('BigHat','dbo'))); --give 'dbo' full rights;

    The only difference is that we are only using USER_NAME for the DBO user, and if you aren't a member of one of the security groups you will see nada.

    Now we recreate our security policy that lets people see data if they manage the row, and insert only to the security group they are set to.

    CREATE SECURITY POLICY rowLevelSecurity.Demo_SaleItem_SecurityPolicy
    ADD FILTER PREDICATE [rowLevelSecurity].[ManagedByUser$SecurityPredicate]([ManagedByUser]) ON Demo.saleItem,
    ADD BLOCK PREDICATE [rowLevelSecurity].[ManagedByUser$SecurityPredicate]([ManagedByUser]) ON Demo.SaleItem AFTER INSERT
    WITH (STATE = ON, SCHEMABINDING = ON);

    Bear in mind that you should usually not let a user do the following in ad-hoc SQL, because there is no control over the value parameter that would make sense (other than perhaps limiting the group names. But any user could put anything in if they can execute the procedure (as I am about to do.)

    First using an undefined security group;

    EXEC sys.sp_set_session_context @key = N'securityGroup', @value = 'Undefined';
    SELECT * FROM Demo.SaleItem;


    Wait, what?

    SaleItemId  ManagedByUser
    ----------- ---------------
    1           BigHat
    2           BigHat
    3           MedHat
    4           MedHat
    5           SmallHat
    6           SmallHat
    7           BigHat
    8           BigHat

    Ah yes, I am the dbo. Security testing is tricky! I will set context to SmallHat, one of the accounts we have been using.

    EXECUTE AS USER = 'SmallHat';

    Now try again!

    EXEC sys.sp_set_session_context @key = N'securityGroup', @value = 'Undefined';
    SELECT * FROM Demo.SaleItem;

    No rows are returned

    saleItemId  ManagedByUser
    ----------- ---------------

    Now let us try one of the roles we defined.

    EXEC sys.sp_set_session_context @key = N'securityGroup', @value = 'SmallHat';
    SELECT * FROM Demo.SaleItem;

    saleItemId  ManagedByUser
    ----------- ---------------
    5           SmallHat
    6           SmallHat

    The two rows expected. Next the last two.

    EXEC sys.sp_set_session_context @key = N'securityGroup', @value = 'MedHat';
    SELECT * FROM Demo.SaleItem;
    EXEC sys.sp_set_session_context @key = N'securityGroup', @value = 'BigHat';
    SELECT * FROM Demo.SaleItem;

    SaleItemId  ManagedByUser
    ----------- ---------------
    3           MedHat
    4           MedHat
    5           SmallHat
    6           SmallHat

    SaleItemId  ManagedByUser
    ----------- ---------------
    1           BigHat
    2           BigHat
    3           MedHat
    4           MedHat
    5           SmallHat
    6           SmallHat
    7           BigHat
    8           BigHat

    What is important to note here is if you are doing this wih a single connetion, you don't necessarily have to disconnect, but it is imperative that the user executes the session context setting procedure before every execution.

    Don't forget to revert back to dbo so you can do the next configuration if desired:

    REVERT;

    (Slightly More) Complex (But Not As Complex as It Could Be) Security Configurations

    Realitically, you can do almost any mapping you want. In this example, I just want to demo using a table to hold an access control list, where in addition to the security we already have configured, I specifically say what my 2 users can see beyond what they are set up as managers, while leaving BigHat Open to seeing all data.

    To start with, I will create a table of primary keys for the SaleItem, along with the security context names (I don’t need to grant rights to the users to read from the table, just like I don’t have to grant rights to execute the function.)

    CREATE TABLE rowLevelSecurity.SaleItemOverride
    (
        SaleItemId int NOT NULL,
        ManagedByUser nvarchar(15) NOT NULL,
        CONSTRAINT PKSaleItem PRIMARY KEY (SaleItemId, ManagedByUser)
    );


    I am going back to using USER_NAME() instead of the session context, but the concept is very much the same.
    Now we create some security data. What I am going to do here is, in addition to idea that the user gets to see
    all items that they are the managing user for, now we are giving them access to additional rows.

    INSERT INTO rowLevelSecurity.SaleItemOverride
    VALUES ( 1, 'SmallHat'), (2,'SmallHat'),(8,'MedHat');

    Now we add in the SaleItemId into the parameters, so we can check the data against the data in the table we have
    just created.

    DROP SECURITY POLICY rowLevelSecurity.Demo_SaleItem_SecurityPolicy;
    GO

    ALTER  FUNCTION rowLevelSecurity.ManagedByUser$SecurityPredicate (@saleItemId int, @ManagedByUser AS sysname)
        RETURNS TABLE
    WITH SCHEMABINDING
    AS
        RETURN (SELECT 1 AS ManagedByUser$SecurityPredicate
                WHERE @ManagedByUser = USER_NAME()
                   OR (USER_NAME() = N'MedHat' and @ManagedByUser <> 'BigHat')
                   OR USER_NAME() = 'BigHat'   
                  --adding on this predicate that says to check the user name to the managedbyUser column in the override table        
                   OR EXISTS ( SELECT 1
                                FROM   rowLevelSecurity.SaleItemOverride
                                WHERE (ManagedByUser = USER_NAME() --the user who is being given access to a row
                                       or USER_NAME() = N'MedHat') --or MedHat who has accesss to all that the small user has
                                  AND SaleItemId = @saleItemId)
                               );
    GO


    And now recreate the security policy, this time with two parameters for each function:

    CREATE SECURITY POLICY rowLevelSecurity.Demo_SaleItem_SecurityPolicy
        ADD FILTER PREDICATE [rowLevelSecurity].[ManagedByUser$SecurityPredicate](SaleItemId, ManagedByUser) ON Demo.saleItem,
        ADD BLOCK PREDICATE [rowLevelSecurity].[ManagedByUser$SecurityPredicate](SaleItemId, ManagedByUser) ON Demo.SaleItem AFTER INSERT
        WITH (STATE = ON, SCHEMABINDING = ON);


    With this recreated, now let's try it out. First we will show the override rows, and then change to the SmallHat security principal and see what we get. First reviewing what is in the override table.

    SELECT *
    FROM   rowLevelSecurity.SaleItemOverride;

     
    This returns:

    SaleItemId  ManagedByUser
    ----------- ---------------
    1           SmallHat
    2           SmallHat
    8           MedHat

    So for SmallHat, we should be able to see SalesItemId 1 and 2, no matter who the ManagedyUser is, along with any rows with ManagedByUser = SmallHat in the SaleItem table:

    EXECUTE AS USER = 'SmallHat';
    GO
    SELECT * FROM Demo.SaleItem;
    GO
    REVERT;


    Which we can.

    SaleItemId  ManagedByUser
    ----------- ---------------
    1           BigHat
    2           BigHat
    5           SmallHat
    6           SmallHat

    Now let's see what happens when we do MedHat, who can see the rows of SmallHat too, even if they were given to them through our security table:

    EXECUTE AS USER = 'MedHat';
    GO
    SELECT * FROM Demo.SaleItem;
    GO
    REVERT;

    SaleItemId  ManagedByUser
    ----------- ---------------
    1           BigHat
    2           BigHat
    3           MedHat
    4           MedHat
    5           SmallHat
    6           SmallHat
    8           BigHat

    And if you execute the BigHat example, you will see that they do still have complete access.

    Naturally, this is not the end of the possible scenarios you could implement with row level security. You will be able to create complex examples, commonly not by using users, but using the groups that users are members of, and/or possibly your complex org chart, which might be a hierarchy. Using a hierarchy is one example that Books Online suggests as a bad idea, but check on the concept of the Kimball Helper table as a way to have your hierarchy, but a flattened version of it as well that will make queries a lot faster (in my test examples from my hierarchies presentatation with over 1/2 million nodes, I have seen amazing performance even on a laptop level machine) if you can use a
    key lookup rather than some form of scan.

    I will be adding to the blog after I finish the book some deeper examples of hiearchies. For an example now of how Jamey Johnston (@statcowboy) has built a hierarchy to work with some data in the real world, check his blog where he details the examples from his 2016 presentation: http://blog.jameyjohnston.com/oil-gas-sql-server-security-demo/.

  • Presenting on Concurrency Three Times This Month…Though Not Concurrently

    Tomorrow night, March 15, I will be at the Hampton Roads SQL Server User Group’s March meeting, then at SQL Saturday Richmond on Saturday the 19th, then finally back at home on the 25th for my home user group in Nashville (which I haven’t seen much of this year) to present a presentation that I am pretty fond of after having done it once before in Nashville, and 20 times for myself.

    Here is the abstract:

    Let Me Finish... Isolating Write Operations

    OLTP databases can be constantly written to and reporting databases are written to at least periodically. In order to ensure consistent results, connections must be isolated from one another while executing, ideally with the lowest possible cost to concurrency.  How this isolation is handled is based on the isolation level, whether the classic lock based or the newer optimistic scheme of the in-memory OLTP engine is used, or even if both engines are enlisted in the same transaction. In this session we will look at examples of how SQL Server isolates reading and writing operations from other writing operations to explore how this may affect your application through error messages and performance hits.

    What I like about it is that it takes the “fun” parts (okay, technical and nerdy parts) of my In-Memory DB Design session I had done several times last year (until SQL Server 2016 changed it beyond recognition) and makes it more applicable to more people. I cover the locking and optimistic concurrency controls that are in SQL Server 2016, and since RC0 has just arrived, I get to do it on what might be the release version of SQL Server.

    Hope to see you at one of these (and if not, I am scheduled to do this presentation again for Pensacola’s SQL Saturday in June).

  • Row Level Security–Part 2-Write Operations

    In the first entry in this series (to view, follow this link), I took a look at the how row level security worked with read operations. Now in this entry, I want to expand that to how you use it to limit what a user might write to, or delete from a table.

    To recap our current situation, we have three users:

    CREATE USER BigHat WITHOUT LOGIN;--DROP IF EXISTS works with users as well as objects:

    CREATE USER MedHat WITHOUT LOGIN; --MediumHat, which we want to get all of SmallHat's rights, but not BigHats

    CREATE USER SmallHat WITHOUT LOGIN; -- gets a minimal amount of security

    A VERY simple table:

    CREATE TABLE Demo.SaleItem
    (
        SaleItemId    int CONSTRAINT PKSaleIitem PRIMARY KEY,
        ManagedByUser sysname
    )

    With data that looks like this (the dbo can see all of the data).

    SELECT *
    FROM   Demo.SaleItem

    Which returns:

    SaleItemId  ManagedByUser
    ----------- ---------------
    1           BigHat
    2           BigHat
    3           MedHat
    4           MedHat
    5           SmallHat
    6           SmallHat

    The following security policy has been implemented:

    CREATE SECURITY POLICY rowLevelSecurity.Demo_SaleItem_SecurityPolicy
        ADD FILTER PREDICATE rowLevelSecurity.ManagedByUser$SecurityPredicate(ManagedByUser) ON Demo.SaleItem
        WITH (STATE = ON, SCHEMABINDING = ON);

    Based on the following TVF:

    CREATE FUNCTION rowLevelSecurity.ManagedByUser$SecurityPredicate (@ManagedByUser AS sysname)
        RETURNS TABLE
    WITH SCHEMABINDING
    AS
        RETURN (SELECT 1 AS ManagedByUser$SecurityPredicate
                WHERE @ManagedByUser = USER_NAME()
                   OR (USER_NAME() = 'MedHat' and @ManagedByUser <> 'BigHat')
                   OR (USER_NAME() IN ('BigHat','dbo'))); --give 'dbo' full rights

    Now the goal is going to be to demonstrate how we can execute INSERTs, UPDATEs, and DELETEs on the table.

    First, I will grant the user rights to INSERT, UPDATE and DELETE from the table to all of the users we have set up:

    GRANT INSERT,UPDATE,DELETE ON Demo.SaleItem TO SmallHat, MedHat,BigHat;

    Before we start to look at the BLOCK predicates, let't take a look at what the user can do at this point, starting with an INSERT from the SmallHat user, with a ManagedByUser that we configured back in part 1 that they cannot see:

    EXECUTE AS USER = 'SmallHat';
    GO
    INSERT INTO Demo.SaleItem (saleItemId, ManagedByUser)
    VALUES (7,'BigHat');
    GO
    SELECT * FROM Demo.SaleItem;
    SELECT COUNT(*) FROM Demo.SaleItem WHERE saleItemId = 7
    GO
    REVERT

    This returns:

    saleItemId  ManagedByUser
    ----------- ---------------
    5           SmallHat
    6           SmallHat

    -----------
    0

    Which would, no doubt in my mind, drive the end user nuts thinking "Where did my row go?" So in just a bit, we will fix so it can't occur, if you don't want it to. In my contrived business rules, I will establish a case where we want to do just this, so that the user could update a row and it is no longer in view. (A real version might be to hide soft deleted rows from most users. ( For example a deletedFlag bit NOT NULL column with a value of 1, perhaps.) After RTM I will give that case a test when I test performance.)

    As the dbo:

    SELECT *
    FROM   Demo.SaleItem
    WHERE  saleItemId = 7;

    I can see it

    saleItemId  ManagedByUser
    ----------- ---------------
    7           BigHat

    Next up, can we UPDATE or DELETE the row as the SmallHat user? It seems fairly obvious we cannot since we can’t see it in a WHERE clause of a SELECT, but it never hurts to check:

    EXECUTE AS USER = 'SmallHat';
    GO
    UPDATE Demo.SaleItem
    SET    ManagedByUser = 'SmallHat'
    WHERE  SaleItemId = 7; --Give it back to me!

    DELETE Demo.SaleItem
    WHERE  SaleItemId = 7; --or just delete it
    GO
    REVERT;
    GO
    SELECT *
    FROM   Demo.SaleItem
    WHERE  SaleItemId = 7;

    If you haven't turn the NOCOUNT setting on for your connection, you will see:


    (0 row(s) affected)

    (0 row(s) affected)

    SaleItemId  ManagedByUser
    ----------- ---------------
    7           BigHat

    So the FILTER predicate we previously established works on UPDATES and DELETEs as well. Great. Now let's work on making sure that
    the user can't do something silly to the data they have in their view UNLESS it is an acceptable purpose.

    I will drop the exiting security policy for the time being to demonstrate how the block predicate works. We will put back the filter in the very last part of the blog to meet the requirement of letting the user modify data to a state they can’t see:

    DROP SECURITY POLICY IF EXISTS rowLevelSecurity.Demo_SaleItem_SecurityPolicy;

    Next we are going BLOCK predicate, that will block users from doing certain options. There are two block types: AFTER and BEFORE.

    • AFTER - If the row would not match your ability to see the data after the operation, it will fail. Here we have INSERT and UPDATE. So in my example scenario, for INSERT SmallHat would not be able to insert a row that didn't have 'SmallHat' for the ManagedByUser. For UPDATE (with no before setting), SmallHat could update any row they can see to 'SmallHat', but not something else.
    • BEFORE - This seem like it is the same thing as the filter predicate, saying that if you can't see the row, you can't UPDATE or DELETE it, but there is a subtle difference. This says, no matter if you can see the row, before you can modify the row, it must match the predicate. So in our case, if we added BEFORE update, and dropped the FILTER predicate, the SmallHat could see all rows, but only change the rows they manage.

    I am going to set one of the obvious (to me) set of row level security predicates that one might set in a realistic scenario for a managed by user type column.

    1. BLOCK AFTER INSERT, to say that if you can't see the row, that you can't create a new row.
    2. BLOCK UPDATE and DELETE you don't own.
    3. Allow you to update a row to a manager that you cannot see, to enable you to pass the row to a collegue. Naturally some level of "are you sure" protection needs to be placed on the row, because once you update it, it will be gone from your view

    So, using the security predicate function we already created, we apply the following:

    CREATE SECURITY POLICY rowLevelSecurity.Demo_SaleItem_SecurityPolicy
        ADD BLOCK PREDICATE rowLevelSecurity.ManagedByUser$SecurityPredicate(ManagedByUser) ON Demo.saleItem AFTER INSERT,
        ADD BLOCK PREDICATE rowLevelSecurity.ManagedByUser$SecurityPredicate(ManagedByUser) ON Demo.saleItem BEFORE UPDATE,
        ADD BLOCK PREDICATE rowLevelSecurity.ManagedByUser$SecurityPredicate(ManagedByUser) ON Demo.saleItem BEFORE DELETE
        WITH (STATE = ON, SCHEMABINDING = ON);


    Now, let's try again add a row that SmallHat couldn't see:

    EXECUTE AS USER = 'SmallHat';
    GO
    INSERT INTO Demo.SaleItem (SaleItemId, ManagedByUser)
    VALUES (8,'BigHat');

    Nice try, it says, but:

    Msg 33504, Level 16, State 1, Line 171
    The attempted operation failed because the target object 'SimpleDemos.Demo.SaleItem' has a block predicate that conflicts with this operation. If the operation is performed on a view, the block predicate might be enforced on the underlying table. Modify the operation to target only the rows that are allowed by the block predicate.
    The statement has been terminated.

    (Notice that the error message includes that database and schema of the object too. Nice!)

    Now try again, with SmallHat as the ManagedByUser column value:

    INSERT INTO Demo.SaleItem (SaleItemId, ManagedByUser)
    VALUES (8,'SmallHat');

    This works, and now to show that it worked (still in the security context of the SmallHat user:

    SELECT * FROM Demo.SaleItem

    This returns:

    SaleItemId  ManagedByUser
    ----------- ---------------
    1           BigHat
    2           BigHat
    3           MedHat
    4           MedHat
    5           SmallHat
    6           SmallHat
    7           BigHat
    8           SmallHat

    And we see all rows in the table because we dropped the FILTER predicate.

    Next, continuing in the security context of the SmallHat user, let's try the UPDATE and DELETE we tried earlier to SaleItemId 7:

    UPDATE Demo.SaleItem
    SET    ManagedByUser = 'SmallHat'
    WHERE  SaleItemId = 7;
    GO
    DELETE FROM Demo.SaleItem WHERE SaleItemId = 7;

    Two errors that looks just like this:

    Msg 33504, Level 16, State 1, Line 211
    The attempted operation failed because the target object 'SimpleDemos.Demo.SaleItem' has a block predicate that conflicts with this operation. If the operation is performed on a view, the block predicate might be enforced on the underlying table. Modify the operation to target only the rows that are allowed by the block predicate.
    The statement has been terminated.

    Msg 33504, Level 16, State 1, Line 211
    The attempted operation failed because the target object 'SimpleDemos.Demo.SaleItem' has a block predicate that conflicts with this operation. If the operation is performed on a view, the block predicate might be enforced on the underlying table. Modify the operation to target only the rows that are allowed by the block predicate.
    The statement has been terminated.

    Showing we have stopped the user from modifying the rows, even though they can see them.

    Now, lets use the security hole we left. That of letting the user update the row by not checking that the value matched AFTER the UPDATE operation , in the following case changing the ManagedByUser column to another user.

    UPDATE Demo.SaleItem
    SET    ManagedByUser = 'BigHat'
    WHERE  SaleItemId = 8;

    SELECT * FROM Demo.SaleItem WHERE SaleItemId = 8;

    You can see that this was allowed

    SaleItemId  ManagedByUser
    ----------- ---------------
    8           BigHat

    Now, let's go back to the context of the dbo, and let's add the FILTER predicate back to the security policy, to show how you add
    a policy to one

    REVERT;

    We can add a predicate using the same syntax, without knowing the other items that are in the policy.

    ALTER SECURITY POLICY rowLevelSecurity.Demo_SaleItem_SecurityPolicy
        ADD FILTER PREDICATE rowLevelSecurity.ManagedByUser$SecurityPredicate(ManagedByUser) ON Demo.SaleItem;
    GO


    Now we have the following policy defined:

    CREATE SECURITY POLICY rowLevelSecurity.Demo_SaleItem_SecurityPolicy
        ADD FILTER PREDICATE rowLevelSecurity.ManagedByUser$SecurityPredicate(ManagedByUser) ON Demo.SaleItem,
        ADD BLOCK PREDICATE rowLevelSecurity.ManagedByUser$SecurityPredicate(ManagedByUser) ON Demo.SaleItem AFTER INSERT,
        ADD BLOCK PREDICATE rowLevelSecurity.ManagedByUser$SecurityPredicate(ManagedByUser) ON Demo.SaleItem BEFORE UPDATE,
        ADD BLOCK PREDICATE rowLevelSecurity.ManagedByUser$SecurityPredicate(ManagedByUser) ON Demo.SaleItem BEFORE DELETE
        WITH (STATE = ON, SCHEMABINDING = ON);


    But what if we want to remove a predicate, in this example, say the redundant BEFORE predicates to the FILTER one we just added back.
    Note that there is no name to each predicate, so for example, to drop the different BEFORE BLOCK predicates on Demo.SaleItem:

    ALTER SECURITY POLICY rowLevelSecurity.Demo_SaleItem_SecurityPolicy
        DROP BLOCK PREDICATE ON Demo.SaleItem BEFORE UPDATE,
        DROP BLOCK PREDICATE ON Demo.SaleItem BEFORE DELETE;


    I rather expect that in many cases it will be easier to drop and recreate the policy that is desired, but it is available to remove individual predicates. You can get away with only specifying the predicate type because as we mentioned in the previous entry in this series, you can only have one predicate of a given type on each table.

    In this blog, I wanted to cover the range of things one might do with one table and a fairly simple predicate function. In the next entry, I will get a bit more complex with the functions you can build to apply, including accessing other tables. I won't cover any more about the DDL of the CREATE SECURITY POLICY statement, just note that it does not have to center on one table. The following is certainly possible:

    CREATE SECURITY POLICY rowLevelSecurity.MultipleTables
        ADD FILTER PREDICATE rowLevelSecurity.ManagedByUser$SecurityPredicate(ManagedByUser) ON Demo.Sale,
        ADD FILTER PREDICATE rowLevelSecurity.ManagedByUser$SecurityPredicate(ManagedByUser) ON Demo.SaleItem,
        ADD BLOCK PREDICATE rowLevelSecurity.ManagedByUser$SecurityPredicate(ManagedByUser) ON Demo.Sale AFTER INSERT,
        ADD BLOCK PREDICATE rowLevelSecurity.ManagedByUser$SecurityPredicate(ManagedByUser) ON Demo.SaleItem BEFORE UPDATE,
        ADD BLOCK PREDICATE rowLevelSecurity.ManagedByUser$SecurityPredicate(ManagedByUser) ON Demo.UserName BEFORE DELETE
        WITH (STATE = ON, SCHEMABINDING = ON);


    I don't believe there to be any real advantage to doing it this way versus another other than convenience of turning policies off being easier (and I could see error troubleshooting to be more difficult.) I will update the blog if this turns out to not be correct.

  • How Sure Are You of Your PASS Summit Abstract?

    You know who you are. You have a presentation idea that has been percolating since last October. You have asked friends and coworkers if it is a good idea. You may have presented it at 10 SQL Saturdays to thunderous appreciation. You may have even tried to submit last year but was turned down.  You have this abstract written and may have even entered it on the Call for Speaker’s page and you think the idea is great, but you just aren't sure that the abstract is good enough. Who can you turn to?

    PASS has a service that can help you take the concern about the quality of your abstract out of the equation. It is called Abstract Coaching and it is completely free for any PASS member thinking about submitting for the Summit.

    You fill out the Microsoft Word form you can find here on the PASS website that has almost everything  that your Summit Submission will have:

    • Title - A name to tell the potential attendee what the session is about
    • Length – The length of your session. Whether you have a 75 minute General Session or are trying for a Full Day Pre-Conference session, we will review your abstract.
    • Level - How technical will the audience need to be to attend this season
    • Track - The major area of concentration for the session, such as DBA, developer, BI, etc.
    • Prerequisites * - Here you will list what you expect attendees to your session to already understand before attending. Every session needs these unless a person who wanders off the street could attend and understand what is going on.
    • Goals * - 3 big concepts you are trying to get across to the attendee to help the selection committee really understand what your plan is for the presentation
    • Abstract - A short paragraph or so that explains the session in enough detail that the selection committee and prospective attendees can understand what they will be getting. This along with the title will be the primary "advertisement" for the session.

    Note: Items with an * are for the selection committee only, and will not be shared with the attendees.

    The document will also list some basic instructions and restrictions. It might seem like a lot to fill out, but going through the exercise will help you flesh out your idea and give the commenters more understanding about what you are wanting to say. After filling out the form, simply email it in to the address included in the form and a few days later you will have an analysis of your submission.

    One thing you may notice is that your name and bio are not included in the coaching session document. The service is more or less anonymous, with the coaching team leader being the only person who will see your name connected to your submission. Your abstract will be sent anonymously to a commenter and returned the same way.  The entire team is under NDA about the details of the names and details of your abstract as well.

    There are a few restrictions to be aware of. The service offers no guarantees or promises of being selected, just one honest opinion of your abstract in terms of:

    • Grammar - Misspellings and other poor writing skills have tanked more great submission concepts than speaking skills ever will.
    • Possible Technical Concerns - Your abstract will be reviewed by someone who is a good writer, but also is a speaker and leader in technology as well.
    • Cohesion of the Submission Parts - Here we are looking at how well all of the parts of your abstract fit together. Quite often, a submission will have a title like "How to Do X" but will not even mention what X is in the goals or abstract leading to great confusion about what the presentation would be about.
    • Pretty much anything else the reviewer wants to share - Commenters will do their best to give you any comments about anything that is not out of bounds (mostly things like how likely you are to get chosen). 

    In the end, the Coaching team’s goal is to help you as much as possible, so you can worry about what most database professionals worry about most: the concept, the demos, and not oversleeping on the day of your presentation.

    Helpful links:

  • Row Level Security – Part 1–Simple Read operations

    This is part 1 of my Row Level Security blogs for my upcoming book project.

    Row Level Security is a very interesting feature. Unlike Dynamic Data Masking (link), which looks interesting on it's face, once you dig in the utilization of the feature looks to be limited by how granular you can make the security work. Row Level Security on the other hand is very configurable, and looks mainly to be limited by the possible performance limitations (and there may be a method or two to attack it, but I haven’t been able to make them work in CTP 3.2 yet)

    For the simple case, we will create a simple table valued user defined function (TVF) that returns either a single row with 1, or no rows. 1 means to show the row, no row the opposite. The parameters will be mapped to 1 or more columns as input. This means that for every row that is to be output to the user from any queries, this function will be executed. If this sounds like something that is generally considered a bad idea, it is. But to be fair, 1: this is a simple TVF, so it is more or less like a subquery and 2: other methods of implementing row level security suffer from common issues.

    In the next (at least) three blogs, I will cover row level security in some detail, with examples of several ways it can be used. Later in my experimentation with features for the book, I will attempt to use the feature with a very large dataset for performance testing, but I want to wait until after RTM for that exercise. (Or let's be real, Aaron Bertrand or another of my MVP friends will probably have beaten me to the subject, which will be easier for me anyhow!)

    The following is the version of SQL Server I used for this series of blogs until I state otherwise:

    select @@version

    Microsoft SQL Server 2016 (CTP3.2) - 13.0.900.73 (X64)
        Dec 10 2015 18:49:31
        Copyright (c) Microsoft Corporation
        Enterprise Evaluation Edition (64-bit) on Windows Server 2012 R2 Standard 6.3 <X64> (Build 9600: ) (Hypervisor)

    To get started you will need to go to a SQL Server 2016 database (I am using the same one I created for the Dynamic Data Masking example) and we are going to drop and create 3 login-less users (in real scenarios, use roles), so we know they are starting fresh security wise:

    DROP USER IF EXISTS BigHat; --DROP IF EXISTS works with users as well as objects:
    CREATE USER BigHat WITHOUT LOGIN;

    DROP USER IF EXISTS MedHat --MediumHat, which we want to get all of SmallHat's rights, but not BigHats
    CREATE USER MedHat WITHOUT LOGIN;

    DROP USER IF EXISTS SmallHat -- gets a minimal amount of security
    CREATE USER SmallHat WITHOUT LOGIN;


    Next, the rowLevelSecurity function are best if placed in their own schema (for security purposes, so if you give a user access to a user schema, they won’t have access to this schema… This paradigm fails if you use db_datareader and db_datawriter, particularly as later we will be adding some tables to the schema I will be creating when we get to the more “creative” solutions.) I will make a very obviously named schema:

    CREATE SCHEMA rowLevelSecurity;

    Now we create a function that will be used to say, for a given row, should the user be able to see a row. This function will later be applied to one or more configuration of security policies.

    For the tables we will create for the demos, I will include a "ManagedByUser" column, which will hold the database user name of the user. This will be the parameter passed in to my function.

    The predicate for the function needs to determine:

    1. Bighat user sees all
    2. SmallHat user sees only rows where ManagedByUser = 'SmallHat' (we wll expand this in later blogs, but we are starting here
    3. MedHat sees rows where ManagedByUser = 'MedHat' or the username <> 'BigHat', allowing for other low rights user to be createdin the future

    So I will create the following function:

    CREATE FUNCTION rowLevelSecurity.ManagedByUser$SecurityPredicate (@ManagedByUser AS sysname)
        RETURNS TABLE
    WITH SCHEMABINDING
    AS
        RETURN (SELECT 1 AS ManagedByUser$SecurityPredicate
                WHERE @ManagedByUser = USER_NAME() --If the ManagedByUser = the database username
                   OR (USER_NAME() = 'MedHat' and @ManagedByUser <> 'BigHat') --if the user is MedHat, and the row isn't managed by BigHat
                   OR (USER_NAME() = 'BigHat')); --Or the user is the BigHat person, they can see everything


    You don't need to give the user rights to the function, but in order to test the function, I will give the user access temporarily

    GRANT SELECT ON rowLevelSecurity.ManagedByUser$SecurityPredicate TO PUBLIC; --testing only

    Now, we can test the function in the security context of each user

    EXECUTE AS USER = 'smallHat';
    GO
    SELECT * FROM rowLevelSecurity.ManagedByUser$SecurityPredicate('BigHat');
    SELECT * FROM rowLevelSecurity.ManagedByUser$SecurityPredicate('MedHat');
    SELECT * FROM rowLevelSecurity.ManagedByUser$SecurityPredicate('SmallHat');
    GO
    REVERT;

    Returns Nothing, Nothing, 1; indicating that the user would be able to SmallHat rows only

    EXECUTE AS USER = 'medHat';
    GO
    SELECT * FROM rowLevelSecurity.ManagedByUser$SecurityPredicate('BigHat');
    SELECT * FROM rowLevelSecurity.ManagedByUser$SecurityPredicate('MedHat');
    SELECT * FROM rowLevelSecurity.ManagedByUser$SecurityPredicate('SmallHat');
    GO
    REVERT;

    Returns Nothing, 1, 1; indicating that the user would be able to SmallHat and MedHat rows, but not BigHat's

    EXECUTE AS USER = 'bigHat';
    GO
    SELECT * FROM rowLevelSecurity.ManagedByUser$SecurityPredicate('BigHat');
    SELECT * FROM rowLevelSecurity.ManagedByUser$SecurityPredicate('MedHat');
    SELECT * FROM rowLevelSecurity.ManagedByUser$SecurityPredicate('SmallHat');
    GO
    REVERT;

    Returns 1, 1, 1; indicating that the user would be able to see all rows. So let's remove the rights on the function, and create our sample table

    REVOKE SELECT ON rowLevelSecurity.ManagedByUser$SecurityPredicate TO PUBLIC; --was for testing only

    Just a very simple table, and grant select rights to our three users:

    /* create the schema if you don't have it yet:
    CREATE SCHEMA Demo;
    */
    CREATE TABLE Demo.SaleItem
    (
        SaleItemId    int CONSTRAINT PKSaleIitem PRIMARY KEY,
        ManagedByUser nvarchar(15) --more typically would be sysname, but nvarchar(15) is easier to format for testing
    )
    INSERT INTO Demo.SaleItem
    VALUES (1,'BigHat'),(2,'BigHat'),(3,'MedHat'),(4,'MedHat'),(5,'SmallHat'),(6,'SmallHat');
    GO
    GRANT SELECT ON Demo.SaleItem TO SmallHat, MedHat, BigHat;

    At this point, each of these users can see every row in the table. We are going to change that quick.

    We are going to create a SECURITY POLICY object (note that it is schema owned), with one FILTER PREDICATE. This is used to filter read access to rows. In he next blog in the series, we will see the way to protect against writes and deletes to certain rows as well.

    --simple, data viewing filter
    CREATE SECURITY POLICY rowLevelSecurity.Demo_SaleItem_SecurityPolicy
        ADD FILTER PREDICATE rowLevelSecurity.ManagedByUser$SecurityPredicate(ManagedByUser) ON Demo.SaleItem
        WITH (STATE = ON); --go ahead and make it apply

    Note that you can have other predicates on the same table that we will discuss in the next blog, as well as predicates on different table in the same security policy. Whether you want to do that will likely depend on what kind of control you want to be able to turn off a policy for some purpose, and how you manage your DDL.  If I discover there to be some particular value in either direction, I will update this blog.

    You can only have one enabled filter predicate per table. If you try to put another in the same policy you get either message, depending on if you try to create two in the same policy, or different policies:

    Msg 33262, Level 16, State 1, Line 146
    A FILTER predicate for the same operation has already been defined on table 'Demo.SaleItem' in the security policy 'rowLevelSecurity.Demo_SaleItem_SecurityPolicy'.

    Msg 33264, Level 16, State 1, Line 146
    The security policy 'rowLevelSecurity.Demo_SaleItem_SecurityPolicy2' cannot be enabled with a predicate on table 'Demo.SaleItem'. Table 'Demo.SaleItem' is already referenced by the enabled security policy 'rowLevelSecurity.Demo_SaleItem_SecurityPolicy'.

    Next, let's test our predicates by executing select statements on the table we have created in the context of our three users:

    EXECUTE AS USER = 'SmallHat';
    GO
    SELECT * FROM Demo.SaleItem;
    GO
    REVERT;

    Which returns:

    SaleItemId  ManagedByUser
    ----------- ---------------
    5           SmallHat
    6           SmallHat

    Which makes sense to our desired outcome based on the business rules. Next, for MedHat:

    EXECUTE AS USER = 'MedHat';
    GO
    SELECT * FROM Demo.SaleItem;
    GO
    REVERT;

    All of MedHat rows, and SmallHat:

    SaleItemId  ManagedByUser
    ----------- ---------------
    3           MedHat
    4           MedHat
    5           SmallHat
    6           SmallHat

    And finally, the BigHat:

    EXECUTE AS USER = 'BigHat';
    GO
    SELECT * FROM Demo.SaleItem;
    GO
    REVERT;


    SaleItemId  ManagedByUser
    ----------- ---------------
    1           BigHat
    2           BigHat
    3           MedHat
    4           MedHat
    5           SmallHat
    6           SmallHat

    Awesome, now, let's compare this to the contents of the table as the dbo sees it:

    SELECT * FROM Demo.SaleItem;

    Returns:

    SaleItemId  ManagedByUser
    ----------- ---------------

    Well, that was less successful than I kind of expected. As the sa/dbo, I can't see any of the data. This runs counter-intuitive to the common practice/thinking that these users/admin roles are not subject to security. For MOST cases, the DBA will want to include the dbo user in the TVF that you base your functions on.  I will change this now for our demos:

    DROP SECURITY POLICY IF EXISTS rowLevelSecurity.Demo_SaleItem_SecurityPolicy; --if exists helps when debugging!
    go
    ALTER FUNCTION rowLevelSecurity.ManagedByUser$SecurityPredicate (@ManagedByUser AS sysname)
        RETURNS TABLE
    WITH SCHEMABINDING
    AS
        RETURN (SELECT 1 AS ManagedByUser$SecurityPredicate
                WHERE @ManagedByUser = USER_NAME()
                   OR (USER_NAME() = 'MedHat' and @ManagedByUser <> 'BigHat')
                   OR (USER_NAME() IN ('BigHat','dbo'))); --give 'dbo' full rights
    GO
    CREATE SECURITY POLICY rowLevelSecurity.Demo_SaleItem_SecurityPolicy
    ADD FILTER PREDICATE rowLevelSecurity.ManagedByUser$SecurityPredicate(ManagedByUser) ON Demo.SaleItem
    WITH (STATE = ON);


    NOW we compare our output with the output of the BigHat query:

    SELECT * FROM Demo.SaleItem;

    And we get the data happiness we expected:

    SaleItemId  ManagedByUser
    ----------- ---------------
    1           BigHat
    2           BigHat
    3           MedHat
    4           MedHat
    5           SmallHat
    6           SmallHat

    For the final read operation demonstration, lets see how it works from a stored procedure.

    CREATE PROCEDURE Demo.SaleItem$select
    AS
        SET NOCOUNT ON; 
        SELECT USER_NAME(); --Show the userName so we can see the context
        SELECT * FROM  Demo.SaleItem;
    GO
    GRANT EXECUTE ON   Demo.SaleItem$select to SmallHat, MedHat, BigHat;


    Now execute the procedure as the different users (I am only going to include SmallHat to avoid being over repetitive, try it out for yourself)

    EXECUTE AS USER = 'SmallHat';
    GO
    EXEC Demo.SaleItem$select;
    GO
    REVERT;

    Which returns:

    -----------------------------
    SmallHat

    SaleItemId  ManagedByUser
    ----------- ---------------
    5           SmallHat
    6           SmallHat

    This shows us that the row level security works as expected without ownership chaining coming into place for the selection of rows, but it does come into play for running the TVF that determines which rows can be seen. In a later entry in this series, I will show how you can use a table in the function if you can't simply code the query to just use system functions.

    So how could you override it? Just like in the Dynamic Data Masking examples, in the procedure code, use WITH EXECUTE AS to elevate to a different users rights (and we will see some other possible solutions later as well in the third entry where I will show some methods of using values other than the simple system functions.)

    ALTER PROCEDURE Demo.SaleItem$select
    WITH EXECUTE AS 'BigHat' --use a similar user/role, and avoid dbo/owner if possible to avoid security holes.
    AS
        SET NOCOUNT ON;
        SELECT USER_NAME();
        SELECT * FROM Demo.SaleItem;

    Now execute this and note the output:

    EXECUTE AS USER = 'smallHat'
    go
    EXEC Demo.SaleItem$select
    GO
    REVERT

    Not only did you get the elevated rights of the user for objects they own, you now look like that user to the USER_NAME() function which is good for this example.

    ----------------------------------
    BigHat

    SaleItemId  ManagedByUser
    ----------- ---------------
    1           BigHat
    2           BigHat
    3           MedHat
    4           MedHat
    5           SmallHat
    6           SmallHat

    The downside here is that if you are using USER_NAME for any sort of logging purposes, this might be a bad thing. I would suggest that if you are using the user's context for logging stuff like RowLastModifiedByUser, to consider using ORIGINAL_LOGIN(), which will always return the server principal that the user attached to the server with.

    In the next entry, I will be covering the predicates whcih apply to the INSERT, UPDATE, and DELETE stements called BLOCK PREDICATES; which block write actions from occurring on the objects in the security
    policy.

    If you can’t wait and want to see some of where my material came from (along with blind experimenation!) Here are several links I found useful in writiing about this subject if you can’t wait until part 2 (and why should you?):

    Links:
    http://www.infoq.com/news/2015/06/SQL-Server-Row-Level-Security
    https://msdn.microsoft.com/en-us/library/dn765135.aspx
    https://www.mssqltips.com/sqlservertip/4094/phase-out-contextinfo-in-sql-server-2016-with-sessioncontext/
    https://www.mssqltips.com/sqlservertip/4004/sql-server-2016-row-level-security-introduction/

  • Target Audience - SQL Saturday 489 Tampa - “How to Design a Relational Database” Precon

    A few days back, I blogged an  intro to my pre-con on How to Design a Relational Database. You can see that blog here and you can (must?) go see the abstract and register: here.

    But who exactly is the target audience? I think there are really four reasonable answers, one answer that I feel says it all:

    Beginner Data Architects

    If you want to be a data architect, and have at least some SQL skills, you will probably not get so lost as to feel left out the entire day. I start at the beginning, introducing all of the concepts of a database, and we are going to do our work in groups so you will get some experience, if brief, in the database design process. I take this cue from my first Kimball University class on data warehousing, which I used as an inspiration to my first pre-con I did (which was pretty much the same as this one, plus or minus 20% from experience.) I was lost parts of the class since I was new to data warehousing at the time, but there was a nice mix of people in the class from beginner to near-expert, and the experience has paid many dividends throughout the years since, as I have done more and more data warehousing design and implementations along with my more natural relational database designs.

    Intermediate Data Architects

    While beginners may not have seen some of the stuff we are doing, intermediate data architects have probably done some of this stuff already. When I was at this point in my career, my goal was to make sure I was doing things the right way (sometimes I was, sometimes I wasn’t!). You are often mentored by people who have all sorts of ideas about what is right, coupled with websites, databases that are shipped with products (I am looking at you AdventureWorks!), not to mention you may have taken a class or two in college on the subject. In this class, my goal is to provide practical discussion about database design that can help clarify some of these things you have seen. That along with the ability to ask me questions, as well as discuss possible solutions to the teamwork assignment will hopefully clarify the way to design a database using some patterrns you may not have heard of, or to be honest, that you have heard are a bad idea but may not be.

    Since this is only a 7 hour class, we won’t have a ton of time to discuss your specific needs, but if you can fashion your questions in a specific, generic manner, we can discuss some ideas as a class.  If you have a design you want me to look at after the class, just send me some requirements and a design and I will be happy to discuss a bit more after the class is over. I want to make sure you get your money’s worth, and I usually learn something new from one or more of the attendees insane problems they have to deal with.

    Skilled Data Architects

    This is the most difficult group to mention, but here is what I endeavor to do for members of this group: I would enjoy attending this session myself even if I were not the one doing the talking. Worst case: you know everything, get to do some fun database design, and don’t have to work on a Friday.

    DBA/Developers Who Don’t Focus on SQL

    One group that I think needs a class like this as much as anyone are the people who use relational databases but don’t really focus on them for most of their work. This group tends to either write or support SQL, and need to know why a design should look a given way. “Why are there so many tables? Wouldn’t it be easier to just have one table that held all of the data?” Even if you won’t exactly apply these skills as your primary focus, it is good to understand the fundamentals, and participate in a design session or two. It is the reason I often go to reporting tool and programming sessions at conferences, to make sure I understand the other person’s job that I am affecting.

    Everyone (aka the one answer to rule them all)

    Finally, let’s be clear about one thing. If you can (or can get your boss to) fork out just shy of $160, I will promise you at least a tiny bit of entertainment, and the chance to meet a new friend or two. Worst case you didn’t have to work on a Friday. Best case, you learn how easy it is to design a great database, start a new career that lets you enjoy every day at the office. Worst case, you help offset the cost of my vacation to Disney World when I am finished with SQL Saturday. Either way, it is pretty much a no lose proposition for us both.

    Hopefully I will see several of my blog readers down in Tampa for this event on the 26th of February. I know I am pretty excited about it!

More Posts Next page »

This Blog

Syndication

Links to my other sites

Archives

Privacy Statement