THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Rob Farley

- Owner/Principal with LobsterPot Solutions (a MS Gold Partner consulting firm), Microsoft Certified Master, Microsoft MVP (SQL Server), APS/PDW trainer and leader of the SQL User Group in Adelaide, Australia. Rob is a former director of PASS, and runs training courses around the world in SQL Server and BI topics.

  • Will 2015 be a big year for the SQL community?

    In Australia, almost certainly yes.

    Australia recently saw two Azure data centres open, meaning that customers can now consider hosting data in Azure without worrying about it going overseas. Whether you’re considering SQL Database or having an Azure VM with SQL on it, the story has vastly improved here in Australia, and conversations will go further.

    The impact of this will definitely reach the community…

    …a community which is moving from strength to strength in itself.

    I say that because in 2014 we have seen new PASS Chapters pop up in Melbourne and Sydney (user groups that have existed for some time but have now been aligned with PASS); many of the prominent Australian partner organisations have MVPs on staff now, which was mentioned a few times at the Australian Partner Conference in September; and SQL Saturdays have come along way since the first ones were run around the country in 2012. February will see SQL Saturday 365 in Melbourne host around 30 sessions, and build on its 2013 effort of becoming one of the largest ten SQL Saturday events in the world. Microsoft Australia seems more receptive than ever to the SQL Server community, and I’m seeing individuals pushing into the community as well.

    From a personal perspective, I think 2015 will be an interesting year. As well as being a chapter leader and regional mentor, I know that I need to develop some new talks, after getting rejected to speak at the PASS Summit, but I also want to take the time to develop other speakers, as I have done in recent years.

    TSQL2sDay150x150I also want to write more – both blogs and white papers. I’ve blogged every month for at least five years, but many months that’s just the T-SQL Tuesday post. (Oh yeah – this post is for one of those two, hosted by Wayne Sheffield (@DBAWayne) on the topic of ‘Giving Back’.) So I want to be able to write a lot more than 12 posts in the year, and take the opportunity to get deeper in the content. I know I have a lot to talk about, whether it be in the BI space, or about query plans, or PDW, or security – there really are a lot of topics I could cover – I just need to reserve the time to get my content out there.

    So challenge me. If you want help with an abstract, or a talk outline (which I know is very different to an abstract), or you want me to blog on a particular topic, then let me know and I’ll see what I can do. I want to give even more to the community, and if you’re in the community, that should include you!

    @rob_farley

  • Minimising Data Movement in PDW Using Query Optimisation Techniques

    This is a white paper that I put together recently about APS / PDW Query Optimisation. You may have seen it at http://blogs.technet.com/b/dataplatforminsider/archive/2014/11/14/aps-best-practice-how-to-optimize-query-performance-by-minimizing-data-movement.aspx as well, but in case you haven’t, read on!

    I think the significance of this paper is big, because most people who deal with data warehouses (and PDW even more so) haven’t spent much time thinking about Query Optimisation techniques, and certainly not about how they can leverage features of SQL Server’s Query Optimizer to minimise data movement (which is probably the largest culprit for poor performance in a PDW environment).

    Oh, and I have another one that I’m writing too...

     


    The Analytics Platform System, with its MPP SQL Server engine (SQL Server Parallel Data Warehouse) can deliver performance and scalability for analytics workloads that you may not have expected from SQL Server. But there are key differences in working with SQL Server PDW and SQL Server Enterprise Edition that one should be aware of in order to take full advantage of the SQL Server PDW capabilities. One of the most important considerations when tuning queries in Microsoft SQL Server Parallel Data Warehouse is the minimisation of data movement. This post shows a useful technique regarding the identification of redundant joins through additional predicates that simulate check constraints.

    Microsoft’s PDW, part of the Analytics Platform System (APS), offers scale-out technology for data warehouses. This involves spreading data across a number of SQL Server nodes and distributions, such that systems can host up to many petabytes of data. To achieve this, queries which use data from multiple distributions to satisfy joins must leverage the Data Movement Service (DMS) to relocate data during the execution of the query. This data movement is both a blessing and a curse; a blessing because it is the fundamental technology which allows the scale-out features to work, and a curse because it can be one of the most expensive parts of query execution. Furthermore, tuning to avoid data movement is something which many SQL Server query tuning experts have little experience, as it is unique to the Parallel Data Warehouse edition of SQL Server.

    Regardless of whether data in PDW is stored in a column-store or row-store manner, or whether it is partitioned or not, there is a decision to be made as to whether a table is to be replicated or distributed. Replicated tables store a full copy of their data on each compute node of the system, while distributed tables distribute their data across distributions, of which there are eight on each compute node. In a system with six compute nodes, there would be forty-eight distributions, with an average of less than 2.1% (100% / 48) of the data in each distribution.

    When deciding whether to distribute or replicate data, there are a number of considerations to bear in mind. Replicated data uses more storage and also has a larger management overhead, but can be more easily joined to data, as every SQL node has local access to replicated data. By distributing larger tables according to the hash of one of the table columns (known as the distribution key), the overhead of both reading and writing data is reduced – effectively reducing the size of databases by an order of magnitude.

    Having decided to distribute data, choosing which column to use as the distribution key is driven by factors including the minimisation of data movement and the reduction of skew. Skew is important because if a distribution has much more than the average amount of data, this can affect query time. However, the minimisation of data movement is probably the most significant factor in distribution-key choice.

    Joining two tables together involves identifying whether rows from each table match to according a number of predicates, but to do this, the two rows must be available on the same compute node. If one of the tables is replicated, this requirement is already satisfied (although it might need to be ‘trimmed’ to enable a left join), but if both tables are distributed, then the data is only known to be on the same node if one of the join predicates is an equality predicate between the distribution keys of the tables, and the data types of those keys are exactly identical (including nullability and length). More can be read about this in the excellent whitepaper about Query Execution in Parallel Data Warehouse at http://gsl.azurewebsites.net/Portals/0/Users/Projects/pdwau3/sigmod2012.pdf

    To avoid data movement between commonly-performed joins, creativity is often needed by the data warehouse designers. This could involve the addition of extra columns to tables, such as adding the CustomerKey to many fact data tables (and using this as the distribution key), as joins between orders, items, payments, and other information required for a given report, as all these items are ultimately about a customer, and adding additional predicates to each join to alert the PDW Engine that only rows within the same distribution could possibly match. This is thinking that is alien for most data warehouse designers, who would typically feel that adding CustomerKey to a table not directly related to a Customer dimension is against best-practice advice.

    Another technique commonly used by PDW data warehouse designers that is rarely seen in other SQL Server data warehouses is splitting tables up into two, either vertically or horizontally, whereas both are relatively common in PDW to avoid some of the problems that can often occur.

    Splitting a table vertically is frequently done to reduce the impact of skew when the ideal distribution key for joins is not evenly distributed. Imagine the scenario of identifiable customers and unidentifiable customers, as increasingly the situation as stores have loyalty programs allowing them to identify a large portion (but not all) customers. For the analysis of shopping trends, it could be very useful to have data distributed by customer, but if half the customers are unknown, there will be a large amount of skew.

    To solve this, sales could be split into two tables, such as Sales_KnownCustomer (distributed by CustomerKey) and Sales_UnknownCustomer (distributed by some other column). When analysing by customer, the table Sales_KnownCustomer could be used, including the CustomerKey as an additional (even if redundant) join predicate. A view performing a UNION ALL over the two tables could be used to allow reports that need to consider all Sales.

    The query overhead of having the two tables is potentially high, especially if we consider tables for Sales, SaleItems, Deliveries, and more, which might all need to be split into two to avoid skew while minimising data movement, using CustomerKey as the distribution key when known to allow customer-based analysis, and SalesKey when the customer is unknown.

    By distributing on a common key the impact is to effectively create mini-databases which are split out according to groups of customers, with all of the data about a particular customer residing in a single database. This is similar to the way that people scale out when doing so manually, rather than using a system such as PDW. Of course, there is a lot of additional overhead when trying to scale out manually, such as working out how to execute queries that do involve some amount of data movement.

    By splitting up the tables into ones for known and unknown customers, queries that were looking something like the following:

    SELECT …
    FROM Sales AS s
    JOIN SaleItems AS si
    ON si.SalesKey = s.SalesKey
    JOIN Delivery_SaleItems AS dsi
    ON dsi.LineItemKey = si.LineItemKey
    JOIN Deliveries AS d
    ON d.DeliveryKey = dsi.DeliveryKey

    …would become something like:

    SELECT …
    FROM Sales_KnownCustomer AS s
    JOIN SaleItems_KnownCustomer AS si
    ON si.SalesKey = s.SalesKey
    AND si.CustomerKey = s.CustomerKey
    JOIN Delivery_SaleItems_KnownCustomer AS dsi
    ON dsi.LineItemKey = si.LineItemKey
    AND dsi.CustomerKey = s.CustomerKey
    JOIN Deliveries_KnownCustomer AS d
    ON d.DeliveryKey = dsi.DeliveryKey
    AND d.CustomerKey = s.CustomerKey
    UNION ALL
    SELECT …
    FROM Sales_UnknownCustomer AS s
    JOIN SaleItems_UnknownCustomer AS li
    ON si.SalesKey = s.SalesKey
    JOIN Delivery_SaleItems_UnknownCustomer AS dsi
    ON dsi.LineItemKey = s.LineItemKey
    AND dsi.SalesKey = s.SalesKey
    JOIN Deliveries_UnknownCustomer AS d
    ON d.DeliveryKey = s.DeliveryKey
    AND d.SalesKey = s.SalesKey

    I’m sure you can appreciate that this becomes a much larger effort for query writers, and the existence of views to simplify querying back to the earlier shape could be useful. If both CustomerKey and SalesKey were being used as distribution keys, then joins between the views would require both, but this can be incorporated into logical layers such as Data Source Views much more easily than using UNION ALL across the results of many joins. A DSV or Data Model could easily define relationships between tables using multiple columns so that self-serving reporting environments leverage the additional predicates.

    The use of views should be considered very carefully, as it is easily possible to end up with views that nest views that nest view that nest views, and an environment that is very hard to troubleshoot and performs poorly. With sufficient care and expertise, however, there are some advantages to be had.

    The resultant query would look something like:

    SELECT …
    FROM Sales AS s
    JOIN SaleItems AS li
    ON si.SalesKey = s.SalesKey
    AND si.CustomerKey = s.CustomerKey
    JOIN Delivery_SaleItems AS dsi
    ON dsi.LineItemKey = si.LineItemKey
    AND dsi.CustomerKey = s.CustomerKey
    AND dsi.SalesKey = s.SalesKey
    JOIN Deliveries AS d
    ON d.DeliveryKey = dsi.DeliveryKey
    AND d.CustomerKey = s.CustomerKey
    AND d.SalesKey = s.SalesKey

    Joining multiple sets of tables which have been combined using UNION ALL is not the same as performing a UNION ALL of sets of tables which have been joined. Much like any high school mathematics teacher will happily explain that (a*b)+(c*d) is not the same as (a+c)*(b+d), additional combinations need to be considered when the logical order of joins and UNION ALLs.

    joins

    Notice that when we have (TableA1 UNION ALL TableA2) JOIN (TableB1 UNION ALL TableB2), we must perform joins not only between TableA1 and TableB1, and TableA2 and TableB2, but also TableA1 and TableB2, and TableB1 and TableA2. These last two combinations do not involve tables with common distribution keys, and therefore we would see data movement. This is despite the fact that we know that there can be no matching rows in those combinations, because some are for KnownCustomers and the others are for UnknownCustomers. Effectively, the relationships between the tables would be more like the following diagram:

    joins2

    There is an important stage of Query Optimization which must be considered here, and which can be leveraged to remove the need for data movement when this pattern is applied – that of Contradiction.

    The contradiction algorithm is an incredibly useful but underappreciated stage of Query Optimization. Typically it is explained using an obvious contradiction such as WHERE 1=2. Notice the effect on the query plans of using this predicate.

    clip_image012Because the Query Optimizer recognises that no rows can possibly satisfy the predicate WHERE 1=2, it does not access the data structures seen in the first query plan.

    This is useful, but many readers may not consider queries that use such an obvious contradiction are going to appear in their code.

    But suppose the views that perform a UNION ALL are expressed in this form:

    CREATE VIEW dbo.Sales AS
    SELECT *
    FROM dbo.Sales_KnownCustomer
    WHERE CustomerID > 0
    UNION ALL
    SELECT *
    FROM dbo.Sales_UnknownCustomer
    WHERE CustomerID = 0;

    Now, we see a different kind of behaviour.

    Before the predicates are used, the query on the views is rewritten as follows (with SELECT clauses replaced by ellipses).

    SELECT …
    FROM (SELECT …
    FROM (SELECT ...
    FROM [sample_vsplit].[dbo].[Sales_KnownCustomer] AS T4_1
    UNION ALL
    SELECT …
    FROM [tempdb].[dbo].[TEMP_ID_4208] AS T4_1) AS T2_1
    INNER JOIN
    (SELECT …
    FROM (SELECT …
    FROM [sample_vsplit].[dbo].[SaleItems_KnownCustomer] AS T5_1
    UNION ALL
    SELECT …
    FROM [tempdb].[dbo].[TEMP_ID_4209] AS T5_1) AS T3_1
    INNER JOIN
    (SELECT …
    FROM (SELECT …
    FROM [sample_vsplit].[dbo].[Delivery_SaleItems_KnownCustomer] AS T6_1
    UNION ALL
    SELECT …
    FROM [tempdb].[dbo].[TEMP_ID_4210] AS T6_1) AS T4_1
    INNER JOIN
    (SELECT …
    FROM [sample_vsplit].[dbo].[Deliveries_KnownCustomer] AS T6_1
    UNION ALL
    SELECT …
    FROM [tempdb].[dbo].[TEMP_ID_4211] AS T6_1) AS T4_2
    ON (([T4_2].[CustomerKey] = [T4_1].[CustomerKey])
    AND ([T4_2].[SalesKey] = [T4_1].[SalesKey])
    AND ([T4_2].[DeliveryKey] = [T4_1].[DeliveryKey]))) AS T3_2
    ON (([T3_1].[CustomerKey] = [T3_2].[CustomerKey])
    AND ([T3_1].[SalesKey] = [T3_2].[SalesKey])
    AND ([T3_2].[SaleItemKey] = [T3_1].[SaleItemKey]))) AS T2_2
    ON (([T2_2].[CustomerKey] = [T2_1].[CustomerKey])
    AND ([T2_2].[SalesKey] = [T2_1].[SalesKey]))) AS T1_1

    Whereas with the inclusion of the additional predicates, the query simplifies to:

    SELECT …
    FROM (SELECT …
    FROM (SELECT …
    FROM [sample_vsplit].[dbo].[Sales_KnownCustomer] AS T4_1
    WHERE ([T4_1].[CustomerKey] > 0)) AS T3_1
    INNER JOIN
    (SELECT …
    FROM (SELECT …
    FROM [sample_vsplit].[dbo].[SaleItems_KnownCustomer] AS T5_1
    WHERE ([T5_1].[CustomerKey] > 0)) AS T4_1
    INNER JOIN
    (SELECT …
    FROM (SELECT …
    FROM [sample_vsplit].[dbo].[Delivery_SaleItems_KnownCustomer] AS T6_1
    WHERE ([T6_1].[CustomerKey] > 0)) AS T5_1
    INNER JOIN
    (SELECT …
    FROM [sample_vsplit].[dbo].[Deliveries_KnownCustomer] AS T6_1
    WHERE ([T6_1].[CustomerKey] > 0)) AS T5_2
    ON (([T5_2].[CustomerKey] = [T5_1].[CustomerKey])
    AND ([T5_2].[SalesKey] = [T5_1].[SalesKey])
    AND ([T5_2].[DeliveryKey] = [T5_1].[DeliveryKey]))) AS T4_2
    ON (([T4_1].[CustomerKey] = [T4_2].[CustomerKey])
    AND ([T4_1].[SalesKey] = [T4_2].[SalesKey])
    AND ([T4_2].[SaleItemKey] = [T4_1].[SaleItemKey]))) AS T3_2
    ON (([T3_2].[CustomerKey] = [T3_1].[CustomerKey])
    AND ([T3_2].[SalesKey] = [T3_1].[SalesKey]))
    UNION ALL
    SELECT …
    FROM (SELECT …
    FROM [sample_vsplit].[dbo].[Sales_UnknownCustomer] AS T4_1
    WHERE ([T4_1].[CustomerKey] = 0)) AS T3_1
    INNER JOIN
    (SELECT …
    FROM (SELECT …
    FROM [sample_vsplit].[dbo].[SaleItems_UnknownCustomer] AS T5_1
    WHERE ([T5_1].[CustomerKey] = 0)) AS T4_1
    INNER JOIN
    (SELECT …
    FROM (SELECT …
    FROM [sample_vsplit].[dbo].[Delivery_SaleItems_UnknownCustomer] AS T6_1
    WHERE ([T6_1].[CustomerKey] = 0)) AS T5_1
    INNER JOIN
    (SELECT …
    FROM [sample_vsplit].[dbo].[Deliveries_UnknownCustomer] AS T6_1
    WHERE ([T6_1].[CustomerKey] = 0)) AS T5_2
    ON (([T5_2].[CustomerKey] = [T5_1].[CustomerKey])
    AND ([T5_2].[SalesKey] = [T5_1].[SalesKey])
    AND ([T5_2].[DeliveryKey] = [T5_1].[DeliveryKey]))) AS T4_2
    ON (([T4_1].[CustomerKey] = [T4_2].[CustomerKey])
    AND ([T4_1].[SalesKey] = [T4_2].[SalesKey])
    AND ([T4_2].[SaleItemKey] = [T4_1].[SaleItemKey]))) AS T3_2
    ON (([T3_2].[CustomerKey] = [T3_1].[CustomerKey])
    AND ([T3_2].[SalesKey] = [T3_1].[SalesKey]))) AS T1_1

    This may seem more complex – it’s certainly longer – but this is the original, preferred version of the join. This is a powerful rewrite of the query.

    joins3 

    Furthermore, the astute PDW-familiar reader will quickly realise that the UNION ALL of two local queries (queries that don’t require data movement) is also local, and that therefore, this query is completely local. The TEMP_ID_NNNNN tables in the first rewrite are more evidence that data movement has been required.

    When the two plans are shown using PDW’s EXPLAIN keyword, the significance is shown even clearer.

    The first plan appears as following, and it is obvious that there is a large amount of data movement involved.

    clip_image014

    clip_image015

    The queries passed in are identical, but the altered definitions of the views have removed the need for any data movement at all. This should allow your query to run a little faster. Ok, a lot faster.

    Summary

    When splitting distributed tables vertically to avoid skew, views over those tables should include predicates which reiterate the conditions that cause the data to be populated into each table. This provides additional information to the PDW Engine that can remove unnecessary data movement, resulting in much-improved performance, both for standard reports using designed queries, and ad hoc reports that use a data model.

     

    Check us out at www.lobsterpot.com.au or talk to me via Twitter at @rob_farley

  • Learning through others

    This PASS Summit was a different experience for me – I wasn’t speaking. I’ve presented at three of the five PASS Summits I’ve been to, where the previous one I’d not spoken at was 2012, while I was a PASS Director (and had been told I shouldn’t submit talks – advice that I’d ignored in 2013). I have to admit that I really missed presenting, both in 2012 and this year, and I will need to improve my session abstracts to make sure I get selected in future years.

    I’m not a very good ‘session attendee’ on the whole – it’s not my preferred style of learning – but I still wanted to go, because of the learning involved. Sometimes I will learn a lot from the various things that are mentioned in the few sessions I go to, but more significantly, I learn a lot from discussions with other people. I hear what they are doing with technology, and that encourages me to explore those technologies further. It’s not quite at the point of learning by osmosis simply by being in the presence of people who know stuff, but by developing relationships with people, and hearing them speak about the things they’re doing, I definitely learn a lot.

    Of course, I don’t get to know people for the sake of learning. I get to know people because I like getting to know people. But of course, one of the things I have in common with these people is SQL, and conversations often come around to that. And I know that I learn a lot from those conversations. I don’t have the luxury of living near many (any?) of my friends in the data community, and spending time with them in person definitely helps me.TSQL2sDay150x150

    And it’s not just SQL stuff that I learn. This month’s T-SQL Tuesday (for which this is a post) is hosted by Chris Yates (@YatesSQL), who I got to run alongside on one of the mornings. Even that was a learning experience for me, as we chatted about all kinds of things, and I listened to my feet hitting the ground – another technique I learned from a community – and made sure I stuck to my running form to minimise the pain I’d be feeling later in the day. Talking to Chris while I ran helped immensely, and I was far less sore than I thought I might be.

    On the SQL side, I got to learn about how excited people are about scale-out, with technologies like Stretched Tables coming very soon. As someone involved in the Parallel Data Warehouse space (and seriously – how thrilled was I to be able to chat with Dr Rimma Nehme, who was involved in the PDW Query Optimizer), scale-out is very much in my thoughts, and seeing what Microsoft is doing in this space is great – but learning what other people in the community are thinking about it is even more significant for me.

    @rob_farley 

    PS: This is the 60th T-SQL Tuesday. Huge thanks to Adam Machanic (@adammachanic) for starting this, and giving me something to write about each month these last five years.

  • PASS Summit WIT Lunch

    With the pleasant sound of cutlery on crockery, those lucky enough to secure tickets to the WIT Lunch at the PASS Summit get to listen to an interview with Kimberly Bryant, who is the founder of a non-profit organisation called Black Girls Code – helping teenaged girls from low-privilege communities to get into technology.

    She calls herself an Accidental Entrepreneur, driven by her passion to see the less-privileged have opportunities to explore an industry that was dominated by a very different part of the community. Her daughter was interested in tech, and went on a tech-focused summer camp, where she was the only non-white kid, and one of only three girls. With a crowd of about 40, that was less than ten percent of the camp.

    What Kimberly saw at the camp, and in other environments that are dominated by a particular demographic, was that the people who were providing for the group would cater for the masses, and not the minorities. From an economic perspective, I’m sure this makes sense. If you’re going to find something that caters for a particular cluster of people, a particular type of person, then targetting the larger clusters is likely to give the ‘best results’. But (my opinion) this is ignoring the fact that the larger clusters of people tend to be catered for by just about anything. In my experience, if someone is part of a larger cluster, they have a large amount of support from their peers already, and need less from the organisers. But if the organisers can ensure that the edges of the group are looked after, then the ones in the middle will still be just fine, and the whole group will be encouraged.

    Diversity is something that the IT industry suffers from, and I do mean ‘suffer’. Without good diversity, our industry is held back. Stupidly, our industry keeps shooting itself in the foot, and it’s the larger clusters of people – I guess that means people like me – who need to take a stand when we see things that would alienate minority groups.

    Kimberly Bryant points out that teams need diversity, and that hiring decisions need to ensure that they don’t turn away people because of diversity. For myself, as a business owner, I hope that I never turn someone away because of diversity, because I do agree that teams need diversity. What I love the most though, is that what Kimberly has done is to develop programs to make sure that people from a particular minority group present as stronger candidates to hiring managers.

    Let’s encourage people from minority groups to get into IT. We’ll all benefit from it.

    @rob_farley

  • Dr Rimma Nehme at the PASS Summit

    This Summit’s presentation from Microsoft Research Labs is from Dr Rimma Nehme, bucking the trend of having presentations from Dr David DeWitt. I’m really pleased to be able to hear from her, because she’s an absolute legend.

    Among her qualifications is work on the PDW Query Optimizer – a topic closer to me than probably any other area of SQL Server. I just wish I had known this a few minutes ago when I met her, but I’m sure she’ll chat more freely after her big presentation.http://www.sqlpass.org/images/speakers/RimmaNehme588.png

    Today she’s talking about Cloud Computing, which is great because the cloud space has changed significantly in recent years, and it’s good to hear from Microsoft Research Labs again. For example, analysing the power-effectiveness of a data centre by comparing the total power used by a data centre against the computing power of a data centre. This leads to exploring more effective systems, such as evaporative cooling (which is used by many Australian homes and businesses, of course), making energy-responsibility a key component of cloud computing. With such an effort being put into cloud computing, the globally-responsible option is to use the cloud.

    The five key drivers for cloud that Dr Nehme listed are:

    • Elasticity
    • No CapEx
    • Pay Per Use
    • Focus on Business
    • Fast Time To Market

    These are all huge, of course, and the business aspects are massive. It’s increasingly easy to persuade businesses to move to the cloud, but the exciting thing about the technologies that have been discussed this week is the elasticity point.

    Microsoft is doing huge amounts of work to let people scale out easily. New technologies such as Stretched Tables will allow people to have hybrid solutions between on-prem and cloud like never before. With a background in the PDW Query Optimizer, Dr Nehme is the perfect person to be exploring what’s going on with spreading the load across multiple cloud-based machines for these scale-out solutions.

    The cloud means that many database professionals worry about their jobs. I’m sure people felt the same way when the industrial revolution came through. People who work on production-lines have been replaced by robots, and database administrators who only do high availability don’t need to handle that in the cloud space. But they will not be redundant. Dr Nehme just said “Cloud was not designed to be a threat to DBAs”, and this is significant. The key here is that we have more data than ever, and we need to be able to use computing power effectively.

    We can’t keep going with the amount of data that is appearing, and we need to be more responsible than ever.

    Great keynote, Dr Nehme. I hope this is the first of many keynotes from you.

    @rob_farley

  • PASS Summit – Thursday Keynote

    It’s good to point out it’s still only Thursday, as my laptop tells me that it’s already Friday.

    Today is the second of only two keynotes this Summit, which means that it’s the opportunity to hear from Microsoft Research Labs about what’s going on with data from their perspective.

    It’s also when we get to hear from the PASS VPs – community members that I used to serve with on the PASS Board of Directors – about how PASS is doing from a Financial and Marketing perspective.

    One of the interesting things about PASS is that there are reserves of over a million dollars. I mention this because it’s an area that some people is quite “interesting” for a community and non-profit organisation, but I want to point out that these savings help let PASS be more free in what they (we?) do. Having a million dollars in the bank means that PASS can reach out and do things that will serve the community, even if it seems like it could be risky. There is a lot of risk in running the Summit every year, and this is the most obvious area that PASS could need money to cover costs that might not come back if, say, there’s another volcano eruption in Iceland. I saw first-hand the freedom that PASS had because of the reserves (although some risks were still very high and freedom does not mean irresponsibility), and I know this is a good thing.

    From the marketing perspective, the celebration of individuals who have gone beyond the norm is a great part of the Summit event, and the PASSion Award winner has been announced as Andrey Korshikov. This guy has done so much for the Russian Data Community, making him the most influential SQL person in the largest country of the world. You can’t go past that…

    @rob_farley

  • Keynote technologies – new or not?

    So I’m sitting in the PASS Summit keynote, and there are some neat things being shown.

    Something that just appeared on the screen was around having the location of shoppers being shown on a plan of a store. There were some ‘Oohs’ coming from around the room, as they mentioned that Kinect was being used to track locations. Hotspots were appearing on a time-driven picture.

    But the thing that I think is most exciting is that this is almost all achievable right now. Collecting information from Kinect is something that my friends John & Bronwen have been presenting about for years, and displaying things on custom maps in Power BI (complete with hotspots) is also very achievable. If you don’t know how to do this, get along to Hope Foley’s session this afternoon (Wed 5th), as she explores more of what’s possible with Power Map. She wrote a post recently about Custom Maps in Power Map, which is a great blog post, walking through how to show spatial data on the plan of a building, playing it against a time dimension.

    The stuff in the keynote is excellent – much of it is future, but if you’re at the PASS Summit, you can be having conversations with many of the world’s best experts about how to revolutionise your data story, not just in the future, but right now.

    @rob_farley

  • PASS Summit keynote

    The PASS Summit has kicked off again with a tremendous keynote from Ranga. He's been in the role at Microsoft for a little over a year now, and has really come into his own, as can be seen by the presentation this morning. The changes to the data picture haven't changed hugely over the past year, although the "Internet of Things" space is increasing quickly.

    image1

    With that, the speed of growth in data volume has kicked in harder than ever. Being able to collect, process, and analyse the kinds of volume that we're now facing means that scaling is major feature being discussed. In recent years, this meant looking at Big Data and the ways that this can hook into our existing solutions, and technologies like Hekaton have allowed us to scale up to handle huge numbers of transactions in a scale-up scenario.

    This year, though, we see scale-out having a refreshed focus. We're hearing talk of 'sharding' more, and the idea of being able to use multiple databases (including cloud-based ones) to achieve scale on demand – an elasticity that suits business more than ever.

    Most of our customers at LobsterPot see changes in the amount of business that’s going on across the year, with some having certain key days requiring orders of magnitude more traffic than on ‘normal’ days. They already scale out their websites, but data is another matter. Databases typically scale UP, not OUT.

    My work in the Analytics Platform System / Parallel Data Warehouse space makes me acutely aware of the challenges around scaling out data. When you need to perform joins between tables which have their data in different databases, on different servers, there are problems that need addressing. A lot of it happens behind the scenes through complex data movement techniques, so that it looks like a normal query. This is stuff that is hard to do through clever data

    What we’re seeing this morning are some of the ways that Microsoft is providing scale out technology in SQL Server and SQL Database. Considering they now have over a million SQL Database databases in Azure, thinking about how to leverage this technology to enhance on-prem SQL Server databases to provide a new level of hybrid is very interesting.

    One of these technologies is Stretched Tables, which we saw this morning. This is about being able to take a table in SQL Server and stretch it into SQL Databases in Azure. This means that the table will be sharded across on-prem and cloud – hot data being stored locally, and more-rarely used data being stored in the cloud. For queries that need to access data that’s in the cloud, data can can be extracted from the cloud tables, pushing predicates down to pull back part of the data, transparently (as far as the user is concerned).

    This is not like using linked servers and views, handling inserts with triggers. This is achieving hybrid behind the scenes, giving users a logical layer they can query to access their information whether it’s local or in the cloud.

    Until now, I’ve always felt that ‘hybrid’ has been about using some components locally and other components in the cloud. But what we’re seeing now are ways that ‘hybrid’ can mean that we have the core of our database – the tables themselves – are handled in a hybrid way.

    Exciting times ahead…

    @rob_farley

  • LobsterPot staff are influential, outstanding and valuable

    The title of the post says it all, but let me explain why…

    It’s not news that LobsterPot has three SQL Server MVPs on staff. Ted received his fifth award earlier in the year, and this month saw Julie get her second award and a ninth for me.

    But not only that, Julie was recognised as one of Nine Influential Women by Solutions Review magazine, and Martin received an Outstanding Volunteer Award from the PASS organisation. Ted and Julie have both received this award in the past, and former employee Roger Noble also received this award while he was working for us. It’s all further evidence that LobsterPot staff really are very special.

    @rob_farley

  • Heroes of SQL

    Every story has heroes. Some heroes distinguish themselves by their superpowers; others by extraordinary bravery or compassion; some are simply heroes because of what they do in their jobs.

    We picture the men and women who work in the emergency departments of hospitals, soldiers who go back into the line of fire to rescue their colleagues, and of course, those who have been bitten by radioactive spiders.

    We don’t tend picture people who work with databases.

    But let me explain something – at the PASS Summit next month, you will come across a large number of heroes. The people who are presenting show extraordinary bravery to stand up in front of a room full of people who want to learn and who will write some of the nastiest things about them in evaluation forms. The members of the SQL Server Product Group (who you can see at the SQL Clinic) from Microsoft have incredible information about how SQL Server works on the inside. And then you have people like Paul White, Jon Kehayias and Ted Krueger, who have obviously spent too much time around arachnids with short half-lives.

    The amazing thing about the SQL Server community is their willingness to be heroes – not only by stepping up at conferences, but in helping people with their every day problems. It’s one thing to be a hero to help those in your workplace, by making sure that backups are performed, and that your databases are checked for corruption regularly, but people in the SQL Server community help people they don’t know on forums, they write blogs posts, and they attend (and organise) SQL Saturdays and other events so that they can sit and talk to strangers.

    The PASS Summit is the biggest gathering of SQL professionals in the world each year. So come along and see why people in the SQL community are different.TSQL2sDay150x150

    They’re heroes.

    @rob_farley 

    PS: Thanks to another SQL Hero, Tracy McKibben (@realsqlguy), for his effort in hosting this month’s T-SQL Tuesday.

  • Less than a month away...

    The PASS Summit for 2014 is nearly upon us, and the MVP Summit is immediately prior, in the same week and the same city. This is my first MVP Summit since early 2008. I’ve been invited every year, but I simply haven’t prioritised it. I’ve been awarded MVP status every year since 2006 (just received my ninth award), but in 2009 and 2010 I attended SQLBits in the UK, and have been to every PASS Summit since then. This year, it’s great that I get to do both Summits in the same trip, but if I get to choose just one, then it’s an easy decision.

    So let me tell you why the PASS Summit is the bigger priority for me.

    Number of people

    Actually, the PASS Summit isn’t that much larger than the MVP Summit, but the MVP Summit has thousands of non-SQL MVPs, and only a few hundred in the SQL space. Because of this, the ‘average conversation with a stranger’ is very different. While it can be fascinating to meet someone who is an MVP for File System Storage, the PASS Summit has me surrounded by people who do what I do, and it makes for more better conversations as I learn about who people are and what they do.

    Access to Microsoft

    The NDA content that MVPs learn at the MVP Summit is good, but the PASS Summit will have content about every-SQL-thing you ever want. The same Microsoft people who present at the MVP Summit are also at the PASS Summit, and dedicate time to the SQL Clinic, which means that you can spend even more time working through ideas and problems with them. You don’t get this at the MVP Summit.

    Non-exclusivity

    Obviously not everyone can go to the MVP Summit, as it’s a privilege that comes as part of the MVP award each year (although it’s hardly ‘free’ when you have to fly there from Australia). While it may seem like an exclusive event is going to be, well, exclusive, most MVPs are all about the wider community, and thrive on being around non-MVPs. There are less than 400 SQL MVPs around the world, and ten times that number of SQL experts at the Summit. While some of the top experts might be MVPs, a lot of them are not, and the PASS Summit is a chance to meet those people each year.

    Content from the best

    The MVP Summit has presentations from people who work on the product. At my first MVP Summit, this was a huge deal. And it’s still good to hear what these guys are thinking, under NDA, when they can actually go into detail that they know won’t leave the room. But you don’t get to hear from Paul White at the MVP Summit, or Erin Stellato, or Julie Koesmarno, or any of the other non-Microsoft presenters. The PASS Summit gives the best of both worlds.

    I’m really looking forward to the MVP Summit. I’ve missed the last six, and it’s been too long. MVP Summits were when I met some of my oldest SQL friends, such as Kalen Delaney, Adam Machanic, Simon Sabin, Paul & Kimberly, and Jamie Thomson. The opportunities are excellent. But the PASS Summit is what the community is about.

    MVPs are MVPs because of the community – and that’s what the PASS Summit is about. That’s the one I’m looking forward to the most.

    @rob_farley

  • Passwords

    Another month, and another T-SQL Tuesday. I have some blog posts I’ve been meaning to write, but the scheduling of T-SQL Tuesday and my determination to keep my record of never having missed one keeps me going. This month is hosted by Sebastian Meine (@sqlity), and is on the topic of Passwords.

    TSQL2sDay150x150

    Passwords are so often in the news. We read about how passwords are stolen through security breaches on a regular basis, and have plenty of suggestions on how using complex passwords can help (although the fact that tools such as 1Password put passwords on the clipboard must be an issue…), or that we should use passwords that are complex through length but simple in form such as a sentence – and we naturally see xkcd.com jump in on things with poignant commentary on life in a tech world.

    This post is actually not to tell you all to avoid using passwords more than once, or to use sufficiently complex that you don’t put onto your clipboard, or anything like that.

    Instead, I want you to think about what a password means.

    A password means that you have secret information that only you have. It’s what ‘secret’ means. As soon as you tell that secret information to multiple places, it’s not secret any more. Anyone who has seen my passport knows where I was born, and there are plenty of ways to work out my mother’s maiden name, yet these are considered ‘secret’ information that can be used to check that I’m me.

    These days, I carry multiple RSA tokens around with me, so that I can log into client sites, or connect to bank’s internet banking. The codes on these devices are considered secret, but actually, they contain a secret piece of information that can be used to identify me, through the codes they generate. Combining a password and these codes is considered enough to identify me, but not in a way that can let someone else in a few seconds later when the numbers change.

    When I develop SSIS packages for clients, or just about anything that needs to connect to sensitive data, I don’t try to figure out what passwords need to be included. Where possible (frustratingly it’s not always), I don’t include passwords in database connections at all – it’s secret information that I shouldn’t have to know. Instead, I let the package run with credentials that are stored within the SQL instance. When the package is deployed, it can run with the appropriate permissions, according to the rights given to the user identified in the credential. The trust that is established by the credential is enough to let it do what it needs to, and all I need to tell the package is “Assume you have sufficient rights for this.” I don’t need to store the password anywhere in the package that way, and I’m separated from production data, as every developer should be.

    I studied cryptography at university, although that was nearly twenty years ago and I hope things have moved on since then. I know various algorithms have been ‘cracked’, but the principles of providing secret information for identification carry on. I believe public/private key pairs are still excellent methods of proving that someone is who they say they are, so that I can generate something that you know comes from me, and you can generate something that only I can decrypt (and by using both my key pair and yours will allow us to have a secure conversation – until one of our private keys is compromised).

    Today we need to be able to identify ourselves through multiple devices and our ‘secret’ information is stored on servers, protected by passwords. Our passwords are secret, and anyone who knows any password we have used before could try to see if this is our secret information for other servers.

    I don’t know what the answer is, but I’m careful with my information. That said, I was the victim of credit-card skimming just recently, which the bank detected and cancelled my cards.

    Just be careful with your passwords. They are secret, and you should treat them that way. If you can make use of RSA tokens, or multi-factor authentication, or some other method that can trust you, then do so. Hopefully those places that you entrust your secret information will do the right thing by you…

    Be safe out there!

    @rob_farley

  • SQL Spatial: Getting “nearest” calculations working properly

    If you’ve ever done spatial work with SQL Server, I hope you’ve come across the ‘nearest’ problem.

    You have five thousand stores around the world, and you want to identify the one that’s closest to a particular place. Maybe you want the store closest to the LobsterPot office in Adelaide, at -34.925806, 138.605073. Or our new US office, at 42.524929, -87.858244. Or maybe both!

    You know how to do this. You don’t want to use an aggregate MIN or MAX, because you want the whole row, telling you which store it is. You want to use TOP, and if you want to find the closest store for multiple locations, you use APPLY. Let’s do this (but I’m going to use addresses in AdventureWorks2012, as I don’t have a list of stores). Oh, and before I do, let’s make sure we have a spatial index in place. I’m going to use the default options.

    CREATE SPATIAL INDEX spin_Address ON Person.Address(SpatialLocation);

    And my actual query:

    WITH MyLocations AS
    (SELECT * FROM (VALUES ('LobsterPot Adelaide', geography::Point(-34.925806, 138.605073, 4326)),
                           ('LobsterPot USA', geography::Point(42.524929, -87.858244, 4326))
                   ) t (Name, Geo))
    SELECT l.Name, a.AddressLine1, a.City, s.Name AS [State], c.Name AS Country
    FROM MyLocations AS l
    CROSS APPLY (
        SELECT TOP (1) *
        FROM Person.Address AS ad
        ORDER BY l.Geo.STDistance(ad.SpatialLocation)
        ) AS a
    JOIN Person.StateProvince AS s
        ON s.StateProvinceID = a.StateProvinceID
    JOIN Person.CountryRegion AS c
        ON c.CountryRegionCode = s.CountryRegionCode
    ;

    image

    Great! This is definitely working. I know both those City locations, even if the AddressLine1s don’t quite ring a bell. I’m sure I’ll be able to find them next time I’m in the area.

    But of course what I’m concerned about from a querying perspective is what’s happened behind the scenes – the execution plan.

    image

    This isn’t pretty. It’s not using my index. It’s sucking every row out of the Address table TWICE (which sucks), and then it’s sorting them by the distance to find the smallest one. It’s not pretty, and it takes a while. Mind you, I do like the fact that it saw an indexed view it could use for the State and Country details – that’s pretty neat. But yeah – users of my nifty website aren’t going to like how long that query takes.

    The frustrating thing is that I know that I can use the index to find locations that are within a particular distance of my locations quite easily, and Microsoft recommends this for solving the ‘nearest’ problem, as described at http://msdn.microsoft.com/en-au/library/ff929109.aspx.

    Now, in the first example on this page, it says that the query there will use the spatial index. But when I run it on my machine, it does nothing of the sort.

    image

    I’m not particularly impressed. But what we see here is that parallelism has kicked in. In my scenario, it’s split the data up into 4 threads, but it’s still slow, and not using my index. It’s disappointing.

    But I can persuade it with hints!

    If I tell it to FORCESEEK, or use my index, or even turn off the parallelism with MAXDOP 1, then I get the index being used, and it’s a thing of beauty! Part of the plan is here:

    image

    It’s massive, and it’s ugly, and it uses a TVF… but it’s quick.

    The way it works is to hook into the GeodeticTessellation function, which is essentially finds where the point is, and works out through the spatial index cells that surround it. This then provides a framework to be able to see into the spatial index for the items we want. You can read more about it at http://msdn.microsoft.com/en-us/library/bb895265.aspx#tessellation – including a bunch of pretty diagrams. One of those times when we have a much more complex-looking plan, but just because of the good that’s going on.

    This tessellation stuff was introduced in SQL Server 2012. But my query isn’t using it.

    When I try to use the FORCESEEK hint on the Person.Address table, I get the friendly error:

    Msg 8622, Level 16, State 1, Line 1
    Query processor could not produce a query plan because of the hints defined in this query. Resubmit the query without specifying any hints and without using SET FORCEPLAN.

    And I’m almost tempted to just give up and move back to the old method of checking increasingly large circles around my location. After all, I can even leverage multiple OUTER APPLY clauses just like I did in my recent Lookup post.

    WITH MyLocations AS
    (SELECT * FROM (VALUES ('LobsterPot Adelaide', geography::Point(-34.925806, 138.605073, 4326)),
                           ('LobsterPot USA', geography::Point(42.524929, -87.858244, 4326))
                   ) t (Name, Geo))
    SELECT
        l.Name,
        COALESCE(a1.AddressLine1,a2.AddressLine1,a3.AddressLine1),
        COALESCE(a1.City,a2.City,a3.City),
        s.Name AS [State],
        c.Name AS Country
    FROM MyLocations AS l
    OUTER APPLY (
        SELECT TOP (1) *
        FROM Person.Address AS ad
        WHERE l.Geo.STDistance(ad.SpatialLocation) < 1000
        ORDER BY l.Geo.STDistance(ad.SpatialLocation)
        ) AS a1
    OUTER APPLY (
        SELECT TOP (1) *
        FROM Person.Address AS ad
        WHERE l.Geo.STDistance(ad.SpatialLocation) < 5000
        AND a1.AddressID IS NULL
        ORDER BY l.Geo.STDistance(ad.SpatialLocation)
        ) AS a2
    OUTER APPLY (
        SELECT TOP (1) *
        FROM Person.Address AS ad
        WHERE l.Geo.STDistance(ad.SpatialLocation) < 20000
        AND a2.AddressID IS NULL
        ORDER BY l.Geo.STDistance(ad.SpatialLocation)
        ) AS a3
    JOIN Person.StateProvince AS s
        ON s.StateProvinceID = COALESCE(a1.StateProvinceID,a2.StateProvinceID,a3.StateProvinceID)
    JOIN Person.CountryRegion AS c
        ON c.CountryRegionCode = s.CountryRegionCode
    ;

    But this isn’t friendly-looking at all, and I’d use the method recommended by Isaac Kunen, who uses a table of numbers for the expanding circles.

    It feels old-school though, when I’m dealing with SQL 2012 (and later) versions. So why isn’t my query doing what it’s supposed to? Remember the query...

    WITH MyLocations AS
    (SELECT * FROM (VALUES ('LobsterPot Adelaide', geography::Point(-34.925806, 138.605073, 4326)),
                           ('LobsterPot USA', geography::Point(42.524929, -87.858244, 4326))
                   ) t (Name, Geo))
    SELECT l.Name, a.AddressLine1, a.City, s.Name AS [State], c.Name AS Country
    FROM MyLocations AS l
    CROSS APPLY (
        SELECT TOP (1) *
        FROM Person.Address AS ad
        ORDER BY l.Geo.STDistance(ad.SpatialLocation)
        ) AS a
    JOIN Person.StateProvince AS s
        ON s.StateProvinceID = a.StateProvinceID
    JOIN Person.CountryRegion AS c
        ON c.CountryRegionCode = s.CountryRegionCode
    ;

    Well, I just wasn’t reading http://msdn.microsoft.com/en-us/library/ff929109.aspx properly.

    The following requirements must be met for a Nearest Neighbor query to use a spatial index:

    1. A spatial index must be present on one of the spatial columns and the STDistance() method must use that column in the WHERE and ORDER BY clauses.

    2. The TOP clause cannot contain a PERCENT statement.

    3. The WHERE clause must contain a STDistance() method.

    4. If there are multiple predicates in the WHERE clause then the predicate containing STDistance() method must be connected by an AND conjunction to the other predicates. The STDistance() method cannot be in an optional part of the WHERE clause.

    5. The first expression in the ORDER BY clause must use the STDistance() method.

    6. Sort order for the first STDistance() expression in the ORDER BY clause must be ASC.

    7. All the rows for which STDistance returns NULL must be filtered out.

    Let’s start from the top.

    1. Needs a spatial index on one of the columns that’s in the STDistance call. Yup, got the index.

    2. No ‘PERCENT’. Yeah, I don’t have that.

    3. The WHERE clause needs to use STDistance(). Ok, but I’m not filtering, so that should be fine.

    4. Yeah, I don’t have multiple predicates.

    5. The first expression in the ORDER BY is my distance, that’s fine.

    6. Sort order is ASC, because otherwise we’d be starting with the ones that are furthest away, and that’s tricky.

    7. All the rows for which STDistance returns NULL must be filtered out. But I don’t have any NULL values, so that shouldn’t affect me either.

    ...but something’s wrong. I do actually need to satisfy #3. And I do need to make sure #7 is being handled properly, because there are some situations (eg, differing SRIDs) where STDistance can return NULL. It says so at http://msdn.microsoft.com/en-us/library/bb933808.aspx – “STDistance() always returns null if the spatial reference IDs (SRIDs) of the geography instances do not match.” So if I simply make sure that I’m filtering out the rows that return NULL…

    …then it’s blindingly fast, I get the right results, and I’ve got the complex-but-brilliant plan that I wanted.

    image

    It just wasn’t overly intuitive, despite being documented.

    @rob_farley

  • Nepotism In The SQL Family

    There’s a bunch of sayings about nepotism. It’s unpopular, unless you’re the family member who is getting the opportunity.

    But of course, so much in life (and career) is about who you know.

    From the perspective of the person who doesn’t get promoted (when the family member is), nepotism is simply unfair; even more so when the promoted one seems less than qualified, or incompetent in some way. We definitely get a bit miffed about that.

    But let’s also look at it from the other side of the fence – the person who did the promoting. To them, their son/daughter/nephew/whoever is just another candidate, but one in whom they have more faith. They’ve spent longer getting to know that person. They know their weaknesses and their strengths, and have seen them in all kinds of situations. They expect them to stay around in the company longer. And yes, they may have plans for that person to inherit one day. Sure, they have a vested interest, because they’d like their family members to have strong careers, but it’s not just about that – it’s often best for the company as well.

    I’m not announcing that the next LobsterPot employee is one of my sons (although I wouldn’t be opposed to the idea of getting them involved), but actually, admitting that almost all the LobsterPot employees are SQLFamily members… …which makes this post good for T-SQL Tuesday, this month hosted by Jeffrey Verheul (@DevJef).TSQL2sDay150x150

    You see, SQLFamily is the concept that the people in the SQL Server community are close. We have something in common that goes beyond ordinary friendship. We might only see each other a few times a year, at events like the PASS Summit and SQLSaturdays, but the bonds that are formed are strong, going far beyond typical professional relationships.

    And these are the people that I am prepared to hire. People that I have got to know. I get to know their skill level, how well they explain things, how confident people are in their expertise, and what their values are. Of course there people that I wouldn’t hire, but I’m a lot more comfortable hiring someone that I’ve already developed a feel for. I need to trust the LobsterPot brand to people, and that means they need to have a similar value system to me. They need to have a passion for helping people and doing what they can to make a difference. Above all, they need to have integrity.

    Therefore, I believe in nepotism. All the people I’ve hired so far are people from the SQL community. I don’t know whether I’ll always be able to hire that way, but I have no qualms admitting that the things I look for in an employee are things that I can recognise best in those that are referred to as SQLFamily.

    …like Ted Krueger (@onpnt), LobsterPot’s newest employee and the guy who is representing our brand in America. I’m completely proud of this guy. He’s everything I want in an employee. He’s an experienced consultant (even wrote a book on it!), loving husband and father, genuine expert, and incredibly respected by his peers.

    It’s not favouritism, it’s just choosing someone I’ve been interviewing for years.

    @rob_farley

  • LobsterPot Solutions in the USA

    We’re expanding!

    I’m thrilled to announce that Microsoft Gold Partner LobsterPot Solutions has started another branch appointing the amazing Ted Krueger (5-time SQL MVP awardee) as the US lead. Ted is well-known in the SQL Server world, having written books on indexing, consulting and on being a DBA (not to mention contributing chapters to both MVP Deep Dives books). He is an expert on replication and high availability, and strong in the Business Intelligence space – vast experience which is both broad and deep.lp_usa_square

    Ted is based in the south east corner of Wisconsin, just north of Chicago. He has been a consultant for eons and has helped many clients with their projects and problems, taking the role as both technical lead and consulting lead. He is also tireless in supporting and developing the SQL Server community, presenting at conferences across America, and helping people through his blog, Twitter and more.

    Despite all this – it’s neither his technical excellence with SQL Server nor his consulting skill that made me want him to lead LobsterPot’s US venture. I wanted Ted because of his values. In the time I’ve known Ted, I’ve found his integrity to be excellent, and found him to be morally beyond reproach. This is the biggest priority I have when finding people to represent the LobsterPot brand. I have no qualms in recommending Ted’s character or work ethic. It’s not just my thoughts on him – all my trusted friends that know Ted agree about this.

    So last week, LobsterPot Solutions LLC was formed in the United States, and in a couple of weeks, we will be open for business!

    LobsterPot Solutions can be contacted via email at contact@lobsterpotsolutions.com, on the web at either www.lobsterpot.com.au or www.lobsterpotsolutions.com, and on Twitter as @lobsterpot_au and @lobsterpot_us.

    Ted Kruger blogs at LessThanDot, and can also be found on Twitter and LinkedIn.

    This post is cross-posted from http://lobsterpotsolutions.com/lobsterpot-solutions-in-the-usa

More Posts Next page »

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement