THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

John Paul Cook

  • In The Cloud: Azure Cosmos DB

    Azure Cosmos DB is Microsoft’s new global scale, distributed database as a service that supersedes Azure Document DB. Cosmos is a technology that enables you to create applications that are unimaginable with a single conventional relational database, even if the single relational database is stored in the cloud. Cosmos is something you need to know more about if you make your living working with data.

    Read more about Azure Cosmos DB here. If you want to try code against Cosmos and you don’t have an Azure account or don’t want to incur usage charges, you can download and install the Azure Cosmos DB Emulator. Some of the documentation refers to Document DB instead of Cosmos. Remember, Cosmos supersedes Document DB.

    image

    Figure 1. Azure Cosmos DB emulator.

  • In The Cloud: Azure Portal and HTML5 Storage Capacity Exceeded

    While preparing blog posts last week, the following message appeared in my browser window. I was in the Azure portal when the message appeared. It’s not an error with Azure or the browser. It could happen when using any website or browser. I tried again and everything was fine. The browser had many tabs open for days. A lot of things had been done in the browser.

    Because technical training is part of what I do from time to time, I saw this as a good educational opportunity. I think all technically oriented web users are aware of cookies. HTML5 has something different called localStorage and sessionStorage. If you’re interested in learning more about this, I highly recommend reading this from W3 Schools.

    image

    Figure 1. Not enough storage capacity in the browser when using the Azure portal.

    Seeing the message is a very rare event. My point in posting this is not about the message per se, but to make you aware of the modern storage features of HTML5.

  • [OT] Never Bet Against Cloud Computing

    Today at the Preakness horse racing event, Classic Empire dominated and led the race appearing to be the certain winner with Always Dreaming staying a close second and appearing to be a possible contender. At the end, Cloud Computing charged ahead leaving Classic Empire behind. Always Dreaming faded to eighth place. The victory of Cloud Computing was called an upset, but really people, had I been there, I would have put my money on Cloud Computing from the get go.

  • In The Cloud: Owner Role and Security Administration

    Azure uses Role Based Access Control (RBAC), which is something people generally don’t pay at lot of attention to when initially learning how to use the Azure portal. Take a close look at the screen capture shown below. The Delete button is disabled. There are definitely times you want to protect Azure resources from accidental deletion. That’s just basic good governance.

    image

    Figure 1. Delete button disabled for a Data Catalog resource.

    The reason that the Delete button is disabled can be understood by going to subscription management.

    image

    Figure 2. In the Azure portal, click the arrow to show more services, then select Subscriptions.

    image

    Figure 3. Notice that I was in the User Access Administrator role which doesn’t have delete privilege.

    The individual vertical sections in the Azure portal are called blades. Clicking the Add button causes the Add Permissions blade to appear. You need to select a role and then select the member(s) to add to the role.

    image

    Figure 4. The Add Permissions blade before selecting member(s) to add to the role.

    image

    Figure 5. The Add Permissions blade after selecting member(s). Click Save to make the RBAC change.

    image

    Figure 6. Notice that the Delete button is now enabled. It doesn’t immediately perform the deletion. It prompts you to make sure.

    image

    Figure 7. Prompt to make sure you really want to do this. Notice that the Delete button is disabled at this point.

    Typing the resource group name is a pain. I always move the mouse over the resource group name in the warning, double-click, copy, and paste.

    image

    Figure 8. Use copy and paste to simplify the deletion process.

    image

    Figure 9. The Delete button is enabled after entering it into the box.

    image

    Figure 10. Deletion in progress.

  • BI Beginner: Stacked Charts in Power BI

    People new to Power BI Desktop have asked me how to create a columnar chart where each column has different colors for different values stacked on top of each other. There’s no reason to be intimidated. Creating such a chart is very simple with only two additional requirements since yesterday’s port. First, your input data must classify the data into groups. You’ll either need groups specified in the data or be able to classify the data into groups based on some criteria. Today the sample data has two columns of sales data, one for sales of Widget A, and the other for sales of Widget B.

    image

    Figure 1. Sales data for Widget A and Widget B instead of total overall sales.

    Second, you must chose an appropriate visualization. Not all visualizations are suitable for stacking different data values on top of each other. You’ll need a visualization that has stacked within the visualization’s name. You want stacked data, so you need a stacked visualization. I told you this was easy!

    It is assumed that you understand the content in yesterday’s post. Those initial steps are not repeated here so that we can focus on the concept of stacked charts. What’s different from yesterday is that two columns are placed underneath Value.

    image

    Figure 2. You must drag both A Units Sold and B Units Sold to place them underneath Value.

    image

    Figure 3. Final stacked column chart.

    image

    Figure 4. You can change from a stacked column chart to a stacked bar chart by simply clicking the icon for a stacked bar chart.

    With Power BI, it’s very easy to change your visualization by clicking on a different visualization. I caution you that you do need to pay attention. Not all visualizations are equal. Take a close look at the Y-axis when you take this data and select the line chart visualization.

    image

    Figure 5. Notice that the line chart visualization isn’t stacked and has a maximum Y-axis value of 20.

    image

    Figure 6. The stacked area chart has a maximum Y-axis value of 30.

  • BI Beginner: Simple X-Y Plot

    When I worked for Microsoft giving presentations on data platform products, it was a common occurrence to have people tell me that they didn’t know how to use Power BI. This is the first in a series of posts showing how to do simple, useful tasks in Power BI Desktop. Power BI Desktop is free. Download the Power BI Desktop from here.

    You can either create a simple Excel file such as the one shown below or download the attached file to follow this tutorial. This tutorial begins with the steps to create a simple X-Y plot in Excel. Later the Excel file is used as input to Power BI.

    image

    Figure 1. Sales data for simple X-Y plot.

    It’s easy to create an X-Y plot in Excel as shown below.

    image

    Figure 2. Sample data saved in Excel to be loaded into Power BI.

    image

    Figure 3. Steps to create a simple X-Y plot in Excel.

    image

    Figure 4. Simple X-Y plot created by Excel.

    What follows is my answer to the question of how to do the same thing in Power BI. As you will see, it’s actually pretty easy. It will take you longer to read these instructions than it will take to actually do the work.

    Begin by loading your Excel file into Power BI Desktop.

    image

    Figure 5. Loading data into Power BI Desktop.

    You’ll be prompted to select your worksheet.

    image

    Figure 6. Select your worksheet and click Load.

    After your data is loaded, Power BI appears in Report View. It’s not like Excel where you’d immediately see your data values in a grid. You can see your data in a grid by clicking on Data, but that’s not what we need to do now.

    image

    Figure 7. Report View in Power BI Desktop.

    Move your mouse under Visualizations and click on the Line chart icon. This is the part where people tell me they have trouble. What to do next, what goes where?

    The Axis is the X-axis. So where to the Y-axis value belong? Under Values – where else, now that you think about it!

    image

    Figure 8. Line chart parameters.

    image

    Figure 9. Drag the Year to the spot just under Axis.

    image

    Figure 10. Drag Units Sold to the spot just under Values.

     

    image

    Figure 11. Finished line chart.

    image

    Figure 12. Line chart resized.

    It’s really easy to change the visualization type. You should experiment by clicking on various visualization to see what they look like.

    image

    Figure 13. Click on Stacked column chart to change how it looks.

    You probably want to do some formatting. Click the icon for the paint roller brush to explore the formatting options. Your formatting changes remain intact even if you change your visualization.

    image

    Figure 14. Click on the Format icon to expose the formatting properties.

    image

    Figure 15. Use the Title properties to change the title.

    image

    Figure 16. Extensive formatting just to show what is possible.

    image

    Figure 17. Notice that the formatting changes carry across different visualizations.

    Save your Power BI model when you’re done!

    The next post in this series is here where you can learn how to make stacked charts.

  • In The Cloud: The Importance of Being Organized

    People often ask me about learning how to use Azure SQL Database as well as many other Azure products. If you want to learn, you’ve got to have an Azure account. Get one for free or use your personal or corporate MSDN account.

    Where I see people struggling with Azure is in not being organized. Naming conventions are essential. Think about it. Once you go into production, you need good naming conventions and discipline enforcing them. You need to have good governance of your cloud resources starting with your early forays into learning how to work in the cloud. Regardless of who your cloud provider is, you need to have meaningful, descriptive names.

    You’re not just learning about, for example, Azure SQL Database when you create your first database in the cloud. You’re also learning and experimenting with governance. You’re figuring out what naming conventions make things better. When I started learning Azure, I tended to use suffixes to identify Azure resources. That didn’t work well over time. I switched to using prefixes, actually using what is known as Hungarian notation. What’s the best naming convention to use? The one your actually use! The point is that having some naming system is better than having none at all. So, to paraphrase Robert Duvall’s five P rule in The Killer Elite, proper planning prevents poor performance. By the way, the word affix can be used instead of saying “prefix or suffix” because an affix can be either. An affix can also appear in the body of a word, not just at either end.

    I tend to use the prefix db for Azure SQL Database, dw for Azure SQL Data Warehouse, rg for Resource Group, srvr for server. The server name is interesting. Azure resources that are exposed as URLs are lower case only. I recommend using mixed casing where possible. It makes your naming conventions easier to comprehend.

    image

    Figure 1. Using meaningful names for Azure resources.

    One last detail. When you create an Azure SQL Database, your server admin login can’t be something hackers love such as sa or administrator. But you can use essay as your login. Think about it, pronounce it out loud. Seriously though, you should consider making your server admin login something less likely to be found in a dictionary attack.

  • Visual Studio Code

    Visual Studio Code is a free, relatively new (late 2016), open source, cross platform tool for editing code. I like the Git integration. It doesn’t right out of the box support SQL, but it’s easy to add it as shown below.

    It’s really simple. Click the icon for Extensions, enter sql for your filter, then find the mssql extension and click Install.

    image

    Figure 1. Installing the Visual Studio Code extension for Microsoft SQL Server.

    The extension comes with extensive documentation, links, and animated tutorials. Click mssql in the list of extensions.

    image

    Figure 2. Click mssql to load the documentation.

    This is the direct link to the mssql extension tutorial, the same one that is the first link in the screen capture above. You should use Microsoft’s documentation. I’m just making you aware that it exists. One think I want to point out is what happens when you hold down the Ctrl key and then press K. The K is that start of a two key sequence called a chord.

    image

    Figure 3. Ctrl+K is the first of a two key sequence called a chord. Notice that is says “Waiting for second key of chord…”

    Pressing M completes the chord to display the language mode selection menu. I selected SQL. Notice SQL appears in the bottom right corner of the screen capture. It all sounds good to me, music to my ears.

    image

    Figure 4. Language mode set to SQL.

    My only point here is to make you aware of a new tool to help you get your work done from any platform.

  • Charles Bonnet Syndrome and Implications for Digital Image Processing

    Those of you who know me or have read my biography are aware that I live in two worlds, information technology and healthcare. This post is about image processing in the brain and in a computer. In particular, this discussion is about what happens when there is insufficient data. I think there are some lessons from healthcare that can be applied to information technology.

    First, some background. Charles Bonnet was a Swiss naturalist (scientist who studies the natural world) who in 1760 wrote about his 89-year old grandfather’s visual hallucinations. The grandfather had cataracts which rendered him almost blind. In other words, his retinas were sending insufficient data to his brain. He reported seeing people, animals, and objects. This syndrome was named after Charles Bonnet and is also known as visual release hallucinations. Retinas and digital imaging sensors are alike in that they generate signals that must be processed to be understood. Processing is not error free. When the retina receives less input, it’s similar to trying to make sense of a a very pixelated image.

    When discussing machine learning, we mention training a model. Data is fed into a algorithm and a model is trained. New data such a digital image is submitted to the model. If an approximate match is found, the model classifies and categorizes the new image. The brain does something similar. Consider the case of a person with retinal damage caused by age related macular generation, diabetic retinopathy, glaucoma, or some other pathology. If the person is able to see brown, white, and black next to each other, the trained model in the brain may find a match on a beagle. The brain says brown, white, and black must be a beagle so I’ll put a beagle into the person’s field of view. Putting this into information technology terms, a beagle might be algorithmically correct based on the input data, but if there isn’t a beagle present, the outcome is a false positive.

    It’s important to understand that this discussion is about visual hallucinations in a person without any psychiatric problems. In fact, the people who have this syndrome are aware that the hallucinations are not real. In other words, they have insight into what is happening. The take away is that no matter how good our digital image processing systems are, we still need oversight from a thinking human being who asks and answers the question: Does this make sense? The Indian ophthalmologist G. J. Menon published a framework for evaluating visual hallucinations. In his framework, the presence or absence of insight into the hallucination is very significant. I’m suggesting that our work isn’t done when a model is trained. We need to consider developing problem  specific frameworks for critically appraising the results of machine learning.

    It’s important to consider the consequences of incorrect processing. There are reports of people with Charles Bonnet Syndrome fretting about their hallucinations and fearing they may be losing their grip on their sanity. They may not bring it up because of a fear of no longer being allowed to live independently in case they are diagnosed as psychiatrically ill. It’s a great relief to these patients when they finally find out that their hallucinations are actually normal. How family members and healthcare professionals treat a patient is quite different when they think a person is psychiatrically normal instead of abnormal. Following procedures and jumping to easy conclusions can have devastating consequences for people.

    You can read more about Charles Bonnet Syndrome at the website of the Charles Bonnet Syndrome Foundation. There are similar auditory hallucinations in psychiatrically normal people. Once again, the presence of insight about the hallucinations is significant. We need critical human thinking skills providing the sanity checks for signal processing.

  • PolyBase Error After Uninstalling JRE

    The new PolyBase feature in SQL Server 2016 has a dependency on the 64-bit Java Runtime or JRE. It must be installed prior to installing PolyBase. If you uninstall the JRE and install a later version of the JRE, you may experience a failure of PolyBase. I won’t speculate as to what will happen if you reinstall the same version of JRE that you uninstalled.

    Here’s the error message displayed from attempting CREATE EXTERNAL TABLE after the uninstall old JRE and install new JRE was done.

    Msg 105019, Level 16, State 1, Line 63
    EXTERNAL TABLE access failed due to internal error: 'An unexpected error has occurred while creating or attaching to JVM. Details: System.DllNotFoundException: Unable to load DLL 'jvm.dll': The specified module could not be found. (Exception from HRESULT: 0x8007007E)
       at JNI.JavaVM.JNI_GetCreatedJavaVMs(IntPtr& pVM, Int32 jSize1, Int32& jSize2)
       at Microsoft.SqlServer.DataWarehouse.Hadoop.HadoopBridge.JavaBridge.LoadOrAttachJVM()'

    It’s tempting to interpret the error message as a path problem and start fiddling with environment variables. If you’ve uninstalled the JRE, you should do a repair installation of SQL Server instead of trying path changes. After the repair was done, PolyBase started working again without a reboot.

    image

    Figure 1. Select Maintenance, then Repair.

    image

    Figure 2. Select the instance to repair.

    image

    Figure 3. Click Repair.

  • In the Cloud: Free Azure Training

    Microsoft has many free Azure training courses found at https://openedx.microsoft.com/. Each course is estimated to take 16-24 hours to complete.

    Oh my, I just realized Free Azure Training as a TLA is FAT. As a nurse, I’d never tell anybody to get fat. But as an Azure architect, I want you to get FAT. Check it out and upgrade your skills!

  • In the Cloud: Visual Studio 2017

    Visual Studio 2017 has new functionality to help you build apps for both on-premise and cloud solutions. The screen capture of the Visual Studio 2017 Enterprise installer shows a different interface than the previous several versions of Visual Studio have used. Notice I selected the options for both Azure development and Data storage and processing. The default selections for each of those are shown. I ended up installing all of the options in both categories. You’ll want the PowerShell tools and Azure Storage AzCopy. Notice that some Redgate tools are included. Even the Community edition has Redgate SQL Prompt.

    image

    Figure 1. Visual Studio Enterprise 2017 installer.

    You should click on Individual components and look for other options to add to your installation.

    image

    Figure 2. Individual components default selections.

    If you uninstall Visual Studio 2015 and related components prior to installing Visual Studio 2017, it’s possible you may need to do a repair installation of SQL Server Management Studio (SSMS). It depends on whether or not you uninstalled a dependency. Keep in mind that SSMS as of SQL Server 2016 is based on the Visual Studio shell.

    When you first run SSMS after installing Visual Studio 2017, you will see a message box like the one shown below. Just click OK.

    image

    Figure 3. Normal informative message after updating Visual Studio. Just click OK.

  • FLAS

    The information technology field is full of obfuscation. There are many three letter acronyms, including the meta three letter acronym TLA. Now in the era of Big Data and Cloud, we have four letter acronyms, what I call FLAS, two of which are HTAP and ACID. People have asked me what these are, so here is an explanation.

    ACID has a new meaning. No longer is it just Atomicity, Consistency, Isolation, and Durability. It can also mean Algorithms, Cloud, IoT, and Data. HTAP is Hybrid Transactional Analytic Processing. Transactional processing systems are by default not optimized for analytical processing. Transactional processing works well when data is stored in indexed rows. That type of storage does not work well for analytic processing, which involves the aggregation of columns of data. By arranging and storing data in columns, aggregations can be done much faster. A tried and true technique for providing fast analytical processing and fast transactional processing is to have ETL or ELT processes to periodically build columnar OLAP storage from row oriented OLTP storage. The disadvantage is the latency introduced.

    HTAP solutions provide both OLAP and OLTP storage allowing analytical processing on current transactional data without slowing down the transactional processing.

    Of course I’m writing about FLAS because my job title as of this writing is DPSA, Data Platform Solution Architect.

  • Role Based Access Control in Azure

    From talking to customers and people new to Azure SQL Database, it is apparent that many people are unaware of how Role Based Access Control (RBAC) in Azure affects Azure SQL Database. Last week I gave a presentation on this at SQL Saturday and discussed it with customers at the SQL PASS Summit. This post is a brief summary and provides links to the official documentation.

    Definitions

    RBAC - Role Based Access Control

    IAM - Identity and Access Management

    CLI - Command Line Interface

    TDETransparent Data Encryption

    NOTE: Actions described as being performed in the Azure portal can also be accomplished via scripting with PowerShell, the Azure CLI, or the Azure REST API. Actions described as being accomplish in SSMS can also be scripted with T-SQL.

    Get started with access management in the Azure portal is where I suggest you start reading. Keep in mind that both Azure SQL Database and Azure SQL Datawarehouse have their Azure roles at the Azure database server level. An Azure database server is really a logical construct, a container of databases that exists for administrative purposes.

    image

    Figure 1. Adding a role in the Azure portal.

    Pay particular attention to Owner vs. Contributor. If you put a user in the Contributor role, the user cannot grant permissions to other users. That’s the difference between the two roles.

    Although you can administer Azure roles through the Azure portal, eventually for production use you’ll want to script everything so you can have repeatable processes. You’ll need Azure PowerShell (Windows users) or the Azure CLI (cross platform and Docker). You can also use the REST API. See Managing Role-Based Access Control with the REST API, which also includes links to PowerShell and Azure CLI instructions.

    Use role assignments to manage access to your Azure subscription resources

    Documentation on Azure roles can be found at RBAC: Built-in roles. Notice there are three roles that begin with SQL:

    · SQL DB Contributor

    · SQL Security Manager

    · SQL Server Contributor

    image

    Figure 2. SQL roles in Azure portal.

    It’s important to understand that these are Azure roles. They are limited in scope to actions that can be performed in the Azure portal. The Storage Account Contributor role in the Azure portal is relevant to Azure SQL Database if you have auditing enabled.

    The Azure portal is for administering Azure resources. SQL Server Management Studio (SSMS) or equivalent T-SQL scripts are used for administering the contents of Azure SQL Database. There is very little overlap between what can be done in the Azure portal and in SSMS. TDE can be enabled in either, see Transparent Data Encryption with Azure SQL Database. Dynamic Data Masking can be enabled in either although it is more flexible in SSMS. You can create an Azure SQL Database from either the Azure portal or SSMS.

  • Big Data V’s

    Originally there were three V’s of Big Data. Doug Laney introduced Volume, Velocity, and Variety as the three V’s of data management in 2001 when he worked for the Meta Group, which was purchased by Gartner in 2005. He did not use the term Big Data in his 2001 publication.

    IBM uses the same three V’s and added a fourth B of Veracity no later than 2013. The author Bernard Marr added a fifth V of Value. He describes his five V’s in these slides. Rob Livingstone added Validity and Visibility. In 2013, Doug Laney made the distinction between definitional and aspirational qualities of data. According to Doug, only Volume, Velocity, and Variety are definitional. All other V words are aspirational.

    Mike Gualtieri of Forrester in 2012 asserted that Doug’s three V’s are not actionable. Mike offered what he calls a pragmatic definition of Store, Process, and Access.

    Neil Biehn when still at PROS (now at Siemens) stated that the fourth and fifth V’s are Viability and Value.

    If you are willing to have more than three V’s, Visualization is an obvious V to add and multiple authors have written about it. It’s hard to make sense of Big Data and difficult to derive value or take action if you can’t see what the data is telling you.

    I think people should use as few or as many V’s as they find helpful. I will not go so far as to formally propose another V, but I will state that Vexing sometimes fits Big Data. It can be vexing because of the technical issues of Doug’s three V’s and it can be vexing to interpret and act upon.

This Blog

Syndication

Privacy Statement