<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://sqlblog.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Search results matching tags 'Development' and 'Best Practices'</title><link>http://sqlblog.com/search/SearchResults.aspx?o=DateDescending&amp;tag=Development,Best+Practices&amp;orTags=0</link><description>Search results matching tags 'Development' and 'Best Practices'</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP2 (Build: 61129.1)</generator><item><title>Preparation is key to a successful cloud deployment</title><link>http://sqlblog.com/blogs/buck_woody/archive/2012/05/01/preparation-is-key-to-a-successful-cloud-deployment.aspx</link><pubDate>Tue, 01 May 2012 13:09:16 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:43122</guid><dc:creator>BuckWoody</dc:creator><description>&lt;p&gt;If you want to be wise, watch the actions and outcomes of others. Emulate the successful actions, and avoid the actions that cause failure. That’s true in life in general - and in technology projects in specific.&amp;#160; &lt;/p&gt;  &lt;p&gt;I’ve worked with several clients who have created or migrated an application to “the cloud” - meaning using Microsoft Windows Azure or another provider. Although the statement in the title of this post is trite, I cannot over-emphasize how accurate it is. In every case of those who had a great experience with a distributed computing environment (which is thankfully the vast majority of my projects), &lt;/p&gt;  &lt;p&gt;What kind of preparation do you need to do? Here are some tips I’ve learned in the successful (and not-so-successful) deployments I’ve seen: &lt;/p&gt;  &lt;h3&gt;Follow standard recommendations for successful projects in general &lt;/h3&gt;  &lt;p&gt;You and your organization have probably done a few projects before - this one should have the same general attributes: a well-defined goal, a small, motivated team, a realistic timeline, and an adequate budget. I know, I know, you *never* seem to get those things - but if you don’t, you’ll fail. Simple as that. &lt;/p&gt;  &lt;h3&gt;Educate yourself&lt;/h3&gt;  &lt;p&gt;Computing technology started out on a single set of hardware for a single purpose - and realizing the limits of the hardware at hand, systems designers quickly realized that scale-out and virtualization was key. No, that’s not new - mainframes almost always worked on the concept of scale-out and virtual machines. But we switched in the 1980’s to single-user systems again, and we’ve been there ever since. By that I mean you install an OS on the things you work on. Now we move back to distributed system concepts, and there are some real differences. You’ll need to learn how those work, and do things a new way. Hey, we’re IT - we LOVE learning new things, right? &lt;/p&gt;  &lt;h3&gt;Get a partner if needed&lt;/h3&gt;  &lt;p&gt;There are a few of us white-haired Gandalf’s around that remember how to work in a distributed system, but if it’s new to you, that’s completely OK. You can save yourself a world of trouble by working with someone who’s done this before - a partner you hire, someone from Microsoft Consulting, whatever. &lt;/p&gt;  &lt;p&gt;And don’t forget support - who will handle each issue, what is the escalation model, who are your contacts at Microsoft, and what is your “light’s out” strategy?&lt;/p&gt;  &lt;p&gt;“A new broom sweeps clean”, the old adage goes, but the old brooms know where the dirt is. &lt;/p&gt;  &lt;h3&gt;Build a model&lt;/h3&gt;  &lt;p&gt;Take some time to do a Proof of Concept on your local system and using your Azure hours from your MSDN account if you have one. Going through this build - and being willing to throw it away and try it a different way - is invaluable. &lt;/p&gt;  &lt;h3&gt;Test your theories&lt;/h3&gt;  &lt;p&gt;Three statisticians are walking in a field. They see a rabbit - the first guy raises his gun, firing far in front of the rabbit. The second guy simultaneously raises his gun and fires far behind the rabbit. The third guy yells “We got him!”&lt;/p&gt;  &lt;p&gt;Not every theory is correct - not every attempt is the right one. Build in your success tests while you’re building your model. Then check them - don’t leave this step out. &lt;/p&gt;  &lt;h3&gt;Rinse, lather, repeat&lt;/h3&gt;  &lt;p&gt;This is advice from a shampoo bottle - which I’ve never used (I don’t really have that much hair - especially now). But in a “Cloud” project, it’s important. It’s an evolving system, that gains new improvements at an amazing rate. As soon as you deploy and stabilize you need to start the process over again. If you created your system in a Services model, with contracts for the APIs and abstracted code, this is far easier. &lt;/p&gt;  &lt;p&gt;It’s not hard to do a cloud project right. But it’s really simple to do it wrong. Follow these guidelines and you’ll learn from the successes - and mistakes - of others. &lt;/p&gt;</description></item><item><title>Cloud Computing Patterns: Using Data Transaction Commitment Models for Design</title><link>http://sqlblog.com/blogs/buck_woody/archive/2012/02/14/cloud-computing-patterns-using-data-transaction-commitment-models-for-design.aspx</link><pubDate>Tue, 14 Feb 2012 20:45:47 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:41744</guid><dc:creator>BuckWoody</dc:creator><description>&lt;p&gt;There are multiple ways to store data in a cloud provider, specifically around Windows and SQL Azure. As part of a &amp;ldquo;Data First&amp;rdquo; architecture design, one decision vector &amp;ndash; assuming you&amp;rsquo;ve already done a data classification of the elements you want to store &amp;ndash; is to decide the transaction level you need for that datum.&amp;nbsp; Once you&amp;rsquo;ve decided on what level of transactional commitment you need, you can make intelligent decisions about the storage engine, method of access and storage, speed and other requirements.&lt;/p&gt;
&lt;p&gt;Although the list below is neither original nor exhaustive, these are the general considerations I use for a given data set. It&amp;rsquo;s important to note that in many on premises systems the engine choice at hand overrides these concerns. If you have a large Relational Database Management System (RDBMS) for instance, you might simply place all data there without further consideration. In a Platform as a Service (PaaS) like Windows and SQL Azure, however, selection of the proper engine for a particular dataset has implications ranging from cost to performance, and selecting the right engine is critical when you want to leverage the data across &amp;ldquo;Bid Data&amp;rdquo; analysis like Hadoop or other constructs.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Monolithic Consistent Transactional&lt;/strong&gt;&lt;br /&gt;The first selection is analogous to a local RDBMS system. The dataset is retrieved in a functionally single, monolithic transaction, i.e. kept together with ACID properties in mind. This is the most reliable type of data design for datasets that require a high degree of safety in the read/write pattern. As an example, a bank ATM transaction should be modeled in a monolithic way. If I make a transfer of funds from one account to another, I want the money to be subtracted from one account if and only if it is successfully added to the other. The bank, on the other hand, wants the money added to the second account if and only if it is subtracted from the first. This is a prime example of a monolithic (single atomic transaction), Consistent (if and only if) and Transactional (as a unit, with provision for roll-back and reporting if unsuccessful) data requirement.&lt;/p&gt;
&lt;p&gt;The primary engine used for this type of data is often SQL Azure &amp;ndash; an RDMBS in the same datacenters as Windows Azure. Placing both the calling application, whether that is a Data Access Layer-based code widget or a direct call from a Web or Worker Role, means that data is retrieved quickly and in a monolithic way. The costs for this method is based on overall database size.&amp;nbsp; A consideration is how much data you can store this way. Database sizes have limits, although there are ways of overcoming size issues using technologies such as Sharding or SQL Azure Federations. There is also the consideration of performance. In an RDBMs that conforms to ACID properties, locking and other overhead for safety is at conflict with the highest possible read performance.&amp;nbsp; But in some cases the ACID properties are worth the cost, as in the banking example.&lt;/p&gt;
&lt;p&gt;You are not limited to SQL Azure in this model. Windows Azure Table storage, while similar to NoSQL offerings is different in that it is immediately consistent across all three replicated copies of data, offering a higher degree of safety. And while Table storage does not offer built-in support for transactions, there are ways to achieve certain transaction levels.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Monolithic Realtime&lt;/strong&gt;&lt;br /&gt;If consistency can be relaxed &amp;ndash; meaning that a guaranteed read/write patter is not essential &amp;ndash; then more options arise in Windows and SQL Azure. You can still use SQL Azure for this type of storage, with either automatic or programmatic hints allowing for &amp;ldquo;dirty reads&amp;rdquo;. Windows Azure Table storage is still consistent, but the selection of the method for querying the data such as separate copies of read and write data can be employed. Because of the relaxed transaction nature, higher speeds are possible by querying cached or separate datasets.&lt;/p&gt;
&lt;p&gt;An example here is that same transaction from the bank, but a statement inquiry. Just after the money is deposited, the user wishes to query the current balance. The current balance &amp;ndash; minus the transaction that just occurred &amp;ndash; is retrieved and shown to the user, perhaps even combining the amount with the latest transaction, perhaps saved as a local cached object, with a caveat to the user.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Distributed Realtime&lt;/strong&gt;&lt;br /&gt;At some point, the data becomes too large to fit inside a single processing session, and parallelism is used. In this case, either separate databases in SQL Azure or Windows Azure Tables, local data storage on the Web or Worker Role, or a combination of all with Caches is the right approach for the data design.&lt;/p&gt;
&lt;p&gt;The biggest implication in this type of system is speed &amp;ndash; a higher degree of data separation is essential, and so the dataset selection must fit the pattern. It is unacceptable to force an ACID-properties type workload into this environment. Typical examples here are the actual data asset payload for streaming video or music, read-only documents and so on. This pattern is often separated from the meta-data, which is kept in more of a transactional model.&lt;/p&gt;
&lt;p&gt;As an example, assume you log on to a website to watch a movie or listen to music. The provider needs to verify your identity and account balance, which are transactional data loads. After that process is complete, the workload shifts to a copy &amp;ndash; perhaps one of several &amp;ndash; of the asset to stream to your location.&lt;/p&gt;
&lt;p&gt;In this case, Windows Azure Blob storage, along with the Content Delivery Network (CDN &amp;ndash; a series of servers closer to the user) is employed along with the transactional realtime requirements for the metadata.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Distributed Eventual&lt;/strong&gt;&lt;br /&gt;At the furthest end of the data scale are large datasets that need deeper analysis, but not necessarily in realtime. Examples here are terrabytes of data requiring a Business Intelligence view, but with a tolerance of a few seconds to minutes or hours. In this case, Storage, Processing and Query methods, such as the Hadoop offering in Windows Azure, or perhaps the High Performance Computing (HPC) Windows Server in Windows Azure fit well.&amp;nbsp; Here, the design of the data is often dictated by the source, and more emphasis is placed on the algorithms around processing and re-assembling the data.&lt;/p&gt;
&lt;p&gt;There are, of course, other patterns. In many cases a single dataset may have needs in one or more of these categories &amp;ndash; in fact, sitting at 30,000 feet typing this entry, I&amp;rsquo;m having that very design discussion with a gentleman sitting next to me. The key is to design data-first, and fit the technology to the requirement for each datum. Allow each function and engine to handle the data in the most efficient, effective way for cost, performance and utility.&lt;/p&gt;</description></item><item><title>Application Lifecycle Management Overview for Windows Azure</title><link>http://sqlblog.com/blogs/buck_woody/archive/2012/02/07/application-lifecycle-management-overview-for-windows-azure.aspx</link><pubDate>Tue, 07 Feb 2012 14:58:39 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:41593</guid><dc:creator>BuckWoody</dc:creator><description>&lt;p&gt;Developing in Windows Azure is at once not that much different from what you’re familiar with in on-premises systems, and different in significant ways. Because of these differences, developers often ask about the specific process to develop and deploy a Windows Azure application - more formally called an Application Lifecycle Management, or ALM. &lt;/p&gt;  &lt;p&gt;There are specific resources you can use to learn more about various parts of ALM - I’ve referenced those at the end of this post. But ALM has multiple definitions, from the governance of code injection, domain upgrade, testing, process flow and more. Many developers are interested in the finer-grained information, like how do I develop and deploy an application? What tools do I need, and how do I get the code running somewhere that I can test? &lt;/p&gt;  &lt;p&gt;I’ll cover the very high-level process here, and refer you to specifics at the end of each section, so that you can take it all in at one viewing, and then bookmark for more detail when you need more information. I won’t be covering processes like Continuous Integration or Agile and other methodologies in this post - I’ll blog those later. &lt;/p&gt;  &lt;h2&gt;Initial Development&lt;/h2&gt;  &lt;p&gt;You start with writing code. You have three ways to do this. You can use Visual Studio (even the Express Edition Works), Eclipse, or by &lt;a href="https://www.ibm.com/developerworks/webservices/library/ws-restful/" target="_blank"&gt;leveraging the REST API format&lt;/a&gt;. You can do this in a standalone (non-connected) environment like your laptop. &lt;/p&gt;  &lt;p align="left"&gt;Using Visual Studio is one of the simplest methods to create an Azure application, allowing you to combine the Azure components you want to leverage (Storage, Compute, SQL Azure, the Service Bus, etc.) along with the on-premises code you have now or are creating. Once you’ve installed and patched Visual Studio, just download and install the Windows Azure Software Development Kit (SDK) and you’ll have not only all the API’s you need to talk to Azure, but a fully functioning local environment to run and test your code before you deploy it. You’ll also get a robust set of samples. You can download what you need for all of that (free) here: &lt;a href="http://www.windowsazure.com/en-us/develop/downloads/"&gt;http://www.windowsazure.com/en-us/develop/downloads/&lt;/a&gt; . There’s a step-by-step process here: &lt;a href="http://msdn.microsoft.com/en-us/magazine/ee336122.aspx"&gt;&lt;u&gt;&lt;font color="#0066cc"&gt;http://msdn.microsoft.com/en-us/magazine/ee336122.aspx&lt;/font&gt;&lt;/u&gt;&lt;/a&gt; &lt;/p&gt;  &lt;p&gt;You can also use Eclipse to develop for Windows Azure. You won’t get the full runtime environment in just that kit alone, but you can use this successfully on a Linux system. I have several folks using this method. The downloads and documentation for that is here: &lt;a href="http://www.windowsazure4e.org/"&gt;&lt;u&gt;&lt;font color="#0066cc"&gt;http://www.windowsazure4e.org/&lt;/font&gt;&lt;/u&gt;&lt;/a&gt; &lt;/p&gt;  &lt;p&gt;You can use REST API’s to hit Azure Assets and control them. Not my preferred method, but possible. There are REST API’s for various sections of Azure. You can find the main reference for that here: &lt;a href="http://msdn.microsoft.com/en-us/library/windowsazure/ff800682.aspx"&gt;http://msdn.microsoft.com/en-us/library/windowsazure/ff800682.aspx&lt;/a&gt;&amp;#160;&lt;/p&gt;  &lt;p&gt;&lt;font color="#9bbb59"&gt;&lt;font color="#c0504d"&gt;&lt;strong&gt;&lt;em&gt;Note: &lt;/em&gt;&lt;/strong&gt;We recently demonstrated using a Cloud-based Integrated Development Environment (IDE) for Node.js deployment to Windows Azure. More on that here:&lt;/font&gt; &lt;/font&gt;&lt;a href="http://www.readwriteweb.com/cloud/2012/01/cloud9-ide-to-enable-nodejs-ap.php"&gt;&lt;u&gt;&lt;font color="#0066cc"&gt;http://www.readwriteweb.com/cloud/2012/01/cloud9-ide-to-enable-nodejs-ap.php&lt;/font&gt;&lt;/u&gt;&lt;/a&gt; &lt;/p&gt;  &lt;h2&gt;Deploying to a Test Instance&lt;/h2&gt;  &lt;p&gt;After you write the code, you’ll need to test it somewhere. The Azure Emulator on your development laptop is for a single user on that laptop, and it also has some subtle differences from the production fabric as you might imagine. Normally you’ll set up a small subscription to run and test the application, just like you would have a set of test servers. Each subscription has its own management keys and certificates, so this assists in keeping the testing environment separate for billing and control. &lt;/p&gt;  &lt;p&gt;More on that general information here: &lt;a href="http://msdn.microsoft.com/en-us/library/ff803362.aspx"&gt;http://msdn.microsoft.com/en-us/library/ff803362.aspx&lt;/a&gt;&amp;#160;&lt;/p&gt;  &lt;h2&gt;Deploying to Production&lt;/h2&gt;  &lt;p&gt;Once you have developed the code and tested it, you need to move it to a location where users can access it. In reality, there is no physical difference in the type of machines, fabric or any other component in “Production” Windows Azure accounts and the “Test” accounts, but you’ll most often pick smaller systems to deploy on in testing, and you’ll probably keep the URL in the plain format.&lt;/p&gt;  &lt;p&gt;In the Production Windows Azure account, the team normally limits the access to the account for deployment to a separate set of developers. This ensures code flow and control. A DNS name is normally mapped to the longer, Microsoft-generated URL so that your users access the application or data the way you want them to. &lt;/p&gt;  &lt;p&gt;More on setting up an account here: &lt;a href="http://techinch.com/2010/06/14/setup-your-windows-azure-account/"&gt;http://techinch.com/2010/06/14/setup-your-windows-azure-account/&lt;/a&gt;&amp;#160;&lt;/p&gt;  &lt;h2&gt;Managing Code Change&lt;/h2&gt;  &lt;p&gt;With the application deployed, there are two broad tasks you need to consider. One is managing changes through the application, and the other involves management, monitoring and performance tuning for an application.&lt;/p&gt;  &lt;p&gt;To make a code change, the standard ALM process is followed, just as above. You can use command-line tools to automate the process as you would with an on-premises system. A vide on that shows you how: &lt;a href="http://www.microsoftpdc.com/2009/SVC25"&gt;http://www.microsoftpdc.com/2009/SVC25&lt;/a&gt;. Normally this is used with an “In-Place” upgrade into Production Account, since your testing is completed in a separate account. More on that process here: &lt;a href="http://msdn.microsoft.com/en-us/library/windowsazure/ee517255.aspx"&gt;http://msdn.microsoft.com/en-us/library/windowsazure/ee517255.aspx&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;One difference is the “VIP Swap” process you can use for the final push to Production. In essence, this allows you to have two copies of the application running on the Production account, with a quick way to cut over and back when you’re ready. The process for that is detailed here: &lt;a href="http://msdn.microsoft.com/en-us/library/windowsazure/ee517253.aspx"&gt;http://msdn.microsoft.com/en-us/library/windowsazure/ee517253.aspx&lt;/a&gt;&amp;#160;&lt;/p&gt;  &lt;p&gt;For monitoring, you have several options. You should enable the Windows Azure Diagnostics in your code - more on that here: &lt;a href="http://archive.msdn.microsoft.com/WADiagnostics"&gt;http://archive.msdn.microsoft.com/WADiagnostics&lt;/a&gt;. &lt;/p&gt;  &lt;p&gt;You can observe uptime and other information on the Windows Azure Service Dashboard, where you can also consume the uptime as an RSS feed: &lt;a href="http://www.windowsazure.com/en-us/support/service-dashboard/"&gt;http://www.windowsazure.com/en-us/support/service-dashboard/&lt;/a&gt;&amp;#160;&lt;/p&gt;  &lt;p&gt;From there, you can also use System Center to monitor not only Windows Azure deployments but internal applications as well. The Management Pack and documentation for that is here: &lt;a href="http://www.microsoft.com/download/en/details.aspx?id=11324"&gt;http://www.microsoft.com/download/en/details.aspx?id=11324&lt;/a&gt;. &lt;/p&gt;  &lt;p&gt;There are also 3rd-party tools to manage Windows Azure. More on that here: &lt;a href="http://www.bing.com/search?q=monitor+Windows+Azure&amp;amp;form=OSDSRC"&gt;http://www.bing.com/search?q=monitor+Windows+Azure&amp;amp;form=OSDSRC&lt;/a&gt;&amp;#160;&lt;/p&gt;  &lt;h3&gt;Other References: &lt;/h3&gt;  &lt;p&gt;There is a lot more detail in this official reference: &lt;a href="https://www.windowsazure.com/en-us/develop/net/fundamentals/deploying-applications/"&gt;https://www.windowsazure.com/en-us/develop/net/fundamentals/deploying-applications/&lt;/a&gt;&amp;#160;&lt;/p&gt;  &lt;p&gt;Bryan Group explains the ramifications of the Secure Development Lifecycle (SDL) with lots of collateral you can review: &lt;a href="http://blogs.msdn.com/b/bryang/archive/2011/04/26/applying-the-sdl-to-windows-azure.aspx"&gt;http://blogs.msdn.com/b/bryang/archive/2011/04/26/applying-the-sdl-to-windows-azure.aspx&lt;/a&gt;&lt;/p&gt;</description></item><item><title>Rip and Replace or Extend and Embrace?</title><link>http://sqlblog.com/blogs/buck_woody/archive/2011/09/13/rip-and-replace-or-extend-and-embrace.aspx</link><pubDate>Tue, 13 Sep 2011 11:20:05 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:38437</guid><dc:creator>BuckWoody</dc:creator><description>&lt;p&gt;As most of you know, I don&amp;rsquo;t like the term &amp;ldquo;cloud&amp;rdquo; very&lt;br /&gt;much. It isn&amp;rsquo;t defined, which means it can be anything. I prefer &amp;ldquo;distributed&lt;br /&gt;computing&amp;rdquo;, which is more technically accurate and describes what you&amp;rsquo;re doing&lt;br /&gt;in more concrete terms.&lt;/p&gt;
&lt;p&gt;So when you think about Windows and SQL Azure, you don&amp;rsquo;t&lt;br /&gt;have to think about an entire product &amp;ndash; you can use parts of the system&lt;br /&gt;together or independently to accomplish what you need to do. You can use the&lt;br /&gt;computing functions, storage, and more and more I see folks leverage the&lt;br /&gt;Service Bus to enable current applications to expose things to the web.&lt;/p&gt;
&lt;p&gt;And that brings up the point of this post. Once you decide&lt;br /&gt;that a distributed architecture works to solve a problem, you&amp;rsquo;re faced with a&lt;br /&gt;decision: should you completely re-write your architecture to take advantage of&lt;br /&gt;the current systems or should you just fold in new code that makes the data or&lt;br /&gt;function available to the web?&lt;/p&gt;
&lt;p&gt;Of course, the answer is always &amp;ldquo;it depends&amp;rdquo; on the situation&lt;br /&gt;&amp;ndash; and it does. But unless you&amp;rsquo;re fixing a problem with current code, I usually&lt;br /&gt;advocate a migration approach. That means at the very least retaining the&lt;br /&gt;business logic (again, unless it&amp;rsquo;s not currently working) and as much of the&lt;br /&gt;code as you can. In fact, if you follow this paradigm, you&amp;rsquo;re on your way to&lt;br /&gt;making a Service Bus out of the functions you currently have. You can expose&lt;br /&gt;the results of a system rather than opening the system up. Let&amp;rsquo;s take an&lt;br /&gt;example.&lt;/p&gt;
&lt;p&gt;Assume for a moment that you have an order-taking system&lt;br /&gt;on-premise. That system performs many functions, one of which might creating a&lt;br /&gt;Purchase Order. Your system might be enclosed, meaning that it has an&lt;br /&gt;application that talks to a middle-tier, and then from there to a database&lt;br /&gt;system. A query is generated from a screen, and passed along to eventually&lt;br /&gt;compute, store and return a Purchase Order Number, along with other&lt;br /&gt;information. Imagine now that you wire up the code not only to return the PO&lt;br /&gt;number to the client, but to make that number available on an endpoint &amp;ndash;&lt;br /&gt;actually really not that hard to do.&lt;/p&gt;
&lt;p&gt;Now you can make that PO number available to the web using&lt;br /&gt;Azure. You could restrict who can make that call to the system, or open it up&lt;br /&gt;to a broader audience. Or instead of the PO Number, you could make a product&lt;br /&gt;list available. And you can go further than that &amp;ndash; EBay, for instance, uses the&lt;br /&gt;OData protocol (which is very cool in and of itself) which you can query from&lt;br /&gt;the web. You could compare your company&amp;rsquo;s product catalog to what is on EBay,&lt;br /&gt;and list the items you have there if there are no competitors in that space.&lt;br /&gt;And on and on it goes.&lt;/p&gt;
&lt;p&gt;So the point is this &amp;ndash; where you can, retain what works.&lt;br /&gt;Fold in systems like Azure where they make sense. Extend and Embrace.&lt;/p&gt;</description></item><item><title>Plan for Diagnostics in Cloud Computing From the Git-Go</title><link>http://sqlblog.com/blogs/buck_woody/archive/2011/09/06/plan-for-diagnostics-in-cloud-computing-from-the-git-go.aspx</link><pubDate>Tue, 06 Sep 2011 13:11:22 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:38295</guid><dc:creator>BuckWoody</dc:creator><description>&lt;p&gt;“Git-Go” is something we say in the South that means “right at the start”. I’ve seen several applications for on-premise systems that don’t have much in the way of diagnostics - the developers rely on a debugger, the event logs on the server and client workstation, and most of all, the ability to watch the system from end-to-end. &lt;/p&gt;  &lt;p&gt;This approach is a mistake for an on-premise system, and it’s definitely a problem for a distributed architecture. You simply do not own all of the components from end to end in a cloud environment, nor are you always able to attach a debugger or other remote monitoring tools to the various areas within the code path. So you need to make sure that from the very outset of your design that you build in diagnostics. My personal preference is to build a system such that a control file turns on deeper information gathering from the system, up to a minimal level.&lt;/p&gt;  &lt;p&gt;When I do that, I set a high level of logging, a medium level, and a moderate level. I normally use the deepest level of information during the testing and acceptance phase of the deployment, then switch to moderate and then the least level of information gathering. Also in my design I often set an error condition to begin gathering the deeper information along with the exception, where possible.&lt;/p&gt;  &lt;p&gt;There are decisions you need to make as to where to store the diagnostics (many operations in the cloud cost money), how often you collect them, and so on. You can get a quick overview on using the diagnostics that come with Windows Azure here: &lt;a href="http://www.azuresupport.com/2010/03/getting-started-with-windows-azure-diagnostics-and-monitoring/"&gt;http://www.azuresupport.com/2010/03/getting-started-with-windows-azure-diagnostics-and-monitoring/&lt;/a&gt; This is where you should start first. More detail on that: &lt;a href="http://msdn.microsoft.com/en-us/library/gg433048.aspx"&gt;http://msdn.microsoft.com/en-us/library/gg433048.aspx&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;My friend Dave Pallman has a great tool he’s released for free: &lt;a href="http://davidpallmann.blogspot.com/2009/03/azure-application-monitor-now-on.html"&gt;http://davidpallmann.blogspot.com/2009/03/azure-application-monitor-now-on.html&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;If the issue is in storage apps: &lt;a href="http://social.msdn.microsoft.com/Forums/en-US/windowsazuredata/thread/d84ba34b-b0e0-4961-a167-bbe7618beb83"&gt;http://social.msdn.microsoft.com/Forums/en-US/windowsazuredata/thread/d84ba34b-b0e0-4961-a167-bbe7618beb83&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;If you have System Center, this is the quickest and easiest way to implement the monitoring – really handy: &lt;a href="http://pinpoint.microsoft.com/en-us/applications/windows-azure-application-monitoring-management-pack-release-candidate-12884907699"&gt;http://pinpoint.microsoft.com/en-us/applications/windows-azure-application-monitoring-management-pack-release-candidate-12884907699&lt;/a&gt;&lt;/p&gt;</description></item><item><title>Cloud Computing and the Importance of Code Diagrams</title><link>http://sqlblog.com/blogs/buck_woody/archive/2011/05/03/cloud-computing-and-the-importance-of-code-diagrams.aspx</link><pubDate>Tue, 03 May 2011 13:59:20 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:35407</guid><dc:creator>BuckWoody</dc:creator><description>&lt;p&gt;Most mature development shops use various code diagrams to give a symbolic representation of high-level and database code structures. Standards such as &lt;a href="http://www.bpmb.de/images/BPMN2_0_Poster_EN.pdf" target="_blank"&gt;Business Process Model Notation&lt;/a&gt; (BPMN), &lt;a href="http://www.informit.com/guides/content.aspx?g=sqlserver&amp;amp;seqNum=62" target="_blank"&gt;Entity Relationship Diagrams&lt;/a&gt; (ERD) and the &lt;a href="http://uml.org/" target="_blank"&gt;Unified Modeling Language&lt;/a&gt; (UML) are a few I use all the time. &lt;/p&gt;  &lt;p&gt;In the Distributed Computing (Cloud Computing) paradigm, these three diagrams (or their equivalent) become essential. In the past, I’ve been able to rely on a single architecture where my code will run. I understand the servers, the networking and the path the code takes between the client and the components within that architecture.&lt;/p&gt;  &lt;p&gt;With Distributed Computing (DC), the architecture changes. In fact, the reason I use the term “Distributed Computing” instead of “Cloud Computing” most often (except in the title of this post, as you can see) is that I feel it’s more technically accurate about how we write code. I don’t view DC coding as an “all or nothing” exercise – I view it as just another option to solve a computing problem. A “hybrid” approach, where I mix in the strengths of a cloud provider is often a great way to leverage the best cost, performance and other advantages of each part of your solution. It can also help keep data secure, provide options for High Availability and Disaster Recovery, and more.&lt;/p&gt;  &lt;p&gt;To gain these advantages, we have to think more about the components of the application rather than a monolithic stack of components in a single architecture. And that brings us to the title of this post…&lt;/p&gt;  &lt;p&gt;For us to correctly identify code components, database objects, security paths and other elements, we have to be able to conceptualize them. And that’s where those diagrams come into play. Starting with some sort of business or organizational need, we can use BPMN or UML Actor diagrams to explain what the program needs to do. That helps segregate the security and location requirements. For instance, if&amp;#160; the BPMN shows a data access to Private Information, we can evaluate the need for an on-premise system that is federated to a DC provider. If the business users need global access, we can decide whether to set up a VPN to allow access to an on-premise system or whether a login component can be used on the web.&lt;/p&gt;  &lt;p&gt;After determining the flow of the program, move on to the data the system will store. In the case of Windows and SQL Azure, there are several options for storing data. In the past, I’ve often selected a single storage type, such as an RDBMS, and stored program data there. Now we can store in multiple formats, in multiple locations and more. The ERD is pivotal, because it defines data types, which can help decisions around where things go. Another important aspect to the data decision which is not covered in an ERD (but perhaps should be) is the estimated size and growth of a datum, since that can also drive the decision on where to put a data component.&lt;/p&gt;  &lt;p&gt;From there, the UML document helps me understand where each computing element can live. There are strengths for each type of computing, and using the UML diagram I can place each code component in the best environment for speed, security and other considerations.&lt;/p&gt;  &lt;p&gt;So in the new Distributed Computing world, these graphical documents do much more than just help design the application – they can help define the architecture as well.&lt;/p&gt;</description></item><item><title>SQL Server Best Practices: Protect CmdExec</title><link>http://sqlblog.com/blogs/buck_woody/archive/2009/12/03/sql-server-best-practices-protect-cmdexec.aspx</link><pubDate>Thu, 03 Dec 2009 15:49:34 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:19482</guid><dc:creator>BuckWoody</dc:creator><description>&lt;p&gt;In SQL Server, there are times that you need to do things in the operating system, and to allow that there is a feature called CmdExec. This is not always a good thing –whenever you leave the confines of SQL Server and go out to the operating system, you can cause issues, not the least of which are security-related.&lt;/p&gt;  &lt;p&gt;This best practice is primarily aimed at SQL Server 2000 – in SQL Server 2005 and higher, you’ll have these as job step types in SQL Server Agent (or ActiveX). What you should to do is ensure that only the sysadmins role can run CmdExec job steps.&lt;/p&gt;  &lt;p&gt;In SQL Server 2005 and higher, you should use other methods to work with the operating system, such as SQL CLR or PowerShell to handle that with better safety and security. &lt;/p&gt;</description></item><item><title>What is a physical database?</title><link>http://sqlblog.com/blogs/louis_davidson/archive/2009/06/11/what-is-a-physical-database.aspx</link><pubDate>Thu, 11 Jun 2009 22:26:36 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:14601</guid><dc:creator>drsql</dc:creator><description>&lt;p&gt;A bit of terminology that gets beaten to death is that of the “physical” database.&amp;#160; I would think most every DBA uses this term (I do), but…to mean what?&amp;#160; I think there are two common utilizations:&lt;/p&gt;  &lt;ol&gt;   &lt;li&gt;The layer of tables, constraints, indexes, etc used to store data &lt;/li&gt;    &lt;li&gt;The actual on-disk structures. &lt;/li&gt; &lt;/ol&gt;  &lt;p&gt;Frankly, until 3 years ago, I used the first interpretation.&amp;#160; However, I was beaten up pretty badly by a few people whom I don’t really remember (I think &lt;a href="http://www.simple-talk.com/author/anith-sen/" target="_blank"&gt;Anith Sen&lt;/a&gt; was one of them.)&amp;#160; The problem is, I was scolded, &lt;strong&gt;“physical”&lt;/strong&gt; already had a meaning, given it by the “founder” himself, EF Codd. &lt;/p&gt;  &lt;p&gt;So, checking his 12 Rules, Codd stated the following two things:&lt;/p&gt;  &lt;p&gt;&lt;b&gt;Rule 8:&lt;/b&gt; &lt;i&gt;Physical data independence&lt;/i&gt;: &lt;/p&gt; Changes to the physical level (how the data is stored, whether in arrays or linked lists etc.) must not require a change to an application based on the structure.   &lt;p&gt;&lt;b&gt;Rule 9:&lt;/b&gt; &lt;i&gt;Logical data independence&lt;/i&gt;: &lt;/p&gt; Changes to the logical level (tables, columns, rows, and so on) must not require a change to an application based on the structure. Logical data independence is more difficult to achieve than physical data independence.   &lt;p&gt;And actually, the implementation layer really is the logical model if you follow his terminology since his rules were pertaining to the relational model and not the entire design process.&amp;#160; This article says it better than I can in a long blog, but I am not sure about that URL (mac.com?): &lt;/p&gt;  &lt;p&gt;&lt;a href="http://homepage.mac.com/s_lott/iblog/architecture/C465799452/E20080301143528/"&gt;http://homepage.mac.com/s_lott/iblog/architecture/C465799452/E20080301143528/&lt;/a&gt; &lt;/p&gt;  &lt;p&gt;The physical layer of a relational database occurs down at the file system level.&amp;#160; Codd's &amp;quot;Rule 8&amp;quot; (Physical Data Independence) says that the things we're designing in ERwin (and similar tools) are the things our application depends on.&amp;#160; These are not physical in nature, but are the relational implementation.&amp;#160; &lt;/p&gt;  &lt;p&gt;So the thing I am trying to say is that physical means that a little 5 volt charge is sitting there representing a bit of data in the physical world.&amp;#160; I like the term logical to mean implementation platform non-specific.. The thing in the middle is the SQL Server/Relational&amp;#160; implementation specific model.&amp;#160; It may take liberties to optimize for SQL Server, but it is not physical. That is were partitioning. indexing, filegroups, etc come in. Changes to this layer ought never be noticable by the application.&amp;#160; &lt;/p&gt;  &lt;p&gt;I guess in the comments, I ought to expect a good number of replies that might start to answer the question.&amp;#160; Does it matter? Is it only semantics? Hey if you don’t think semantics matter, I hope that when you find yourself drowning that the person who has the choice of tossing you a life preserver or a sack of door knobs interprets the meaning of your cry for help in the way you intended. You would hate to find yourself at the bottom of a lake thinking “hmm, I wonder why they did that? Did they hate me, of just mis-interpret the meaning of my sentence?&amp;quot; &lt;/p&gt;</description></item><item><title>disallow results from triggers</title><link>http://sqlblog.com/blogs/louis_davidson/archive/2009/06/06/disallow-results-from-triggers.aspx</link><pubDate>Sun, 07 Jun 2009 02:38:06 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:14514</guid><dc:creator>drsql</dc:creator><description>&lt;p&gt;A setting that I noticed a while back when looking at sys.configurations was &lt;strong&gt;disallow results from triggers.&lt;/strong&gt;&amp;#160; Triggers are one of my favorite subjects, and you will find a lot of good uses of them in my book (triggers are also well named, as poor usage of them will allow you to shoot your foot off in a heartbeat as I discussed in my earlier post: &lt;a title="http://sqlblog.com/blogs/louis_davidson/archive/2008/07/13/triggers-evil.aspx" href="http://sqlblog.com/blogs/louis_davidson/archive/2008/07/13/triggers-evil.aspx"&gt;http://sqlblog.com/blogs/louis_davidson/archive/2008/07/13/triggers-evil.aspx&lt;/a&gt;). &lt;/p&gt;  &lt;p&gt;One of the main ways you can mess up your system is to have triggers that return results, often by accident, that the client doesn’t know about.&amp;#160; There are three kinds of return values that are interesting:&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;Result sets &lt;/li&gt;    &lt;li&gt;Raiserror messages &lt;/li&gt;    &lt;li&gt;Rowcount messages &lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;Also note that this is an overall system configuration. So it will disallow results from triggers in all of your databases.&amp;#160; It is never a good idea to return data from a trigger, but it is actually pretty useful for testing.&amp;#160; In the following code, I will test this setting and show how it will affect returning data to the client from the aforementioned methods.&amp;#160; &lt;/p&gt;  &lt;p&gt;What will the setting do? It will raise an error if you try to do a result set.&amp;#160; It will automatically suppress rowcount messages (without the need for SET NOCOUNT ON), but will allow error messages (I won’t cover it here, but error messages can cause issues with transactions in triggers, particularly when coupled with TRY…CATCH blocks).&lt;/p&gt;  &lt;p&gt;To demonstrate the setting, I will use tempdb.&amp;#160; Checking the system setting for your server, use:&lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;SELECT value      &lt;br /&gt;FROM&amp;#160;&amp;#160;&amp;#160; sys.configurations       &lt;br /&gt;WHERE name = 'disallow results from triggers'&lt;/font&gt; &lt;/p&gt;  &lt;p&gt;This will probably return, unless you have discovered the setting before I did, in which case you probably aren’t reading any longer, so there.&lt;/p&gt;  &lt;p&gt;---------    &lt;br /&gt;0 &lt;/p&gt;  &lt;p&gt;To show you the effect of this setting , let's build the following scenario. The trigger will return 3 types of values&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;the contents of the table named triggerResult &lt;/li&gt;    &lt;li&gt;the rowcount of rows affected from an insert statement like you would normally have in a database &lt;/li&gt;    &lt;li&gt;the rowcount of the rows from the insert statement into a temp table &lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;To start, we create 2 tables, one as the “main” table, and another that will hold the results of a side effect causing query:&lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;--primary test table      &lt;br /&gt;create table triggerResult       &lt;br /&gt;(       &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; triggerResultId int primary key       &lt;br /&gt;)&amp;#160;&amp;#160;&amp;#160; &lt;br /&gt;--holds our side effect to prove the trigger executed       &lt;br /&gt;create table triggerResultSideEffect       &lt;br /&gt;(       &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; triggerResultId int       &lt;br /&gt;)&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;Then we will create the trigger that gives us several different types of output    &lt;br /&gt;    &lt;br /&gt;&lt;font size="2" face="Courier New"&gt;create trigger triggerResult$insertTrigger      &lt;br /&gt;on triggerResult       &lt;br /&gt;after insert       &lt;br /&gt;as       &lt;br /&gt;begin       &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; --returns a result set       &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; select triggerResultId       &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; from&amp;#160;&amp;#160; triggerResult&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;&amp;#160;&amp;#160;&amp;#160; --side effect like you might expect in a trigger      &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; insert into triggerResultSideEffect (triggerResultId)       &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; select triggerResultId       &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; from&amp;#160;&amp;#160; inserted&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;&amp;#160;&amp;#160;&amp;#160; --just to get a fixed rowcount output      &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; declare @test table (value char(1))       &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; insert into @test       &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; values (1)       &lt;br /&gt;&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;&amp;#160;&amp;#160;&amp;#160; --and a couple of errors      &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; raiserror ('Low',10,1)       &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; raiserror ('Normal',16,1) &lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;end      &lt;br /&gt;&lt;/font&gt;Now we will test out the trigger by inserting one row into the triggerResult table:&lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;insert into triggerResult      &lt;br /&gt;values (1)&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;This returns (the final rows affected is from the original statement): &lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;triggerResultId      &lt;br /&gt;---------------       &lt;br /&gt;1 &lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;(1 row(s) affected) &lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;(1 row(s) affected) &lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;(1 row(s) affected) &lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;Low      &lt;br /&gt;Msg 50000, Level 16, State 1, Procedure triggerResult$insertTrigger, Line 22       &lt;br /&gt;Normal &lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;(1 row(s) affected) &lt;/font&gt;&lt;/p&gt;  &lt;p&gt;Now try a multi-row operation, to see the difference &lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;insert into triggerResult      &lt;br /&gt;values (2),(3),(4)&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;This returns the following (which you can clearly now see the different number of rows that were affected by each statement, 4 from the one that returns all of the rows in the table, 3 from the side effect query that inserts the rows from the inserted table.&lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;triggerResultId      &lt;br /&gt;---------------       &lt;br /&gt;1       &lt;br /&gt;2       &lt;br /&gt;3       &lt;br /&gt;4 &lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;(4 row(s) affected) &lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;(3 row(s) affected) &lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;(1 row(s) affected) &lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;Low      &lt;br /&gt;Msg 50000, Level 16, State 1, Procedure triggerResult$insertTrigger, Line 22       &lt;br /&gt;Normal &lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;(3 row(s) affected)&lt;/font&gt; &lt;/p&gt;  &lt;p&gt;Now, change the setting to disallow trigger results (you may need to do allow advanced options) &lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;exec sp_configure 'show advanced options',1      &lt;br /&gt;RECONFIGURE       &lt;br /&gt;exec sp_configure 'disallow results from triggers',1       &lt;br /&gt;RECONFIGURE&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;Now, just to make sure of where our data is, we know that there are 4 rows in the triggerResult table. Let’s check the triggerResultSideEffect table contents:&lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;SELECT *      &lt;br /&gt;FROM&amp;#160;&amp;#160; triggerResultSideEffect       &lt;br /&gt;ORDER&amp;#160; BY triggerResultId &lt;/font&gt;&lt;/p&gt;  &lt;p&gt;This returns, showing all of the values we have inserted: &lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;triggerResultId      &lt;br /&gt;---------------       &lt;br /&gt;1       &lt;br /&gt;2       &lt;br /&gt;3       &lt;br /&gt;4&lt;/font&gt; &lt;/p&gt;  &lt;p&gt;Now, trying to run the statement with the same trigger:&lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;insert into triggerResult      &lt;br /&gt;values (5)&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;This will cause the following error message: &lt;/p&gt;  &lt;p&gt;&lt;font color="#800000" size="2" face="Courier New"&gt;Msg 524, Level 16, State 1, Procedure triggerResult$insertTrigger, Line 6      &lt;br /&gt;A trigger returned a resultset and the server option 'disallow results from triggers' is true.&lt;/font&gt; &lt;/p&gt;  &lt;p&gt;Altering the trigger, just remove the statement that returns a result set:&lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;alter trigger triggerResult$insertTrigger      &lt;br /&gt;on triggerResult       &lt;br /&gt;after insert       &lt;br /&gt;as       &lt;br /&gt;begin       &lt;br /&gt;&lt;strong&gt;&lt;em&gt;&amp;#160;&amp;#160;&amp;#160; ----returns a result set          &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; --select triggerResultId           &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; --from&amp;#160;&amp;#160; triggerResult           &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; --side effect like you might expect in a trigger&lt;/em&gt;&lt;/strong&gt;&amp;#160;&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;&amp;#160;&amp;#160;&amp;#160; insert into triggerResultSideEffect (triggerResultId)      &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; select triggerResultId       &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; from&amp;#160;&amp;#160; inserted &lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;&amp;#160;&amp;#160;&amp;#160; --just to get a fixed rowcount output      &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; declare @test table (value char(1))       &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; insert into @test       &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; values (1) &lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;&amp;#160;&amp;#160;&amp;#160; --and a couple of errors      &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; raiserror ('Low',10,1)       &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; raiserror ('Normal',16,1) &lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;end &lt;/font&gt;&lt;/p&gt;  &lt;p&gt;Now, re-executing the statement with no results being returned: &lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;insert into triggerResult      &lt;br /&gt;values (5) &lt;/font&gt;&lt;/p&gt;  &lt;p&gt;This simply returns the error message that are returned, and the rows affected message from the insert statement: &lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;Low      &lt;br /&gt;Msg 50000, Level 16, State 1, Procedure triggerResult$insertTrigger, Line 22       &lt;br /&gt;Normal &lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;(1 row(s) affected)&lt;/font&gt; &lt;/p&gt;  &lt;p&gt;Just to make sure that the rows were created:&lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;select *      &lt;br /&gt;from&amp;#160;&amp;#160; triggerResult &lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;select *      &lt;br /&gt;from&amp;#160;&amp;#160; triggerResultSideEffect&lt;/font&gt; &lt;/p&gt;  &lt;p&gt;This returns: &lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;triggerResultId      &lt;br /&gt;---------------       &lt;br /&gt;1       &lt;br /&gt;2       &lt;br /&gt;3       &lt;br /&gt;4       &lt;br /&gt;5 &lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;(5 row(s) affected) &lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;triggerResultId      &lt;br /&gt;---------------       &lt;br /&gt;1       &lt;br /&gt;4       &lt;br /&gt;3       &lt;br /&gt;2       &lt;br /&gt;5 &lt;/font&gt;&lt;/p&gt;  &lt;p&gt;&lt;font size="2" face="Courier New"&gt;(5 row(s) affected)&lt;/font&gt; &lt;/p&gt;  &lt;p&gt;Which shows that the data was inserted..&lt;/p&gt;  &lt;p&gt;Note that books online states: This feature will be removed in the next version of Microsoft SQL Server. Do not use this feature in new development work, and modify applications that currently use this feature as soon as possible. We recommend that you set this value to 1. &lt;/p&gt;  &lt;p&gt;This means that the ability to return result sets from a trigger will go away in the next version of SQL Server.&amp;#160; Returning results from a trigger is certainly not something that has ever been a good practice, so if this does in fact occur, it is a good setting to set in your development servers and begin to use.&lt;/p&gt;</description></item><item><title>Data correctness during the development process</title><link>http://sqlblog.com/blogs/louis_davidson/archive/2006/10/09/data-correctness-during-the-development-process.aspx</link><pubDate>Tue, 10 Oct 2006 00:02:09 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:280</guid><dc:creator>drsql</dc:creator><description>&lt;p&gt;During all phases of development, particularly when building a completely new system, it is very common to have a need for data to test with.&amp;nbsp;&amp;nbsp;As important as this is for a UI system, the need is doubled&amp;nbsp;is when you are working on a reporting system.&amp;nbsp; While Everyone knows that the process of getting test data can be a fickle thing during the development process.&amp;nbsp; All programmers, relational and functional alike, when we test our code, we tend to create some meaningless data, usually only a row or two, to validate whatever table constraints, triggers, procedures, form, screen, batch process, etc that we are building, but&amp;nbsp;that data, generally speaking of course, likely bears little resemblance to "real" data.&amp;nbsp; But it is fine for the development process.&amp;nbsp;&amp;nbsp; &lt;/p&gt; &lt;p&gt;To write queries, everyone (including myself, if I can be honest with you) want &lt;em&gt;&lt;strong&gt;some&lt;/strong&gt;&lt;/em&gt; reasonable data in a table to write queries against.&amp;nbsp; Consider that the overarching mantra of the forum/newsgroup&amp;nbsp;question answerer is: "provide me a table and some test data..."&amp;nbsp; so we can test/write queries against it.&amp;nbsp; Without some data, answering a question for a person can be like target shooting with little idea of where the target is actually located.&lt;/p&gt; &lt;p&gt;The important question is, what does "some" mean, as in "I need &lt;strong&gt;&lt;em&gt;some&lt;/em&gt;&lt;/strong&gt; data".&amp;nbsp; I have worked on this mentally for years, as I struggled to put a label on the data correctness factor,&amp;nbsp;because it it such a hard process balancing the needs of the programmers building the database and code that uses it with the programmers building reporting.&amp;nbsp; The process is a bit like having builders building the foundation of a house while another group of folks setting up for a full dinner party on the floor above.&amp;nbsp; Never going to happen in most any industry other than ours. &lt;/p&gt; &lt;p&gt;Lately, on a large reporting project I have been working on, this frustration has reached dizzying heights.&amp;nbsp; We have data conversions from an existing system&amp;nbsp;going on all of the time, working to get the final data right for the final cutover.&amp;nbsp; We have gone over and over throughout the process of the entire project with each "new" set of data it was questioned how "right" the data was.&amp;nbsp;&amp;nbsp;Each of these sets of data was good enough to use in some capacity, but rarely if you need fine, verifiable detail going back to the original system, generally to match up to some existing report.&amp;nbsp; The whole process can be&amp;nbsp;really frustrating, though somewhat necessary.&amp;nbsp; &lt;/p&gt; &lt;p&gt;So in order to label this data, I will attempt to coin a few names for some of the ways our data ends up in the system during the process, if for no other reason than I can begin to label data in our systems in this manner for developers who are stuck using data in various forms, even if it does drive them a bit nutty.&amp;nbsp; The terms I am tossing out to define test data quality are: Scribble, Reasonable, Representative and Live/Real.&amp;nbsp; I will define these as:&lt;/p&gt; &lt;p&gt;&lt;strong&gt;Scribble &lt;em&gt;- Just data that is entered with no concern for other users.&amp;nbsp; &lt;/em&gt;&lt;/strong&gt;probably&amp;nbsp;very fake looking, including values like 'ahdflka', or 'aafdfiduah'&amp;nbsp; or even 'alsjdflaj;' (that's kind of fun...)&amp;nbsp;&amp;nbsp; No real value other than being there for you to mess around with to try updates, deletes, etc.&amp;nbsp; Reporting value is almost nil, even for the person who thinks very abstractly.&lt;/p&gt; &lt;p&gt;&lt;strong&gt;Reasonable&amp;nbsp;&lt;em&gt;- Valid data based on business rules, but likely may not resemble reality.&lt;/em&gt;&lt;/strong&gt;&amp;nbsp; The purpose of this type of data is often to test one area of functionality, like to see some data on the screen.&amp;nbsp;&amp;nbsp;It might just be a set of rows that have been hand manufactured to look like possible data, even if it is&amp;nbsp;not a combination of values that has ever, or will ever exist.&amp;nbsp;&amp;nbsp;This kind of data is built relatively easy, either using the user interface, or just by creating data using random number generation (like for keys based on numeric surrogates).&amp;nbsp; You must think abstractly when using this data, since nothing is likely to be correct.&amp;nbsp; Random numbers are certainly no great shakes when it comes to correctness, and a person typing data in for testing is very often going to get very lazy and start entering gibberish (like 'aslkfal', or 'askjakjcb'&amp;nbsp; (that never loses it's fun factor))&lt;/p&gt; &lt;p&gt;&lt;strong&gt;Representative Subset &lt;em&gt;- A set of data that is incomplete, but resembles, or is a subset of, reality.&lt;/em&gt;&lt;/strong&gt;&amp;nbsp; This is the type of data that we all want as developers.&amp;nbsp; It looks good, feels good, and other than magnitude, it is real data.&amp;nbsp; It is also far harder to create.&amp;nbsp; For existing systems, it might be converted data, or for completely new systems, it is probably hand entered, likely for the QA team.&amp;nbsp; &lt;/p&gt; &lt;p&gt;&lt;strong&gt;Real &lt;/strong&gt;- &lt;em&gt;&lt;strong&gt;Real live data that, if you weren't in a development area would in fact be usable for business.&lt;/strong&gt;&lt;/em&gt;&amp;nbsp; It is likely a snapshot up to some point in time, but as of a given point in time, this is what could be in production.&amp;nbsp; This data is more likely in a system that you are building as an upgrade to another system.&amp;nbsp; If this is a wholly new system, there is some chance that your system will start out empty.&amp;nbsp; At best clerks might be hired to copy data that is currently held in file folders (paper, remember it?) into the online system, but they might not do that until after the "go live" date.&amp;nbsp; &lt;/p&gt; &lt;p&gt;Maybe by labeling data as "reasonable" it would be clear to folks that there is data to browse, try out, etc, but there is no way possible that they can validate this data against anything other than the business rules.&amp;nbsp; This is especially an issue with a reporting source like a data warehouse.&amp;nbsp; It is expected in the minds of all users that data warehouse data will be based on a verifiable source, though when in the development phase, this clearly might not be the case.&amp;nbsp; &lt;/p&gt; &lt;p&gt;Have better ideas?&amp;nbsp; Better names?&amp;nbsp; Want to show me up and look cool to your nerdy friends?&amp;nbsp; Please leave comments :)&amp;nbsp; &lt;/p&gt; &lt;p&gt;Crossposted to &lt;a href="http://drsql.spaces.live.com"&gt;http://drsql.spaces.live.com&lt;/a&gt;&lt;/p&gt;</description></item></channel></rss>