Time to share a lesson learned. While working with new GEOMETRY and GEOGRAPHY data types in SQL Server 2008 there nested set of issues.
- Unless you have spatial data to work with, the use-cases for these data types are limited.
- Loading data from typical data sources, such as shape files, is probably out of the reach of the typical developer or database administrator unless you have a specialty tool for it.
- As helpful as SQL Server Integration Services typically is, it becomes a choke point in this scenario.
What exactly is it the issue? Let us say that you have been tasked with developing a database to support decisions about where a company should build new storefront locations. A previous analysis of sales has shown that storefronts built in areas where a minimum population and a minimum percentage of the area’s population are within a 25-mile radius. How would you solve this problem?
Clearly you need geographically-bound data and demographic data. In the United States, our Census Bureau provides such data in a relatively easy to consume format – if you are using traditional Geographic Information System (GIS) tools. For example, you can go the Bureau’s Web site and download one set of files that define the geographic boundaries of census areas and download another dataset that has the desired demographics. A straight-forward process for extracting the demographic data into comma-separated values makes preparing the data for consumption by SSIS exists.
The Shapefile data is a different story. In SQL Server 2008 there is no out-of-the-box SSIS Data Source for reading them and I doubt we will see one before RTM. So you are left to find a third-party solution or write your own. While it is possible to write your own, this is not an option that should be considered lightly especially if the need to consume is one-off or few-off. Currently there is a relative dearth of open source/free software solutions available. One that I used with good success is Morten Nielsen’s stack off tools (see http://www.iter.dk/page/SQL-Server-2008-Spatial-Tools.aspx). Note, however, that this tool is not yet usable with RC0. When it comes doing serious spatial ETL, there is really one choice as far as I am concerned: Safe Software’s FME (http://www.safe.com/aboutus/news/2007/106/). I particularly like the fact they have gone to the work of writing SSIS data sources, transformations and transformations.
Yet even that solution has a significant issue, its price tag. Serious GIS shops might not bat at an eye at the acquisition cost for licensing FME, but I can imagine the looks of sticker shock when other types of business owners see the price. This is not Safe’s fault – they have a good product and they have obviously made a substantial investment in.
So what is the best solution? I hope the SSIS team realizes that the uptake of the spatial features in SQL Server is largely a function of how easy it is it to acquire and use data. This is really an unparalleled opportunity for the SSIS team. I cannot think of another pairing of SQL Server features where one team could so dramatically improve the usability of another -- and thus empower us more as customers -- than this. To do this, I hope the team does not take a “chicken and egg” mentality to the problem. That is, they need to do more than say “well, those features aren’t being used, so it doesn’t make any sense for us to invest much in them.” What this thinking fails to realize is that the features may not being used much is because of the acquisition problem discussed above. I am not saying “if they build it, the users will come.” But I am saying that “if they do not build, the SQL Team should not expect users to use these features.”
If you happen to agree with me on this topic, I'd appeciate your vote on Connect issue 357045 (https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=357045)