THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Andy Leonard

Andy Leonard is an author and engineer who enjoys building and automating data integration solutions. Andy is co-host of the Data Driven podcast. Andy is no longer updating this blog. His current blog is AndyLeonard.blog.

Data Wrangling Concepts: Latency and Staleness

Latency is the amount of time between data creation and load. Staleness is the amount of time since data in a target was refreshed.

In many data integration use cases, a latency or staleness of one day, one week, or even one month is acceptable. Enterprise Data & Analytics helps enterprise data integration teams reduce latency and staleness. We often help customers seek near-real-time solutions – or near 0 latency and staleness.

How do we approach a solution?

  1. Tuning
  2. Re-architect
  3. Redesign

Tuning

If the source schemata are stable, we begin tuning at the sources and destinations. Most targets and many sources are data stores. If we’re using SQL Server Integration Services (SSIS) for data integration, many find it odd that we begin at the source and destinations before tuning the SSIS packages. We start with the databases because we often (not always) see more improvement from database tuning than from SSIS tuning.

If the source schemata are not stable, we examine how the enterprise is managing this volatility.

Re-architect

Some enterprises manage volatile source schemata manually. Employing a Biml-Driven Architecture (BDA) automates some of the manual effort. A BDA can often alleviate all of the manual work.

BDA can also facilitate loading a common target with dozens (or hundreds) of sources.

Perhaps most intriguing BDA can reduce to time-to-analyze data, allowing data scientists to begin experimenting with the data sooner and, hopefully, expediting results.

Redesign

Support and maintenance is too often overlooked when considering latency and staleness. Consider while your team is troubleshooting an issue with the enterprise data integration solution, the data in the target is becoming more stale and latency is increasing at a rate of one minute per minute. Employing design patterns is one way to reduce support and maintenance overhead. If most (or all) of your data integration solution employs the same (or similar) design pattern(s), understanding one package means team members understand many (or all) package(s). Implementing a BDA once design patterns are known adds even more efficiency to enterprise data integration support and maintenance.

Software design best practices cannot be overemphasized. Separation of concerns, decoupling, coding to contracts, testing, source control, and Data Integration Lifecycle Management (DILM) save more time and money than business owners realize.

Conclusion

At Enterprise Data & Analytics, we help enterprises build faster data integration solutions, build data integration solutions faster, and make data integration execution more manageable. If you are interested in learning more, please contact us.

:{>

Learn More:
Biml in the Enterprise Data Integration Lifecycle (Password: BimlRocks)
From Zero to Biml - 19-22 Jun 2017, London 
IESSIS1: Immersion Event on Learning SQL Server Integration Services – 2-6 Oct 2017, Chicago

Tools:
SSIS Framework Community Edition
Biml Express Metadata Framework
SSIS Catalog Compare
DILM Suite

Published Thursday, April 13, 2017 10:14 AM by andyleonard

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

No Comments

Leave a Comment

(required) 
(required) 
Submit

This Blog

Syndication

News

My Latest Book:

Community Awards



Friend of Red Gate

Contact Me

Archives

Privacy Statement