This post is the forty-ninth part of a ramble-rant about the software business. The current posts in this series can be found on the series landing page.
This post is about dashboards. Dashboards are where data meets decision-makers. The field of data visualization is about this intersection of information and actors. Here, the numbers are translated and communicated in a manner that is clear enough to define action. Decisions are supported by these systems. Business and intelligence meet. The data is right there, represented numerically or graphically or both, waiting to be used.
This post is not about how to accomplish data visualization. This post is about the fact that data should be visualized.
Edward Tufte (Blog) is one of the prominent names in the field of data visualization and information design. Well known for his criticism of Microsoft PowerPoint, Tufte has earned a reputation for clarity and insight. He participated in the investigation of the Space Shuttle Columbia disaster. There is a ton of related information here. He has been appointed to presidential panels to ensure integrity in communications. Tufte invented sparklines and is generally considered a data visualization genius.
Data visualization specialists value integrity in communications. At least, the good ones value it.
This Isn’t Hard
Dashboards are elegant. They do not have to be complex. In fact, the most effective data visualizations are intuitive and almost instantly convey the desired information. Simple is good.
A long time ago (back when the years began with the number 1) in a place far, far away; I built a Manufacturing Execution System, or MES. It was called Plant-Wide Webs (catchy, eh?). It was one of the first MES’s to exclusively use a browser for visualization. The idea of PWWs was to convey – at a glance – the state of a manufacturing facility or enterprise. As I mentioned in Performance-Based Management Stinks, the only metrics that count are shipping and delighting customers. I believe that’s true for measuring employee performance. Behind that statement is a principle and it is this: I believe it is possible to isolate or create an effective, single, accurate-enough metric for anything. Is this metric going to communicate everything that’s going on at all levels of your business at a glance? Goodness no. But, I maintain it’s possible to glean way more than 80% of the important truth from a single number (I have more to say about this... later), and that’s what Plant-Wide Webs did. The dashboard? It was a modified stoplight:
I started with an electronic drawing of the facility which I converted to an HTML image map. The map was completely green, yellow, or red. The numbers behind this were not simple, but they were available (via a click or two, max) and condensed into a single metric. And they were near-real-time and immediately recognizable. Back in the day, “near-real-time” meant accurate to about a minute. The plant manager could view their facility’s performance in near-real-time all day. History was provided (of course), and drill-through was supported as well. After all, drilling was as simple as linking; something at which HTML excels. Each click would take the manager to more detail. the first click on the plant image would be a copy of the image split into several shapes, each representing a section of the plant and each reflecting the red-yellow-green status of respective facility sections. And each section was drillable. And so on, and so on; until you reached a screen filled with readings from actual machines – data collected from data acquisition systems or Programmable Logic Controllers (PLCs) or Human-Machine Interface systems (HMIs).
This wasn’t then (and certainly isn’t today) earth-shattering graphics. But it was easy to maintain and scale, fast-rendering, and best of all; simple and clear. It communicated. What did it communicate? The state of the plant? No. The state of the business!
Users aren’t stupid. They are your community. If you treat your community like they are stupid, you make more work for yourself. You also communicate that you distrust and disrespect them. Transparency isn’t merely the right thing to do, it is also the smart thing to do.
On my first data professional gig, I was hired to implement and manage the reporting solution. It was a web-based solution and the sales demos must have been impressive. The product actually was impressive as long as you ran it on the correct platform. We ran a ported version on the incorrect platform. </SadFace></WithASmallTear>. Back in those days I could hold my own as a web developer. The short version of a long story is: I fixed the ported code. I was done when the manager decided to move the current SQL Server person to another position. Since I was the only other person with the words “SQL” and “Server” near each other on my resume, I got the gig. Now mind you, I thought I could do the job. When my manager asked me if I could do it I told him “Yes. How hard can it be to tell developers ‘No’?” </CaptainSnarkyIWas>. I learned a lot during that first real database person position.
I did a few things right, though. One of them was to trust my community. We started with a pilot of ten “power users”. They were all internal, part of the same company as us. But the next step was to expand to something like eighty users, and not all of them worked for us. So they didn’t have access to all the information available to the original group.
It sounds like a simple thing. Here’s why it wasn’t:
In ETL (Extract, Transform, and Load) operations there is this thing called Latency (an engineering idea) that is tightly coupled and indirectly proportional to another thing called Throughput. The more ___ you can shove through a pipe, the less latency you experience. Back then, we were loading a ton of data, relatively speaking. It took days to load a couple tables in our data warehouse. Since we didn’t want to wait around the clock, I found an old spreadsheet I created to do predictive analytics, and we would sample the current number of rows in the destination table every now and then along with the time, and then do some math, and then do some more math, and then we’d have a science-backed wild guess about when the table would contain all the rows from the source. (The funny part of this story? I developed that spreadsheet while working as a 3rd-shift electrician in a hardware plant to determine how long a large tank would take to fill. Fluid levels, data, it doesn’t matter – the math is the same. James Maxwell would be proud.)
That inspired another idea: I could build a web page to display the latency metadata used as the source in part of the calculations. It was a fantastic internal tool. That inspired another idea: Why keep it internal? Most of the calls I was fielding were from users outside the company who had no idea why their report numbers were changing with each refresh. Everyone inside knew when there was an issue with the overnight batch processing that increased latency. But not those outside. So I placed a link to the latency page on the website and published the latency data with our next release.
I almost got fired.
My boss considered that data proprietary because it basically showed when we were not compliant with our SLA. I get that (now, I didn’t back then, but I do now). It never occurred to me that we should withhold information the users needed to make informed decisions about the validity of the information we provided. It still doesn’t occur to me. Transparency stopped my phone from ringing, allowing me to concentrate on more pressing (and valuable) matters. I probably would have been fired if the bosses of the external users hadn’t called my boss’s boss to tell her what an awesome idea that was. Just about the time my boss was ready to chew me out for releasing “proprietary performance and SLA data” he got a call from his boss, and she chewed him out for not letting her know about this cool new initiative that was saving our customers time and increasing the value of our data and service to them.
Elegant != Pretty
I tell every student that attends my From Zero To SSIS! class: “Anyone can build SSIS packages that work. I expect your SSIS packages to also be pretty.” But I leave them with this caveat: “If you have to choose between pretty and functional, always choose functional.” The same goes for dashboards. If you are afforded the time to delight the customer, do so. If not, opt for “working” over “pretty” every time. Make it as pretty and fast as you possibly can, right after you get it working. Remember:
Deliver quality late, no one remembers.
Deliver junk on time, no one forgets. – Andy, circa 2004
I’ve never had a customer or user come back to me after delivering quality late and say “Sure, Andy. This works well and all, but you were two days late.” They simply do not remember that it was late if it does what they want. But come in early and under budget with bugs? You will not hear the end of it.
A dashboard is simply a communication medium. It translates data into actionable information. It’s that simple. If your dashboard does amazing things but sacrifices any portion of this vital function, then your dashboard stinks. Get this part right. Communicate the state of the business quickly and accurately. Provide multiple levels (grains) of information. Trust and respect your community, for they can make your job easier or more difficult.
Dashboard development and implementation is more art than science. Treat it as such.