Sep 27, 2019

Data Shame and Data Blame

The term often used in the BI and analytics industry is “data-driven” – it’s one thing every business wants to be.

But we don’t hear too much of the flip side of this, which is when the data reveals some unpleasant information or a mistake. This is the difficult side of analytics, because it involves people, many of whom are deeply invested in their roles and the work they do. When the metrics reveal a problem or error, it can become a tricky situation.

Here’s a recent example from the Washington Post: the process they used to combine multiple data sources to build a list of best-selling books had errors, ultimately leading to under reported sales figures and titles being dropped from the list.

Kudos to the newspaper for its transparency on this error. But one has to wonder about the repercussions this kind of mistakes has for authors and for the Post staff who only belatedly caught the error.

Data Shame

Marketing speaker and Timeshare CMO founder Melinda Byerley made some observations on Twitter about the power our data has to shame us:

“I have observed that one reason data isn’t used to its full capacity in orgs is because accountability and change will surely follow. Data has fear attached to it….

Us data nerds need to remember our data won’t help our companies unless people use it. And people are bundles of feelings: fear, shame, insecurity.”

Often, especially early in the move to analytics, we won’t like what we actually see in the data. And, given the power and emotions we connect with our data, it is never enough to simply broadcast a data error and its correction.

Data scientist Matt Mazur wrote an interesting blog post on the importance of diplomacy when revealing data errors. To a data analyst or scientist, finding an error like this can be one of those “Aha!” moments and tracking down its cause an intriguing intellectual exercise. But for the end users who may have based decisions on erroneous metrics, it can be a crisis.

In his two examples, Mazur explains how his casual communication of the problem (“Hey, there was a bug, and I fixed it”) created problems downstream. In hindsight, instead of simply posting to the company Slack channel, he notes, he should have:

  1. first communicated the issue to those directly involved, to assess its impact
  2. allowed the teams involved some time to check their calculations
  3. notified ahead of time anyone who may have used the wrong data to make decisions, so they didn’t learn about it at the same time as everyone else.

I would add a 4th item to this list, to come up with a plan or process to ensure this type of mistake would not happen again, so this error can lead to better practices.

Software developers (and data scientists) are used to errors. They’re not a big deal. In fact, iterating through errors and corrections is a major part of the job. However, it’s a mistake to assume that that is how other departments operate. Melinda Byerley wisely tweeted: “We can’t just drop data off. We have to talk to the scared little kid in every listener’s head: it will be okay. So many times our clients are sheepishly confessing their data shame to us.”

The Aftermath: Data Blame

And when the analytics don’t look right, or an error is found, what may follow is the blame game.

Last year, in a KPMG survey of more than 2,000 senior executives 65% surveyed admitted to some lack of trust in their data analytics. This may be just an example of a cynical (or realistic) understanding of the analytics challenges ahead. However, a full 62% of those surveyed believed data errors were the fault of IT. Given the complexity of analytics, it seems like IT should share some of the responsibility for analytics errors with other departments, like the C-suite.

The article cites recommendations from Gartner analyst Aashish Gupta, which ultimately aim to foster trust and a “psychologically safe team environment” – that is, one that “co-workers and team members feel that they will not be punished or humiliated when taking interpersonal risks, such as asking for help, admitting mistakes and vulnerabilities, or expressing concerns.”

Organizations will never be data driven if they are more concerned with accountability than with looking at the actual metrics or honing the analytical process. Instead, there will be more of the same misreporting or obfuscating the metrics up the ladder and less true data discovery.

Additional reading

The Dataquest blog has a good article on the kinds of challenges (communication and others) faced by data scientists.

In this editorial, Steve Jones at the SQLServerCentral blog discusses an example (but hopefully not a trend) of IT staff taking the blame for a data breach.