Can You Trust Your Data?
Short answer: NO!
Many years ago, I was assigned the task of working on a global marketing data warehouse for a company that was losing money. After the new year, the data warehouse metrics indicated a huge profit during December. I warned the executives, "This number does not make sense"; companies typically do not make huge purchases at the end of the year. Their response was, "Numbers don't lie." Really? The executives reported the company was profitable and everyone would receive their first bonus in five years. After a week of intensive data analysis, I made a phone call to the offshore branch who coincidently started integrating their data in December. I asked, "Was the revenue posted in dollars?" They confirmed my suspicion: no, the revenue was posted in local currency. After conversion, the company was still losing money. I reported my findings and was the villain who took away everyone's bonus.
Stakeholders have a 'metrics fixation', but metrics can be pseudo-science. Stakeholders find it easier to base their decisions on "quantitative" metrics rather than qualitative factors such as experience and judgment. Dashboards metrics may be based on the wrong data or the wrong measurement or interpreted incorrectly. This can be problematic, particularly if depending upon third-party companies or products for some or all the metrics captured. One method for improving data accuracy is for the BA to ensure the stakeholders identify an ongoing auditing process to review metrics and confirm correctness.
Stakeholders are fond of dashboards, but dashboards can be misleading because they provide no context for the data. Many today are innumerate; they do not have basic knowledge of mathematics. I often argue with stakeholders over percentages, due to the lack of understanding of how percentages work. For example: the dashboard metric that has dropped by 10% from Monday morning to Monday afternoon, and then increased by 10% on Tuesday closing, has not returned to the Monday morning value — although users will argue it has. (If Monday morning value is 100-10% = 90, 90+10%= 99 Tuesday closing value.) BAs who are innumerate themselves should seek assistance by those who are stronger in math.
Stakeholders have a hunger for data these days and will state, "We need more data for complete analysis" when this is rarely true. At one site, the managers wanted to stop a project because they were missing twelve weeks of data from a five-year period. I reminded them they still had 248 weeks of data; would the missing 12 weeks really skew the results? They decided it did not. For most data analysis, missing some data is acceptable, but there are circumstances where data must be 100% complete and correct. The BA must provide the facts to the stakeholders and allow them to determine how much data and what degree of accuracy is required.
Business analysts and developers both make mistakes, and data errors may not be encountered for weeks, months, or years after implementation. A business analyst performing data analysis must always ask, "Does this data make sense?" If it does not, the BA must have the courage to question and research the data's validity.
This was best stated by Euripides who said,
"Question everything. Learn something…."
# # #