For the purpose of this discussion, let me define data defects in a software development process as defects caused when improper data setup manifests itself as “application errors.” I ran into this situation recently when I was asked to define requirements to fix a couple of defects that had been logged in the system by business testers.
Preliminary investigations seemed to point towards something fundamentally wrong in the application. We had been provided with sample output from the testers that showed the results generated by the system under development compared against expected results. They were not even close. One is hard pressed to believe that the developers would actually release something that was so far off the mark. So, a natural conclusion would be that the requirements presented to the developers was just flat out wrong.
This is where things got interesting. When we sat down with the Development Team and started performing a root cause analysis, they showed us scenario after scenario where the application behaved exactly as one would expect it to. Looking at the requirements they developed from, I could not detect anything obviously wrong either. Yet, Business kept insisting that the results were incorrect. And Business was right.
So, we dug deeper and eventually discovered that the base data on which the calculations were performed was improperly configured. In short, garbage in, garbage out.
What happened next was an eye opener for me. Development went to the Defect Management system and closed out the defect by rejecting it. They provided their rationale for rejection and moved on to the next defect. Case closed.
So, who then was responsible for fixing the underlying data defect? Not us, said Development. And rightly so because data was owned by Business. Who in Business is responsible for fixing the defect? Don’t know, said Development. Correct again, because they did not know.
As far as Development was concerned they had done everything they had to do. They had analyzed the problem. Determined root cause. Reproduced the “error.” Demonstrated beyond all reasonable doubt that the code worked as advertised. Now, it was someone else’s problem.
The problem was that the “error” was still out there. It had not been fixed. And Business could not and would not deploy the solution to their users with this “error.” Right again. How could they possibly be expected to use a system that produced erroneous output, regardless of the source of the error. One does not deploy an application simply because there are no code defects.
This episode revealed a fundamental gap that exists in most Software Development and Deployment methodologies. There is a gray area that exists between Development and Deployment that is left unaccounted for. It is in this space that data falls.
In most organizations, Business owns data. This is as it should be. For the most part, Business gets data right. However, when the complexity of data grows, it is almost impossible for Business to understand the implications to changes in data or for that matter, even what constitutes properly formed data from an application viewpoint.
Simply put, it is not possible for Business to solve subtle data issues without the active participation of the Development team. The Developers need to guide and educate Business as to what constitutes correct data from an application perspective. They might also need to provide Business with simple tools to analyze the data, update it, scrub it and manage it. This is a collaborative effort between these two constituencies. And it needs to be managed.
From our experience we found it best to call out a data management effort as a separate task that was managed at the Program Level. This is where we have netted out.
1. Define the members of Data Readiness the team with representation from Business and Development.
2. Fund the effort.
3. Define a process to identify data errors.
4. Define a process to fix these errors.
5. Test explicitly to identify and expose data errors.
6. Track it like any project deliverable.
It is a lot of hard work that is not necessarily fancy or exciting. But if it is not done, the entire project is at risk to fail.