“Anticipate the difficult by managing the easy.” – Lao Tzu
As software requirements analysts, we have crucial decisions to make regarding those errors often considered rare. This is the last in a series of 3 posts on how to define requirements for error conditions, specifically data errors.
In the previous two articles of this series, we have defined Data Errors and categorized them. In this installment, let us look at some simple ways of defining requirements to proactively manage potential Data Error scenarios before they become big problems.
- Define Localization Requirements. If our application is going to be used in different parts of the world, then we must clearly specify the languages to be supported. Languages like Spanish and Portuguese use special characters natively. This must be called out explicitly in the requirements. I have seen xml files used in data exchange break down when special characters that are a standard part of Spanish language sets show up in the data if the development team thought they were only dealing with English characters.
- Check Statutory and Legal Requirements for Potential Missing Data. Tax laws, statutes and regulations are typically very specific in terms of how values are calculated and assessments made. Almost all of these kinds of requirements have very specific data needs, many of which can be arcane and not necessarily immediately obvious or intuitive. There is also tremendous variation from locale to locale on these data needs within a country, let alone across countries. Sitting down with the subject matter experts and slogging through the Data Models to identify the needs in detail is my recommendation for the best (although not easy) approach.
- Define Performance (Non Functional) Requirements. Clearly define the MAXIMUM load that the application must be designed to handle. All too often AVERAGE loads alone are specified with disastrous results. See my post here for a detailed discussion on this issue. Data Errors caused due to excessive volumes are the easiest to handle during the requirements phase.
- Define User Friendly Error Messages. This is essential for outward facing applications like web sites. A common Data Error condition of all web sites is invalid URL. We must provide clear requirements around how we want these pages to be handled. For example, provide a simple message that states the requested page is no longer available on the site and redirect the user automatically to the home page or some other relevant page on the site. This is a much better response to a common condition than the standard ‘404 page not found’ error message that shows up by default. For other situations where errors are caused by missing or invalid data, we must provide clear information to the user on what data is missing / invalid and the steps to be taken to rectify the situation. The reality is that all Data Errors are going to be trapped by the application. By spending some time up front defining requirements around the behavior we want when they are encountered, we will end up with a much more usable application.
- Define Interface Requirements. Interfaces lie right at the intersection of requirements and design, so a lot of analysts choose to not define them at all. Even if we are not defining detailed interface requirements, at a minimum, we should look at all the data that flows through the interface and pay particular attention to the data that is NOT required for our application. We can use the following filters to determine what to do with the extra data flowing through the interface:
a. Discard – the data is of no use to our application and we can safely ignore it.
b. Save – the data may have use for reporting or additional validation down the road. Provide explicit requirements about the data to be saved.
c. Associate – this is related to saving data. If we want any of the data to be explicitly associated with other data in our system prior to being saved, then we must call this out in the requirements.
- Review Data Critically for Potential Problems. Once we have identified the data needs of our application, we must review the Data Models we have created critically to see if there are any potential problems we need to address. Here is a list of questions that can be used:
a. Can this data have different formats depending on the location or source? An example of this is ZIP / postal codes.
b. Can this data be of a different type depending on location or source? An example of this is ZIP / postal codes.
c. Will the language in which the data is created cause it to behave differently? For example, special characters in language sets can cause problems for certain interfaces, if they are not identified up front and planned for by the developers.
d. Will the units of measurement used with the data cause problems? For example, kilograms versus pounds, or kilometers versus miles. Even if we get the data type correct, we could be way off if the proper units of measurement are not specified in the requirements.
e. How likely is this data to be missing, malformed, incomplete or different from what we are specifying? This is particularly important for applications that use a lot of third-party data.
- Define Clear Data Validation Requirements. One of the easiest ways to avoid having malformed or corrupt data down the road is by defining clear data validation rules up front. This is far easier than attempting to do a massive data clean up exercise later. Data validation rules are nothing more than the business rules that have been defined for data. Armed with these requirements, developers can easily create UI or validation routines that enforce these rules.
It is not difficult to write good requirements around Data Errors that are bound to hit all applications. In many cases, by getting in front of the issue, many errors related to missing, excessive or different data can be eliminated altogether.
I hope you found this series of articles useful to identify Data Errors in your own applications, and to deal with them proactively with good requirements. It is very easy to do, and the payoff is huge.