Data dictionaries have been around for quite a long time. I have a book on analysis that was written in 1979 that covers them in great detail. Unfortunately, most of the techniques in that text focus on how to collate and manage the data rather than what data to include. There’s nothing quite like reading about how to organize your data dictionary using a card catalog file to make you realize how far we’ve come in terms of data management and organization.
The concept of a data dictionary is just as valid today as it was 27 years ago. A data dictionary is a model that allows you to look at the properties of all the data in your system in a very analytical and structured fashion.
To construct a data dictionary for your system, you first need to identify what the business objects are. This is really a very crucial step in the process. It is tempting for some users (and analysts) to think in terms of system objects and database tables, but that is not what you should be focusing on here. Instead, you should think about the real-world objects your system deals with.
For example, you might be working on a shipping management system. This system focuses on the tracking and routing of packages. Those packages are real-world objects that have tangible attributes such as weight, dimensions, recipient address, and return address. By focusing on these objects, you drive towards the real business requirements and not some predefined implementation concept.
To construct the table itself, you need three things for each business object – the attributes (phone number), the characteristics of each attribute (length of phone number), and the values of each characteristic (phone numbers must be at least 10 characters in length).
This step is where the value in producing a model becomes clear. It takes some extra time to sit down and think of all of the characteristics you want to specify for each attribute, but the payoff is that once you have these defined for one object, you can quickly fill out the values without having to worry about each individual attribute. You can also use the characteristics for one object as a starting point for the other business objects. For example, my business users have told me that the number of decimal places of each weight value tracked by the system is very important for monitoring and reporting. It stands to reason that other objects and attributes might require the same level of specification. If you figure it out once, you can use it in many places.
Why use these tables? It all goes back to the idea of structure. Without the table, you aren’t thinking in terms of objects, attributes, characteristics, and values. Everything is ad-hoc and that can lead to gaps and mistakes.
Would I use these on every project? No way. This kind of information is way too detailed for many applications. When your system is very data driven, however, a thorough analysis of data properties can lead you to many hard-to-find requirements.