A batch process is used to do an automated manipulation of data, often for a bulk transfers of data. They execute transactions automatically without real time user input data.
You would typically implement a batch process if you needed to schedule the update or manipulation of data for a time in which the systems and data involved were less “busy”.
As an example, we might need to update changed product data every day, from one system to another. It may be important to do this update when there are not a lot of users trying to hit an application that uses that product data because it could slow down system response time for the users. In this case we would want to use a batch process. In the implementation, the changed product data would be extracted from the main system and imported into another system nightly. Then there may also be manipulations such as upon importing, such as concatenating the productName and productDescription fields into the productDescription field in the new repository.
There was a post on the messageboard recently about what models to use to capture batch process requirements. Use cases do not work well, and while process flows might sometimes work, the model choice really depends on the complexity of the transactions.
When I think about something like a batch process to load data, I start with a set of questions. To answer each of those questions, I employ the appropriate model:
Use data models, such as data dictionaries, ERDs, state tables:
- What data needs to be manipulated?
- What data needs to be batched?
- What are the attributes on the data to be batched?
- What is the desired end state of the data after running the process? (This might be a state of the data or it might be an output format.)
- What manipulations are executed on the data as part of processing? (If the manipulations get complicated, then a process flow diagram might help describe those steps.)
Use system models, such as context diagrams:
- Where does the data come from?
- Where does the data need to go?
- What other systems have data that is used in the manipulation?
Use people or system models, such as actor lists or context diagrams:
- What or who triggers the batch process to run?
Use non-functional models to identify non-functional requirements:
- When does the process execution need to be completed by?
- How frequent does the process need to run?
In capturing the requirements for batch processes, you need to start with a set of questions that have to be answered to gather the basic data, and then apply the known models to answer those questions.