Data flows: Note on Data-Driven Process Modeling

A Focus on Data
Data flow diagrams (DFDs)
Common DFD mistakes
Summary/next steps

A (partial) solution - focus on the data

About twenty years ago, systems analysts began to resolve the conflicts between managerial and information-systems design perspectives in an elegant way. This solution described processes by focusing almost exclusively on the data that each process generated. This notion suggested that by focusing on data movement and the processing activities that occurred within a business process, one could simplify process descriptions in ways that supported successful information systems. To a large extent, this idea proved valuable: systems analysis techniques based on this perspective now dominate most software systems engineering work. Focusing on the data had at least these advantages:

  1. A focus on data minimized the distractions generated by other process characteristics. For example, knowing what data an order entry process required enabled system designers to build systems that would support order entry no matter who did the work, no matter where the work was done, and no matter when it was done. The system design, from this perspective, depended less on attributes of the existing process than on the data within the process. In this sense, a focus on data alone enabled systems analysts to move quickly beyond process details to recognize the minimum necessary characteristics of a process that their systems would have to support.
  2. A focus on data made it easier to recognize potentially new process designs. Abstracting a general picture of the data used in a process from the specific details that represented examples of that process tended to provide new insights for process redesign. Consider the ordering example we discussed above. In the more complex version of that process, Accounting matches two types of purchase orders with a shipping manifest. A systems analyst using data-driven techniques would quickly recognize that all three forms carried largely the same data, ask why they were all needed - and why the matching step needed to take place. In organizations where Accounting waited days or weeks to obtain copies of all three forms prior to sending out an order (not really so unlikely in many businesses), such a process innovation could have important business benefits (e.g., shorter delivery times).

In effect, focusing on data enabled systems analysts to extract only those details most necessary to the development of the information system from the masses of detail collected to describe organizational processes. This simplified process view often offered insights for potential process redesigns. Insights arose for at least two reasons:

  1. Developing software-based process support forces analysts to examine the logic by which organizations make choices within processes. Electronic support for order entry, for example, requires that a system "know" how orders are officially approved, how products are confirmed to be backordered, how packages are approved for shipment, etc.
  2. Developing the algorithms (e.g., the specific instructions) that software can use across a range of process variations forces analysts to generalize generic process characteristics from observed process specifics. Orders from both large and small customers, for example, are likely to have features in common as well as specific differences based upon customer size: an efficient software system would know how to handle common characteristics using the same software code, thereby minimizing the amount of customized code that would have to be produced to manage process variations.

This interaction between process descriptions and the requirements of software engineering led to a standardized way of analyzing business processes. This perspective understood processes from the point of view of the data and processing steps that they generated. It led to a four-step analysis of data flows and their related processing steps:

  1. Data flows and processing steps were observed within the process as it was currently practiced in the organization. The resulting model was often referred to as the "physical" or "implementation" model of the system.
  2. The underlying logic represented by observed process steps was abstracted to build a generalized model of the process. This model was often referred to as the "logical" or "essential" model of the system. "Logical", in this sense, referred to the programming logic that analysts would have to use to write the software code needed to build the system.
  3. After closely examining patterns of data flows and processing that emerged from the physical and logical models, analysts could often suggest ways of executing process logic more effectively. In other words, they could suggest process improvements. The logical model of these improvements would comprise the key characteristics of a new process which, if implemented, would enable improvements in business performance based on a combination of process redesign and newly available information system support.
  4. If management decided to adopt the new process, the combination of process design and information system support would become the basis of new practice within the organization, giving rise to a new set of physical process models.

A summary of this approach to process analysis is shown below in Figure 2. It suggests how a systems analyst could analyzes data flows to move from observations of current practice in an organization to understanding (a) key characteristics of current practice, (b) key characteristics of improved practice, and (c) potential redesigns that could become future practice to support improved business performance. The curved arrow in the figure suggests the analytical progression that would support such conclusions.

We will refer to the analytical perspective that focuses on data movement and the processing implications of work tasks as data flow analysis. The following pages describe a graphical approach to understanding data flows that provides a way to build data flow models quickly and consistently. These models offer a useful way to compress large amounts of process information into a two-dimensional space that assists in both understanding the logic of a process and identifying key data entities within the process. For this reason, data flow analysis can be seen as a technique complementary to entity-relationship diagramming. Indeed, the data flow diagrams (DFDs) that data flow analysis produces can be seen as a data-driven process representation that is one step further away from the database-design-specific characteristics of entity-relationship diagramming (see Figure 3).

Next: Data Flow Diagramming

@ 1999 Charles Osborn