Theterm “business process” has a strong connotation of applying thetechnology to business procedures in support of large enterprises.However, long-running processes also occur in science and engineering,which also have developed some degree of automation support. Sincetheir capabilities are very similar to those we described for businessprocesses, we summarize them only briefly.
Softwaresystems that support scientific experimentation need to deal withlong-running processes. Scientists call these workflows rather thanbusiness processes, but the concept is the same. A typical scenario isto use a pipeline of tools that takes raw data from a physicalscientific experiment and transforms it into a meaningfulinterpretation of the result of the experiment. For example, inbioinformatics, an experiment might involve putting the liquid resultof a wet-lab experiment into a scientific instrument, such as a massspectrometer or micro-array. The output of the instrument is a file.That file is then pipelined through a sequence of data analysis tools,ultimately producing results that can be interpreted by a scientist.The analysis may be run thousands of times on different samples.
There are several ways in which automation of workflows can help scientists, such as the following:
A scientist can write a workflow definition that drives the execution of the multistep experiment. The workflow management system maps the computational steps onto a multiprocessor computing facility and monitors and manages their execution.
A scientist can review the history of workflow executions. This history, which scientists usually call provenance, can give the exact steps that were executed to produce a particular output. The ability to run queries to find the provenance of certain experiments helps enable the reproducibility of experiments. This is especially valuable when the process has manual steps and different executions of the workflow have different manual steps.
A workflow system can capture the sequence of steps of a process so that it can be replayed many times. Initial experiments may involve many manual steps. But as the process is perfected, the same steps are executed in each replay. It is therefore helpful if the workflow system can transform an execution history into a script that can be re-executed many times.
Asof this writing, scientists have their own workflow management systems,which are different from those used for business processes. However,there is a growing awareness of the strong similarities of these twotechnologies. It therefore seems likely that more technology sharingbetween these two communities will develop.
Configurationmanagement systems help engineers manage shared designs. A similar kindof system, called a product data management system, is used fordiscrete manufacturing. In these systems, design information typicallyis stored in files, which are grouped into configurations, each ofwhich corresponds to some component being designed. The system offerscheck-out–check-in functionality. A user checks out the files he or sheneeds to work on. After the work is completed, the user checks themback in. The work that was done between the check-out and check-in canbe thought of as a step in the design process. A design tool may beinvoked to evaluate the result of that step. If the result passes thetest, it has to be recorded in the project management system where thechange request originated. If not, it has to be returned to theengineer to redo the design step.
Forthe most part, the steps of such a configuration management process aremanual. However, they often follow a well-defined engineering processthat could be codified as a business process definition. Thus, they canbenefit from some degree of software automation to track the state ofeach process and to review its history long after it executed.Currently, this type of functionality usually is built as a specialfunction in a configuration management product, rather than usinggeneral-purpose business process management tools.
Configurationmanagement also is used to manage complex computer systems. This ismore of an operational activity than a design activity. However, thebusiness process functionality is largely the same. The steps requiredto perform certain system management functions are specified as abusiness process, such as steps to add a new user to the system or toadd a new server to the network. Thus, some degree of automation totrack process state is valuable here too.
Oneinteresting aspect of configuration management compared to normal TPsystems is that the steps of a configuration management process requireapplication-specific logic to make them serializable, due to concurrentcheckout steps. For example, suppose Alice checks out file F and then Bob checks out F too. Alice modifies F, thereby creating F′, and checks in F′. Then Bob modifies his copy of F, thereby creating F″, and checks in F″. At check-in time, the configuration management system knows that Bob’s initial state of F was overwritten by Alice. It therefore knows that it would be incorrect to overwrite Alice’s version F′ by Bob’s version F″. Instead, a configuration management system would ask that Bob’s changes to F be merged into F′. The system might help by finding the differences between F and F″, and then helping Bob add those changes to F′. Or it might find the differences between F and F′ and the differences between F and F″, merge those changes, and then apply the merged changes to F. In both solutions, the intent is to make it appear that Bob actually made his modifications to F′, not to F;that is, to make it appear that Alice’s and Bob’s modifications ranserially. We will see that this is an instance of a general problemthat arises in TP when independent transactions modify different copiesof the same data, in this case different copies of F. We discuss a variety of general-purpose solutions to the problem in Section 9.5, Multimaster Replication. Those solutions don’t solve the problem for configuration management per se,but they have the same property of identifying independent and henceconflicting changes and requiring that they be merged together in anapplication-specific way.