Monday, June 6, 2011

BPM Data Model Architecture

One of the primary assets of a BPM suite is how it manages the process data. Execution, monitoring and improving the process would heavily depend on the process data which the process engine manages. Process data in BPM suites are persisted as two flavors. Some suites store them as BLOBs and some products save the data in a relation data model. This article explores the two possible routes BPM vendors undertake.

A BPM suite when stores the process data using relational data model would give the technical users great insight and transparency into the process data. Also, users will have the ability to track the process data against each step of execution of the process. Tracking and reporting of historical process data would be a cake walk and the required data can be just pulled using a SQL query.

Having a relational data model for process data also means, that a third party Business Intelligence tool can be plugged into the process database for advanced BI activities. Also, versioning and in-flight instance management of process could be better handled by the BPM suite. The performance of the process engine itself would be greatly enhanced, since the engine is not required to parse the BLOBs to act on the process data, rather it has to just fetch from the database.

Typically, in relational data model, a process is represented as a database table. When a new process is created, a new database table will be created and would be associated with the process. So as the number of business processes keeps growing, so are the database tables, and so does the data model which also grows when the processes keeps increasing. Unlike, when the process data is stored as BLOBs a huge monolithic block of a single database table grows, with data model being frozen, highly impacting the database performance.

Moreover, the BPM product architecture would be highly scalable for the future demands of BPM features. For example, if a new or existing feature of a XPDL (XML Process Definition Language) schema needs to be implemented; having the process data in database instead of BLOBs would greatly enhance the ease of implementation.

Process data stored as BLOBs does not hold any significant advantage over relation data model. So, why some BPM suites went with this approach? It may be best answered by the BPM product vendors who choose to store process data as BLOBs.

No comments: