Workflow Technology has become over the last two decades the cornerstone of modern application systems, in particular those built upon Service Oriented Architecture (SOA). It helps implement business processes that can be easily adapted to the changing needs of a dynamic environment and provides the base for the two-level application development paradigm. Workflow Management Systems (WfMS) deliver the functions of workflow technology; they have become a critical middleware component whose performance characteristics are significantly impacting the overall performance of the applications that have been built. In the thesis, a set of optimization techniques for a state-of-the-art WfMS have been developed that delivers the required robustness with the best achievable performance. This WfMS has been labeled Stuttgarter Workflow Maschine (SWoM) to emphasize its birth place. Several novel approaches have been developed to achieve the desired goal : (1) the concept of transaction flows has been developed as the base for a flow optimizer that significantly improves the performance of the workflow engine, (2) the notion of caching is driven into virtually all areas of the different components that make up the SWoM, (3) the exploitation and tie-in into the underlying infrastructure for optimal resource exploitation, where the infrastructure, IBM WebSphere for the application server and IBM DB2 as the database environment, provides for the necessary robustness, and (4) a flow optimizer optimizes the execution of the transaction flows with respect to cache, database, and CPU cycle usage based on user recommendations and statistical information that the SWoM collects during execution. These optimization techniques, that maintain middleware robustness, are complemented by a set of optimization techniques that improve performance by relaxing some of the stringent robustness requirements. All techniques have been validated using a simple, yet expressive benchmark that focuses on high-speed, message-based interactions, synchronous invocation, message transformation, and parallel execution. The benchmark results show that the SWoM scales almost linearly with respect to CPU load and the parallelism of requests. The maximum number of requests that the SWoM could obtain on a quad core CPU running Windows 7 64bit was more than 100 process instances per second.