This project provides a lightweight set of utilities that make it simple to implement parallelized data processing systems. Data objects flowing through the pipeline are processed by a series of independent user-defined components called Stages . A pipeline may have a number of different branches of execution, each of which is a fully qualified Pipeline in its own right.
The Stage is the primary unit of execution in a processing pipeline. Each Stage has a well defined lifecycle that passes through setup, processing, and cleanup states. When a pipeline starts running, each Stage runs its setup method and then waits for an object to be passed to its process() method. This method is implemented by the user to provide some sort of useful operation on the data, after which the object or products of the processing can be passed to subsequent stages and/or branches.
A stage can act as a filter for data, or may have more complex responsibilities such as finding and allocating resources for subsequent stages.
Several threading models are available to choose from including synchronous processing (each object input to the pipeline passes through the entire pipeline before the next is processed), and a couple of different multithreaded processing models that run each stage in either a single dedicated thread or using a pool of threads to perform processing. The threading model chosen for each stage is independent of both the stage implementation and of the threading models used for other stages, providing a great deal of flexibility when it comes to configuring the processing system. Threading models are defined by implementations of the StageDriver interface.
In addition to sequential processing of objects, a simple event model is provided to enable asynchronous communication between stages in the pipeline and its branches.
This project has no released versions.
The JavaDoc API documentation is available online.