scispace - formally typeset
Open AccessProceedings ArticleDOI

A component based approach to scientific workflow management

Nigel Baker, +4 more
- Vol. 583, Iss: 1, pp 155-160
TLDR
The issues of adopting a component product line based approach and the experiences of software reuse are discussed.
Abstract
CRISTAL is a distributed scientific workflow system used in the manufacturing and production phases of HEP experiment construction at CERN. The CRISTAL project has studied the use of a description driven approach, using meta-modelling techniques, to manage the evolving needs of a large physics community. Interest from such diverse communities as bio-informatics and manufacturing has motivated the CRISTAL team to re-engineer the system to customize functionality according to end user requirements but maximize software reuse in the process. The next generation CRISTAL vision is to build a generic component architecture from which a complete software product line can be generated according to the particular needs of the target enterprise. This paper discusses the issues of adopting a component product line based approach and our experiences of software reuse.

read more

Content maybe subject to copyright    Report

1
Available on CMS information server
CMS NOTE 2001/024
The Compact Muon Solenoid Experiment
Mailing address: CMS CERN, CH-1211 GENEVA 23, Switzerland
CMS Note
April 30, 2001
A Component Based Approach to Scientific Workflow
Management
J.-M. Le Goff, Z. Kovacs
CERN, Geneva, Switzerland
N. Baker, P. Brooks, R. McClatchey
Centre for Complex Cooperative Systems, Univ. West of England, Frenchay, Bristol BS16 1QY UK
Abstract
CRISTAL is a distributed scientific workflow system used in the manufacturing and production phases of HEP
experiment construction at CERN. The CRISTAL project has studied the use of a description driven approach,
using meta-modelling techniques, to manage the evolving needs of a large physics community. Interest from
such diverse communities as bio-informatics and manufacturing has motivated the CRISTAL team to re-engineer
the system to customize functionality according to end user requirements but maximize software reuse in the
process. The next generation CRISTAL vision is to build a generic component architecture from which a
complete software product line can be generated according to the particular needs of the target enterprise. This
paper discusses the issues of adopting a component product line based approach and our experiences of software
reuse.
Keywords: Components, Workflow Management, Multi-Layer Architectures, Meta-Objects

2
1. Introduction
As component technology gradually evolves and matures so system developers will gradually migrate from
systems composed of interoperable objects to those composed of interoperable components. One of the main
motivations for this migration is the potential of software reuse and its associated benefits of cost reduction and
time to market of software products. Component-based software development is concerned with constructing
software artifacts by assembling prefabricated configurable building blocks. However software reuse is
concerned with more than binary components. For many organizations it is the generation and application of
generic software assets that are reusable across a family of target products. Binary components are just one view
of the software development process. The creation and evolution of graphical models to visualize specific aspects
of software artifacts is another view. What is required is some software development process that couples these
high-level development approaches with implementation approaches. This paper opens with a brief discussion of
the context and motivations for this research followed by an outline of software product lines. The issues and the
team’s experience of software reuse are discussed. The final part of the paper concentrates on the divide between
object based modeling and component based development, which is preventing software reuse from reaching its
full potential.
2. Motivation
CRISTAL is a scientific workflow system[1] that is being used to control the production and assembly process of
the CMS Electromagnetic Calorimeter (ECAL) detector at CERN Geneva. Detector production is a collaborative
effort with production centres distributed across many institutes worldwide. The production process is unusual in
that only one final product is manufactured; however the types of parts from which it is assembled could consist
of many versions. The evolution of the detector will take many years and during this process the history of
versioned parts must be captured. The ultimate detector will be part of a high energy physics experiment
therefore collection and storage of manufacturing & production data is just as important as control of the process.
This stored data not only gives the “as built” view of the final system but has been designed as a warehouse to
provide “calibration views”, “maintenance views” and other views not yet conceived by the designers. It is these
specialized aspects which characterize CRISTAL as a scientific workflow system. A general workflow system is
used to coordinate and manage execution of the thousands of tasks and activities that occur in any complex
enterprise. Workflow management can be applied to diverse applications from banking to manufacturing. In each
case the system must be capable of describing and storing the tasks and activities of the domain to be automated,
executing and co-ordinating the tasks and storing the outcomes. CRISTAL has taken an object-oriented approach
describing all parts, manufacturing & production tasks, manufacturing & production data using meta-modeling
techniques. As a consequence of the uncertainty and specialized nature of the application the core meta-model of
the CRISTAL workflow system has the potential to be applied to almost any workflow or enterprise resource
management application. The motivation to achieve this potential has stemmed from requests to apply the core
CRISTAL technology to bio-informatics and general manufacturing domains. However a number of problems
remain to be solved in order to develop our workflow software to cope with the demands of such a diverse
product family range.
3. Software Product Lines
The product family problem is well known. A software product line [2,3] is a set of software systems that share a
common set of features that satisfy a specific market demand. The key idea is to build shared assets that can be
instantiated and combined to develop instances of the product line. Similar to a manufactured product line
software products will:-
Pertain to an application domain and market
Share an architecture and
Be built from reusable components
The application domain is reasonably clear in our particular example but the issues that surround a common
architecture and components are less so. The following explore these issues in further depth based on our
software engineering experiences.

3
3.1 Software Architecture
A product line software architecture[4] is the central artifact in product line engineering because it provides the
framework for developing and integrating shared assets and must be common to all the products. Naturally the
common user requirements map to the standard architecture but product specific requirements must map to
variations provided for by the architecture. It is these specific requirement variations that define the particular
product line. The problem is how to best manage and include mechanisms for this variation. [5] Discusses
methods to model and capture this variation. Standard computing mechanisms to cope with variation are:-
Alternate selection using “if then else” flow control
Alternate selection using parameters
And in object orientation the use of inheritance, delegation and meta-models.
In our experience in building workflow systems one of the benefits of meta-modeling is that with careful analysis
and use of descriptive classes a core generic software architecture can be developed to support almost any type of
workflow system. A discussion of the concepts and benefits of meta-modeling is not in the scope of this paper but
more details can be found in [6].
Applying the meta-modeling approach to a product line of workflow managers (that is workflow managers for
production of aircraft, cars, kitchens etc.) would necessitate describing the activities and items to configure the
architecture to the particular work flow manager in the product line. Compared with a more software component
based approach where the actual product line goes through a software build process where variational
components are linked in or omitted according to the features of the target product. The former approach makes
for a configurable adaptive architecture but the second is required to support a product line.
3.2 Reusable Components
Product line engineering practice advocates the generation and application of generic software assets that are
reusable across a family of target products. It suggests analyzing common and variable product characteristics to
define scope of reuse, identify reusable components with a suitable level of generality. The expected benefits are
production of quality cost-effective software, rapid application development and improved maintenance. It
emphasizes strategic planned reuse rather than opportunistic reuse. That is it is not just about libraries, class
hierarchies or configurable architectures. This is reuse at a very high level disconnected from implementation
issues. The following section discusses the implementation issues of software reuse and its application to
components and component based development and the final part of the paper attempts to link the two together.
4. Software Reuse
Software reuse is not a new concept as Figure 1 illustrates. Early efforts focused on small-grained reuse of
software code. Our experience over the past 10 years of building object-oriented systems has convinced us that
most reuse has come from higher-level design artifacts.
Figure 1: A History of Software Reuse.
1960’s
Subroutines
1970’s
Modules
1980’s
Objects
1990’s
Components
2000
Frameworks

4
Very little code has been reused, except class library reuse mainly confined to client-side user interfaces. So why
so little reuse at the code level? One explanation appears to be that the cost of creation and use of these small-
grained assets often outweighed the modest gains. But another important factor is that the underlying software
technology is moving so fast, especially true in software projects with long time scales. For example object
technology has witnessed, in a short space of time Smalltalk, ADA, C++, Java, EJB, COM+, Active X and OMG
CORBA.
Where we have experienced more success in reuse of software artifacts is with visual modeling languages such as
Object Modeling Technique (OMT) and the Unified Modeling Language (UML)[7]. The creation and evolution
of graphical models using UML has allowed us to specify, visualize, construct and document the artifacts of the
software systems we have built. Building UML models has provided a structure for problem solving and allowed
us to contemplate large-scale system problems. Derived from OMT, UML version 1.1 was adopted as an Object
Management Group (OMG) standard in November 1997 with a recent minor version, UML 1.3, adopted in
November 1999. Usually the great thing about standards is that there are lots to choose from. However in contrast
to the rapidly changing implementation software technology, UML is the universal OAD modeling standard used
by OMG member organizations and Microsoft. Perhaps because of this stability we have over the years been able
to reuse large-grained architectural frameworks and patterns which have been captured in UML. The term’s
pattern, framework, component are somewhat overloaded and the following subsections provide working
definitions and discuss reuse issue experiences.
4.1 Patterns
A Pattern[8] is a solution schema expressed in terms of objects & classes for recurring design problems within a
particular context. Patterns focus on reuse of abstract designs and software architecture, which is usually,
described using graphical modeling notation. So in UML this is specification is done using interaction, class and
object diagrams. The patterns that we have reused in the construction of our workflow management system[9]
have evolved out of years of proven design experience. Although made up of graphical diagrams the
documentation provides a vocabulary and concept understanding amongst the team. Documentation describes
heuristics for use and applicability although this is not modeled in UML. In the object oriented community well
known patterns are named, described and cataloged for reuse by the community as a whole. We have not only
used many well-known patterns but in the domain of workflow management discovered new patterns. It has
enabled us to make use of design patterns that were proven on previous projects and is a good example of reuse at
the larger grain level. UML diagrams are able to describe pattern structure but provides no support for describing
pattern behavior or any notation for the pattern template. UML 1.4, which is in draft stage, will enhance the
notation for patterns.
4.2 Frameworks
A framework is the term given to a more powerful and large grained object oriented reuse technique. It is a
reusable semi-complete application that can be specialized to produce custom applications [10]. It specifies a
reusable architecture for all or part of a system and may include reusable classes, patterns or templates.
Frameworks focus on reuse of concrete design algorithms and implementations in a particular programming
language. Frameworks can be viewed as the reification of families of design patterns. When specialized for a
particular application then it is called an application framework and Fayad[11] identifies three categories:
System Infrastructure where frameworks are applied to operating systems, network communications and
GUI’s.
Middleware applied to ORBs and transactions
Enterprise Frameworks which address domains such as telecommunications, business, manufacturing.
Framework requirements are defined by software vendors or standards organizations for example IBM’s San
Francisco Project, FASTech’ FACTORYworks, and Motorola’s CIM Baseline. Fingar[12] maintains that most
frameworks should capture workflows since they provide the necessary modeling capabilities for constructing
any business process. He states that workflow management is one of the elements common to all e-commerce
applications and is essential. Many proponents of frameworks go so far as to suggest that workflow mechanisms
should eliminate the need for most application programming in the workplace.
Frameworks can also be classified according to the techniques used to extend them. Whitebox frameworks rely
on OO language features such as inheritance and dynamic binding. Blackbox frameworks are structured and
extended using object composition and delegation.

5
Component frameworks are specialized frameworks that are designed to support components. D’Souza[13]
describes a component based framework as a collaboration in which all the components are specified with type
models; some of them may come with their own implementations. To use the framework you plug in components
that fulfill the specifications. Three main industrial examples of component frameworks are OMG’s Corba
Component Model (CCM) Enterprise Java Beans (EJB) and Microsoft’s COM+.
4.3 Components
Are defined as a package of software that can be independently replaced. It both provides and requires services
based on specified interfaces [13]. It conforms to architectural standards so that it can plug in and interoperate
with other components. The granularity of components can vary from an instance of single to many classes and
can be a significant part of a system, consistent with the goal of reuse. Unlike classes, components contain
implementation elements such as source, binary executable or scripts. Components are binary-replaceable things
and this distinction more than anything else sets them apart from classes. They package implementation and
because of interface based design can be replaced. This means that when a new variant of a component is created
it can replace a previous without recompiling other components, provided it conforms to the same interface.
Software developers can build applications by assembling components rather than designing and coding.
In order to gain the payoff in software reuse as advocated by product line engineering there some difficulties to
be resolved. Compared with classes and patterns which are modeled at analysis and design phase components are
modeled at implementation phase. This in our experience is a particular problem where we have so much
invested in graphical models. The essence of the problem is how does a collection of classes modeled at the UML
OAD level, become implementation components. This raises the follow-up question; is UML capable of
component modeling? Although components have become the de facto standard for desktop development this is
not the case for server development. In summary leading component architectures have matured and evolved to
support enterprise application. What is not clear is whether graphical modeling languages and tools can support
the leading component architectures to deliver the goals of product-line engineering. The following section
discusses these issues.
Figure 2: UML Component Notation
5. Modeling Components
A component in UML is a software artifact that exists at runtime. The notation for modeling components in UML
is shown in Figure 2. In the top part of the figure the long hand notation for a component is shown complete with
attributes and operations. This particular component is realized by two interfaces, interface One and interface
Two. Underneath is shown the shortened notation where almost all of the detail is hidden. The two interfaces of
the component are shown as so-called “lollipops”. UML components are typical found in implementation related
component diagrams and deployment diagrams.
The ability to model component frameworks is just as essential as being able to model components. Although
component frameworks vary they do conform to a common architectural pattern. Figure 3 adapted from
Kobryn[14] illustrates this common pattern using UML notation. The pattern is represented by the UML 1.4
ellipse with dashed perimeter and contains a number of classifiers. The client represents an entity that requests a
<<Interface>>
One
Component Name
Attributes
Operations
+ op1( )
<<Interface>>
Two

References
More filters
Book

Pattern-oriented Software Architecture: A System of Patterns

TL;DR: Patterns.
Book

Objects, Components, and Frameworks With Uml: The Catalysis Approach

TL;DR: This chapter discusses Model Frameworks as Templates, a model framework for component-based development, and its applications in architecture, management, and user experience.
Proceedings ArticleDOI

PuLSE: a methodology to develop software product lines

TL;DR: The PuLSETM (Product Line Software Engineering) methodology is developed for the purpose of enabling the conception and deployment of software product lines within a large variety of enterprise contexts and captures and leverages the results from the technology transfer activities with industrial customers.
Journal ArticleDOI

Component-based frameworks for e-commerce

TL;DR: To achieve coherence and manage the complexity and change inherent in multiple e-commerce applications, an overarching structure is needed—an application architecture to provide companies with the speed and agility they need to compete in Internet time.
Proceedings ArticleDOI

A systematic approach to derive the scope of software product lines

TL;DR: PuLSE-Eco, a technique especially developed to address the aforementioned issues, is introduced, with a complete product-centric orientation done via product maps, the separation of concerns achieved through the definition and operationalization of strategic business objectives, and diverse types of analyses performed upon product maps allowing scoping decisions based on these objectives.
Related Papers (5)