scispace - formally typeset
Search or ask a question

Showing papers in "Journal of Database Management in 2003"


Journal ArticleDOI
TL;DR: This paper presents a framework for managing data quality in such environments using the information product approach and posits the notion of a virtual business environment to support dynamic decision-making and describes the role of the data quality framework.
Abstract: Large data volumes, widely distributed data sources, and multiple stakeholders characterize typical e-business settings. Mobile and wireless technologies have further increased data volumes, further distributed the data sources, while permitting access to data anywhere, anytime. Such environments empower and necessitate decision-makers to act/react quicker to all decision-tasks including mission-critical ones. Decision-support in such environments demands efficient data quality management. This paper presents a framework for managing data quality in such environments using the information product approach. It includes a modeling technique to explicitly represent the manufacture of an information product, quality dimensions and methods to compute data quality of the product at any stage in the manufacture, and a set of capabilities to comprehensively manage data quality and implement total data quality management. The paper also posits the notion of a virtual business environment to support dynamic decision-making and describes the role of the data quality framework in this environment.

195 citations


Journal ArticleDOI
TL;DR: Why ontological theories can be used to inform conceptual modelling research, practice, and pedagogy is discussed and how a particular ontological theory has enabled me to improve my understanding of certain conceptual modelling practices and grammars is described.
Abstract: Conceptual modelling is an activity undertaken during information systems development work to build a representation of selected semantics about some real-world domain. Ontological theories have been developed to account for the structure and behavior of the real world in general. In this paper, I discuss why ontological theories can be used to inform conceptual modelling research, practice, and pedagogy. I provide examples from my research to illustrate how a particular ontological theory has enabled me to improve my understanding of certain conceptual modelling practices and grammars. I describe, also, how some colleagues and I have used this theory to generate several counter-intuitive, sometimes surprising predictions about widely advocated conceptual modelling practices - predictions that subsequently were supported in empirical research we undertook. Finally, I discuss several possibilities and pitfalls I perceived to be associated with our using ontological theories to underpin research on conceptual modelling.

114 citations


Journal ArticleDOI
TL;DR: It is concluded that the transformation operations in UML can automate a substantial part of both forward and reverse engineering.
Abstract: The Unified Modeling Language (UML) provides various diagram types for describing a system from different perspectives or abstraction levels. Hence, UML diagrams describing the same system are dependent and strongly overlapping. In this paper we study how this can be exploited for specifying transformation operations between different diagram types. We discuss various general approaches and viewpoints of model transformations in UML. The source and target diagram types for useful transformations are analyzed and given categories. The potentially most interesting transformation operations are discussed in detail. It is concluded that the transformation operations can automate a substantial part of both forward and reverse engineering. These operations can be used, for example, for model checking, merging, slicing, and synthesis.

46 citations


Journal ArticleDOI
TL;DR: The comprehensive, unambiguous treatment of this basic electronic commerce process is formal, yet intuitive and clear, suggesting that OPM is a prime candidate for becoming a common standard vehicle for defining, specifying, and analyzing electronic commerce and supply chain management systems.
Abstract: Object-Process Methodology (OPM) is a system development and specification approach that combines the major system aspects-function, structure and behavior-within a single graphic and textual model. Having applied OPM in a variety of domains, this chapter specifies an electronic commerce system in a hierarchical manner, at the top of which are the processes of managing a generic product supply chain before and after the product is manufactured. Focusing on the post-product supply chain management, we gradually refine the details of the fundamental, almost "classical" electronic commerce interaction between the retailer and the end-customer, namely payment over the Internet using the customer's credit card. The specification results in a set of Object-Process Diagrams and a corresponding equivalent set of Object-Process Language sentences. The synergy of combining structure and behavior within a single formal model, expressed both graphically and textually, yields a highly expressive system modeling and specification tool. The comprehensive, unambiguous treatment of this basic electronic commerce process is formal, yet intuitive and clear, suggesting that OPM is a prime candidate for becoming a common standard vehicle for defining, specifying, and analyzing electronic commerce and supply chain management systems.

42 citations


Journal ArticleDOI
TL;DR: This work proposes a comprehensive methodology called IAIS (Inter Agency Information Sharing) that uses XML to facilitate the definition of information that needs to be shared, the storage of such information, the access to this information and finally the maintenance of shared information.
Abstract: Recently, there has been increased interest in information sharing among government agencies, with a view toward improving security, reducing costs and offering better quality service to users of government services. In this work, the authors complement earlier work by proposing a comprehensive methodology called IAIS (Inter Agency Information Sharing) that uses XML to facilitate the definition of information that needs to be shared, the storage of such information, the access to this information and finally the maintenance of shared information. The authors compare IAIS with two alternate methodologies to share information among agencies, and analyze the pros and cons of each. They also show how IAIS leverages the recently proposed XML (extensible markup language) standard to allow for inclusion of various groups’ viewpoints when determining what information should be shared and how it should be structured.

40 citations


Journal ArticleDOI
TL;DR: An evaluation framework is made of an evaluation framework, which highlights the extent to which a methodology is component oriented, and the improved approach suggests the use of the standard RM-ODP as an underlying framework, to provide consistent, systematic, and integrated CBD methodology engineering support throughout the lifecycle.
Abstract: Components are already prominent in the implementation and deployment of advanced distributed information systems. Part and parcel of this development are the effective Component Based Development (CBD) methodology encompassing methods, tools and techniques that effectively target the existing component based technology. Current CBD methodologies lack a comprehensive component-based concept structure. They handle components mainly at the implementation and deployment phases still, which are heavily influenced by UML notations. In this paper a presentation is made of an evaluation framework, which highlights the extent to which a methodology is component oriented. Current CBD methods and approaches do not provide full support for various component concepts. Therefore, a CBD method sample was evaluated using the framework’s concepts and requirements. CBD method improvements are proposed based on the evaluation. The improved approach suggests the use of the standard RM-ODP as an underlying framework, to provide consistent, systematic, and integrated CBD methodology engineering support throughout the lifecycle.

35 citations


Journal ArticleDOI
TL;DR: If formal methods are to be given the place they deserve within UML, a more precise description of UML must be developed, and recent attempts to provide such a description are surveyed.
Abstract: Formal methods, whereby a system is described and/or analyzed using precise mathematical techniques, is a well -established and yet, under-used approach for developing software systems. One of the reasons for this is that project deadlines often impose an unsatisfactory development strategy in which code is produced on an ad-hoc basis without proper thought about the requirements and design of the piece of software in mind. The result is a large, often poorly documented and un-modular monolith of code, which does not lend itself to formal analysis. Because of their complexity, formal methods work best when code is well structured, e.g., when they are applied at the modeling level. UML is a modeling language that is easily learned by system developers and, more importantly, an industry standard, which supports communication between the various project stakeholders. The increased popularity of UML provides a real opportunity for formal methods to be used on a daily basis within the software lifecycle. Unfortunately, the lack of preciseness of UML means that many formal techniques cannot be applied directly. If formal methods are to be given the place they deserve within UML, a more precise description of UML must be developed. This article surveys recent attempts to provide such a description, as well as techniques for analyzing UML models formally.

32 citations


Journal ArticleDOI
TL;DR: This paper empirically explores whether data protection provided by perturbation techniques adds a so-called data mining bias to the database and finds initial support for the existence of this bias.
Abstract: Data perturbation is a data security technique that adds ‘noise’ to databases allowing individual record confidentiality. This technique allows users to ascertain key summary information about the data that is not distorted and does not lead to a security breach. Four bias types have been proposed which assess the effectiveness of such techniques. However, these biases only deal with simple aggregate concepts (averages, etc.) found in the database. To compete in today’s business environment, it is critical that organizations utilize data mining approaches to discover additional knowledge about themselves ‘hidden’ in their databases. Thus, Database Administrators are faced with competing objectives: protection of confidential data versus data disclosure for data mining applications. This paper empirically explores whether data protection provided by perturbation techniques adds a so-called Data Mining Bias to the database. The results find initial support for the existence of this bias.

29 citations


Journal ArticleDOI
TL;DR: The purpose of this paper is to introduce an alternative quantitative approach to information security evaluation that is suitable for information resources that are potential targets of intensive professional attacks.
Abstract: Simple probability theory is not a good basis for security in the case of high-stakes information resources where these resources are subject to attacks upon national infrastructure or to battle space illumination. Probability theory induces us to believe that one cannot totally rule out all probabilities of an intrusion. An alternative theoretical base is possibility theory. While persistent, well-supported, and highly professional intrusion attacks will have a higher probability of success, operating instead against the possibility of intrusion places defenders in a theoretical framework more suitable for high-stakes protection. The purpose of this paper is to introduce an alternative quantitative approach to information security evaluation that is suitable for information resources that are potential targets of intensive professional attacks. This approach operates from the recognition that information resource security is only an opinion of officials responsible for security.

25 citations


Journal ArticleDOI
TL;DR: It is argued that temporal dynamics are semantic rather than structural and that the existing constructs in the E-R model are sufficient to represent them, which supports methodologies that leverage narrative and human cognitive processing capabilities in the development and verification of data models.
Abstract: Research in temporal database management has viewed temporal dynamics from a structural perspective, posing extensions to the Entity-Relationship (E-R) model to represent the state history of time-dependent attributes and relationships. We argue that temporal dynamics are semantic rather than structural and that the existing constructs in the E-R model are sufficient to represent them. Practitioners have long used E-R models without temporal extensions to design systems with rich support for temporality by modeling both things and events as entities — a practice that is consistent with the original presentation of the E-R model. This approach supports methodologies that leverage narrative and human cognitive processing capabilities in the development and verification of data models. Furthermore it maintains modeling parsimony and facilitates the representation of causality — why a particular state exists.

21 citations


Journal ArticleDOI
TL;DR: The empirical evidence presented indicates that the organizations with higher levels of centralized IT authority are likely to implement a more centralized data warehousing approach.
Abstract: Information systems (IS) strategic planners debate what is the most appropriate data warehouse (DW) topology for an organization. The primary question is whether to start DW projects with enterprise-wide data warehouses (EDWs) or with smaller-scale data marts (DMs). This article examines the relationship between modes of IT governance and DW topology to determine whether or not the implementation differences in DW topology can be described by differences in IT governance arrangements. Three primary modes of IT governance–centralized, decentralized, and hybrid – were used to arrange key IT activities. A replicated case study approach coupled with a research survey was used to provide a comprehensive understanding of the relationship between modes of IT governance and DW topology. Utilizing information from six organizations, the empirical evidence presented indicates that the organizations with higher levels of centralized IT authority are likely to implement a more centralized data warehousing approach. Key implications for theory and practice are discussed.

Journal ArticleDOI
TL;DR: This paper has developed a model for accurate and fast data recovery from information attacks in order to reduce denial-of-service while providing consistent values of data items.
Abstract: The survivability of database systems in case of information attacks depends exclusively on the logging mechanism. The recovery methods specifically designed for recovery from information attacks require that the log must record all operations of every transaction and that the log should never be purged, thus incurring enormous growth of the log. In this paper, we have developed a model for accurate and fast data recovery from information attacks in order to reduce denial-of-service while providing consistent values of data items. Our model divides the log based on transaction relationships and stores each segment as a separate file, which can then be accessed independently as required. We have proved that only one of these segments will be accessed during damage assessment and recovery process. Appropriate damage assessment and recovery methods are also presented. Through simulation we have validated that our model significantly reduces recovery time.

Journal ArticleDOI
TL;DR: This paper shows how to use the X-TIME methodology to build cooperative environments for B2B platforms involving the integration of Web data and services and allows the creation of adaptable semantics oriented metamodels to facilitate the design of wrappers or reconciliators (mediators).
Abstract: This paper presents a Web-based data integration methodology and tool framework, called X-TIME, for the development of business-to-business (B2B) design environments and applications. X-TIME provides a data model translator toolkit based on an extensible metamodel and XML. It allows the creation of adaptable semantics oriented metamodels to facilitate the design of wrappers or reconciliators (mediators) by taking into account several characteristics of interoperable information systems such as extensibility and composability. X-TIME defines a set of meta-types for representing meta-level semantic descriptors of data models found in the Web. The meta-types are organized in a generalization hierarchy to capture semantic similarities among modeling concepts of interoperable systems. We show how to use the X-TIME methodology to build cooperative environments for B2B platforms involving the integration of Web data and services.

Journal ArticleDOI
TL;DR: The paper proposes that any implementation of integration of digital signatures with relational databases needs to consider the organizational policies and the related rules, to ensure a successful integration and implementation.
Abstract: This paper explores the nature and scope of integration of digital signatures with relational databases. Such as integration is essential if Internet commerce is to succeed. While evaluating the pros and cons of the integration and the related technologies, this paper identifies challenges, both technological and organizational. The paper proposes that any implementation needs to consider the organizational policies and the related rules. Careful consideration of these aspects will ensure a successful integration and implementation.

Journal ArticleDOI
TL;DR: This paper proposes a novel approach that combines the reactive rule-based scheme of an active database management system (ADBMS) with the technology of digital watermarking to automatically protect digital data.
Abstract: In the past decade, the business community has embraced the capabilities of the Internet for a multitude of services that involve access to data and information. Of particular concern to these businesses have been the protection and authentication of digital data as it is distributed electronically. This paper proposes a novel approach that combines the reactive rule-based scheme of an active database management system (ADBMS) with the technology of digital watermarking to automatically protect digital data. The ADMBS technology facilitates the establishment of Event-Condition-Action (ECA) rules that define the actions to be triggered by events under certain conditions. These actions consist of the generation of unique watermarks and the tagging of digital data with unique signatures. Watermarking is a technology that embeds, within the digital data’s context, information identifying its owner and/or creator. The integration of these two technologies is a powerful mechanism for protecting digital data in a consistent and formal manner with applications in e-business in establishing and authenticating the ownership of images, audio, video, and other digital materials.

Journal ArticleDOI
TL;DR: This work focuses on DBagents, which are wrapped into Java agents that provide various mechanisms for security and migration and can be used to perform schema updates on distributed databases as well as to extract information in order to create and refresh data warehouses.
Abstract: Database agents, in our context also called DBagents, can be utilized to establish a federated information base by integrating heterogeneous databases. Agents are especially well suited to also address the highly relevant issue of security. During the process of migration, DBagents are wrapped into Java agents that provide various mechanisms for security and migration. This special architecture can be used to perform schema updates on distributed databases as well as to extract information in order to create and refresh data warehouses. Moreover, queries that require data which is too detailed for data warehouses can be answered by propagating them from the data warehouse to the underlying operational databases wrapped in DBagents. Furthermore, a federation connected by DBagents is much more flexible; new databases may join or existing ones may leave the federation even while queries are executed.

Journal ArticleDOI
TL;DR: A prototype device driver, RORIB (Real-time Online Remote Information Backup) is presented and an experiment is conducted comparing the performance, in terms of response time, of the prototype and several current backup strategies.
Abstract: Data plays an essential role in business today. Most, if not all, E-business applications are database driven, and data backup is a necessary element of managing data. Backup and recovery techniques have always been critical to any database, and as real-time databases are used more often, real-time online backup strategies become critical to optimize performance. In this paper, current backup methods are discussed and evaluated for response time and cost. A prototype device driver, RORIB (Real-time Online Remote Information Backup) is presented and discussed. An experiment is conducted comparing the performance, in terms of response time, of the prototype and several current backup strategies. RORIB provides an economic and efficient solution for real-time online remote backup. Significant improvement in response time is demonstrated using this prototype device driver when compared to other types of software-driven backup protocols. Another advantage of RORIB is that the cost is negligible when compared to other hardware solutions for backup, such as Storage Area Networks (SANs) and Private Backup Networks (PBNs). Additionally, this multi-layered device-driver uses TCP/IP (Telecommunications Protocol/Internet Protocol) which allows the driver to be a “drop in†filter between existing hardware layers and thus reduces the implementation overhead and improves portability. Linux is used as the operating system in this experiment because of its open source nature and its similarity to UNIX. This also increases the portability of this approach. The driver is transparent to both the user and the database management system. Other potential applications and future research directions for this technology are presented.

Journal ArticleDOI
TL;DR: A federated process framework and its system architecture are proposed that provide a conceptual design for effective implementation of process information sharing supporting the autonomy and agility of the organizations.
Abstract: Process information sharing is a beneficial tool through which participating organizations in a virtual enterprise can improve their customer services and business performance. However, autonomy and agility of the organizations have placed limitations in the development of process information sharing, which the previous research has not satisfactorily addressed. This paper proposes a federated process framework and its system architecture that provide a conceptual design for effective implementation of process information sharing supporting the autonomy and agility of the organizations. First, in terms of autonomy, the federated process framework supports a flexible sharing policy to control the amount of shared data so that the framework can be applied to a wide variety of practical situations from loosely-coupled cases to tightly-coupled cases. Second, in terms of agility, the system architecture based on the federated process framework supports the entire life cycle of process information sharing by allowing sufficient adaptability to the changes of business environments. We develop the framework using an object-oriented database and Extensible Markup Language to accommodate all the constructs and their interactions within object-oriented message exchange model.