scispace - formally typeset
Search or ask a question

Showing papers in "Journal of Digital Information Management in 2007"


Journal Article
TL;DR: Study of the effects of various mobility models on the performance of two routing protocols Dynamic Source Routing (DSR- Reactive Protocol) and Destination-Sequenced Distance- Vector (DSDV-Proactive protocol) shows that performance of the routing protocol varies across different mobility models, node densities and length of data paths.
Abstract: A Mobile Ad-Hoc Network (MANET) is a self- configuring network of mobile nodes connected by wire- less links to form an arbitrary topology without the use of existing infrastructure. In this paper, we have studied the effects of various mobility models on the performance of two routing protocols Dynamic Source Routing (DSR- Reactive Protocol) and Destination-Sequenced Distance- Vector (DSDV-Proactive Protocol). For experiment pur- poses, we have considered four mobility scenarios: Ran- dom Waypoint, Group Mobility, Freeway and Manhattan models. These four Mobility Models are selected to rep- resent possibility of practical application in future. Per- formance comparison has also been conducted across varying node densities and number of hops. Experiment results illustrate that performance of the routing protocol varies across different mobility models, node densities and length of data paths.

177 citations


Journal Article
TL;DR: This work benchmarked five alternative key phrase extraction methods, TFIDF, KEA, Keyword, Keyterm, and Mixture, in an automatic Web site summarization framework and demonstrates that Keyterm is the best choice for keyphrase extraction while Mixture should be used to obtain key sentences.
Abstract: Web Site Summarization is the process of automatically generating a concise and informative summary for a given Web site. It has gained more and more attention in recent years as effective summarization could lead to enhanced Web information retrieval systems such as searching for Web sites. Extraction-based approaches to Web site summarization rely on the extraction of the most significant sentences from the target Web site based on the density of a list of key phrases that best describe the entire Web site. In this work, we benchmark five alternative key phrase extraction methods, TFIDF, KEA, Keyword, Keyterm, and Mixture, in an automatic Web site summarization framework we previously developed. We investigate the performance of these underlying methods via a formal user study and demonstrate that Keyterm is the best choice for key phrase extraction while Mixture should be used to obtain key sentences. We also discuss why one method performs better than another and what could be done to further improve the summarization system.

30 citations


Journal Article
TL;DR: A novel measure to evaluate the information content of a descriptor in terms of variance is introduced to reveal redundancies between state-of-the-art audio features and MPEG-7 audio descriptors.
Abstract: In this paper we perform statistical data analysis of a broad set of state-of-the-art audio features and low-level MPEG-7 audio descriptors. The investigation comprises data analysis to reveal redundancies between state-of-the-art audio features and MPEG-7 audio descriptors. We introduce a novel measure to evaluate the information content of a descriptor in terms of variance. Statistical data analysis reveals the amount of variance contained in a feature. It enables identification of independent and redundant features. This approach assists in efficient selection of orthogonal features for content-based retrieval. We believe that a good feature should provide descriptions with high variance for the underlying data. Combinations of features should consist of decorrelated features in order to increase expressiveness of the descriptions. Although MPEG-7 is a popular and widely used standard for multimedia description, only few investigations do exist that address analysis of the data quality of low-level MPEG-7 descriptions.

26 citations


Journal Article
TL;DR: In the application domain of authoring of digital photo albums, this paper shows where and how semantics emerge in the authoring process and how these contribute to an even richer media content pool, both for single media and also for the composition and its later use.
Abstract: Authoring of personalized multimedia content can be considered as a process consisting of selecting, composing, and assembling media elements into coherent multimedia presentations that meet the user’s or user group’s preferences, interests, current situation, and environment. In the approaches we find today, media items and semantically rich metadata information are used for the selection and composition task. However, most valuable semantics for the media elements and the resulting multimedia content that emerge with and in the authoring process are not considered any further. This means that the effort for semantically enriching media content comes to a sudden halt in the created multimedia document – which is very unfortunate. In this paper, we propose with the SemanticMM4U framework an integrated approach for deriving and exploiting multimedia semantics that arise with and from the creation of personalized multimedia content and make it available for further use and applications. In this approach, not only the metadata are considered that are semantically evolving from the media elements of the newly created presentation. Also the actual usage of media elements for the authoring can emerge new semantics of the single media elements employed. In the application domain of authoring of digital photo albums we show where and how semantics emerge in the authoring process and how these contribute to an even richer media content pool, both for single media and also for the composition and its later use.

20 citations


Journal Article
TL;DR: The Infocious Web search engine as discussed by the authors improves the way people find information on the Web by resolving ambiguities present in natural language text, which is achieved by performing linguistic analysis on the content of the Web pages we index.
Abstract: In this paper we present the Infocious Web search engine [23]. Our goal in creating Infocious is to improve the way people find information on the Web by resolving ambiguities present in natural language text. This is achieved by performing linguistic analysis on the content of the Web pages we index, which is a departure from existing Web search engines that return results mainly based on keyword matching. This additional step of linguistic processing gives Infocious two main advantages. First, Infocious gains a deeper understanding of the content of Web pages so it can better match users' queries with indexed documents and therefore can improve relevancy of the returned results. Second, based on its linguistic processing, Infocious can organize and present the results to the user in more intuitive ways. In this paper we present the linguistic processing technologies that we incorporated in Infocious and how they are applied in helping users find information on the Web more efficiently. We discuss the various components in the architecture of Infocious and how each of them benefits from the added linguistic processing. Finally, we experimentally evaluate the performance of a component which leverages linguistic information in order to categorize Web pages.

16 citations



Journal Article
TL;DR: Results show that BBN was more effective than the other two methods in risk prediction, and compared with metrics such as Lines of code and Cyclomatic complexity, Halstead program difficulty, Number of executable statements and Halstead programs volume are the more effective metrics as risk predictors.
Abstract: The software systems which are related to national projects are always very crucial This kind of systems always involves hi-tech factors and has to spend a large amount of money, so the quality and reliability of the software deserve to be further studied Hence, we propose to apply three classification techniques most used in data mining fields: Bayesian belief networks (BBN), nearest neighbor (NN) and decision tree (DT), to validate the usefulness of software metrics for risk prediction Results show that comparing with metrics such as Lines of code (LOC) and Cyclomatic complexity (V(G)) which are traditionally used for risk prediction, Halstead program difficulty (D), Number of executable statements (EXEC) and Halstead program volume (V) are the more effective metrics as risk predictors By analyzing we also found that BBN was more effective than the other two methods in risk prediction

13 citations


Journal Article
TL;DR: A method is proposed, based on data mining algorithms, which allows one to infer the "normal behavior" of objects, by extracting frequent "rules" from a given dataset by Extracting frequent " rules" from the dataset by using association rules.
Abstract: Anomaly detection problems have been investigated in several research areas such as database, machine learning, knowledge discovery, and logic programming, with the main goal of identifying objects of a given population whose behavior is anomalous with respect to a set of commonly accepted rules that are part of the knowledge base. In this paper we focus our attention on the analysis of anomaly detection in databases. We propose a method, based on data mining algorithms, which allows one to infer the "normal behavior" of objects, by extracting frequent "rules" from a given dataset. These rules are described in the form of quasi- functional dependencies and mined from the dataset by using association rules. Our approach allows us to consequently analyze anomalies with respect to the previously inferred dependencies: given a quasi-functional dependency, it is possible to discover the related anomalies by querying either the original database or the association rules previously stored. By further investigating the nature of such anomalies, we can either derive the presence of erroneous data or highlight novel information which represents significant exceptions of frequent rules. Our method is independent of the considered database and directly infers rules from the data. The applicability of the proposed approach is validated through a set of experiments on XML databases, whose results are here reported.

10 citations


Journal Article
TL;DR: This paper surveys most approaches used to solve the problem of Network -on-Chip architecture with different goals and describes the approach based on a genetic algorithm and its implementation and its design objective.
Abstract: In order to improve the performance of current embedded systems, Network -on-Chip (NoC) offers many advantages, especially in terms of flexibility and low cost. Applications require more and more intensive computations, especially multimedia applications such as video encoding. Developing product and application using such an architecture offers many challenges and opportunities. Many tools will be required to develop a NoC architecture for a specific application. A tool which can map and schedule an application or a set of applications to a given NoC architecture will be essential and must be able to satisfy many relative trade-offs (real-time, performance, low power consumption, time to market, re-usability, cost, area, etc). In this paper, we survey most approaches used to solve this problem with different goals. We then describe our approach based on a genetic algorithm and its implementation and its design objective.

10 citations


Journal Article
TL;DR: This paper provides the control of trains in a sector of moving block interlocking system using the approach of promotion, which is the approach used to link the local state with a global state in Z specifications.
Abstract: Railway interlocking system is a safety critical system. Its failure can cause the loss of human life, severe injuries and loss of money. Therefore the complication of this type of system requires advanced methodologies, which provide complete security and quality of a system. One way of achieving this goal is by using formal methods, which are mathematically based languages, techniques and tools used for specifying and verifying such systems. This paper provides the control of trains in a sector of moving block interlocking system using the approach of promotion. The promotion is the approach used to link the local state with a global state in Z specifications. The control comprises three components, i.e. sector, trains and security of a train in a sector.

9 citations


Journal Article
TL;DR: A general statistical framework is developed based on a model of how the individual rankers depend on the ground truth ranker, and how noise level and the misinformation of the rankers aect the performance of the aggregate ranker.
Abstract: The rank aggregation problem has been studied extensively in recent years with a focus on how to combine several dierent rankers to obtain a consensus aggregate ranker. We study the rank aggregation problem from a dierent perspective: how the individual input rankers impact the performance of the aggregate ranker. We develop a general statistical framework based on a model of how the individual rankers depend on the ground truth ranker. Within this framework, one can generate synthetic data sets and study the performance of dierent aggregation methods. The individual rankers, which are the inputs to the rank aggregation algorithm, are statistical perturbations of the ground truth ranker. With rigorous experimental evaluation, we study how noise level and the misinformation of the rankers aect the performance of the aggregate ranker. We introduce and study a novel

Journal Article
TL;DR: This paper presents a framework for achieving context- based multimedia semantic annotation and organisation towards personalised learning by organising existing multimedia resources based on contexts towards providing a truly personalised eLearning experience.
Abstract: Recent developments of the Internet and the World Wide Web (WWW) have resulted in the proliferation of multi-media resources and learning with existing multimedia web resources on the Web is becoming more prevalent and important. While recent standardization efforts in eLearning such as LOM, SCORM, and IMS Learning Design work towards learning content description, packaging, and delivery; existing eLearning solutions still lack the ability to adequately use multi-media resources to provide a learner with personalised learning resources. Effective use of multimedia for web-based learning provides quality interactive learning experience, but current techniques does not adequately provide a semantic approach for organising multimedia resources. With evolving trend in learning through the use of web technology, eLearning systems are expected to provide personalised learning resources for effective learning. We have accordingly proposed a way of organising existing multimedia resources based on contexts towards providing a truly personalised eLearning experience. This paper presents a framework for achieving context- based multimedia semantic annotation and organisation towards personalised learning. Categories and Subject Descriptors H. 5.1(Multimedia Information Systems); Audio Input/Output: K.3 (Computers and Education)

Journal Article
TL;DR: An implementation of the inverse function of the Advanced Encryption Standard using the field of prime numbers instead of the Galois Field originally proposed by Rijndael is proposed, showing that the former approach is simpler and requires less execution time and implementation circuitry compared to the latter.
Abstract: This paper proposes an implementation of the inverse function of the Advanced Encryption Standard using the field of prime numbers instead of the Galois Field originally proposed by Rijndael The paper will show that the former approach is simpler and requires less execution time and implementation circuitry compared to the latter The authors analyzed several implementations of the inverse function for the S-Box using various approaches in search for an optimal one In particular, simulation was used to analyze performances of algorithms for computing the inverse function based on: the arithmetic modulo a power-of-two; arithmetic modulo a power-of-two plus one; and arithmetic modulo a prime number The simulation revealed that the modulo a prime number approach has the best performance Furthermore, the analysis revealed that using this approach may enhance security relative to the original approach The proposed implementation will provide a better alternative that can be embedded in many systems Categories and Subject Descriptors E3 (Data Encryption); F2 (Analysis of Algorithms and Problem Complexity) G1 (Numerical Analysis); Computer arithmetic

Journal Article
TL;DR: A novel approach to clustering using a simple accuracy-based Learning Classifier System with a modification to the original YCS fitness function has been found to improve the identification of less-separated data sets.
Abstract: This paper presents a novel approach to clustering using a simple accuracy-based Learning Classifier System with a modification to the original YCS fitness function has been found to improve the identification of less-separated data sets. Our approach achieves this by exploiting the evolutionary computing and reinforcement learning techniques inherent to such systems. The purpose of the work is to develop an approach to learning rules which accurately describe clusters without prior assumptions as to their number...

Journal Article
TL;DR: A knowledge-based approach that employs category theoretic models to formalize and mechanize object-oriented software design and synthesis by focusing concern on reasoning about the interdependency relationships at different levels of abstraction and granularity is proposed.
Abstract: To reuse previous knowledge of object-oriented design and adapt them to solve new problems, the collaboration relationships and the responsibility distribution among software objects need to be thoroughly understood and precisely formulated. The paper proposes a knowledge-based approach that employs category theoretic models to formalize and mechanize object-oriented software design and synthesis by focusing concern on reasoning about the interdependency relationships at different levels of abstraction and granularity. The major benefit of our approach is twofold: first, it provides an explicit semantics for formal object-oriented specifications, and therefore enables a high-level of reusability and dynamic adaptability. Second, it utilizes the ability of categorical computations to support automated software composition and refinement. A prototype tool that demonstrates the feasibility and effective of our approach is also presented

Journal Article
TL;DR: This work proposes using a labeling scheme to encode each element in the XML database by its positional information, and proposes the TwigINLAB algorithm to optimize the query processing.
Abstract: With the popularity of XML as data exchange over the Web, querying XML data has become an important issue to be addressed. Since the logical structure of XML is a tree, establishing a parent-child (P-C), ancestor-descendant (A-D) or sibling relationship between nodes is essential for structural query processing. Thus, we propose using a labeling scheme to encode each element in the XML database by its positional information. Based on this labeling scheme, we further propose our TwigINLAB algorithm to optimize the query processing. Experimental results indicate that TwigINLAB can process both path queries and twig queries better than the TwigStack algorithm on an average of 27% and 14% respectively in terms of execution time using the XMARK benchmark dataset.

Journal Article
TL;DR: A new taxonomy for classification of broadcast digital archives based on a novel theoretical approach is introduced and an unambiguous representation of multimedia informative content from the relevant points of view to the broadcasters community is presented.
Abstract: Multimedia content classification and retrieval are indispensable tools in the current convergence of audiovisual entertainment and information media. Thanks to the development of broadband networks, every consumer will have digital video programmes available on-line as well as through the traditional distribution channels. In this scenario, since the early ‘90s, the most important TV broadcasters in Europe have started projects whose aim was to achieve preservation, restoration and automatic documentation of their audiovisual archives. In particular, the association of low-level multimedia features to knowledge and semantics for the purpose of automatic classification of multimedia archives is currently the target of many researchers in both academic and IT industrial communities. This paper describes our research direction, which is focusing on three points: (a) We first introduce a new taxonomy for classification of broadcast digital archives based on a novel theoretical approach. The advantage of this taxonomy is that it can provide an unambiguous representation of multimedia informative content from the relevant points of view to the broadcasters community. (b) We secondly present a multilayer multimedia database model to represent both structure and content of multimedia objects. (c) We further propose a framework architecture for building a Multimedia Fuzzy Annotation System (MFAS), and a description of our experimental plan.

Journal Article
TL;DR: Correlation results indicate that the proposed VCA algorithm is feasible in detecting low quality as well as non-fingerprint images, and it has been compared with NIST fingerprint image quality results.
Abstract: Clarity of fingerprint image structure is crucial for many fingerprint applications, as well as the performance of built systems which relies on the validity and quality of captured images. Validity check will eliminate invalid images before starting the life cycle of fingerprint metadata enrolling for system processing cycle; therefore the overall benchmarking system accuracy will not be affected by rejecting an invalid image before getting in the system cycle. In this paper we propose a validity check algorithm (VCA). The VCA is applied to the base image element statistical weight calculation because the image element (pixel) describes an image object with the contrast, brightness, clarity and noising attributes. Our algorithm depends on fingerprint object segmentation, background subtraction, total image thresholding and pixel weight calculation. A VTC2000DB1_B, TIMA databases was used to evaluate the VCA and it has been compared with NIST fingerprint image quality results. Correlation results indicate that the proposed algorithm is feasible in detecting low quality as well as non-fingerprint images.

Journal Article
TL;DR: This system can be effectively used in application areas like e-governance, agriculture, rural health, education, national resource planning, disaster management, information kiosks etc where people from all walks of life are involved.
Abstract: The goal of this work was developing a query processing system using software agents. Open Agent Architecture framework is used for system development. The system supports queries in both Hindi and Malayalam; two prominent regional languages of India. Natural language processing techniques are used for meaning extraction from the plain query and information from database is given back to the user in his native language. The system architecture is designed in a structured way that it can be adapted to other regional languages of India. . This system can be effectively used in application areas like e-governance, agriculture, rural health, education, national resource planning, disaster management, information kiosks etc where people from all walks of life are involved.

Journal Article
TL;DR: The decomposition-matching-merging approach is adopted and INLAB, a novel hybrid query processing merging both indexing and labeling technologies is proposed, which can process XML path queries by up to an order of magnitude faster than conventional top-down approach.
Abstract: Due to its flexibility and efficiency in transmission of data, XML has become the emerging standard for data transfer and exchange across the Internet. In native XML database, XML documents are usually modeled as trees, and XML queries are typically specified in path expression. The primitive structural relationships are parent-child and ancestor-descendant in the path expression. Thus, finding all occurrences of these relationships is crucial. We adopt the decomposition-matching-merging approach and propose INLAB, a novel hybrid query processing merging both indexing and labeling technologies. Experimental results show that INLAB can process XML path queries by up to an order of magnitude faster than conventional top-down approach.

Journal Article
TL;DR: This paper proposes to add a conceptual layer on top of MPEG-7 metadata layer, where the domain knowledge is represented using a formal language and serves as a bridge between the two layers.
Abstract: This paper describes an approach for semantic description and retrieval of multimedia data described by means of MPEG-7. This standard uses XML schema to define the descriptions. Therefore, it lacks ability to represent the data semantics in a formal and concise way and it does not allow integration and use of domain specific knowledge. Moreover, inference mechanisms are not provided and hence the extraction of implicit information is not (always) possible. To address these issues, we propose to add a conceptual layer on top of MPEG-7 metadata layer, where the domain knowledge is represented using a formal language. A set of mapping rules is proposed. They serve as a bridge between the two layers. Querying MPEG-7 descriptions using XML query languages such as XPath or XQuery requires to know MPEG-7 syntax and documents structure. To provide a flexible query formulation, we exploit the conceptual layer vocabulary to express user queries. A user query, making reference to terms specified at the conceptual level, is rewritten into an XQuery expression over MPEG-7 descriptions.

Journal Article
TL;DR: This thesis analyzed the performance of a turbocoded OFDM system and the results are simulated in Rayleigh fading channel in terms of bit error rate.
Abstract: data transmission which requires robust and spectrally efficient communication techniques. The basic idea behind adaptive transmission is to improve spectral efficiency by varying the transmission power level, symbol transmission rate, constellation size, and coding rate scheme. In this paper adaptive modulation and turbo coding is applied to OFDM that can provide a lower bit error rate than adaptive OFDM [9]. This thesis analyzed the performance of a turbocoded OFDM system and the results are simulated in Rayleigh fading channel. The performance is evaluated in terms of bit error rate.

Journal Article
TL;DR: The proposed algorithm based on the concept of re-coloring the palette colors of 256-color images using the properties of HSL color space and segmenting true type images by taking part of the image, generating a palette to it then apply the method to segment the selected region.
Abstract: This paper describes a palette-based colored image segmentation system. In this paper we analyzed the properties of Hue, Saturation, and Luminance (HSL) color space with emphasis on the visual perception of the variation in Hue, Saturation and Luminance values of palette colors. The proposed algorithm based on the concept of re-coloring the palette colors of 256-color images using the properties of HSL color space and segmenting true type images by taking part of the image, generating a palette to it then apply the method to segment the selected region. After re-coloring of the palette colors the colors in the image will be re-colored according to its new palette, and as a result the image will be segmented to different segments.


Journal Article
TL;DR: Unsupervised learning is an important supplementary method to category data since it could increase the precision of clustering results and is widely used on/with unlabeled data, such as extracting relevance that exists in records.
Abstract: 1. Introduction From a traditional point of view, knowledge exploration can be categorized into supervised learning and unsupervised learning (Jordan and Jacobs 1994). In the last decade, there have been research activities on supervised learning approaches and techniques, whereby class information is available before any knowledge exploration takes place. The most utilized approach is to achieve a predetermined independent measurement in order to preferentially target classes. Then a classification algorithm is applied in the data pre-processing stage (Liu and Motoda 1998, Liu and Yu 2005). However, this approach is not robust to be effectively applied on features with irregular sizes or nonrecurring, high-dimensional variables. Unsupervised learning is a recent approach in knowledge exploration. It is widely used on/with unlabeled data, such as extracting relevance that exists in records. Unsupervised learning is an important supplementary method to category data since it could increase the precision of clustering results. Unlike supervised learning, unsupervised learning attempts

Journal Article
TL;DR: This paper proposes a simple and efficient method for layered MIMO-OFDM system with channel equalization, using Decision feedback equalizer (DFE) with recursive least square (RLS) algorithm for channelequalization.
Abstract: This paper proposes a simple and efficient method for layered MIMO-OFDM system with channel equalization. Temporal variations in the channel are due to Doppler spread, a sign of relative motion between transmitter and receiver. Results are simulated in both Rayleigh channel and additive white Gaussian noise (AWGN) channel. Decision feedback equalizer (DFE) with recursive least square (RLS) algorithm is used for channel equalization. Different type of Layered structure is applied to MIMO-OFDM system and their performance is evaluated. Different modulations schemes are used with same number of transmit and receive antennas. Simulation results are shown for both vertical and horizontal coded layered structure MIMO-OFDM systems with different modulation schemes and different number of transmit antennas.


Journal Article
TL;DR: This work proposes Byzantine type of faults, which encompasses most of the common sensor node faults, and shows by simulation that the proposed strategy works well for 2 major classes of collaborative sensor network applications viz.
Abstract: In-networks, Data Aggregation is usually war- ranted for distributed wireless sensor networks, owing to reliability and energy efficiency reasons. Sensor nodes are usually deployed in unattended and unsafe environ- ments and hence are vulnerable to intentional or unin- tentional damages. Individual nodes are prone to differ- ent type of faults such as hardware faults, crash faults etc and other security vulnerabilities wherein one or more nodes are compromised to produce bogus data so as to confuse the rest of the network in collaborative sensing applications. The availability of constrained resources and the presence of faulty nodes make designing fault toler- ant information aggregation mechanisms in large sensor networks particularly challenging. In our work, we con- sider Byzantine type of faults, which encompasses most of the common sensor node faults (9). Faulty nodes are assumed to send inconsistent and arbitrary values to other nodes during information exchange process. These val- ues are termed as outliers and we use a statistical test called Modified Z-score method to reliably detect and re- move outliers. We show by simulation that the proposed strategy works well for 2 major classes of collaborative sensor network applications viz. (i) Target/ Event detec- tion and (ii) Continuous data gathering. Categories and Subject Descriptors C.2.1(Network Architecture and Design);Wirelss communication: C.4(Performance of Systems);Fault tolerance: E.1(Data Structures); Distributed data structures


Journal Article
TL;DR: A novel nested optimization technique based on genetic algorithm is proposed to assign tasks onto sensors with minimal cost while meeting application's QoS requirements.
Abstract: Collaborative processing among sensors to fulfill given tasks is a promising solution to save significant energy in resource-limited wireless sensor networks (WSN). Quality-of-Service (QoS) such as lifetime and latency is largely affected by how tasks are mapped to sensors in network. Tasks allocation is a well-defined problem in the area of high performance computing and has been extensively studied in the past. Due to the limitations of WSN, existing algorithms cannot be directly used. In this paper, a novel nested optimization technique based on genetic algorithm is proposed to assign tasks onto sensors with minimal cost while meeting application's QoS requirements. Optimal solution can be achieved by incorporating task mapping, routing path allocation, communication scheduling, dynamic voltage scaling. Performance is evaluated through experiments with randomly generated Directed Acyclic Graphs (DAG) and experiments results show better solution compared with existing methods.