scispace - formally typeset
Search or ask a question

Showing papers in "CSI Transactions on ICT in 2014"


Journal ArticleDOI
TL;DR: This paper introduces two hybrid ensemble based models (i.e. bagging and bayesian boosting based) for opinion classification and applies a pairwise statistical test to compare the significance of the classifiers.
Abstract: With the rapid expansion of e-commerce over the decades, more and more product reviews emerge on e-commerce sites. In order to effectively utilize the information available in the form of reviews, an automatic opinion mining system is needed to organize the reviews and to help the users and organizations in making an informed decision about the products. Opinion mining systems based on machine learning approaches are used to categorize the reviews containing the customer opinion into positive or negative reviews. In this paper we explore this new research area of applying a hybrid combination of machine learning approaches tied with principal component analysis as a feature reduction technique. We introduce two hybrid ensemble based models (i.e. bagging and bayesian boosting based) for opinion classification. The results are compared with two individual classifier models based on statistical learning (i.e. logistic regression and support vector machine) using a dataset of product reviews. The other objective is to compare the influence of using different n-gram schemes (unigrams, bigrams and trigrams). We found that ensemble based hybrid methods perform better in terms of various quality measures in classifying the opinion into positive and negative reviews. We also applied a pairwise statistical test to compare the significance of the classifiers.

29 citations


Journal ArticleDOI
TL;DR: None of the websites evaluated were completely accessible to people with disabilities, i.e., there were no web sites that had no violations of web accessibility guidelines, and there was no significant difference found in the accessibility of public and private sector banking websites in India.
Abstract: Accessibility refers to making websites usable for people of all types of abilities and disabilities, regardless of what browsing technology they are using. Since the web is an important resource of information for millions of people at all levels, accessible websites can help people with disabilities too to participate and contribute more actively in society. The objective of this study is to analyze the status of accessibility of banking websites as it allows people with disabilities to be independent and more in control of their own financial requirements. Web Content Accessibility Guidelines (WCAG) are universally accepted guidelines for website accessibility evaluation. The automatic evaluation tool is used to evaluate the website accessibility based on WCAG 1.0 and WCAG 2.0 guidelines. To further assess the reasons for accessibility barriers, complexity score was calculated. The accessibility score of different disability was also computed. The difference between the mean accessibility errors of public and private sector banks in India was also computed. The correlation of accessibility with the popularity and importance of the web sites was also evaluated. It was found that none of the websites that were evaluated were completely accessible to people with disabilities, i.e., there were no web sites that had no violations of web accessibility guidelines. There was no significant difference found in the accessibility of public and private sector banking websites in India. A framework to categorize the websites into fully accessible, partially accessible and inaccessible was also proposed.

24 citations


Journal ArticleDOI
TL;DR: In this paper, the authors derived a threshold metrics value against the bad smell using risk analysis at five different levels and selected one threshold value from the various risk levels (for bad smell) by determining the largest area under receiver operating character curve for faulty classes at corresponding risk levels.
Abstract: The metrics can be applied by software maintenance, testing and evolution teams for a variety of purposes. Various research studies have designed metrics models for analyzing the quality of software. However, it is hard to assess the quality of software with a single metrics value. A metrics value alone is meaningless without its threshold values. In the current paper, study derives a threshold metrics value against the bad smell using risk analysis at five different levels. Three versions of Mozilla Firefox were used as a dataset to validate the study. The results show that some metrics have threshold values at various risk levels that are of practical use in predicting faulty classes. Finally one threshold value was selected from the various risk levels (for bad smell) by determining the largest area under receiver operating character curve for faulty classes at corresponding risk levels.

19 citations


Journal ArticleDOI
TL;DR: The results of simulation experiments show the high precision of proposed method and reveals that dominance of one blur on another does not affect too much on the applied parameter estimation approach.
Abstract: Motion blur and defocus blur are common cause of image degradation. Blind restoration of such images demands identification of the accurate point spread function for these blurs. The identification of joint blur parameters in barcode images is considered in this paper using logarithmic power spectrum analysis. First, Radon transform is utilized to identify motion blur angle. Then we estimate the motion blur length and defocus blur radius of the joint blurred image with generalized regression neural network (GRNN). The input of GRNN is the sum of the amplitudes of the normalized logarithmic power spectrum along vertical direction and concentric circles for motion and defocus blurs respectively. This scheme is tested on multiple barcode images with varying parameters of joint blur. We have also analyzed the effect of joint blur when one blur has same, greater or lesser extents to another one. The results of simulation experiments show the high precision of proposed method and reveals that dominance of one blur on another does not affect too much on the applied parameter estimation approach.

18 citations


Journal ArticleDOI
TL;DR: The proposed method, CHSRDR, considers the heterogeneity in power and maintains a cluster of vice- heads on the basis of randomness inside the cluster; these vice-heads can work as a head in future, when the main head come to end of power.
Abstract: Wireless sensor networks are potentially used in the field of surveillance and monitoring since last few years. In such applications, clustering play an important role in enhancement of the life span and scalability of the network. Each cluster contains a cluster head that controls the whole cluster working, in which various researcher focuses on the good selection of the cluster head that can improve the life of the WSN. Previous works on cluster head selection lacks in data recovery. In this paper we are proposing Cluster Head Selection by Randomness with Data Recovery in WSN (CHSRDR) method for selecting the cluster head inside the cluster with data recovery. The proposed method, CHSRDR, considers the heterogeneity in power and maintains a cluster of vice-heads on the basis of randomness inside the cluster; these vice-heads can work as a head in future, when the main head come to end of power. The headship circulates among the vice-heads of the cluster. We have simulated the method and got the enhancement in throughput.

13 citations


Journal ArticleDOI
TL;DR: This work proposes a multivariate texture model that uses multivariate discrete local texture pattern (MDLTP) supplemented with multivariate variance (MVAR) and the classification accuracy of the classified image obtained is found to be 93.46 %.
Abstract: Texture features play a vital role in land cover classification of remotely sensed images. Local binary pattern (LBP) is a texture model that has been widely used in many applications. Many variants of LBP have also been proposed. Most of these texture models use only two or three discrete output levels for pattern characterization. In the case of remotely sensed images, texture models should be capable of capturing and discriminating even minute pattern differences. So a multivariate texture model is proposed with four discrete output levels for effective classification of land covers. Remotely sensed images have fuzzy land covers and boundaries. Support vector machine is highly suitable for classification of remotely sensed images due to its inherent fuzziness. It can be used for accurate classification of pixels falling on the fuzzy boundary of separation of classes. In this work, texture features are extracted using the proposed multivariate descriptor, MDLTP/MVAR that uses multivariate discrete local texture pattern (MDLTP) supplemented with multivariate variance (MVAR). The classification accuracy of the classified image obtained is found to be 93.46 %.

13 citations


Journal ArticleDOI
TL;DR: This paper proposes a method to automatically identify associations among the files in digital evidence at the syntactic and semantic levels using metadata and applies this method to identify metadata associations from collections of image files and word processing documents and elicit inter-file relationships for the purpose of identifying interesting or relevant files from large file collections indigital evidence.
Abstract: In the conventional system of analysis that is concerned with digital forensics, content is analyzed to describe the state of files in digital evidence and ascertain their relevance. Such content analysis is carried out using “searching”. When searching a file or for a file, use of keywords is the norm. When the exact words are not known, one may use regular expression search which uses a more flexible language for describing a set of keywords that fit a pattern. During analysis, there is also a need to identify all types of associations that exist between the files to answer the six fundamental questions of what, when, where, how, who and why. If the keywords and pattern have limited scope, an examiner often has very little to go on. Metadata contains information that represents the state of a file, even if partially. Besides, metadata based search is amenable to automation by virtue of the ubiquitous nature of metadata. During analysis, metadata can be used to ascertain the nature of digital photographs that were processed using software and identify digitally generated images that resemble original photographs. Metadata can also be used to identify word processing documents that were derived from other documents and stored as a duplicate or after modification in such a way that traditional techniques cannot detect. Often what is needed is the ability to identify section(s) of the evidence where relevant information appears to reside. Metadata based matches give rise to file relationships that encapsulate the event sequence among related files aiding in the discovery. This paper proposes a method to automatically identify associations among the files in digital evidence at the syntactic and semantic levels using metadata. We apply this method to identify metadata associations from collections of image files and word processing documents and elicit inter-file relationships for the purpose of identifying interesting or relevant files from large file collections in digital evidence. We demonstrate that the file relationships identified using metadata help in the identification of doctored photographs and copied documents.

9 citations


Journal ArticleDOI
TL;DR: An algorithm for embedding copyright mark into host image based on discrete cosine transform (DCT) and Spread Spectrum and a set of systematic experiments, including JPEG compression, Gaussian filtering and addition of noise are performed to prove robustness of the algorithm.
Abstract: In this paper, an algorithm for embedding copyright mark into host image based on discrete cosine transform (DCT) and Spread Spectrum has proposed. The proposed algorithm works by dividing the cover into blocks of equal sizes and then embeds the watermark in middle band of DCT coefficient of cover image. Performance evaluation of proposed algorithm has been made using bit error rate and peak signal to noise ratio value for different watermark size and images: Lena, Girl, and Tank images yield similar results. This algorithm is simple and does not require the original cover image for watermark recovery. A set of systematic experiments, including JPEG compression, Gaussian filtering and addition of noise are performed to prove robustness of our algorithm.

6 citations


Journal ArticleDOI
TL;DR: The project tracking technique consists of project state transition equation and project status measurement equation, in plan-space and is formulated with Monte Carlo method in execution-space, which shows the uncertainty propagation iteration-wise, both individually and totally.
Abstract: Software projects are required to be tracked during their execution for controlling them. According to state-space approach, the tracking technique consists of software project state transition equation and software project status measurement equation. A key factor in tracking software projects is to represent the project with uncertainty involved in the parameters. Traditional and hybrid software project tracking technique is designed with state space approach and simulated using discrete event simulation in plan-space and execution-space. The uncertainty considered here is epistemological and is modeled as a normal distribution using an approximation method. The initial state of the project in execution-space also has an uncertainty associated with it. The project tracking technique consists of project state transition equation and project status measurement equation, in plan-space and is formulated with Monte Carlo method in execution-space. The project status is derived using project measurement equation as a function of project state. The project state is derived using project state transition equation. The software product is developed iteratively and incrementally. With Monte Carlo simulation runs, simulation result shows the uncertainty propagation iteration-wise, both individually and totally; and the effect of uncertainty on project status is shown by showing project status in execution-space and plan-space. Besides, the project completion somewhere during the last iteration is shown with simulation.

6 citations


Journal ArticleDOI
TL;DR: The proposed framework provides unobtrusive Continuous Authentication, by alternating between two modes which utilize hard and soft biometrics respectively, depending on certain confidence parameters, and uses facial features as the hard biometric trait for recognizing the user.
Abstract: Static Authentication provides a secure framework for a one-time authentication session, but fails to authenticate the user throughout the session. This presents the possibility of an imposter gaining access when a user session is active and the user moves away from the system. The goal of continuous authentication is to authenticate the user right from the initial stages of log-in till log-out. Intuitively, this can be implemented by extrapolating the tried-and-tested static authentication techniques throughout the session. However, extrapolating one-time authentication techniques poses new challenges of being computationally expensive, restricting the user’s movement and postures in front of the system, depending on extra expensive hardware and deviating the user from his work-flow. In these situations, the user no longer remains uninterrupted by the authentication process in the background. The proposed framework provides unobtrusive Continuous Authentication, by alternating between two modes which utilize hard and soft biometrics respectively, depending on certain confidence parameters. We use facial features as the hard biometric trait for recognizing the user. Employing face recognition for extended periods of time produces noise, which is dampened by using a supervised machine learning algorithm. The color of user’s clothing as the soft biometric trait relieves the CPU of comparatively high computation and relaxes constraints on the user’s upper body movement.

5 citations


Journal ArticleDOI
TL;DR: The proposed protocol is well designed for sensor node, which has limited resources with a better authentication by using a one way hash function and smart card and provide cost effective mechanism to defend against malicious attack.
Abstract: Security in wireless sensor network (WSN) is a critical issue when it comes to malicious attack or power loss. Recently, several security mechanisms have been proposed. In this paper, an efficient security mechanism is proposed to provide better authentication mechanism to counter the malicious attacks in WSN. The proposed protocol is well designed for sensor node, which has limited resources with a better authentication by using a one way hash function and smart card. In this paper, we pointed out several pitfalls in previous schemes and proposed an improvement that will result in better resource utilization and better security. The security analysis shows that proposed protocol defend better and provide cost effective mechanism to defend against malicious attack.

Journal ArticleDOI
TL;DR: This paper proposes a token based mutual exclusion technique to solve the problem of distributed dynamic channel allocation and shows through simulation, how message exchange between cells is reduced in token based relax mutual exclusion as opposed to other relaxed mutual exclusion techniques.
Abstract: Channel allocation in wireless communication network plays a crucial role in the performance of the network. Fixed scheme for channel allocation does not account for any non-uniform traffic in the network. Dynamic scheme, on the other hand, ensures that whenever a cell requires a channel, it is allocated pertaining to the frequency reuse constraints. The fact that distributed dynamic channel allocation technique has recently gained popularity is due to its high reliability and scalability. In this paper we propose a token based mutual exclusion technique to solve the problem of distributed dynamic channel allocation. We will show through simulation, how message exchange between cells is reduced in token based relaxed mutual exclusion as opposed to other relaxed mutual exclusion techniques.

Journal ArticleDOI
TL;DR: The algorithm proposed will provide a mechanism that will enhance the life, endurance and reliability of SSD of NOR type and well equipped with algorithms to enhance their life and reliability.
Abstract: The role and importance of solid storage devices (SSD) is rapidly increasing for the purpose of data storage. The SSDs are rapidly replacing the old fashioned and traditional magnetic storage medium. Few factors responsible for this metamorphic turnaround are due to the better performance and low power requirement of SSDs. But, as with every pros there are some associated cons. One limiting factor of SSDs in replacing traditional magnetic storage media is its low endurance. These solid state devices can be re-programmed for a limited number of times. Unlike, magnetic storage media, the SSDs can be re-programmed for a limited number of times. Internally, SSDs are organized either in bytes or blocks. Generally, for memory access operations, either the byte or blocks are used. With each write operation, the byte gets worn out and its lifetime decreases. To guard this limiting factor of SSD, wear leveling mechanism is used. Wear leveling mechanism is provided as a feature in flash type SSDs, which evenly distributes the write operation throughout the flash memory and prevents the early wear out of memory. The existing wear leveling mechanism is limited only to flash type SSDs. In this paper, we propose a wear leveling algorithm for SSD and precisely NOR type SSD. NAND types SSDs are well equipped with algorithms to enhance their life and reliability. The algorithm proposed will provide a mechanism that will enhance the life, endurance and reliability of SSD of NOR type.

Journal ArticleDOI
TL;DR: This paper presents an efficient method for estimating parasitaemia using the digital images of thin blood smears that has been stained with Gimsa or equivalent stain that utilizes the 4-connected set properties of digital images to identify the various regions existing within the image.
Abstract: Digital image processing techniques are being explored for accurate and timely diagnosis of malaria, a serious parasitic infection of humans. A key decision factor in the diagnosis is the degree of infection, also called parasitaemia. This paper presents an efficient method for estimating parasitaemia using the digital images of thin blood smears that has been stained with Gimsa or equivalent stain. The method utilizes the 4-connected set properties of digital images to identify the various regions existing within the image. Properties of the different identified regions, such as centroids, major and minor axis, etc., are used to arrive at the number of RBCs (good and infected) present in the image. The method addresses the issues of partially visible RBCs, as well as that of overlapped RBCs. It also addresses image imperfections, caused by dust on the slide, etc.

Journal ArticleDOI
TL;DR: The enhancement of the performance of a SRS by applying time normalization to the speech signal is addressed and the comparison of the proposed Model and baseline syllable based SRS is done.
Abstract: The automatic speech recognition (ASR) is an active field of research. The performance of the ASR can be degraded due to various features like environmental noise, channel distortion and speech rate variability. The speech rate variability is one of the important features that affect the accuracy of the speech recognition system (SRS). In this research work, the speech signal is categorized as slow, normal and fast speech using features like the sound intensity level, time duration and root mean square. This paper addresses the enhancement of the performance of a SRS by applying time normalization to the speech signal. The comparison of the proposed Model and baseline syllable based SRS is done.

Journal ArticleDOI
TL;DR: The results of the proposed binarization approach are seen to be better when compared to five existing well known approaches proposed by Otsu, Gatos et al, Niblack, Souvola et al., and Bernsen using four evaluations measures.
Abstract: Degradations in document images appear due to shadows, non-uniform illumination, ink bleed-through and blur caused by humidity. Thresholding of such document images either result in broken characters or detection of false texts. Numerous algorithms exist that can separate text and background efficiently in the textual regions of the document; but portions of background are mistaken as text in areas that hardly contain any text. This paper presents a way to overcome these problems by a robust binarization technique that recovers the text from a severely degraded document images and thereby increases the accuracy of optical character recognition systems. The proposed document recovery algorithm efficiently removes degradations from document images. Proposed work is based on the fusion of two well known binarization methods: Gatos et al. and Niblack, using dilation and logical AND operations. The results of our proposed binarization approach are seen to be better when compared to five existing well known approaches proposed by Otsu, Gatos et al., Niblack, Souvola et al., and Bernsen using four evaluations measures: Execution time, F-measure, PSNR, and NRM.

Journal ArticleDOI
TL;DR: A robust and efficient method for face recognition using phase only correlation (POC) supplemented by Gabor phase pattern supplemented by POC technique that is comparable with the advanced face recognition algorithms reported in the literature.
Abstract: A robust and efficient method for face recognition using phase only correlation (POC) is proposed in this paper. To achieve efficient recognition rate, it uses the concept of histogram of Gabor phase pattern (HGPP) supplemented by POC technique. In HGPP, the quadrant-bit codes are first extracted from faces, and in order to encode the phase variations, global Gabor phase pattern (GGPP) and local Gabor phase pattern (LGPP) are derived. GGPP and LGPP are then split into the non-overlapping rectangular regions. From the above regions, spatial histograms are extracted and concatenated into an extended histogram feature to represent the original image. The recognition is carried out with the nearest-neighbor classifier, using the histogram intersection as the similarity measurement. Finally, face patterns are verified with POC based matching technique to improve the accuracy of the system. This method improves the result both distribution wise and content wise. Experiments are done on the large scale ORL, YALE, FERET and DCSKU databases. Experimental results show that the proposed method is promising and is comparable with the advanced face recognition algorithms reported in the literature.

Journal ArticleDOI
TL;DR: It is found from the experimental results that resulting sequences have significant improvement in terms of randomness quality and associated fault coverage in their generation procedures.
Abstract: In this paper, we have compared the performances of different cellular automata based random number generators to emphasize on the quality of randomness with a focus on cost effectiveness for concerned fault coverage. This research includes the study of maximum length cellular automata random number generator and proposed equal length cellular automata random number generator. It is found from the experimental results that resulting sequences have significant improvement in terms of randomness quality and associated fault coverage in their generation procedures. The different complexities associated considered here for generation of random numbers, are: space complexity, time complexity, design complexity and searching complexity.

Journal ArticleDOI
TL;DR: An algorithm is introduced, which generates a representative sequence for a group of time series those are clustered by any clustering technique and gives a Representative Time Sequence which will be further used for trend analysis.
Abstract: Trend analysis is important in many applications such as in business, weather, medical, etc. because it imparts knowledge about what has taken place in the past and what will take place in time to come. Trend analysis in the time series is the practice of collecting and attempting to spot patterns. Due to vast amount of data is present in such application and most of them are in the form of time series, we introduce an algorithm, which generates a representative sequence for a group of time series those are clustered by any clustering technique. This algorithm hierarchically merges the time series present in a cluster based on their similarity and gives a Representative Time Sequence which will be further used for trend analysis. We also present verification and validation of the representative time sequence formed by our algorithm on rainfall time series data set.

Journal ArticleDOI
TL;DR: The creation of an advanced cyber infrastructure in India has been recognized as a critical knowledge resource for further development of Science and Technology in India.
Abstract: The creation of an advanced cyber infrastructure has been recognized as a critical knowledge resource for further development of Science and Technology in India. Interaction among various interest groups and brainstorming sessions have converged naturally to the fact that several scientific departments and institutes for Higher Education and research benefit from this effort—that too immediately. E-Infrastructure seamlessly integrates heterogeneous partners and annihilates distance through Smart Ultra High Bandwidth networks.