Showing papers in "International Journal of Computer Applications in 2013"
TL;DR: This survey discusses the existing works on text similarity through partitioning them into three approaches; String-based, Corpus-based and Knowledge-based similarities, and samples of combination between these similarities are presented.
Abstract: Measuring the similarity between words, sentences, paragraphs and documents is an important component in various tasks such as information retrieval, document clustering, word-sense disambiguation, automatic essay scoring, short answer grading, machine translation and text summarization. This survey discusses the existing works on text similarity through partitioning them into three approaches; String-based, Corpus-based and Knowledge-based similarities. Furthermore, samples of combination between these similarities are presented. General Terms Text Mining, Natural Language Processing. Keywords BasedText Similarity, Semantic Similarity, String-Based Similarity, Corpus-Based Similarity, Knowledge-Based Similarity. NeedlemanWunsch 1. INTRODUCTION Text similarity measures play an increasingly important role in text related research and applications in tasks Nsuch as information retrieval, text classification, document clustering, topic detection, topic tracking, questions generation, question answering, essay scoring, short answer scoring, machine translation, text summarization and others. Finding similarity between words is a fundamental part of text similarity which is then used as a primary stage for sentence, paragraph and document similarities. Words can be similar in two ways lexically and semantically. Words are similar lexically if they have a similar character sequence. Words are similar semantically if they have the same thing, are opposite of each other, used in the same way, used in the same context and one is a type of another. DistanceLexical similarity is introduced in this survey though different String-Based algorithms, Semantic similarity is introduced through Corpus-Based and Knowledge-Based algorithms. String-Based measures operate on string sequences and character composition. A string metric is a metric that measures similarity or dissimilarity (distance) between two text strings for approximate string matching or comparison. Corpus-Based similarity is a semantic similarity measure that determines the similarity between words according to information gained from large corpora. Knowledge-Based similarity is a semantic similarity measure that determines the degree of similarity between words using information derived from semantic networks. The most popular for each type will be presented briefly. This paper is organized as follows: Section two presents String-Based algorithms by partitioning them into two types character-based and term-based measures. Sections three and four introduce Corpus-Based and knowledge-Based algorithms respectively. Samples of combinations between similarity algorithms are introduced in section five and finally section six presents conclusion of the survey.
596 citations
TL;DR: A survey of various Encryption Algorithms is presented and it is shown that the art of cryptography has become more complex in order to make information more secure.
Abstract: Encryption is the process of scrambling a message so that only the intended recipient can read it. Encryption can provide a means of securing information. As more and more information is stored on computers or communicated via computers, the need to insure that this information is invulnerable to snooping and/or tampering becomes more relevant. With the fast progression of digital data exchange in electronic way, Information Security is becoming much more important in data storage and transmission. Information Confidentiality has a prominent significance in the study of ethics, law and most recently in Information Systems. With the evolution of human intelligence, the art of cryptography has become more complex in order to make information more secure. Arrays of Encryption systems are being deployed in the world of Information Systems by various organizations. In this paper, a survey of various Encryption Algorithms is presented. General Terms Information Security, Encryption
243 citations
TL;DR: The results obtained by implementing the k-means algorithm using three different metrics Euclidean, Manhattan and Minkowski distance metrics along with the comparative study of results of basic k-Means algorithm which is implemented through Euclidian distance metric for two- dimensional data are discussed.
Abstract: power of k-means algorithm is due to its computational efficiency and the nature of ease at which it can be used. Distance metrics are used to find similar data objects that lead to develop robust algorithms for the data mining functionalities such as classification and clustering. In this paper, the results obtained by implementing the k-means algorithm using three different metrics Euclidean, Manhattan and Minkowski distance metrics along with the comparative study of results of basic k-means algorithm which is implemented through Euclidian distance metric for two- dimensional data, are discussed. Results are displayed with the help of histograms.
196 citations
TL;DR: This paper presents an overview of research directions for applying supervised and unsupervised methods for managing the problem of anomaly detection, and covers the major theoretical issues.
Abstract: Intrusion detection has gain a broad attention and become a fertile field for several researches, and still being the subject of widespread interest by researchers. The intrusion detection community still confronts difficult problems even after many years of research. Reducing the large number of false alerts during the process of detecting unknown attack patterns remains unresolved problem. However, several research results recently have shown that there are potential solutions to this problem. Anomaly detection is a key issue of intrusion detection in which perturbations of normal behavior indicates a presence of intended or unintended induced attacks, faults, defects and others. This paper presents an overview of research directions for applying supervised and unsupervised methods for managing the problem of anomaly detection. The references cited will cover the major theoretical issues, guiding the researcher in interesting research directions.
136 citations
TL;DR: In this article, the proposed system combines entropy based intrusion detection system with anomaly detection system for providing multilevel Distributed Denial of Service (DDoS) in cloud environment. But, it fails to detect those attacks that are not included in database.
Abstract: Computing is a recent computing model; provides consistent access to wide area distributed resources. It revolutionized the IT world with its services provision infrastructure, less maintenance cost, data and service availability assurance, rapid accessibility and scalability. Grid and Cloud Computing Intrusion Detection System (GCCIDS) detects encrypted node communication and find the hidden attack trial which inspects and detects those attacks that network based and host based can't identify. It incorporates Knowledge and behavior analysis to identify specific intrusions. Signature based IDS monitor the packets in the network and identifies those threats by matching with database but It fails to detect those attacks that are not included in database. Signature based IDS will perform poor capturing in large volume of anomalies. Another problem is that Cloud Service Provider (CSP) hides the attack that is caused by intruder, due to distributed nature; cloud environment has high possibility for vulnerable resources. By impersonating legitimate users, the intruders can use a service's abundant resources maliciously. In Proposed System we combine few concepts which are available with new intrusion detection techniques. Here to merge Entropy based System with Anomaly detection System for providing multilevel Distributed Denial of Service (DDoS). This is done in two steps: First, Users are allowed to pass through router in network site in that it incorporates Detection Algorithm and detects for legitimate user. Second, again it pass through router placed in cloud site in that it incorporates confirmation Algorithm and checks for threshold value, if it's beyond the threshold value it considered as legitimate user, else it's an intruder found in environment. This System is represented and maintained by as third party. When attack happens in environment, it sends notification message for client and advisory report to Cloud Service Provider (CSP).
123 citations
TL;DR: A detailed review has been conducted on the current situation of malware infection and the work done to improve anti-malware or malware detection systems and provides an up-to-date comparative reference for developers of malware detection systems.
Abstract: Over the last decades, there were lots of studies made on
malware and their countermeasures. The most recent reports
emphasize that the invention of malicious software is rapidly
increasing. Moreover, the intensive use of networks and
Internet increases the ability of the spreading and the
effectiveness of this kind of software. On the other hand,
researchers and manufacturers making great efforts to produce
anti-malware systems with effective detection methods for
better protection on computers. In this paper, a detailed
review has been conducted on the current situation of
malware infection and the work done to improve anti-malware
or malware detection systems. Thus, it provides an up-to-date
comparative reference for developers of malware detection
systems.
114 citations
TL;DR: In this article, the authors identify the factors influencing the performance of students in final examinations and find out a suitable data mining algorithm to predict the grade of students so as to give timely and an appropriate warning to students those who are at risk.
Abstract: Predicting the performance of a student is a great concern to the higher education managements. The scope of this paper is to identify the factors influencing the performance of students in final examinations and find out a suitable data mining algorithm to predict the grade of students so as to a give timely and an appropriate warning to students those who are at risk. In the present investigation, a survey cum experimental methodology was adopted to generate a database and it was constructed from a primary and a secondary source. The obtained results from hypothesis testing reveals that type of school is not influence student performance and parents’ occupation plays a major role in predicting grades. This work will help the educational institutions to identify the students who are at risk and to and provide better additional training for the weak students.
112 citations
TL;DR: Different approaches of ANPR are discussed by considering image size, success rate and processing time as parameters, and towards the end of this paper, an extension to ANPR is suggested.
Abstract: Traffic control and vehicle owner identification has become major problem in every country. Sometimes it becomes difficult to identify vehicle owner who violates traffic rules and drives too fast. Therefore, it is not possible to catch and punish those kinds of people because the traffic personal might not be able to retrieve vehicle number from the moving vehicle because of the speed of the vehicle. Therefore, there is a need to develop Automatic Number Plate Recognition (ANPR) system as a one of the solutions to this problem. There are numerous ANPR systems available today. These systems are based on different methodologies but still it is really challenging task as some of the factors like high speed of vehicle, non-uniform vehicle number plate, language of vehicle number and different lighting conditions can affect a lot in the overall recognition rate. Most of the systems work under these limitations. In this paper, different approaches of ANPR are discussed by considering image size, success rate and processing time as parameters. Towards the end of this paper, an extension to ANPR is suggested.
104 citations
TL;DR: The firefly Optimization (FA or FFA) algorithm is an optimization method inspired by the flashing behavior of fireflies that ranks randomly generated solutions as fireflies, and brightness is assigned depending on their performance on the objective function.
Abstract: bio-inspired optimization techniques have obtained great attention in recent years due to its robustness, simplicity and efficiency to solve complex optimization problems. The firefly Optimization (FA or FFA) algorithm is an optimization method with these features. The algorithm is inspired by the flashing behavior of fireflies. In the algorithm, randomly generated solutions will be considered as fireflies, and brightness is assigned depending on their performance on the objective function. The algorithm is analyzed on basis of performance and success rate using five standard benchmark functions by which guidelines of parameter selection are derived. The tradeoff between exploration and exploitation is illustrated and discussed.
101 citations
TL;DR: A survey of available literature of some methodologies employed by different researchers to utilize ANN for rainfall prediction reports that rainfall prediction using ANN technique is more suitable than traditional statistical and numerical methods.
Abstract: Rainfall prediction is one of the most important and challenging task in the modern world. In general, climate and rainfall are highly non-linear and complicated phenomena, which require advanced computer modeling and simulation for their accurate prediction. An Artificial Neural Network (ANN) can be used to predict the behavior of such nonlinear systems. ANN has been successfully used by most of the researchers in this field for the last twenty-five years. This paper provides a survey of available literature of some methodologies employed by different researchers to utilize ANN for rainfall prediction. The survey also reports that rainfall prediction using ANN technique is more suitable than traditional statistical and numerical methods . General Terms Rainfall, Artificial Neural Network, Prediction. Keywords Rainfall, Neural Network, BPN, RBF, SVM, SOM, ANN. 1. INTRODUCTION Rainfall brings the most important role in the matter of human life in all kinds of weather happenings. The effect of rainfall for human civilization is very colossal. Rainfall is natural climatic phenomena whose prediction is challenging and demanding. Accurate information on rainfall is essential for the planning and management of water resources and also crucial for reservoir operation and flooding prevention. Additionally, rainfall has a strong influence on traffic, sewer systems and other human activities in the urban areas. Nevertheless, rainfall is one of the most complex and difficult elements of the hydrology cycle to understand and to model due to the complexity of the atmospheric processes that generate rainfall and the tremendous range of variation over a wide range of scales both in space and time. Thus, accurate rainfall prediction is one of the greatest challenges in operational hydrology, despite many advances in weather forecasting in recent decades. Rainfall means crops; and crop means life. Rainfall prediction is closely related to agriculture sector, which contributes significantly to the economy of the nation. On a worldwide scale, large numbers of attempts have been made by different researchers to predict rainfall accurately using various techniques. But due to the nonlinear nature of rainfall, prediction accuracy obtained by these techniques is still below the satisfactory level. Artificial neural network algorithm becomes an attractive inductive approach in rainfall prediction owing to their highly nonlinearity, flexibility and data driven learning in building models without any prior knowledge about catchment behavior and flow processes. Artificial neural networks have been successfully used in these days in various aspects of science and engineering because of its ability to model both linear and non-linear systems without the need to make assumptions as are implicit in most traditional statistical approaches. ANN has been used as an effective model over the simple linear regression model. This paper provides a literature survey on rainfall prediction using different neural networks used by different researchers. The paper also discusses the concept of some neural network architectures briefly which will be helpful to the new researchers in this field. The objective of this survey is to make the prediction of rainfall more accurate in the recent future. The paper has been constructed with the sections. Section II discussed the concept of neural network. Differentmethodologies used by researchers to predict rainfall has been discussed in section III. Section IV discusses the literature survey of rainfall prediction all over the world. At last a conclusion is discussed in the section V.
93 citations
TL;DR: The experiments and results show that heavy amount of potential evidences and valuable data can be found on Android phones by forensic investigators.
Abstract: modern day Smartphone's have built in apps like "WhatsApp & Viber" which allow users to exchange instant messages, share videos, audio's and images via Smartphone's instead of relying on their desktop Computers or laptop thereby increasing the portability and convenience for a layman smart phone user. An Instant Messenger (IM) can serve as a very useful yet very dangerous platform for the victim and the suspect to communicate. The increased use of Instant messengers on Android phones has turned to be the goldmine for mobile and computer forensic experts. Traces and Evidence left by applications can be held on Android phones and retrieving those potential evidences with right forensic technique is strongly required. This paper focuses on conducting forensic data analysis of 2 widely used IMs applications on Android phones: WhatsApp and Viber. 5 Android phones were analyzed covering 3 different versions of Android OS: Froyo (2.2), GingerBread (2.3.x) and Ice- Cream Sandwich (4.0.x). The tests and analysis were performed with the aim of determining what data and information can be found on the device's internal memory for instant messengers e.g. chat messaging logs and history, send & received image or video files, etc. Determining the location of data found from FileSystem Extraction of the device was also determined. The experiments and results show that heavy amount of potential evidences and valuable data can be found on Android phones by forensic investigators.
TL;DR: This work studies and evaluates the performance of different distances that can be used in the K-NN algorithm and analyzes this distance by using different values of the parameter “k” and by using several rules of classification.
Abstract: Cancer diagnosis is one of the most studied problems in the medical domain. Several researchers have focused in order to improve performance and achieve to obtain satisfactory results. Breast cancer is one of cancer killer in the world. The diagnosis of this cancer is a big problem in cancer diagnosis researches. In artificial intelligent, machine learning is a discipline which allows to the machine to evolve through a process. Machine learning is widely used in bio informatics and particularly in breast cancer diagnosis. One of the most popular methods is K-nearest neighbors (K-NN) which is a supervised learning method. Using the K-NN in medical diagnosis is very interesting. The quality of the results depends largely on the distance and the value of the parameter “k” which represent the number of the nearest neighbors. In this paper, we study and evaluate the performance of different distances that can be used in the K-NN algorithm. Also, we analyze this distance by using different values of the parameter “k” and by using several rules of classification (the rule used to decide how to classify a sample). Our work will be performed on the WBCD database (Wisconsin Breast Cancer Database) obtained by the university of Wisconsin Hospital.
TL;DR: The Query Recommendation technique provides alternative queries to the user to frame a meaningful and relevant query in the future and rapidly satisfies their information needs.
Abstract: The exhaustive information available in the World Wide Web indeed, unfolds the challenge of exploring the apposite, precise and relevant data in every search result. Apparently, in such instances of web-searching, Query Recommendations is the ultimate application in information retrieval. The Query Recommendation technique provides alternative queries to the user to frame a meaningful and relevant query in the future and rapidly satisfies their information needs. Similar query
TL;DR: The analysis, verification and experiment showed that a second-order high-pass filter can adequately suppress the low frequency noises and the proposed amplification and filtering circuit design is able to effectively clean the noises and collect the useful surface EMG signals from an upper limb.
Abstract: Electromyographic (EMG) signals have been widely employed as a control signal in rehabilitation and a means of diagnosis in health care. Signal amplification and filtering is the first step in surface EMG signal processing and application systems. The characteristics of the amplifiers and filters determine the quality of EMG signals. Up until now, searching for better amplification and filtering circuit design that is able to accurately capture the features of surface EMG signals for the intended applications is still a challenging. With the fast development in computer sciences and technologies, EMG signals are expected to be used and integrated within small or even tiny intelligent, automatic, robotic, and mechatronics systems. This research focused on small size amplification and filtering circuit design for processing surface EMG signals from an upper limb and aimed to fix the amplifiers and filters inside a robotic hand with limited space to command and control the robot hand movement. The research made a study on the commonly used methodologies for EMG signal processing and circuitry design and proposed a circuit design for EMG signal amplification and filtering. High-pass filters including secondorder and fourth-order with the suppression to low frequency noises are studied. The analysis, verification and experiment showed that a second-order high-pass filter can adequately suppress the low frequency noises. The proposed amplification and filtering circuit design is able to effectively clean the noises and collect the useful surface EMG signals from an upper limb. The experiment also clearly revealed that power line interference needs to be carefully handled for higher signalnoise-ratio (SNR) as a notch-filter might cause the loss of useful signal components. Commercial computer software such as LabView and Matlab were used for data acquisition software development and data analysis. General Terms EMG data acquisition, Bio-signal processing, Bio-informatics
TL;DR: A new metric is proposed that is useful for measuring the improvement in contrast as well as sharpness of both general and medical images and can be used for all types of images.
Abstract: Evaluation of images, after processing, is an important step for determining how well the images are being processed. Quality of image is usually assessed using image quality metrics. Unfortunately, most of the commonly used metrics cannot adequately describe the visual quality of the enhanced image. There is no universal measure, which specifies both the objective and subjective validity of the enhancement for all types of images. This paper is a study of the various quantitative metrics for enhancement against changes in contrast and sharpness of both general and medical images. A new metric is proposed that is useful for measuring the improvement in contrast as well as sharpness. It is computationally simple and can be used for all types of images.
Journal Article•
TL;DR: Five feature selection methods for sentiment analysis are investigated and their performance for classification in term of recall, precision and accuracy is investigated and it is found that performance of the classifier depends on appropriate number of representative feature selected from text.
Abstract: Sentiment analysis or opinion mining has become an open research domain after proliferation of Internet and Web 2.0 social media. People express their attitudes and opinions on social media including blogs, discussion forums, tweets, etc. and, sentiment analysis concerns about detecting and extracting sentiment or opinion from online text. Sentiment based text classification is different from topical text classification since it involves discrimination based on expressed opinion on a topic. Feature selection is significant for sentiment analysis as the opinionated text may have high dimensions, which can adversely affect the performance of sentiment analysis classifier. This paper explores applicability of feature selection methods for sentiment analysis and investigates their performance for classification in term of recall, precision and accuracy. Five feature selection methods (Document Frequency, Information Gain, Gain Ratio, Chi Squared, and Relief-F) and three popular sentiment feature lexicons (HM, GI and Opinion Lexicon) are investigated on movie reviews corpus with a size of 2000 documents. The experimental results show that Information Gain gave consistent results and Gain Ratio performs overall best for sentimental feature selection while sentiment lexicons gave poor performance. Furthermore, we found that performance of the classifier depends on appropriate number of representative feature selected from text.
TL;DR: This paper showed that for similar SNR, L-PPM scheme offered improved performance, and their performance in terms of power and bandwidth efficiencies and the Bit Error Rate versus Signal-to-Noise Ratio (SNR) are compared analytically.
Abstract: As wireless communication systems become ever-more important and pervasive parts of our everyday life; system capacity and quality of service issues are becoming more critical. In order to increase the system capacity and improve the quality of service, it is necessary that we pay closer attention to bandwidth and power efficiency issues. In this paper, the bandwidth and power efficiency issues in Free Space Optics (FSO) transmissions are addressed under Pulse Position Modulation (L-PPM) and Pulse Amplitude Modulation (M-PAM) schemes, and their performance in terms of power and bandwidth efficiencies and the Bit Error Rate (BER) versus Signal-to-Noise Ratio (SNR) are compared analytically. The comparative study of the L-PPM and MPAM schemes is discussed, and showed that for similar SNR, L-PPM scheme offered improved performance. For FSO communication systems, although the power efficiency is inferior to L-PPM scheme, On-Off Keying (OOK) modulation scheme is more commonly used due to its efficient bandwidth usage, but M-PAM is the bandwidth efficient modulation scheme in this research for more than “2” bits of information can be sent, while L-PPM is the power efficient modulation scheme for more number of bits can be sent, and it may be able to improve performance by increasing the number of bits in L-PPM scheme. General Terms Optical Communications, Modulation schemes, Bit Error Rate (BER), Signal to Noise Ratio (SNR), Bandwidth Efficiency, Power Efficiency.
Journal Article•
TL;DR: This paper presents a comprehensive comparative analysis of different existing cryptographic algorithms (symmetric) based on their Architecture, Scalability, Flexibility, Reliability, Security and Limitation that are essential for secure communication (Wired or Wireless).
Abstract: Information Security has become an important issue in modern world as the popularity and infiltration of internet commerce and communication technologies has emerged, making them a prospective medium to the security threats. To surmount these security threats modern data communications uses cryptography an effective, efficient and essential component for secure transmission of information by implementing security parameter counting Confidentiality, Authentication, accountability, and accuracy. To achieve data security different cryptographic algorithms (Symmetric & Asymmetric) are used that jumbles data in to scribbled format that can only be reversed by the user that have to desire key. This paper presents a comprehensive comparative analysis of different existing cryptographic algorithms (symmetric) based on their Architecture, Scalability, Flexibility, Reliability, Security and Limitation that are essential for secure communication (Wired or Wireless).
TL;DR: The applicability of Web Crawler in the field of web search and a review on Web crawler to different problem domains in web search is discussed.
Abstract: Information Retrieval deals with searching and retrieving information within the documents and it also searches the online databases and internet. Web crawler is defined as a program or software which traverses the Web and downloads web documents in a methodical, automated manner. Based on the type of knowledge, web crawler is usually divided in three types of crawling techniques: General Purpose Crawling, Focused crawling and Distributed Crawling. In this paper, the applicability of Web Crawler in the field of web search and a review on Web Crawler to different problem domains in web search is discussed.
TL;DR: The current research work reviews the work carried out in last twenty years and a brief comparison has been performed to analyze the difficulties encountered by these systems, as well as the limitation.
Abstract: Gesture was the first mode of communication for the primitive cave men. Later on human civilization has developed the verbal communication very well. But still nonverbal communication has not lost its weightage. Such non – verbal communication are being used not only for the physically challenged people, but also for different applications in diversified areas, such as aviation, surveying, music direction etc. It is the best method to interact with the computer without using other peripheral devices, such as keyboard, mouse. Researchers around the world are actively engaged in development of robust and efficient gesture recognition system, more specially, hand gesture recognition system for various applications. The major steps associated with the hand gesture recognition system are; data acquisition, gesture modeling, feature extraction and hand gesture recognition. There are several sub-steps and methodologies associated with the above steps. Different researchers have followed different algorithm or sometimes have devised their own algorithm. The current research work reviews the work carried out in last twenty years and a brief comparison has been performed to analyze the difficulties encountered by these systems, as well as the limitation. Finally the desired characteristics of a robust and efficient hand gesture recognition system have been described. General Terms Hand gesture recognition, comparison
TL;DR: In this paper, the authors present a critical evaluation of several categories of semantic similarity approaches based on two standard benchmarks and give an efficient evaluation of all these measures which help researcher and practitioners to select the measure that best fit for their requirements.
Abstract: In recent years, semantic similarity measure has a great interest in Semantic Web and Natural Language Processing (NLP). Several similarity measures have been developed, being given the existence of a structured knowledge representation offered by ontologies and corpus which enable semantic interpretation of terms. Semantic similarity measures compute the similarity between concepts/terms included in knowledge sources in order to perform estimations. This paper discusses the existing semantic similarity methods based on structure, information content and feature approaches. Additionally, we present a critical evaluation of several categories of semantic similarity approaches based on two standard benchmarks. The aim of this paper is to give an efficient evaluation of all these measures which help researcher and practitioners to select the measure that best fit for their requirements. General Terms Similarity Measures, Ontology, Semantic Web, NLP
TL;DR: This paper aims to provide a better understanding of fault tolerance techniques used for fault tolerance in cloud environments along with some existing model and further compare them on various parameters.
Abstract: computing is the result of evolution of on demand service in computing paradigms of large scale distributed computing. It is the adoptable technology as it provides integration of software and resources which are dynamically scalable. These systems are more or less prone to failure. Fault tolerance assesses the ability of a system to respond gracefully to an unexpected hardware or software failure. In order to achieve robustness and dependability in cloud computing, failure should be assessed and handled effectively . This paper aims to provide a better understanding of fault tolerance techniques used for fault tolerance in cloud environments along with some existing model and further compare them on various parameters.
TL;DR: J48 decision tree algorithm is found to be the best suitable algorithm for model construction and may be helpful for identifying the weak students so that management could take appropriate actions, and success rate of students could be increased sufficiently.
Abstract: of an educational institute can be measured in terms of successful students of the institute. The analysis related to the prediction of students academic performance in higher education seems an essential requirement for the improvement in quality education. Data mining techniques play an important role in data analysis. For the construction of a classification model which could predict performance of students, particularly for engineering branches, a decision tree algorithm associated with the data mining techniques have been used in the research. A number of factors may affect the performance of students. Here some significant factors have been considered while constructing the decision tree for classifying students according to their attributes (grades). In this paper four different decision tree algorithms J48, NBtree, Reptree and Simple cart were compared and J48 decision tree algorithm is found to be the best suitable algorithm for model construction. Cross validation method and percentage split method were used to evaluate the efficiency of the different algorithms. The traditional KDD process has been used as a methodology. The WEKA (Waikato Environment for Knowledge Analysis) tool was used for analysis and prediction. . Results obtained in the present study may be helpful for identifying the weak students so that management could take appropriate actions, and success rate of students could be increased sufficiently.
TL;DR: An overview of concepts, applications, issues and tools used for text mining is given.
Abstract: Nowadays there is an increasing trend in the usage of computers for storing documents. As a result of it substantial volume of data is stored in the computers in the form of documents. The documents can be of any form such as structured documents, semi-structured documents and unstructured documents. Retrieving useful information from huge volume of documents is very tedious task. Text mining is an inspiring research area as it tries to discover knowledge from unstructured text. This paper gives an overview of concepts, applications, issues and tools used for text mining.
TL;DR: Crime analysis is done by performing k-means clustering on crime dataset using rapid miner tool and knowledge gained from data mining approaches will be useful and support police force.
Abstract: today's world security is an aspect which is given higher priority by all political and government worldwide and aiming to reduce crime incidence. As data mining is the appropriate field to apply on high volume crime dataset and knowledge gained from data mining approaches will be useful and support police force. So In this paper crime analysis is done by performing k-means clustering on crime dataset using rapid miner tool.
Journal Article•
TL;DR: An effective solution is proposed for DOS based attacks which use the redundancy elimination mechanism consists of rate decreasing algorithm and state transition mechanism as its components and adds a level of security to its already existing solutions of using various alternative options to counter affect the DOS attacks.
Abstract: Vehicular Ad hoc Networks is a special kind of mobile ad hoc network to provide communication among nearby vehicles and between vehicles and nearby fixed equipments. VANETs are mainly used for improving efficiency and safety of (future) transportation. There are chances of a number of possible attacks in VANET due to open nature of wireless medium. In this paper, we have classified these security attacks and logically organized/represented in a more lucid manner based on the level of effect of a particular security attack on intelligent vehicular traffic. Also, an effective solution is proposed for DOS based attacks which use the redundancy elimination mechanism consists of rate decreasing algorithm and state transition mechanism as its components. This solution basically adds a level of security to its already existing solutions of using various alternative options like channel-switching, frequency-hopping, communication technology switching and multiple-radio transceivers to counter affect the DOS attacks. Proposed scheme enhances the security in VANETs without using any cryptographic scheme.
TL;DR: An overview of some basic concepts of motion estimation, optical flow and Lucas kanade method has been provided by us.
Abstract: Motion estimation is demanding field among researchers to compute independent estimation of motion at each pixel in most of general. Motion estimation generally known as optical or optic flow. In this paper, overview of some basic concepts of motion estimation, optical flow and Lucas kanade method has been provided by us. Lucas kanade method is one of the methods for optical flow measurement. It is a differential method for optical flow estimation.
TL;DR: The main objective of this paper is to present the study on various existing binarization algorithms and compared their measurements to act as guide for fresher’s to start their work on Binarization.
Abstract: ABSTARCT Image binarization is important step in the OCR (Optical Character Recognition). There are several methods used for image binarization recently, but there is no way to select single or best method which is used for all images. The main objective of this paper is to present the study on various existing binarization algorithms and compared their measurements. This paper will act as guide for fresher’s to start their work on binarization.
TL;DR: Results show that Saudi Internet shoppers are very much influenced by eWOM, and that a larger percentage of them are dependent on such online forums when making decisions to purchase products through the Internet.
Abstract: substantial growth in online social networks has vastly expanded the potential impact of electronic word of mouth (eWOM) on consumer purchasing decisions. A critical literature review exposed that there is limited research on the impact of online consumer reviews on online purchasing decisions of Saudi Arabian consumers. This research reports on results of a study on the effects of online reviews on Saudi citizens' online purchasing decisions. The results show that Saudi Internet shoppers are very much influenced by eWOM, and that a larger percentage of them are dependent on such online forums when making decisions to purchase products through the Internet. General Terms E-commerce, online shopping reviews. Keywordsword of mouth; eWOM; online WOM; online consumer reviews; Saudi Arabia.
TL;DR: A simple data protection model is proposed where data is encrypted using Advanced Encryption Standard (AES) before it is launched in the cloud, thus ensuring data confidentiality and security.
Abstract: With the tremendous growth of sensitive information on cloud, cloud security is getting more important than even before. The cloud data and services reside in massively scalable data centers and can be accessed everywhere. The growth of the cloud users has unfortunately been accompanied with a growth in malicious activity in the cloud. More and more vulnerabilities are discovered, and nearly every day, new security advisories are published. Millions of users are surfing the Cloud for various purposes, therefore they need highly safe and persistent services. The future of cloud, especially in expanding the range of applications, involves a much deeper degree of privacy, and authentication. We propose a simple data protection model where data is encrypted using Advanced Encryption Standard (AES) before it is launched in the cloud, thus ensuring data confidentiality and security. General Terms Cloud Service Provider (CSP)