scispace - formally typeset
Search or ask a question
Author

Manisha Sharma

Bio: Manisha Sharma is an academic researcher from Banasthali Vidyapith. The author has contributed to research in topics: Sentiment analysis & Deep learning. The author has an hindex of 6, co-authored 14 publications receiving 181 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: A novel deep learning architecture based on Convolutional Neural Network (CNN) and Long Short Term Neural network (LSTM) is proposed that is supported by introducing the semantic information in representation of the words with the help of knowledge-bases such as WordNet and ConceptNet.
Abstract: As the use of the Internet is increasing, people are connected virtually using social media platforms such as text messages, Facebook, Twitter, etc. This has led to increase in the spread of unsolicited messages known as spam which is used for marketing, collecting personal information, or just to offend the people. Therefore, it is crucial to have a strong spam detection architecture that could prevent these types of messages. Spam detection in noisy platform such as Twitter is still a problem due to short text and high variability in the language used in social media. In this paper, we propose a novel deep learning architecture based on Convolutional Neural Network (CNN) and Long Short Term Neural Network (LSTM). The model is supported by introducing the semantic information in representation of the words with the help of knowledge-bases such as WordNet and ConceptNet. Use of these knowledge-bases improves the performance by providing better semantic vector representation of testing words which earlier were having random value due to not seen in the training. Proposed Experimental results on two benchmark datasets show the effectiveness of the proposed approach with respect to the accuracy and F1-score.

103 citations

Journal ArticleDOI
TL;DR: The evaluation of the results shows that LSTM is able to outperform traditional machine learning methods for detection of spam with a considerable margin.
Abstract: Classifying spam is a topic of ongoing research in the area of natural language processing, especially with the increase in the usage of the Internet for social networking. This has given rise to the increase in spam activity by the spammers who try to take commercial or non-commercial advantage by sending the spam messages. In this paper, we have implemented an evolving area of technique known as deep learning technique. A special architecture known as Long Short Term Memory (LSTM), a variant of the Recursive Neural Network (RNN) is used for spam classification. It has an ability to learn abstract features unlike traditional classifiers, where the features are hand-crafted. Before using the LSTM for classification task, the text is converted into semantic word vectors with the help of word2vec, WordNet and ConceptNet. The classification results are compared with the benchmark classifiers like SVM, Naive Bayes, ANN, k-NN and Random Forest. Two corpuses are used for comparison of results: SMS Spam Collection dataset and Twitter dataset. The results are evaluated using metrics like Accuracy and F measure. The evaluation of the results shows that LSTM is able to outperform traditional machine learning methods for detection of spam with a considerable margin.

101 citations

Journal ArticleDOI
TL;DR: How spam detection in the social media text is becoming increasing important because of the exponential increase in the spam volume over the network is described.
Abstract: This article describes how spam detection in the social media text is becoming increasing important because of the exponential increase in the spam volume over the network. It is challenging, espec...

47 citations

Journal ArticleDOI
TL;DR: This paper proposes a novel method of extracting aspects using ontology and further categorizing these sentiments into positive, negative and neutral category using supervised leaning technique and efficiency is evaluated using information retrieval search strategies.
Abstract: Social networks have increased their demand extensively for mining texts. Opinions are used to express views and reviews are used to provide information about how a product is perceived. The reviews available online can be available in thousands, so making the right decision to select a product becomes a very tedious task. Several research works has been proposed in the past but they were limited to certain issues discussed in this paper. A dynamic system is proposed based on the features using ontology followed with classification. Classifying information from such text is highly challenging. We propose a novel method of extracting aspects using ontology and further categorizing these sentiments into positive, negative and neutral category using supervised leaning technique. Opinion Mining is a natural language processing task that mine information from various text forums and classify them on the basis of their polarity as positive, negative or neutral. In this paper, we demonstrate machine learning algorithms using WEKA tool and efficiency is evaluated using information retrieval search strategies.

15 citations

Journal ArticleDOI
01 Apr 2016
TL;DR: The present work focuses on the design and implementation of an Opinion Crawler which downloads the opinions from various sites thereby, ignoring rest of the web and takes real data sets that prove to be much more accurate in terms of precision and recall quality attributes.
Abstract: Due to the sudden and explosive increase in web technologies, huge quantity of user generated content is available online. The experiences of people and their opinions play an important role in the decision making process. Although facts provide the ease of searching information on a topic but retrieving opinions is still a crucial task. Many studies on opinion mining have to be undertaken efficiently in order to extract constructive opinionated information from these reviews. The present work focuses on the design and implementation of an Opinion Crawler which downloads the opinions from various sites thereby, ignoring rest of the web. Besides, it also detects web pages which frequently undergo updation by calculating the timestamp for its revisit in order to extract relevant opinions. The performance of the Opinion Crawler is justified by taking real data sets that prove to be much more accurate in terms of precision and recall quality attributes.

13 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This paper aims to provide a comprehensive overview of the challenges that ML techniques face in protecting cyberspace against attacks, by presenting a literature on ML techniques for cyber security including intrusion detection, spam detection, and malware detection on computer networks and mobile networks in the last decade.
Abstract: Pervasive growth and usage of the Internet and mobile applications have expanded cyberspace. The cyberspace has become more vulnerable to automated and prolonged cyberattacks. Cyber security techniques provide enhancements in security measures to detect and react against cyberattacks. The previously used security systems are no longer sufficient because cybercriminals are smart enough to evade conventional security systems. Conventional security systems lack efficiency in detecting previously unseen and polymorphic security attacks. Machine learning (ML) techniques are playing a vital role in numerous applications of cyber security. However, despite the ongoing success, there are significant challenges in ensuring the trustworthiness of ML systems. There are incentivized malicious adversaries present in the cyberspace that are willing to game and exploit such ML vulnerabilities. This paper aims to provide a comprehensive overview of the challenges that ML techniques face in protecting cyberspace against attacks, by presenting a literature on ML techniques for cyber security including intrusion detection, spam detection, and malware detection on computer networks and mobile networks in the last decade. It also provides brief descriptions of each ML method, frequently used security datasets, essential ML tools, and evaluation metrics to evaluate a classification model. It finally discusses the challenges of using ML techniques in cyber security. This paper provides the latest extensive bibliography and the current trends of ML in cyber security.

135 citations

Journal ArticleDOI
15 May 2020-Energies
TL;DR: A brief review of different machine learning techniques to get to the bottom of all the developments made in detection methods for potential cybersecurity risks, and the first attempt to give a comparison of the time complexity of commonly used ML models in cybersecurity.
Abstract: Cyberspace has become an indispensable factor for all areas of the modern world. The world is becoming more and more dependent on the internet for everyday living. The increasing dependency on the internet has also widened the risks of malicious threats. On account of growing cybersecurity risks, cybersecurity has become the most pivotal element in the cyber world to battle against all cyber threats, attacks, and frauds. The expanding cyberspace is highly exposed to the intensifying possibility of being attacked by interminable cyber threats. The objective of this survey is to bestow a brief review of different machine learning (ML) techniques to get to the bottom of all the developments made in detection methods for potential cybersecurity risks. These cybersecurity risk detection methods mainly comprise of fraud detection, intrusion detection, spam detection, and malware detection. In this review paper, we build upon the existing literature of applications of ML models in cybersecurity and provide a comprehensive review of ML techniques in cybersecurity. To the best of our knowledge, we have made the first attempt to give a comparison of the time complexity of commonly used ML models in cybersecurity. We have comprehensively compared each classifier’s performance based on frequently used datasets and sub-domains of cyber threats. This work also provides a brief introduction of machine learning models besides commonly used security datasets. Despite having all the primary precedence, cybersecurity has its constraints compromises, and challenges. This work also expounds on the enormous current challenges and limitations faced during the application of machine learning techniques in cybersecurity.

118 citations

Journal ArticleDOI
TL;DR: Deep learning is used to classify Spam and Not-Spam text messages using Convolutional Neural Network and Long Short-Term Memory models, which achieved a remarkable accuracy of 99.44% on a benchmark dataset.

118 citations

Journal ArticleDOI
TL;DR: A novel deep learning architecture based on Convolutional Neural Network (CNN) and Long Short Term Neural network (LSTM) is proposed that is supported by introducing the semantic information in representation of the words with the help of knowledge-bases such as WordNet and ConceptNet.
Abstract: As the use of the Internet is increasing, people are connected virtually using social media platforms such as text messages, Facebook, Twitter, etc. This has led to increase in the spread of unsolicited messages known as spam which is used for marketing, collecting personal information, or just to offend the people. Therefore, it is crucial to have a strong spam detection architecture that could prevent these types of messages. Spam detection in noisy platform such as Twitter is still a problem due to short text and high variability in the language used in social media. In this paper, we propose a novel deep learning architecture based on Convolutional Neural Network (CNN) and Long Short Term Neural Network (LSTM). The model is supported by introducing the semantic information in representation of the words with the help of knowledge-bases such as WordNet and ConceptNet. Use of these knowledge-bases improves the performance by providing better semantic vector representation of testing words which earlier were having random value due to not seen in the training. Proposed Experimental results on two benchmark datasets show the effectiveness of the proposed approach with respect to the accuracy and F1-score.

103 citations