Home
/
Authors
/
B. S. Harish

Author

B. S. Harish

Sri Jayachamarajendra College of Engineering

Other affiliations: MVJ College of Engineering, University of Mysore

Bio: B. S. Harish is an academic researcher from Sri Jayachamarajendra College of Engineering. The author has contributed to research in topics: Cluster analysis & Feature selection. The author has an hindex of 13, co-authored 66 publications receiving 605 citations. Previous affiliations of B. S. Harish include MVJ College of Engineering & University of Mysore.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009

Papers

PDF

Open Access

More filters

Journal Article•

Representation and Classification of Text Documents: A Brief Review

[...]

B. S. Harish, Devanur S. Guru, S. S. Manjunath

17 Aug 2010-International Journal of Computer Applications

TL;DR: Various text representation schemes and compare different classifiers used to classify text documents to the predefined classes are presented and the existing methods are compared and contrasted based on qualitative parameters.

...read moreread less

Abstract: Text classification is one of the important research issues in the field of text mining, where the documents are classified with supervised knowledge. In literature we can find many text representation schemes and classifiers/learning algorithms used to classify text documents to the predefined categories. In this paper, we present various text representation schemes and compare different classifiers used to classify text documents to the predefined classes. The existing methods are compared and contrasted based on qualitative parameters viz., criteria used for classification, algorithms adopted and classification time complexities. General Terms Pattern Recognition, Text Mining, Algorithms

...read moreread less

103 citations

Journal Article•DOI•

Sentiment Analysis on IMDb Movie Reviews Using Hybrid Feature Extraction Method

[...]

H. M. Keerthi Kumar, B. S. Harish, H. K. Darshan

01 Jan 2019-International Journal of Interactive Multimedia and Artificial Intelligence

TL;DR: This work shows that the use of Hybrid features obtained by concatenating Machine Learning features (TF, TF-IDF) with Lexicon features (Positive-Negative word count, Connotation) gives better results when tested against classifiers like SVM, Naive Bayes, KNN and Maximum Entropy.

...read moreread less

Abstract: Social Networking sites have become popular and common places for sharing wide range of emotions through short texts. These emotions include happiness, sadness, anxiety, fear, etc. Analyzing short texts helps in identifying the sentiment expressed by the crowd. Sentiment Analysis on IMDb movie reviews identifies the overall sentiment or opinion expressed by a reviewer towards a movie. Many researchers are working on pruning the sentiment analysis model that clearly identifies and distinguishes between a positive review and a negative review. In the proposed work, we show that the use of Hybrid features obtained by concatenating Machine Learning features (TF, TF-IDF) with Lexicon features (Positive-Negative word count, Connotation) gives better results both in terms of accuracy and complexity when tested against classifiers like SVM, Naive Bayes, KNN and Maximum Entropy. The proposed model clearly differentiates between a positive review and negative review. Since understanding the context of the reviews plays an important role in classification, using hybrid features helps in capturing the context of the movie reviews and hence increases the accuracy of classification.

...read moreread less

58 citations

Journal Article•

A Survey on various Machine Learning Approaches for ECG Analysis

[...]

C. K. Roopa, B. S. Harish

17 Apr 2017-International Journal of Computer Applications

TL;DR: The main objective of this paper is to review the various machine learning approaches for diagnosing Myocardial Infarction, differentiate Arrhythmias (heart beat variation), Hypertrophy (increase thickness of the heart muscle) and Enlargement of Heart.

...read moreread less

Abstract: Electrocardiogram (ECG) is a P, QRS and T wave demonstrating the electrical activity of the heart. Feature extraction and segmentation in ECG plays a significant role in diagnosing most of the cardiac disease. The main objective of this paper is to review the various machine learning approaches for diagnosing Myocardial Infarction (heart attack), differentiate Arrhythmias (heart beat variation), Hypertrophy (increase thickness of the heart muscle) and Enlargement of Heart. Further, we also present various machine learning approaches and compare different methods and results used to analyze the ECG. The existing methods are compared and contrasted based on qualitative and qualitative parameters viz., purpose of the work, algorithms adopted and results obtained.

...read moreread less

50 citations

Proceedings Article•DOI•

Sentiment analysis for sarcasm detection on streaming short text data

[...]

Anukarsh G. Prasad¹, S. Sanjana¹, Skanda M. Bhat¹, B. S. Harish¹•Institutions (1)

Sri Jayachamarajendra College of Engineering¹

01 Oct 2017

TL;DR: This paper compares various classification algorithms such as Random Forest, Gradient Boosting, Decision Tree, Adaptive Boost, Logistic Regression and Gaussian Naïve Bayes to detect sarcasm in tweets from the Twitter Streaming API and chooses the best classifier to provide the best possible accuracy.

...read moreread less

Abstract: The growth of social media has been exponential in the recent years. Immense amount of data is being put out onto the public domain through social media. This huge publicly available data can be used for research and a variety of applications. The objective of this paper is to counter problems with the social media dataset, namely : short text nature - the limited quantity of text data (140 to 160 characters), continuous streaming nature, usage of short forms and modern slangs and increasing use of sarcasm in messages and posts. Sarcastic tweets can mislead data mining activities and result in wrong classification. This paper compares various classification algorithms such as Random Forest, Gradient Boosting, Decision Tree, Adaptive Boost, Logistic Regression and Gaussian Naive Bayes to detect sarcasm in tweets from the Twitter Streaming API. The best classifier is chosen and paired with various pre-processing and filtering techniques using emoji and slang dictionary mapping to provide the best possible accuracy. The emoji and slang dictionary being the novel idea introduced in this paper. The obtained results can be used as input to other research and applications.

...read moreread less

47 citations

Proceedings Article•DOI•

SSERBC 2017: Sclera segmentation and eye recognition benchmarking competition

[...]

Abhijit Das¹, Umapada Pal², Miguel Ferrer³, Michael Blumenstein¹, Dejan Stepec⁴, Peter Rot⁴, Ziga Emersic⁴, Peter Peer⁴, Vitomir Struc⁴, S. V. Aruna Kumar, B. S. Harish - Show less +7 more•Institutions (4)

Griffith University¹, Indian Statistical Institute², University of Las Palmas de Gran Canaria³, University of Ljubljana⁴

01 Oct 2017-International Journal of Central Banking

TL;DR: The aim of this competition was to record the recent developments in sclera segmentation and eye recognition in the visible spectrum (using iris, sClera and peri-ocular, and their fusion), and also to gain the attention of researchers on this subject.

...read moreread less

Abstract: This paper summarises the results of the Sclera Segmentation and Eye Recognition Benchmarking Competition (SSERBC 2017) It was organised in the context of the International Joint Conference on Biometrics (IJCB 2017) The aim of this competition was to record the recent developments in sclera segmentation and eye recognition in the visible spectrum (using iris, sclera and peri-ocular, and their fusion), and also to gain the attention of researchers on this subject In this regard, we have used the Multi-Angle Sclera Dataset (MASD version 1) It is comprised of2624 images taken from both the eyes of 82 identities Therefore, it consists of images of 164 (82×2) eyes A manual segmentation mask of these images was created to baseline both tasks Precision and recall based statistical measures were employed to evaluate the effectiveness of the segmentation and the ranks of the segmentation task Recognition accuracy measure has been employed to measure the recognition task Manually segmented sclera, iris and peri-ocular regions were used in the recognition task Sixteen teams registered for the competition, and among them, six teams submitted their algorithms or systems for the segmentation task and two of them submitted their recognition algorithm or systems The results produced by these algorithms or systems reflect current developments in the literature of sclera segmentation and eye recognition, employing cutting edge techniques The MASD version 1 dataset with some of the ground truth will be freely available for research purposes The success of the competition also demonstrates the recent interests of researchers from academia as well as industry on this subject

...read moreread less

40 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Machine learning

[...]

Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

01 Dec 1996-ACM Computing Surveys

TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.

...read moreread less

Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

...read moreread less

13,246 citations

Journal Article•DOI•

Machine learning methods for landslide susceptibility studies: A comparative overview of algorithm performance

[...]

Abdelaziz Merghadi, Ali P. Yunus¹, Jie Dou², Jie Dou³, J. Whiteley⁴, J. Whiteley⁵, Binh ThaiPham, Dieu Tien Bui⁶, Ram Avtar⁷, Boumezbeur Abderrahmane - Show less +6 more•Institutions (7)

Chengdu University of Technology¹, Nagaoka University of Technology², China University of Geosciences (Wuhan)³, University of Bristol⁴, British Geological Survey⁵, Sewanee: The University of the South⁶, Hokkaido University⁷

01 Aug 2020-Earth-Science Reviews

TL;DR: An extensive analysis and comparison between different ML techniques using a case study from Algeria is undertaken, noting that tree-based ensemble algorithms achieve excellent results compared to other machine learning algorithms and that the Random Forest algorithm offers robust performance for accurate landslide susceptibility mapping with only a small number of adjustments required before training the model.

...read moreread less

362 citations

Journal Article•DOI•

Multi-co-training for document classification using various document representations: TF–IDF, LDA, and Doc2Vec

[...]

Donghwa Kim¹, Deokseong Seo¹, Suhyoun Cho¹, Pilsung Kang¹•Institutions (1)

Korea University¹

01 Mar 2019-Information Sciences

TL;DR: This paper transforms a document using three document representation methods: term frequency–inverse document frequency (TF–IDF) based on the bag-of-words scheme, topic distribution based on latent Dirichlet allocation (LDA), and neural-network-based document embedding known as document to vector (Doc2Vec).

...read moreread less

270 citations

Journal Article•DOI•

Text classification and classifiers: a survey

[...]

V Korde, C N Mahender

31 Mar 2012-International Journal of Artificial Intelligence & Applications

TL;DR: This paper has tried to give the introduction ofText classification, process of text classification as well as the overview of the classifiers and tried to compare the some existing classifier on basis of few criteria like time complexity, principal and performance.

...read moreread less

Abstract: As most information (over 80%) is stored as text, text mining is believed to have a high commercial potential value. knowledge may be discovered from m any sources of information; yet, unstructured texts remain the largest readily available source of knowledge .Text classification which classifies the documents according to predefined categories .In this paper we are tried to give the introduction of tex t classification, process of text classification as well as the overview of the classifiers and tried to compare the some existing classifier on basis of few criteria like time complexity, principal and performance .

...read moreread less

238 citations

Journal Article•

The language of evaluation

[...]

Tara Black

01 Dec 2017-English in Aotearoa

TL;DR: In English, I'm going to argue that evaluation is about assessing different possibilities for meaning and drawing conclusions.

...read moreread less

Abstract: Evaluation is hard It's one of those command words that comes up in a range of contexts In Math it is to do with substituting variables in order to solve an equation In many subjects it's about looking at the pros and cons of a situation in order to make a decision In English, I'm going to argue that it is about assessing different possibilities for meaning and drawing conclusions In all cases, it's a process

...read moreread less

224 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156

Collapse