scispace - formally typeset
Search or ask a question
Author

Isaac Kor

Bio: Isaac Kor is an academic researcher. The author has contributed to research in topics: Ensemble forecasting & Regression analysis. The author has an hindex of 1, co-authored 1 publications receiving 2 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: It is concluded that the use of the ensemble model can reduce the average correlation coefficient (as one of the evaluation criteria of the model) to 74.4 ± 16.4, which is an acceptable result.
Abstract: The nature and importance of user’s comments in various social media systems play an important role in creating or changing people's perceptions of certain topics or popularizing them. It has now an important place in various fields, including education, sales, prediction, and so on. In this paper, Facebook social network has been considered as a case study. The purpose of this study is to predict the volume of Facebook users' comments on the published content called post. Therefore, the existing problem is classified as a regression problem. In the method presented in this paper, three regression models called elastic network, M5P model, and radial basis function regression model are combined and an ensemble model is made to predict the volume of comments. In order to combine these base models, a strategy called stack generalization is used, based on which the output of the base models is provided to a linear regression model as new features. This linear regression model combines the outputs of the 3 base models and determines the final output of the system. To evaluate the performance of the proposed model, a database of the UCI dataset, which has 5 training sets and 10 test sets, has been used. Each test set in this database has 100 records. In the present study, the efficiency of the base models and the proposed ensemble model is evaluated on all these sets. Finally, it is concluded that the use of the ensemble model can reduce the average correlation coefficient (as one of the evaluation criteria of the model) to 74.4 ± 16.4, which is an acceptable result.

4 citations


Cited by
More filters
Journal ArticleDOI
01 Apr 2022-Sensors
TL;DR: In this article , a novel algorithm is proposed, which facilitates demodulation of surrounding refractive index (SRI) via cladding mode interrogation and accelerates calibration and measurement of SRI.
Abstract: In the paper based on surface plasmon resonance (SPR) in a tilted fiber Bragg grating (TFBG), a novel algorithm is proposed, which facilitates demodulation of surrounding refractive index (SRI) via cladding mode interrogation and accelerates calibration and measurement of SRI. Refractive indices with a tiny index step of 2.2 × 10−5 are prepared by the dilution of glucose aqueous solution for the test and the calibration of this fiber sensor probe. To accelerate the calibration process, automatic selection of the most sensitive cladding mode is demonstrated. First, peaks of transmitted spectrum are identified and numbered. Then, sensitivities of several potentially sensitive cladding modes in amplitude adjacent to the left of the SPR area are calculated and compared. After that, we focus on the amplitudes of the cladding modes as a function of a SRI, and the highest sensitivity of −6887 dB/RIU (refractive index unit) is obtained with a scanning time of 15.77 s in the range from 1520 nm to 1620 nm. To accelerate the scanning speed of the optical spectrum analyzer (OSA), the wavelength resolution is reduced from 0.028 nm to 0.07 nm, 0.14 nm, and 0.28 nm, and consequently the scanning time is shortened to 6.31 s, 3.15 s, and 1.58 s, respectively. However, compared to 0.028 nm, the SRI sensitivity for 0.07 nm, 0.14 nm, and 0.28 nm is reduced to −5685 dB/RIU (17.5% less), −5415 dB/RIU (21.4% less), and −4359 dB/RIU (36.7% less), respectively. Thanks to the calculation of parabolic equation and weighted Gauss fitting based on the original data, the sensitivity is improved to −6332 dB/RIU and −6721 dB/RIU, respectively, for 0.07 nm, and the sensitivity is increased to −5850 dB/RIU and −6228 dB/RIU, respectively, for 0.14 nm.

4 citations

Journal ArticleDOI
TL;DR: The sentiment analysis and the clusters formed indicate that there is a very pronounced dispersion, the distances are not very similar, even though the data standardization work was carried out.
Abstract: Today, web content such as images, text, speeches, and videos are user-generated, and social networks have become increasingly popular as a means for people to share their ideas and opinions. One of the most popular social media for expressing their feelings towards events that occur is Twitter. The main objective of this study is to classify and analyze the content of the affiliates of the Pension and Funds Administration (AFP) published on Twitter. This study incorporates machine learning techniques for data mining, cleaning, tokenization, exploratory analysis, classification, and sentiment analysis. To apply the study and examine the data, Twitter was used with the hashtag #afp, followed by descriptive and exploratory analysis, including metrics of the tweets. Finally, a content analysis was carried out, including word frequency calculation, lemmatization, and classification of words by sentiment, emotions, and word cloud. The study uses tweets published in the month of May 2022. Sentiment distribution was also performed in three polarity classes: positive, neutral, and negative, representing 22%, 4%, and 74% respectively. Supported by the unsupervised learning method and the K-Means algorithm, we were able to determine the number of clusters using the elbow method. Finally, the sentiment analysis and the clusters formed indicate that there is a very pronounced dispersion, the distances are not very similar, even though the data standardization work was carried out. Keywords—Techniques; machine learning; classification; twitter

3 citations

Proceedings Article
01 Jan 2013
TL;DR: An efficient method for detection of masses in mammograms is introduced and tested and was evaluated both on mini-MIAS and INBreast databases, showing that the algorithm outperforms other competing methods.
Abstract: Mammography is the most effective procedure for an early detection of breast abnormalities. Masses are a type of abnormality which are very difficult to be visually detected on mammograms. In this paper an efficient method for detection of masses in mammograms is introduced and tested. The algorithm is inspired by binary search and was evaluated both on mini-MIAS and INBreast databases. Mini-MIAS results show that our algorithm outperforms other competing methods. For INBreast database there are no other published mass detection results for comparison, but we believe that our algorithm has good performance.

3 citations

Journal ArticleDOI
TL;DR: The numerical results indicate that the proposed method can be used as a promising method for detecting phishing websites, and the stacked generalization strategy is used as an ensemble strategy.
Abstract: Phishing is a social engineering technique used to deceive users, which means trying to obtain confidential information such as username, password or bank account information. One of the most important challenges on the Internet today is the risk of phishing attack and Internet scams. These attacks cost the United States billions of dollars a year. Therefore, researchers have made great efforts to identify and combat such attacks. Accordingly, the present study aims to evaluate the methods of identifying phishing websites. This research is applied in terms of its objectives and descriptive-analytical in nature. In this article, the classification approach is used to identify phishing websites. From a machine learning point of view, if a suitable strategy is used, the ensemble of votes of different classifiers can be used to increase the accuracy of classification. In the method proposed in this paper, three inherently different ensemble classifiers, called bagging, AdaBoost, and rotation forest are employed. In this method, the stacked generalization strategy is used as an ensemble strategy. A relatively new dataset is employed to evaluate the performance of the proposed method. The database was added to the UCI Database in 2015 and uses 30 features that appear to be appropriate for distinguishing phishing and non-phishing websites. The present study uses 10-fold-cross-validation method as an evaluation strategy. The numerical results indicate that the proposed method can be used as a promising method for detecting phishing websites. It is worth mentioning that in this method, an F-score of 96.3 is resulted, which is a good result in detecting phishing.

1 citations