scispace - formally typeset
Search or ask a question
Author

Mochammad Yusa

Other affiliations: Magister, STMIK AMIKOM Yogyakarta
Bio: Mochammad Yusa is an academic researcher from University of Bengkulu. The author has contributed to research in topics: Naive Bayes classifier & Tuple. The author has an hindex of 4, co-authored 13 publications receiving 30 citations. Previous affiliations of Mochammad Yusa include Magister & STMIK AMIKOM Yogyakarta.

Papers
More filters
DOI
17 Jul 2017
TL;DR: Klasifikasi merupakan salah satu teknik yang terdapat pada data mining adalah untuk memprediksi kelas target secara akurat dengan menggunakan variabel-variabel terkait, while penelitian ini menunjukkan bahwa algoritma Decision Tree akan dieksplorasi performanya pada dataset.
Abstract: Klasifikasi merupakan salah satu teknik yang terdapat pada data mining. Tujuan atau objectif dari teknik klasifikasi data mining adalah untuk memprediksi kelas target secara akurat dengan menggunakan variabel-variabel terkait. Terdapat banyak model algoritma dalam teknik klasifikasi data mining. Model algoritma klasfikasi memiliki nilai yang berbeda-beda dan sangat bergantung pada jumlah atribut dan records dari dataset. Dataset yang digunakan adalah dataset terkait proses readmisi pasien diabetes. Dataset yang digunakan masih mengandung missing values sehingga dalam penelitian ini tahap prepocessing data dilakukan. Setelah tahap prepocessing data dilakukan didapat dataset yang terdiri dari 47 atribut dan 49.735 records. Di dalam penelitian ini juga, teknik klasifikasi menggunakan berbagai macam algoritma Decision Tree akan dieksplorasi performanya pada dataset. Algoritma-algoritma klasifikasi yang akan dievaluasi adalah ID3, C4.5, dan CART. Teknik perhitungan atau validasi yang digunakan adalah 10-fold Cross Validation. Hasil dari penelitian ini menunjukkan bahwa model klasifikasi C4.5 memiliki nilai performa yang paling baik . Nilai performa yang dihasilkan adalah 54,13% performa akurasi dan 6 detik Execution time.

8 citations

Journal ArticleDOI
TL;DR: The result shows that k-NN classifier with k=100 has a better performance in terms of accuracy and Kappa Statistic, but Naive Bayes outperforms in term of MAE among other classifiers.

6 citations

Proceedings ArticleDOI
01 Sep 2018
TL;DR: The experiment showed that Scenario 2 with single keyword and filtering strategy is the best keyword finding mechanism that collected as many as 1323 data, while the Scenario 1 that using keyword combination only collected 438 data.
Abstract: A changes in the effects of drugs can be occured because the patient consumes two (or more) drugs or consumes them with a specific food or beverage resulting an unexpected adverse reaction. The abundance of data in social media can become the based of ingredients that is very valuable in constructing drug interaction analysis knowledge for pharmaceutical experts to produce solution for this problem. The data in this study comes from social media Twitter via its standard search API and be collected using software tools “R” via its Twitter library. We observed two scenario strategies in collecting data in order to find the best strategy between them that can retrieved largest number of result. Scenario 1 was based on a combination of two or more keywords, while the Scenario 2 uses only single keyword search mechanism followed by a filtering process. A list of drugs and food interaction related keyword was first declared before the process. There are 20 keywords to be tested, divided into 5 cases and categorized into 4 attributes for each case: Event (K), Drug (O), Effect (E) and Food Beverage (M). The experiment showed that Scenario 2 with single keyword and filtering strategy is the best keyword finding mechanism that collected as many as 1323 data, while the Scenario 1 that using keyword combination only collected 438 data.

6 citations

Proceedings ArticleDOI
Mochammad Yusa, Ema Utami1
01 Oct 2017
TL;DR: The study indicates that the more the number of tuples, the lower and weaker the MAE and Accuracy performances whereas the kappa statistic performance tend to be fluctuated.
Abstract: The aim of this study is to compare some classifiers' performance related to the tuples amount. The different metrics of performance has been considered, such as: Accuracy, Mean Absolute Error (MAE), and Kappa Statistic. In this research, the different numbers of tuples are considered as well. The readmission process dataset of Diabetic patients, which has been experimented, consists of 47 features and 49.736 tuples. The methodology of this research starts from preprocessing phase. After that, the clean dataset is divided into 5 subsets which represent every multiple of 10.000 tuples randomly. Each particular subset will be validated by three traditional classifiers i.e. Naive Bayes, K-Nearest Neighbor (k-NN), and Decision Tree. We also implement some setting parameters of each classifier except Naive Bayes. Validation method used in this research is 10-Fold Cross-Validation. As the final conclusion, we compare the performance of classifiers based on the number of tuples. Our study indicates that the more the number of tuples, the lower and weaker the MAE and Accuracy performances whereas the kappa statistic performance tend to be fluctuated. Our study also found that Naive Bayes outperforms k-NN and Decision Tree in overall. The top classifiers performances were reached in a 20.000-tuple evaluation.

5 citations

Proceedings ArticleDOI
06 May 2020
TL;DR: This research was carried out by collecting 360 0 photos for each building and room, then built a webbased virtual tour application that can display the location and directions to go to the building or office, then information on each building or outdoors will be displayed according to the shape of the object model 360 0.
Abstract: The development of photography technology creates a 360 ° camera that carries an ultra-wide twin1-lens folded optical lens (a small double lens with a folding design). This technology can produce spherical panoramic images. Spherical panoramas are borderless and seamless image objects used to create virtual tours. Bengkulu University, with the broadest environment in Southeast Asia, has more than 30 buildings with several floors and rooms. It's making it challenging to find the location of the building or event for incoming guests. The purpose of this study is to analyze the level of generation of a real perception of the introduction of the Bengkulu University building by a 360 0 virtual tour application. Usability testing is intended to analyze the level of perception generation, to introduce the Bengkulu University building to the guest. This research was carried out by collecting 360 0 photos for each building and room, then built a webbased virtual tour application that can display the location and directions to go to the building or office, then information on each building or outdoors will be displayed according to the shape of the object model 360 0 . Users can run a virtual tour application using a smartphone and VR-box to get a real perception like being in the building or room. Usability testing is done by making a questionnaire based on three components of usability testing (effectiveness, efficiency, and satisfaction), to obtain the level of perception generation from buildings with 3600 object modelling. Usability testing of the 360 0 Virtual Tour of University Bengkulu was carried out with a total of 100 respondents. The results were obtained by generating of usability testing were 94% (Effectiveness 96.1%, Efficiency 96.6%, and Satisfaction 89.2%).

3 citations


Cited by
More filters
Journal ArticleDOI
31 Jul 2020
TL;DR: In this paper, the authors perform a berbagai metode pada sebuah dataset merupakan salah satu cara dalam penetapan metode klasifikasi ying tepat, masalah ying diangkat pada penelitian ini adalah bagaimana mengukur performa metode metode Klasifikaasi dalam mengelola dataset penderita diabetes.
Abstract: Diabetes adalah penyakit yang berlangsung lama atau kronis serta ditandai dengan kadar gula (glukosa) darah yang tinggi atau di atas nilai normal. Jika diabetes tidak dikontrol dengan baik, Pengujian performa berbagai metode pada sebuah dataset merupakan salah satu cara dalam penetapan metode klasifikasi yang tepat, masalah yang diangkat pada penelitian ini adalah bagaimana mengukur performa metode klasifikasi dalam mengelola dataset penderita diabetes. Metode yang digunakan yaitu algoritma K-Nearest Neighbor (KNN), dimana merupakan sebuah metode untuk melakukan klasifikasi terhadap objek berdasarkan data pembelajaran yang jaraknya paling dekat dengan objek tersebut. Pada hasil akhir penelitian ini, telah dihitung akurasi tertinggi 39% pada K=3, presisi tertinggi 65% pada K=3 dan K=5, recall tertinggi 36% pada K=3, dan F-Measure tertinggi 46% pada K=3.

28 citations

Proceedings ArticleDOI
13 Mar 2019
TL;DR: This study focuses on developing sentiment analysis using lexicons and multiplication polarity in Twitter, which needs to be improved in terms of its semantics.
Abstract: Twitter is one type of social media that is often used. Users use Twitter to convey their tweet to the general public. The number of Twitter users has reached 330 million people worldwide. Besides that, in Twitter there are tweets that can be sentiments. Sentiment itself can be defined as policy, opinion, logic, or mud, etc. Therefore, sentiment analysis is determining the polarity or type of opinion in a predetermined text or subject. NLP (Natural Language Processing) technique is used to support the beginning of a text. The technique used in an analysis is tokenization sentiment, elimination of stop words, and stemming. This study focuses on developing sentiment analysis using lexicons and multiplication polarity. Accuracy results are still smaller than using machine learning. Therefore, this lexicon needs to be improved in terms of its semantics.

15 citations

Journal ArticleDOI
28 Feb 2020
TL;DR: The results obtained indicate that there are four factors that influence the prediction of a patient's DM status namely; Fasting Blood Glucose, LDL Cholesterol, Triglycerides, and Body Weight.
Abstract: Diabetes mellitus (DM) is one of the chronic and deadly diseases that are widely observed in various countries today. This disease continues and is increasing to a very alarming stage. This study aims to identify and see the relationship between factors that influence DM disease. The method used in this research is C4.5 algorithm which is one of the algorithms used to make predictive classifications. Classification is one of the processes in data mining that aims to find patterns in relatively large data that use the representations in the form of decision trees. This method is applied to data from medical records of patients with DM in 2014-2018 taken from the Hasanuddin University Teaching Hospital. The results obtained indicate that there are four factors that influence the prediction of a patient's DM status namely; Fasting Blood Glucose (GDP), LDL Cholesterol, Triglycerides, and Body Weight.

12 citations

Journal ArticleDOI
23 Dec 2019
TL;DR: This study compares the performance of the Naive Bayes method and C4.5 Decision Tree in predicting readmissions of diabetic patients, especially patients who have undergone HbA1c examination with a number of scenarios involving a combination of preprocessing methods.
Abstract: Diabetes is a metabolic disorder disease in which the pancreas does not produce enough insulin or the body cannot use insulin produced effectively. The HbA1c examination, which measures the average glucose level of patients during the last 2-3 months, has become an important step to determine the condition of diabetic patients. Knowledge of the patient's condition can help medical staff to predict the possibility of patient readmissions, namely the occurrence of a patient requiring hospitalization services back at the hospital. The ability to predict patient readmissions will ultimately help the hospital to calculate and manage the quality of patient care. This study compares the performance of the Naive Bayes method and C4.5 Decision Tree in predicting readmissions of diabetic patients, especially patients who have undergone HbA1c examination. As part of this study we also compare the performance of the classification model from a number of scenarios involving a combination of preprocessing methods, namely Synthetic Minority Over-Sampling Technique (SMOTE) and Wrapper feature selection method, with both classification techniques. The scenario of C4.5 method combined with SMOTE and feature selection method produces the best performance in classifying readmissions of diabetic patients with an accuracy value of 82.74 %, precision value of 87.1 %, and recall value of 82.7 %.

8 citations

Proceedings ArticleDOI
01 Nov 2019
TL;DR: The results obtained show that the training data created with features selected using complementary information has better results, however, this mutual information when compared with the selection of other features such as Chi Square and Annova F to choose the old process and verify use for both methods.
Abstract: Social media is a media that many users need to be connected with other users in order to establish communication. One of the most widely used social media is Twitter. This Twitter contains opinions or short messages called tweets. The invited company also needs feedback from its customers to find out their view of the requested service. Therefore sentiment analysis is needed to collect sentiment classification of the company. This research uses a dataset from a collection of tweets about US Airlines. Because the dataset has been provided in Kaggle and already has several metadata so the experiment about feature selection can be done quickly. The features selection in this research uses the Mutual Information method. The method was chosen because it opposes the previous reference which the method is effective in correlation measurement from one attribute to another. The results obtained show that the training data created with features selected using complementary information has better. However, this mutual information when compared with the selection of other features such as Chi Square and Annova F to choose the old process and verify use for both methods.

7 citations