Home
/
Authors
/
Mohammed Nasser

Author

Mohammed Nasser

Other affiliations: University of Rajshahi, University of Malaya

Bio: Mohammed Nasser is an academic researcher from University of A Coruña. The author has contributed to research in topics: Outlier & Regression analysis. The author has an hindex of 12, co-authored 33 publications receiving 445 citations. Previous affiliations of Mohammed Nasser include University of Rajshahi & University of Malaya.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Support Vector Machine and Random Forest Modeling for Intrusion Detection System (IDS)

[...]

Md. Al Mehedi Hasan, Mohammed Nasser, Biprodip Pal, Shamim Ahmad

27 Jan 2014-Journal of Intelligent Learning Systems and Applications

TL;DR: This work has built two models for the classification purpose, one is based on Support Vector Machines (SVM) and the other is Random Forests (RF), and Experimental results show that either classifier is effective.

...read moreread less

Abstract: The success of any Intrusion Detection System (IDS) is a complicated problem due to its nonlinearity and the quantitative or qualitative network traffic data stream with many features. To get rid of this problem, several types of intrusion detection methods have been proposed and shown different levels of accuracy. This is why the choice of the effective and robust method for IDS is very important topic in information security. In this work, we have built two models for the classification purpose. One is based on Support Vector Machines (SVM) and the other is Random Forests (RF). Experimental results show that either classifier is effective. SVM is slightly more accurate, but more expensive in terms of time. RF produces similar accuracy in a much faster manner if given modeling parameters. These classifiers can contribute to an IDS system as one source of analysis and increase its accuracy. In this paper, KDD’99 Dataset is used and find out which one is the best intrusion detector for this dataset. Statistical analysis on KDD’99 dataset found important issues which highly affect the performance of evaluated systems and results in a very poor evaluation of anomaly detection approaches. The most important deficiency in the KDD’99 dataset is the huge number of redundant records. To solve these issues, we have developed a new dataset, KDD99Train+ and KDD99Test+, which does not include any redundant records in the train set as well as in the test set, so the classifiers will not be biased towards more frequent records. The numbers of records in the train and test sets are now reasonable, which make it affordable to run the experiments on the complete set without the need to randomly select a small portion. The findings of this paper will be very useful to use SVM and RF in a more meaningful way in order to maximize the performance rate and minimize the false negative rate.

...read moreread less

131 citations

Journal Article•DOI•

Feature Selection for Intrusion Detection Using Random Forest

[...]

Md. Al Mehedi Hasan, Mohammed Nasser, Shamim Ahmad, Khademul Islam Molla

01 Apr 2016-Journal of Information Security

TL;DR: Results show that the Random Forest based proposed approach can select most important and relevant features useful for classification, which reduces not only the number of input features and time but also increases the classification accuracy.

...read moreread less

Abstract: An intrusion detection system collects and analyzes information from different areas within a computer or a network to identify possible security threats that include threats from both outside as well as inside of the organization. It deals with large amount of data, which contains various ir-relevant and redundant features and results in increased processing time and low detection rate. Therefore, feature selection should be treated as an indispensable pre-processing step to improve the overall system performance significantly while mining on huge datasets. In this context, in this paper, we focus on a two-step approach of feature selection based on Random Forest. The first step selects the features with higher variable importance score and guides the initialization of search process for the second step whose outputs the final feature subset for classification and in-terpretation. The effectiveness of this algorithm is demonstrated on KDD’99 intrusion detection datasets, which are based on DARPA 98 dataset, provides labeled data for researchers working in the field of intrusion detection. The important deficiency in the KDD’99 data set is the huge number of redundant records as observed earlier. Therefore, we have derived a data set RRE-KDD by eliminating redundant record from KDD’99 train and test dataset, so the classifiers and feature selection method will not be biased towards more frequent records. This RRE-KDD consists of both KDD99Train+ and KDD99Test+ dataset for training and testing purposes, respectively. The experimental results show that the Random Forest based proposed approach can select most im-portant and relevant features useful for classification, which, in turn, reduces not only the number of input features and time but also increases the classification accuracy.

...read moreread less

88 citations

Journal Article•DOI•

Making Waves: Collaboration in the time of SARS-CoV-2 - rapid development of an international co-operation and wastewater surveillance database to support public health decision-making.

[...]

Lian Lundy¹, Despo Fatta-Kassinos², Jaroslav Slobodnik, Popi Karaolia², Lubos Cirka³, Norbert Kreuzinger⁴, Sara Castiglioni⁵, Lubertus Bijlsma⁶, Valeria Dulio, Genevieve Deviller, Foon Yin Lai⁷, Nikiforos A. Alygizakis⁸, Manuela Barneo⁶, Jose Antonio Baz-Lomba⁹, Frederic Been, Marianna Cichova, Kelly Conde-Pérez¹⁰, Adrian Covaci¹¹, Erica Donner¹², Andrej Ficek¹³, Francis Hassard¹⁴, Annelie Hedström¹, Félix Hernández⁶, Veronika Janska, Kristen L. Jellison¹⁵, Jan Hofman¹⁶, Kelly Hill, Pei-Ying Hong¹⁷, Barbara Kasprzyk-Hordern¹⁶, Stoimir Kolarević¹⁸, Ján Krahulec¹³, Dimitra A. Lambropoulou¹⁹, Rosa de Llanos⁶, Tomáš Mackuľak³, Lorena Martinez-Garcia²⁰, Francisco Javier Escobar Martínez²⁰, Gertjan Medema, Adrienn Micsinai, Mette Myrmel²¹, Mohammed Nasser¹⁰, Harald Niederstätter²², Leonor Nozal²¹, Herbert Oberacher²², Věra Očenášková, Leslie Ogorzaly, Dimitrios Papadopoulos¹⁹, Beatriz Peinado²⁰, Tarja Pitkänen²³, Margarita Poza¹⁰, Soraya Rumbo-Feal¹⁰, Maria Blanca Sanchez²⁰, Anna J. Székely²⁴, Andrea Soltysova²⁵, Nikolaos S. Thomaidis⁸, Juan A. Vallejo¹⁰, Alexander L.N. van Nuijs¹¹, Vassie C. Ware¹⁵, Maria Viklander¹ - Show less +54 more•Institutions (25)

Luleå University of Technology¹, University of Cyprus², Slovak University of Technology in Bratislava³, Vienna University of Technology⁴, Mario Negri Institute for Pharmacological Research⁵, James I University⁶, Swedish University of Agricultural Sciences⁷, National and Kapodistrian University of Athens⁸, Norwegian Institute for Water Research⁹, University of A Coruña¹⁰, University of Antwerp¹¹, University of South Australia¹², Comenius University in Bratislava¹³, Cranfield University¹⁴, Lehigh University¹⁵, University of Bath¹⁶, King Abdullah University of Science and Technology¹⁷, University of Belgrade¹⁸, Aristotle University of Thessaloniki¹⁹, University of Alcalá²⁰, Norwegian University of Life Sciences²¹, Innsbruck Medical University²², National Institute for Health and Welfare²³, Uppsala University²⁴, Slovak Academy of Sciences²⁵

01 Jul 2021-Water Research

TL;DR: The NORMAN SCORE “SARS-CoV-2 in sewage” database provides a platform for rapid, open access data sharing, validated by the uploading of 276 data sets from nine countries to-date and is a resource for the development of recommendations on minimum data requirements for wastewater pathogen surveillance.

...read moreread less

43 citations

Posted Content•DOI•

Predicting the number of people infected with SARS-COV-2 in a population using statistical models based on wastewater viral load

[...]

Juan A. Vallejo¹, Soraya Rumbo-Feal¹, Kelly Conde-Pérez¹, Ángel López-Oriona¹, Javier Tarrío-Saavedra¹, Rubén Reif¹, Susana Ladra¹, Bruno K. Rodiño-Janeiro², Mohammed Nasser¹, Ángeles Cid¹, María C. Veiga¹, Antón Acevedo³, Carlos Lamora, Germán Bou¹, Ricardo Cao⁴, Ricardo Cao¹, Margarita Poza¹ - Show less +13 more•Institutions (4)

University of A Coruña¹, University of Santiago de Compostela², Xunta de Galicia³, University of Vigo⁴

16 Nov 2020-medRxiv

TL;DR: In this paper, statistical regression models from the viral load detected in the wastewater and the epidemiological data from A Coruna health system that allowed us to estimate the number of infected people, including symptomatic and asymptomatic individuals, with reliability close to 90%, were developed.

...read moreread less

Abstract: The quantification of the SARS-CoV-2 RNA load in wastewater has emerged as a useful tool to monitor COVID-19 outbreaks in the community. This approach was implemented in the metropolitan area of A Coruna (NW Spain), where wastewater from a treatment plant was analyzed to track the epidemic dynamics in a population of 369,098 inhabitants. Statistical regression models from the viral load detected in the wastewater and the epidemiological data from A Coruna health system that allowed us to estimate the number of infected people, including symptomatic and asymptomatic individuals, with reliability close to 90%, were developed. These models can help to understand the real magnitude of the epidemic in a population at any given time and can be used as an effective early warning tool for predicting outbreaks. The methodology of the present work could be used to develop a similar wastewater-based epidemiological model to track the evolution of the COVID-19 epidemic anywhere in the world.

...read moreread less

28 citations

Journal Article•DOI•

Comparison of the finite mixture of ARMA-GARCH, back propagation neural networks and support-vector machines in forecasting financial returns

[...]

Altaf Hossain¹, Mohammed Nasser²•Institutions (2)

University of Rajshahi¹, University of Malaya²

01 Mar 2011-Journal of Applied Statistics

TL;DR: The finite mixture of ARMA-GARCH model is applied instead of AR or ARMA models to compare with the standard BP and SVM in forecasting financial time series (daily stock market index returns and exchange rate returns) and shows that the SVM model shows long memory property in forecastingFinancial returns.

...read moreread less

Abstract: The use of GARCH type models and computational-intelligence-based techniques for forecasting financial time series has been proved extremely successful in recent times. In this article, we apply the finite mixture of ARMA-GARCH model instead of AR or ARMA models to compare with the standard BP and SVM in forecasting financial time series (daily stock market index returns and exchange rate returns). We do not apply the pure GARCH model as the finite mixture of the ARMA-GARCH model outperforms the pure GARCH model. These models are evaluated on five performance metrics or criteria. Our experiment shows that the SVM model outperforms both the finite mixture of ARMA-GARCH and BP models in deviation performance criteria. In direction performance criteria, the finite mixture of ARMA-GARCH model performs better. The memory property of these forecasting techniques is also examined using the behavior of forecasted values vis-a-vis the original values. Only the SVM model shows long memory property in forecasting fi...

...read moreread less

26 citations

1
2
3
4
…
5
6
7

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Regression Diagnostics: Identifying Influential Data and Sources of Collinearity

[...]

W. W. Muir¹•Institutions (1)

University of Strathclyde¹

01 May 1981

TL;DR: This chapter discusses Detecting Influential Observations and Outliers, a method for assessing Collinearity, and its applications in medicine and science.

...read moreread less

Abstract: 1. Introduction and Overview. 2. Detecting Influential Observations and Outliers. 3. Detecting and Assessing Collinearity. 4. Applications and Remedies. 5. Research Issues and Directions for Extensions. Bibliography. Author Index. Subject Index.

...read moreread less

4,948 citations

On robust estimation of the location parameter

[...]

Frederick R. Forst

01 Jan 1980

3,652 citations

Singular Value Decomposition for Genome-Wide Expression Data Processing and Modeling

[...]

Orly Alter¹, Patrick O. Brown, David Botstein•Institutions (1)

Stanford University¹

01 Mar 2001

TL;DR: Using singular value decomposition in transforming genome-wide expression data from genes x arrays space to reduced diagonalized "eigengenes" x "eigenarrays" space gives a global picture of the dynamics of gene expression, in which individual genes and arrays appear to be classified into groups of similar regulation and function, or similar cellular state and biological phenotype.

...read moreread less

Abstract: ‡We describe the use of singular value decomposition in transforming genome-wide expression data from genes 3 arrays space to reduced diagonalized ‘‘eigengenes’’ 3 ‘‘eigenarrays’’ space, where the eigengenes (or eigenarrays) are unique orthonormal superpositions of the genes (or arrays). Normalizing the data by filtering out the eigengenes (and eigenarrays) that are inferred to represent noise or experimental artifacts enables meaningful comparison of the expression of different genes across different arrays in different experiments. Sorting the data according to the eigengenes and eigenarrays gives a global picture of the dynamics of gene expression, in which individual genes and arrays appear to be classified into groups of similar regulation and function, or similar cellular state and biological phenotype, respectively. After normalization and sorting, the significant eigengenes and eigenarrays can be associated with observed genome-wide effects of regulators, or with measured samples, in which these regulators are overactive or underactive, respectively.

...read moreread less

1,815 citations

Journal Article•DOI•

A comparative study of logistic model tree, random forest, and classification and regression tree models for spatial prediction of landslide susceptibility

[...]

Wei Chen¹, Xiaoshen Xie¹, Jiale Wang¹, Biswajeet Pradhan², Biswajeet Pradhan³, Haoyuan Hong, Dieu Tien Bui⁴, Zhao Duan¹, Jianquan Ma¹ - Show less +5 more•Institutions (4)

Xi'an University of Science and Technology¹, Sejong University², Universiti Putra Malaysia³, University College of Southeast Norway⁴

01 Apr 2017-Catena

TL;DR: In this article, the authors used three state-of-the-art data mining techniques, namely, logistic model tree (LMT), random forest (RF), and classification and regression tree (CART) models, to map landslide susceptibility.

...read moreread less

Abstract: The main purpose of the present study is to use three state-of-the-art data mining techniques, namely, logistic model tree (LMT), random forest (RF), and classification and regression tree (CART) models, to map landslide susceptibility. Long County was selected as the study area. First, a landslide inventory map was constructed using history reports, interpretation of aerial photographs, and extensive field surveys. A total of 171 landslide locations were identified in the study area. Twelve landslide-related parameters were considered for landslide susceptibility mapping, including slope angle, slope aspect, plan curvature, profile curvature, altitude, NDVI, land use, distance to faults, distance to roads, distance to rivers, lithology, and rainfall. The 171 landslides were randomly separated into two groups with a 70/30 ratio for training and validation purposes, and different ratios of non-landslides to landslides grid cells were used to obtain the highest classification accuracy. The linear support vector machine algorithm (LSVM) was used to evaluate the predictive capability of the 12 landslide conditioning factors. Second, LMT, RF, and CART models were constructed using training data. Finally, the applied models were validated and compared using receiver operating characteristics (ROC), and predictive accuracy (ACC) methods. Overall, all three models exhibit reasonably good performances; the RF model exhibits the highest predictive capability compared with the LMT and CART models. The RF model, with a success rate of 0.837 and a prediction rate of 0.781, is a promising technique for landslide susceptibility mapping. Therefore, these three models are useful tools for spatial prediction of landslide susceptibility.

...read moreread less

591 citations

Journal Article•DOI•

Correlation coefficients of hesitant fuzzy sets and their applications to clustering analysis

[...]

Na Chen¹, Na Chen², Zeshui Xu¹, Meimei Xia¹•Institutions (2)

Southeast University¹, Nanjing University of Finance and Economics²

15 Feb 2013-Applied Mathematical Modelling

TL;DR: The interval-valued HFSs and the corresponding correlation coefficient formulas are developed and demonstrated their application in clustering with intervals-valued hesitant fuzzy information through a specific numerical example.

...read moreread less

449 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105

Collapse