Aly A. Fahmy
Other affiliations: Zagazig University
Bio: Aly A. Fahmy is an academic researcher from Cairo University. The author has contributed to research in topic(s): Support vector machine & Optimization problem. The author has an hindex of 19, co-authored 80 publication(s) receiving 1740 citation(s). Previous affiliations of Aly A. Fahmy include Zagazig University.
Papers published on a yearly basis
TL;DR: This survey discusses the existing works on text similarity through partitioning them into three approaches; String-based, Corpus-based and Knowledge-based similarities, and samples of combination between these similarities are presented.
Abstract: Measuring the similarity between words, sentences, paragraphs and documents is an important component in various tasks such as information retrieval, document clustering, word-sense disambiguation, automatic essay scoring, short answer grading, machine translation and text summarization. This survey discusses the existing works on text similarity through partitioning them into three approaches; String-based, Corpus-based and Knowledge-based similarities. Furthermore, samples of combination between these similarities are presented. General Terms Text Mining, Natural Language Processing. Keywords BasedText Similarity, Semantic Similarity, String-Based Similarity, Corpus-Based Similarity, Knowledge-Based Similarity. NeedlemanWunsch 1. INTRODUCTION Text similarity measures play an increasingly important role in text related research and applications in tasks Nsuch as information retrieval, text classification, document clustering, topic detection, topic tracking, questions generation, question answering, essay scoring, short answer scoring, machine translation, text summarization and others. Finding similarity between words is a fundamental part of text similarity which is then used as a primary stage for sentence, paragraph and document similarities. Words can be similar in two ways lexically and semantically. Words are similar lexically if they have a similar character sequence. Words are similar semantically if they have the same thing, are opposite of each other, used in the same way, used in the same context and one is a type of another. DistanceLexical similarity is introduced in this survey though different String-Based algorithms, Semantic similarity is introduced through Corpus-Based and Knowledge-Based algorithms. String-Based measures operate on string sequences and character composition. A string metric is a metric that measures similarity or dissimilarity (distance) between two text strings for approximate string matching or comparison. Corpus-Based similarity is a semantic similarity measure that determines the similarity between words according to information gained from large corpora. Knowledge-Based similarity is a semantic similarity measure that determines the degree of similarity between words using information derived from semantic networks. The most popular for each type will be presented briefly. This paper is organized as follows: Section two presents String-Based algorithms by partitioning them into two types character-based and term-based measures. Sections three and four introduce Corpus-Based and knowledge-Based algorithms respectively. Samples of combinations between similarity algorithms are introduced in section five and finally section six presents conclusion of the survey.
TL;DR: A deep learning method is proposed to recognize emotion from raw EEG signals using Long-Short Term Memory (LSTM) and the dense layer classifies these features into low/high arousal, valence, and liking.
Abstract: Emotion is the most important component in daily interaction between people. Nowadays, it is important to make the computers understand user’s emotion who interacts with it in human-computer interaction (HCI) systems. Electroencephalogram (EEG) signals are the main source of emotion in our body. Recently, emotion recognition based on EEG signals have attracted many researchers and many methods were reported. Different types of features were extracted from EEG signals then different types of classifiers were applied to these features. In this paper, a deep learning method is proposed to recognize emotion from raw EEG signals. Long-Short Term Memory (LSTM) is used to learn features from EEG signals then the dense layer classifies these features into low/high arousal, valence, and liking. DEAP dataset is used to verify this method which gives an average accuracy of 85.65%, 85.45%, and 87.99% with arousal, valence, and liking classes, respectively. The proposed method introduced high average accuracy in comparison with the traditional techniques.
01 Feb 2020-Data in Brief
TL;DR: The data presented in this article reviews the medical images of breast cancer using ultrasound scan using Breast Ultrasound Dataset, which is categorized into three classes: normal, benign, and malignant images.
Abstract: Breast cancer is one of the most common causes of death among women worldwide. Early detection helps in reducing the number of early deaths. The data presented in this article reviews the medical images of breast cancer using ultrasound scan. Breast Ultrasound Dataset is categorized into three classes: normal, benign, and malignant images. Breast ultrasound images can produce great results in classification, detection, and segmentation of breast cancer when combined with machine learning.
TL;DR: This paper investigates constructing a comprehensive feature set to compensate the lack of parsing structural outcomes in Arabic Language and presents a leading research for the opinion holder extraction in Arabic news independent from any lexical parsers.
Abstract: Opinion mining aims at extracting useful subjective information from reliable amounts of text. Opinion mining holder recognition is a task that has not been considered yet in Arabic Language. This task essentially requires deep understanding of clauses structures. Unfortunately, the lack of a robust, publicly available, Arabic parser further complicates the research. This paper presents a leading research for the opinion holder extraction in Arabic news independent from any lexical parsers. We investigate constructing a comprehensive feature set to compensate the lack of parsing structural outcomes. The proposed feature set is tuned from English previous works coupled with our proposed semantic field and named entities features. Our feature analysis is based on Conditional Random Fields (CRF) and semi-supervised pattern recognition techniques. Different research models are evaluated via cross-validation experiments achieving 54.03 F-measure. We publicly release our own research outcome corpus and lexicon for opinion mining community to encourage further research.
01 Jan 2010
TL;DR: The proposed integration approach is an integration approach between two machine learning techniques, namely bootstrapping semi-supervised pattern recognition and Conditional Random Fields (CRF) classifier as a supervised technique that outperforms previous CRF sole work.
Abstract: Named Entity Recognition (NER) task has become essential to improve the performance of many NLP tasks. Its aim is to endeavor a solution to boost accurately the identification of extracted named entities. This paper presents a novel solution for Arabic Named Entity Recognition (ANER) problem. The solution is an integration approach between two machine learning techniques, namely bootstrapping semi-supervised pattern recognition and Conditional Random Fields (CRF) classifier as a supervised technique. The paper solution contributions are the exploit of pattern and word semantic fields as CRF features, the adventure of utilizing bootstrapping semisupervised pattern recognition technique in Arabic Language, and the integration success to improve the performance of its components. Moreover, as per to our knowledge, this proposed integration has not been utilized for NER task of other natural languages. Using 6-fold cross-validation experimental tests, the solution is proved that it outperforms previous CRF sole work
01 Jan 2002
01 Jun 2016
TL;DR: The SemEval-2016 Task 4 comprises five subtasks, three of which represent a significant departure from previous editions. as mentioned in this paper discusses the fourth year of the Sentiment Analysis in Twitter Task and discusses the three new subtasks focus on two variants of the basic sentiment classification in Twitter task.
Abstract: This paper discusses the fourth year of the ”Sentiment Analysis in Twitter Task”. SemEval-2016 Task 4 comprises five subtasks, three of which represent a significant departure from previous editions. The first two subtasks are reruns from prior years and ask to predict the overall sentiment, and the sentiment towards a topic in a tweet. The three new subtasks focus on two variants of the basic “sentiment classification in Twitter” task. The first variant adopts a five-point scale, which confers an ordinal character to the classification task. The second variant focuses on the correct estimation of the prevalence of each class of interest, a task which has been called quantification in the supervised learning literature. The task continues to be very popular, attracting a total of 43 teams.
TL;DR: In this article, the notion of intuitionistic fuzzy t-norm and t-conorms is introduced, and under which conditions a similar representation theorem can be obtained, and it has been shown that T is the /spl phi/-transform of the Lukasiewicz tnorm.
Abstract: Intuitionistic fuzzy sets form an extension of fuzzy sets: while fuzzy sets give a degree to which an element belongs to a set, intuitionistic fuzzy sets give both a membership degree and a nonmembership degree. The only constraint on those two degrees is that their sum must be smaller than or equal to 1. In fuzzy set theory, an important class of triangular norms and conorms is the class of continuous Archimedean nilpotent triangular norms and conorms. It has been shown that for such t-norms T there exists a permutation /spl phi/ of [0,1] such that T is the /spl phi/-transform of the Lukasiewicz t-norm. In this paper we introduce the notion of intuitionistic fuzzy t-norm and t-conorm, and investigate under which conditions a similar representation theorem can be obtained.
01 Mar 2019-Neural Networks
TL;DR: The emerging picture is that SNNs still lag behind ANNs in terms of accuracy, but the gap is decreasing, and can even vanish on some tasks, while SNN's typically require many fewer operations and are the better candidates to process spatio-temporal data.
Abstract: In recent years, deep learning has revolutionized the field of machine learning, for computer vision in particular. In this approach, a deep (multilayer) artificial neural network (ANN) is trained, most often in a supervised manner using backpropagation. Vast amounts of labeled training examples are required, but the resulting classification accuracy is truly impressive, sometimes outperforming humans. Neurons in an ANN are characterized by a single, static, continuous-valued activation. Yet biological neurons use discrete spikes to compute and transmit information, and the spike times, in addition to the spike rates, matter. Spiking neural networks (SNNs) are thus more biologically realistic than ANNs, and are arguably the only viable option if one wants to understand how the brain computes at the neuronal description level. The spikes of biological neurons are sparse in time and space, and event-driven. Combined with bio-plausible local learning rules, this makes it easier to build low-power, neuromorphic hardware for SNNs. However, training deep SNNs remains a challenge. Spiking neurons' transfer function is usually non-differentiable, which prevents using backpropagation. Here we review recent supervised and unsupervised methods to train deep SNNs, and compare them in terms of accuracy and computational cost. The emerging picture is that SNNs still lag behind ANNs in terms of accuracy, but the gap is decreasing, and can even vanish on some tasks, while SNNs typically require many fewer operations and are the better candidates to process spatio-temporal data.
TL;DR: In this paper, the authors present a review of 154 studies that apply deep learning to EEG, published between 2010 and 2018, and spanning different application domains such as epilepsy, sleep, brain-computer interfacing, and cognitive and affective monitoring.
Abstract: Context Electroencephalography (EEG) is a complex signal and can require several years of training, as well as advanced signal processing and feature extraction methodologies to be correctly interpreted. Recently, deep learning (DL) has shown great promise in helping make sense of EEG signals due to its capacity to learn good feature representations from raw data. Whether DL truly presents advantages as compared to more traditional EEG processing approaches, however, remains an open question. Objective In this work, we review 154 papers that apply DL to EEG, published between January 2010 and July 2018, and spanning different application domains such as epilepsy, sleep, brain-computer interfacing, and cognitive and affective monitoring. We extract trends and highlight interesting approaches from this large body of literature in order to inform future research and formulate recommendations. Methods Major databases spanning the fields of science and engineering were queried to identify relevant studies published in scientific journals, conferences, and electronic preprint repositories. Various data items were extracted for each study pertaining to (1) the data, (2) the preprocessing methodology, (3) the DL design choices, (4) the results, and (5) the reproducibility of the experiments. These items were then analyzed one by one to uncover trends. Results Our analysis reveals that the amount of EEG data used across studies varies from less than ten minutes to thousands of hours, while the number of samples seen during training by a network varies from a few dozens to several millions, depending on how epochs are extracted. Interestingly, we saw that more than half the studies used publicly available data and that there has also been a clear shift from intra-subject to inter-subject approaches over the last few years. About [Formula: see text] of the studies used convolutional neural networks (CNNs), while [Formula: see text] used recurrent neural networks (RNNs), most often with a total of 3-10 layers. Moreover, almost one-half of the studies trained their models on raw or preprocessed EEG time series. Finally, the median gain in accuracy of DL approaches over traditional baselines was [Formula: see text] across all relevant studies. More importantly, however, we noticed studies often suffer from poor reproducibility: a majority of papers would be hard or impossible to reproduce given the unavailability of their data and code. Significance To help the community progress and share work more effectively, we provide a list of recommendations for future studies and emphasize the need for more reproducible research. We also make our summary table of DL and EEG papers available and invite authors of published work to contribute to it directly. A planned follow-up to this work will be an online public benchmarking portal listing reproducible results.