scispace - formally typeset
Search or ask a question

Showing papers in "Cognitive Computation in 2013"


Journal ArticleDOI
TL;DR: A new multi-class classification algorithm, called Twin-KSVC, is proposed in this paper, which takes the advantages of both TSVM and K-SVCR and evaluates all the training points into a “1-versus-1-Versus-rest” structure, so it generates ternary outputs.
Abstract: Twin support vector machine (TSVM) is a novel machine learning algorithm, which aims at finding two nonparallel planes for each class. In order to do so, one needs to resolve a pair of smaller-sized quadratic programming problems rather than a single large one. Classical TSVM is proposed for the binary classification problem. However, multi-class classification problem is often met in our real world. For this problem, a new multi-class classification algorithm, called Twin-KSVC, is proposed in this paper. It takes the advantages of both TSVM and K-SVCR (support vector classification-regression machine for k-class classification) and evaluates all the training points into a “1-versus-1-versus-rest” structure, so it generates ternary outputs { −1, 0, +1}. As all the samples are utilized in constructing the classification hyper-plane, our proposed algorithm yields higher classification accuracy in comparison with other two algorithms. Experimental results on eleven benchmark datasets demonstrate the feasibility and validity of our proposed algorithm.

95 citations


Journal ArticleDOI
TL;DR: It is demonstrated that the revised BPNN modeling can be used to predict and calculate landslide deformation, quicken the learning speed of network, and improve the predicting precision.
Abstract: In this paper, a modified method for landslide prediction is presented. This method is based on the back-propagation neural network (BPNN), and we use the combination of genetic algorithm and simulated annealing algorithm to optimize the weights and biases of the network. The improved BPNN modeling can work out the complex nonlinear relation by learning model and using the present data. This paper demonstrates that the revised BPNN modeling can be used to predict and calculate landslide deformation, quicken the learning speed of network, and improve the predicting precision. Applying this thinking and method into research of some landslide in the Three Gorges reservoir, the validity and practical value of this model can be demonstrated. And it also shows that the dynamic prediction of landslide deformation is very crucial.

92 citations


Journal ArticleDOI
TL;DR: Experimental results show a significant improvement combining the three spectrums, even when using a simple classifier and feature extractor, and it is found that the three studied spectral bands contribute in a nearly equal proportion to a combined system.
Abstract: In this paper, we present a new hand database called Tecnocampus Hand Image Database that includes right hand, palm and dorsal images. All the images have been acquired with three different sensors (visible, near-infrared and thermal). This database consists of 100 people acquired in five different acquisition sessions, two images per session and palm/dorsal sides. The total amount of pictures is 6.000, and it is mainly developed for hand image biometric recognition purposes. In addition, the database has been studied from the information theory point of view, and we found that this highest level of information is achieved in thermal spectrum. Furthermore, a low level of mutual information between different spectrums is also demonstrated. This opens an interesting research field in multi-sensor data fusion.

85 citations


Journal ArticleDOI
TL;DR: Three methods are introduced to supplement the standard n-gram language model and the results show that the adoption of common sense knowledge yields improvements in recognition performance, despite the reduced concept list hereby employed.
Abstract: Compared to human intelligence, computers are far short of common sense knowledge which people normally acquire during the formative years of their lives. This paper investigates the effects of employing common sense knowledge as a new linguistic context in handwritten Chinese text recognition. Three methods are introduced to supplement the standard n-gram language model: embedding model, direct model, and an ensemble of these two. The embedding model uses semantic similarities from common sense knowledge to make the n-gram probabilities estimation more reliable, especially for the unseen n-grams in the training text corpus. The direct model, in turn, considers the linguistic context of the whole document to make up for the short context limit of the n-gram model. The three models are evaluated on a large unconstrained handwriting database, CASIA-HWDB, and the results show that the adoption of common sense knowledge yields improvements in recognition performance, despite the reduced concept list hereby employed.

75 citations


Journal ArticleDOI
TL;DR: This paper proposes an intelligent management scheme for renewable resources combined with battery implemented with a faster and simpler scheme of dynamic programming, by considering only one critic network and some optimization policies in order to satisfy the load demand.
Abstract: The employment of intelligent energy management systems likely allows reducing consumptions and thus saving money for consumers. The residential load demand must be met, and some advantages can be obtained if specific optimization policies are taken. With an efficient use of renewable sources and power imported from the grid, an intelligent and adaptive system which manages the battery is able to satisfy the load demand and minimize the entire energy cost related to the scenario under study. In this paper, an adaptive dynamic programming-based algorithm is presented to face dynamic situations, in which some conditions of the environment or habits of customer may vary with time, especially using renewable energy. Based on the idea of smart grid, we propose an intelligent management scheme for renewable resources combined with battery implemented with a faster and simpler scheme of dynamic programming, by considering only one critic network and some optimization policies in order to satisfy the load demand. Since this kind of problem is suitable to avoid the training of an action network, the training loop among the two neural networks is deleted and the training process is greatly simplified. Computer simulations confirm the effectiveness of this self-learning design in a typical residential scenario.

70 citations


Journal ArticleDOI
TL;DR: The fused master saliency map developed in this research is a good predictor of participants’ eye positions and is an efficient fusion method of all these saliency maps.
Abstract: Faces play an important role in guiding visual attention, and thus, the inclusion of face detection into a classical visual attention model can improve eye movement predictions. In this study, we proposed a visual saliency model to predict eye movements during free viewing of videos. The model is inspired by the biology of the visual system and breaks down each frame of a video database into three saliency maps, each earmarked for a particular visual feature. (a) A ‘static’ saliency map emphasizes regions that differ from their context in terms of luminance, orientation and spatial frequency. (b) A ‘dynamic’ saliency map emphasizes moving regions with values proportional to motion amplitude. (c) A ‘face’ saliency map emphasizes areas where a face is detected with a value proportional to the confidence of the detection. In parallel, a behavioral experiment was carried out to record eye movements of participants when viewing the videos. These eye movements were compared with the models’ saliency maps to quantify their efficiency. We also examined the influence of center bias on the saliency maps and incorporated it into the model in a suitable way. Finally, we proposed an efficient fusion method of all these saliency maps. Consequently, the fused master saliency map developed in this research is a good predictor of participants’ eye positions.

61 citations


Journal ArticleDOI
TL;DR: A novel scheme for the fast face recognition is presented via extreme learning machine (ELM) and sparse coding that could be comparable to the state-of-the-art techniques at a much higher speed.
Abstract: Most face recognition approaches developed so far regard the sparse coding as one of the essential means, while the sparse coding models have been hampered by the extremely expensive computational cost in the implementation. In this paper, a novel scheme for the fast face recognition is presented via extreme learning machine (ELM) and sparse coding. The common feature hypothesis is first introduced to extract the basis function from the local universal images, and then the single hidden layer feedforward network (SLFN) is established to simulate the sparse coding process for the face images by ELM algorithm. Some developments have been done to maintain the efficient inherent information embedding in the ELM learning. The resulting local sparse coding coefficient will then be grouped into the global representation and further fed into the ELM ensemble which is composed of a number of SLFNs for face recognition. The simulation results have shown the good performance in the proposed approach that could be comparable to the state-of-the-art techniques at a much higher speed.

57 citations


Journal ArticleDOI
TL;DR: Overall, this paper highlights some interesting and exciting research areas as well as possible synergies between different applications using biometric information and investigates the potential of utilizing biometrics beyond the presently limited field of security applications.
Abstract: The use of biometrics has been successfully applied to security applications for some time. However, the extension of other potential applications with the use of biometric information is a very recent development. This paper summarizes the field of biometrics and investigates the potential of utilizing biometrics beyond the presently limited field of security applications. There are some synergies that can be established within security-related applications. These can also be relevant in other fields such as health and ambient intelligence. This paper describes these synergies. Overall, this paper highlights some interesting and exciting research areas as well as possible synergies between different applications using biometric information.

57 citations


Journal ArticleDOI
Jing An1, Jing An2, Qi Kang1, Lei Wang1, Qidi Wu1 
TL;DR: In this article, the authors presented a novel metaheuristic algorithm called mussels wandering optimization (MWO), inspired by mussels' leisurely locomotion behavior when they form bed patterns in their habitat.
Abstract: Over the last decade, we have encountered various complex optimization problems in the engineering and research domains. Some of them are so hard that we had to turn to heuristic algorithms to obtain approximate optimal solutions. In this paper, we present a novel metaheuristic algorithm called mussels wandering optimization (MWO). MWO is inspired by mussels’ leisurely locomotion behavior when they form bed patterns in their habitat. It is an ecologically inspired optimization algorithm that mathematically formulates a landscape-level evolutionary mechanism of the distribution pattern of mussels through a stochastic decision and Levy walk. We obtain the optimal shape parameter μ of the movement strategy and demonstrate its convergence performance via eight benchmark functions. The MWO algorithm has competitive performance compared with four existing metaheuristics, providing a new approach for solving complex optimization problems.

46 citations


Journal ArticleDOI
TL;DR: The Fractal Dimension of the observed time series is combined with the traditional MFCCs in the feature vector in order to enhance the performance of two different ASR systems.
Abstract: Mel frequency cepstral coefficients (MFCCs) are a standard tool for automatic speech recognition (ASR), but they fail to capture part of the dynamics of speech. The nonlinear nature of speech suggests that extra information provided by some nonlinear features could be especially useful when training data are scarce or when the ASR task is very complex. In this paper, the Fractal Dimension of the observed time series is combined with the traditional MFCCs in the feature vector in order to enhance the performance of two different ASR systems. The first is a simple system of digit recognition in Chinese, with very few training examples, and the second is a large vocabulary ASR system for Broadcast News in Spanish.

31 citations


Journal ArticleDOI
TL;DR: It is shown how a model of human motivation can play an important role in understanding human moral judgment, which opens the door for new possibilities in developing theories of moral judgment.
Abstract: The topic of moral judgment is of interest to many fields Studying moral judgment based on currently available computational intelligence methods and techniques may shed new light on the matter This article reviews computational models of moral judgment on the basis of the CLARION cognitive architecture, emphasizing a motivation-based approach It attempts to integrate the understanding of moral judgment with other areas of cognition In particular, it shows how a model of human motivation can play an important role in understanding human moral judgment, which opens the door for new possibilities in developing theories of moral judgment

Journal ArticleDOI
TL;DR: This paper presents a two-stage asynchronous protocol for a steady-state visual-evoked potential-based BCI, estimating a threshold using canonical correlation analysis coefficients in synchronous mode and combining it with a sliding window strategy to continuously detect the mental state of the user.
Abstract: Asynchronous brain–computer interface (BCI) systems are more practicable than synchronous ones in real-world applications. A key challenge in asynchronous BCI design is to discriminate intentional control (IC) and non-intentional control (NC) states. In this paper, we present a two-stage asynchronous protocol for a steady-state visual-evoked potential-based BCI. First, we estimate a threshold using canonical correlation analysis coefficients in synchronous mode; then, we combine it with a sliding window strategy to continuously detect the mental state of the user. If the current state is judged as an IC state, then the system will output command. Our results show that the average positive predictive value of the system is 77.06 % and that its average false-positive rate in the NC state and IC state are 2.37 and 12.05 %, respectively.

Journal ArticleDOI
TL;DR: A novel technique for characterizing hypernasal vowels and words using nonlinear dynamics is presented considering different complexity measures that are mainly based on the analysis of the time-delay embedded space.
Abstract: A novel technique for characterizing hypernasal vowels and words using nonlinear dynamics is presented considering different complexity measures that are mainly based on the analysis of the time-delay embedded space. After the characterization stage, feature selection is performed by means of two different strategies: principal components analysis and sequential floating feature selection. The final decision about the presence or absence of hypernasality is carried out using a Soft Margin-Support Vector Machine. The database used in the study is composed of the five Spanish vowels uttered by 266 children, 110 healthy and 156 labeled as hypernasal by a experienced voice therapist. The database also includes the words /coco/ and /gato/ uttered by 119 children; 65 of which were diagnosed as hypernasal and the rest 54 as healthy. The results are presented in terms of accuracy, sensitivity and specificity. ROC curves are also included as a widely accepted way to measure the performance of a detection system. The experiments show that the proposed methodology achieves an accuracy of up to 92.08 % using, together, the best subset of features extracted from every vowel and 89.09 % using the combination of the most relevant features in the case of words.

Journal ArticleDOI
TL;DR: It is hypothesized that some of the underlying neurological mechanisms affecting phonation produce observable correlates in vocal fold biomechanics and that these correlates behave differentially in neurological diseases than in organic pathologies.
Abstract: The dramatic impact of neurological degenerative pathologies in life quality is a growing concern nowadays. Many techniques have been designed for the detection, diagnosis, and monitoring of the neurological disease. Most of them are too expensive or complex for being used by primary attention medical services. On the other hand, it is well known that many neurological diseases leave a signature in voice and speech. Through the present paper, a new method to trace some neurological diseases at the level of phonation will be shown. In this way, the detection and grading of the neurological disease could be based on a simple voice test. This methodology is benefiting from the advances achieved during the last years in detecting and grading organic pathologies in phonation. The paper hypothesizes that some of the underlying neurological mechanisms affecting phonation produce observable correlates in vocal fold biomechanics and that these correlates behave differentially in neurological diseases than in organic pathologies. A general description about the main hypotheses involved and their validation by acoustic voice analysis based on biomechanical correlates of the neurological disease is given. The validation is carried out on a balanced database of normal and organic dysphonic patients of both genders. Selected study cases will be presented to illustrate the possibilities offered by this methodology.

Journal ArticleDOI
TL;DR: Parts of brain processing on how visual perception, recognition, attention, cognitive control, value attribution, decision-making, affordances and action can be melded together in a coherent manner in a cognitive control architecture of the perception–action cycle for visually guided reaching and grasping of objects by a robot or an agent are shown.
Abstract: We show aspects of brain processing on how visual perception, recognition, attention, cognitive control, value attribution, decision-making, affordances and action can be melded together in a coherent manner in a cognitive control architecture of the perception–action cycle for visually guided reaching and grasping of objects by a robot or an agent. The work is based on the notion that separate visuomotor channels are activated in parallel by specific visual inputs and are continuously modulated by attention and reward, which control a robot’s/agent’s action repertoire. The suggested visual apparatus allows the robot/agent to recognize both the object’s shape and location, extract affordances and formulate motor plans for reaching and grasping. A focus-of-attention signal plays an instrumental role in selecting the correct object in its corresponding location as well as selects the most appropriate arm reaching and hand grasping configuration from a list of other configurations based on the success of previous experiences. The cognitive control architecture consists of a number of neurocomputational mechanisms heavily supported by experimental brain evidence: spatial saliency, object selectivity, invariance to object transformations, focus of attention, resonance, motor priming, spatial-to-joint direction transformation and volitional scaling of movement.

Journal ArticleDOI
TL;DR: This article describes the first developments in relation to the learning and reasoning capabilities of DARWIN robots and shows through the resulting behaviors’ of the robot that from a computational viewpoint, the former biological inspiration plays a central role in facilitating “functional segregation and global integration,” thus endowing the cognitive architecture with “small-world” properties.
Abstract: In Professor Taylor’s own words, the most striking feature of any cognitive system is its ability to “learn and reason” cumulatively throughout its lifetime, the structure of its inferences both emerging and constrained by the structure of its bodily experiences. Understanding the computational/neural basis of embodied intelligence by reenacting the “developmental learning” process in cognitive robots and in turn endowing them with primitive capabilities to learn, reason and survive in “unstructured” environments (domestic and industrial) is the vision of the EU-funded DARWIN project, one of the last adventures Prof. Taylor embarked upon. This journey is about a year old at present, and our article describes the first developments in relation to the learning and reasoning capabilities of DARWIN robots. The novelty in the computational architecture stems from the incorporation of recent ideas firstly from the field of “connectomics” that attempts to explore the large-scale organization of the cerebral cortex and secondly from recent functional imaging and behavioral studies in support of the embodied simulation hypothesis. We show through the resulting behaviors’ of the robot that from a computational viewpoint, the former biological inspiration plays a central role in facilitating “functional segregation and global integration,” thus endowing the cognitive architecture with “small-world” properties. The latter on the other hand promotes the incessant interleaving of “top-down” and “bottom-up” information flows (that share computational/neural substrates) hence allowing learning and reasoning to “cumulatively” drive each other. How the robot learns about “objects” and simulates perception, learns about “action” and simulates action (in this case learning to “push” that follows pointing, reaching, grasping behaviors’) are used to illustrate central ideas. Finally, an example of how simulation of perception and action lead the robot to reason about how its world can change such that it becomes little bit more conducive toward realization of its internal goal (an assembly task) is used to describe how “object,” “action,” and “body” meet in the Darwin architecture and how inference emerges through embodied simulation.

Journal ArticleDOI
TL;DR: Investigation of low-variance multitaper spectrum estimation methods to compute the mel-frequency cepstral coefficient (MFCC) features for robust speech and speaker recognition systems shows that the multitaper methods perform better compared with the Hamming-windowed spectrum estimation method.
Abstract: In this paper, we investigate low-variance multitaper spectrum estimation methods to compute the mel-frequency cepstral coefficient (MFCC) features for robust speech and speaker recognition systems. In speech and speaker recognition, MFCC features are usually computed from a single-tapered (e.g., Hamming window) direct spectrum estimate, that is, the squared magnitude of the Fourier transform of the observed signal. Compared with the periodogram, a power spectrum estimate that uses a smooth window function, such as Hamming window, can reduce spectral leakage. Windowing may help to reduce spectral bias, but variance often remains high. A multitaper spectrum estimation method that uses well-selected tapers can gain from the bias-variance trade-off, giving an estimate that has small bias compared with a single-taper spectrum estimate but substantially lower variance. Speech recognition and speaker verification experimental results on the AURORA-2 and AURORA-4 corpora and the NIST 2010 speaker recognition evaluation corpus (telephone as well as microphone speech), respectively, show that the multitaper methods perform better compared with the Hamming-windowed spectrum estimation method. In a speaker verification task, compared with the Hamming window technique, the sinusoidal weighted cepstrum estimator, multi-peak, and Thomson multitaper techniques provide a relative improvement of 20.25, 18.73, and 12.83 %, respectively, in equal error rate.

Journal ArticleDOI
TL;DR: This paper revisits the work of John G Taylor on neural ‘bubble’ dynamics in two-dimensional neural field models and makes use of the fact that mathematical treatments are much simpler when the firing rate function is chosen to be a Heaviside.
Abstract: In this paper, we revisit the work of John G Taylor on neural ‘bubble’ dynamics in two-dimensional neural field models. This builds on original work of Amari in a one-dimensional setting and makes use of the fact that mathematical treatments are much simpler when the firing rate function is chosen to be a Heaviside. In this case, the dynamics of an excited or active region, defining a ‘bubble’, reduce to the dynamics of the boundary. The focus of John’s work was on the properties of radially symmetric ‘bubbles’, including existence and radial stability, with applications to the theory of topographic map formation in self-organising neural networks. As well as reviewing John’s work in this area, we also include some recent results that treat more general classes of perturbations.

Journal ArticleDOI
TL;DR: The utility of the segmentation as a region of interest in JPEG2000 compression to achieve superior image quality over the automatically selected salient image regions while reducing the image filesize by down to 25 % of that of the original.
Abstract: Image feature point algorithms and their associated regional descriptors can be viewed as primitive detectors of visually salient information. In this paper, a new method for constructing a visual attention probability map using features is proposed. (Throughout this work, we use SURF features yet the algorithm is not limited to SURF alone.). This technique is validated using comprehensive human eye-tracking experiments. We call this algorithm “visual interest” (VI) since the resultant segmentation reveals image regions that are visually salient during the performance of multiple observer search tasks. We demonstrate that it works on generic, eye-level photographs and is not dependent on heuristic tuning. We further show that the descriptor-matching property of the SURF feature points can be exploited via object recognition to modulate the context of the attention probability map for a given object search task, refining the salient area. We fully validate the VI algorithm through applying it to salient compression using a pre-blur of non-salient regions prior to JPEG and conducting comprehensive observer performance tests. When using the object contextualisation, we conclude that JPEG files are around 33 % larger than they need to be to fully represent the task-relevant information within them. We finally demonstrate the utility of the segmentation as a region of interest in JPEG2000 compression to achieve superior image quality (measured statistically using PSNR and SSIM) over the automatically selected salient image regions while reducing the image filesize by down to 25 % of that of the original. Our technique therefore delivers superior compression performance through the detection and selective preservation of visually salient information relevant to multiple observer tasks. In contrast to the state of the art in task-directed visual attention models, the VI algorithm reacts only to the image content and requires no detailed prior knowledge of the scene nor of the ultimate observer task.

Journal ArticleDOI
TL;DR: An advanced particle swarm optimization algorithm is proposed to solve multi-modal function optimization problems and an artificial repulsive potential field on local search space is set up to prevent multiple swarms converging to the same areas.
Abstract: In this paper, an advanced particle swarm optimization algorithm (PSO) is proposed to solve multi-modal function optimization problems. Multiple swarms are used for parallel search, and an artificial repulsive potential field on local search space is set up to prevent multiple swarms converging to the same areas. In addition, this paper provides a theoretical analysis of the strategy of multi-swarm parallel search in algorithms. Finally, the proposed algorithm has been tested on three benchmark functions, and the results show a superior performance compared with other PSO variants.

Journal ArticleDOI
TL;DR: The expectation maximization algorithm for learning a multivariate Gaussian mixture model and a multiple kernel density estimator based on the propensity scores are proposed to avoid listwise deletion or mean imputation for solving classification tasks with incomplete data.
Abstract: In this paper, we address the Bayesian classification with incomplete data. The common approach in the literature is to simply ignore the samples with missing values or impute missing values before classification. However, these methods are not effective when a large portion of the data have missing values and the acquisition of samples is expensive. Motivated by these limitations, the expectation maximization algorithm for learning a multivariate Gaussian mixture model and a multiple kernel density estimator based on the propensity scores are proposed to avoid listwise deletion (LD) or mean imputation (MI) for solving classification tasks with incomplete data. We illustrate the effectiveness of our proposed algorithms on some artificial and benchmark UCI data sets by comparing with LD and MI methods. We also apply these algorithms to solve the practical classification tasks on the lithology identification of hydrothermal minerals and license plate character recognition. The experimental results demonstrate their good performance with high classification accuracies.

Journal ArticleDOI
TL;DR: A method of modified K-means-based support vector machine (SVM) classification is proposed to use a hybrid sample selection that leverages the informativeness and representativeness of training samples to classify real multi/hyperspectral images.
Abstract: The definition of valuable training samples and automatic classification of land cover with remote sensing data are both classical problems, which are known to be difficult and have attracted major research efforts. In this paper, a method of modified K-means-based support vector machine (SVM) classification is proposed to use a hybrid sample selection that leverages the informativeness and representativeness of training samples to classify real multi/hyperspectral images. The hybrid sample selection (close-to-cluster-border sampling and near-cluster-center sampling) is constructed on the reduced convex hulls (RCHs) of clustering structure and can reduce the risk of overtraining caused by active sample selection of active learning methods. Numerical results obtained on the classification of three challenging remote sensing images (Landsat-7 ETM+, AVIRIS Indian pines, and KSC) by comparing the proposed technique with random sampling (RS) and margin sampling (MS) demonstrate the good efficiency and high accuracy of our approach.

Journal ArticleDOI
TL;DR: An advanced real-time speech processing front-end aimed at automatically reducing the distortions introduced by room reverberation in distant speech signals, also considering the presence of background noise is proposed to achieve a significant improvement in speech quality for each speaker.
Abstract: This paper deals with speech enhancement in noisy reverberated environments where multiple speakers are active. The authors propose an advanced real-time speech processing front-end aimed at automatically reducing the distortions introduced by room reverberation in distant speech signals, also considering the presence of background noise, and thus to achieve a significant improvement in speech quality for each speaker. The overall framework is composed of three cooperating blocks, each one fulfilling a specific task: speaker diarization, room impulse responses identification and speech dereverberation. In particular, the speaker diarization algorithm pilots the operations performed in the other two algorithmic stages, which have been suitably designed and parametrized to operate with noisy speech observations. Extensive computer simulations have been performed by using a subset of the AMI database under different realistic noisy and reverberated conditions. Obtained results show the effectiveness of the approach.

Journal ArticleDOI
TL;DR: A novel bio-inspired approach aiming at significantly reducing motion chattering phenomena inherent with traditional methods is presented, motivated by two famous Chinese sayings “haste does not bring success” and “ride softly then you may get home sooner”.
Abstract: Wheeled mobile robot (WMR) has gained wide application in civilian and military fields. Smooth and stable motion of WMR is crucial not only for enhancing control accuracy and facilitating mission completion, but also for reducing mechanical tearing and wearing. In this paper, we present a novel bio-inspired approach aiming at significantly reducing motion chattering phenomena inherent with traditional methods. The main idea of the proposed smooth motion controller is motivated by two famous Chinese sayings “haste does not bring success” and “ride softly then you may get home sooner”, which inspires the utilization of pre-processing the speed commands with the help of fuzzy rules to generate more favorable movement for the actuation device, so as to effectively avoid the jitter problem that has not yet been adequately solved by traditional methods. Detail formulas and algorithms are derived with consideration of the kinematics and dynamics of WMR. Smooth and asymptotically stable tracking of the WMR along the desired position and orientation is ensured and real-time experiment demonstrates the effectiveness and simplicity of the proposed method.

Journal ArticleDOI
TL;DR: It is demonstrated that a previously established neural network model of hippocampal region CA3 contains a mechanism that could explain discounting in downstream reward-prediction systems (e.g., basal ganglia) at a rate that is consistent with hyperbolic discounting.
Abstract: Decision-making often requires taking into consideration immediate gains as well as delayed rewards. Studies of behavior have established that anticipated rewards are discounted according to a decreasing hyperbolic function. Although mathematical explanations for reward delay discounting have been offered, little has been proposed in terms of neural network mechanisms underlying discounting. There has been much recent interest in the potential role of the hippocampus. Here we demonstrate that a previously-established neural network model of hippocampal region CA3 contains a mechanism that could explain discounting in downstream reward-prediction systems (e.g., basal ganglia). As part of its normal function, the model forms codes for stimuli that are similar to future, predicted stimuli. This similarity provides a means for reward predictions associated with future stimuli to influence current decision-making. Simulations show that this "predictive similarity" decreases as the stimuli are separated in time, at a rate that is consistent with hyperbolic discounting.

Journal ArticleDOI
TL;DR: The role of distributed input bias is studied, and methods for learning the input patterns for coupled neural oscillatory arrays are developed, to correspond to Freeman's 6th building block of neurodynamics.
Abstract: We analyze spatio-temporal dynamics of coupled neural oscillatory arrays. The interconnected oscillators can produce a wide range of dynamics, including quasi-periodic limit cycles, chaotic waveforms, and intermittent chaotic oscillations. We study the role of distributed input bias and develop methods for learning the input patterns. After learning, the coupled oscillators produce large-scale synchronized, narrow-band oscillations in response to the learned patterns. We study patterns of amplitude modulations that span the whole lattice graph. The presented results correspond to Freeman’s 6th building block of neurodynamics.

Journal ArticleDOI
TL;DR: This study is concerned with several modifications to the expectation-minimization-based algorithm, which iteratively estimates the mixing and source parameters by considering a locally smooth temporal and frequency structure in the power source spectrograms.
Abstract: Convolutive and under-determined blind audio source separation from noisy recordings is a challenging problem. Several computational strategies have been proposed to address this problem. This study is concerned with several modifications to the expectation-minimization-based algorithm, which iteratively estimates the mixing and source parameters. This strategy assumes that any entry in each source spectrogram is modeled using superimposed Gaussian components, which are mutually and individually independent across frequency and time bins. In our approach, we resolve this issue by considering a locally smooth temporal and frequency structure in the power source spectrograms. Local smoothness is enforced by incorporating a Gibbs prior in the complete data likelihood function, which models the interactions between neighboring spectrogram bins using a Markov random field. Simulations using audio files derived from stereo audio source separation evaluation campaign 2008 demonstrate high efficiency with the proposed improvement.

Journal ArticleDOI
TL;DR: New auditory-inspired speech processing methods are presented, combining spectral subtraction and two-dimensional non-linear filtering techniques originally conceived for image processing purposes, to improve recognition rates in a noise-contaminated version of the Isolet database.
Abstract: New auditory-inspired speech processing methods are presented in this paper, combining spectral subtraction and two-dimensional non-linear filtering techniques originally conceived for image processing purposes. In particular, mathematical morphology operations, like erosion and dilation, are applied to noisy speech spectrograms using specifically designed structuring elements inspired in the masking properties of the human auditory system. This is effectively complemented with a pre-processing stage including the conventional spectral subtraction procedure and auditory filterbanks. These methods were tested in both speech enhancement and automatic speech recognition tasks. For the first, time-frequency anisotropic structuring elements over grey-scale spectrograms were found to provide a better perceptual quality than isotropic ones, revealing themselves as more appropriate—under a number of perceptual quality estimation measures and several signal-to-noise ratios on the Aurora database—for retaining the structure of speech while removing background noise. For the second, the combination of Spectral Subtraction and auditory-inspired Morphological Filtering was found to improve recognition rates in a noise-contaminated version of the Isolet database.

Journal ArticleDOI
TL;DR: This paper presents a new time-frequency approach to the underdetermined blind source separation using the parallel factor decomposition of third-order tensors that can directly separate the sources as long as the uniqueness condition of parallel factors decomposition is satisfied.
Abstract: This paper presents a new time-frequency approach to the underdetermined blind source separation using the parallel factor decomposition of third-order tensors. Without any constraint on the number of active sources at an auto-term time-frequency point, this approach can directly separate the sources as long as the uniqueness condition of parallel factor decomposition is satisfied. Compared with the existing two-stage methods where the mixing matrix should be estimated at first and then used to recover the sources, our approach yields better source separation performance in the presence of noise. Moreover, the mixing matrix can be estimated at the same time of the source separation process. Numerical simulations are presented to show the superior performance of the proposed approach to some of the existing two-stage blind source separation methods that use the time-frequency representation as well.

Journal ArticleDOI
TL;DR: Tests showed that nonlinear features and MFCCs are lightly correlated on sustained speech, but uncorrelated on continuous speech, and suggest the existence of nonlinear effects in OSA patients’ voices, which should be found in continuous speech.
Abstract: We present a novel approach for the detection of severe obstructive sleep apnea (OSA) based on patients’ voices introducing nonlinear measures to describe sustained speech dynamics. Nonlinear features were combined with state-of-the-art speech recognition systems using statistical modeling techniques (Gaussian mixture models, GMMs) over cepstral parameterization (MFCC) for both continuous and sustained speech. Tests were performed on a database including speech records from both severe OSA and control speakers. A 10 % relative reduction in classification error was obtained for sustained speech when combining MFCC-GMM and nonlinear features, and 33 % when fusing nonlinear features with both sustained and continuous MFCC-GMM. Accuracy reached 88.5 % allowing the system to be used in OSA early detection. Tests showed that nonlinear features and MFCCs are lightly correlated on sustained speech, but uncorrelated on continuous speech. Results also suggest the existence of nonlinear effects in OSA patients’ voices, which should be found in continuous speech.