scispace - formally typeset
Search or ask a question

Showing papers by "Paul Sajda published in 2010"


Journal ArticleDOI
04 Mar 2010
TL;DR: The efforts in developing brain-computer interfaces (BCIs) which synergistically integrate computer vision and human vision so as to construct a system for image triage are described and two architectures for this type of cortically coupled computer vision are described.
Abstract: Our society's information technology advancements have resulted in the increasingly problematic issue of information overload-i.e., we have more access to information than we can possibly process. This is nowhere more apparent than in the volume of imagery and video that we can access on a daily basis-for the general public, availability of YouTube video and Google Images, or for the image analysis professional tasked with searching security video or satellite reconnaissance. Which images to look at and how to ensure we see the images that are of most interest to us, begs the question of whether there are smart ways to triage this volume of imagery. Over the past decade, computer vision research has focused on the issue of ranking and indexing imagery. However, computer vision is limited in its ability to identify interesting imagery, particularly as ?interesting? might be defined by an individual. In this paper we describe our efforts in developing brain-computer interfaces (BCIs) which synergistically integrate computer vision and human vision so as to construct a system for image triage. Our approach exploits machine learning for real-time decoding of brain signals which are recorded noninvasively via electroencephalography (EEG). The signals we decode are specific for events related to imagery attracting a user's attention. We describe two architectures we have developed for this type of cortically coupled computer vision and discuss potential applications and challenges for the future.

151 citations


Journal ArticleDOI
TL;DR: It is concluded that most microstructural and mechanical properties of the distal tibia can be derived efficiently from µMR images and can provide additional information regarding bone quality.
Abstract: Micro magnetic resonance imaging (µMRI) is an in vivo imaging method that permits 3D quantification of cortical and trabecular bone microstructure. µMR images can also be used for building microstructural finite element (µFE) models to assess bone stiffness, which highly correlates with bone's resistance to fractures. In order for µMRI-based microstructural and µFE analyses to become standard clinical tools for assessing bone quality, validation with a current gold standard, namely, high-resolution micro computed tomography (µCT), is required. Microstructural measurements of 25 human cadaveric distal tibias were performed for the registered µMR and µCT images, respectively. Next, whole bone stiffness, trabecular bone stiffness, and elastic moduli of cubic subvolumes of trabecular bone in both µMR and µCT images were determined by voxel-based µFE analysis. The bone volume fraction (BV/TV), trabecular number (Tb.N*), trabecular spacing (Tb.Sp*), cortical thickness (Ct.Th), and structure model index (SMI) based on µMRI showed strong correlations with µCT measurements (r2 = 0.67 to 0.97), and bone surface-to-volume ratio (BS/BV), connectivity density (Conn.D), and degree of anisotropy (DA) had significant but moderate correlations (r2 = 0.33 to 0.51). Each of these measurements also contributed to one or many of the µFE-predicted mechanical properties. However, model-independent trabecular thickness (Tb.Th*) based on µMRI had no correlation with the µCT measurement and did not contribute to any mechanical measurement. Furthermore, the whole bone and trabecular bone stiffness based on µMRI were highly correlated with those of µCT images (r2 = 0.86 and 0.96), suggesting that µMRI-based µFE analyses can directly and accurately quantify whole bone mechanical competence. In contrast, the elastic moduli of the µMRI trabecular bone subvolume had significant but only moderate correlations with their gold standards (r2 = 0.40 to 0.58). We conclude that most microstructural and mechanical properties of the distal tibia can be derived efficiently from µMR images and can provide additional information regarding bone quality. © 2010 American Society for Bone and Mineral Research.

113 citations


Journal Article
TL;DR: A novel hybrid algorithm based on combining two types of optimization iterations: one being very fast and memory friendly while the other being slower but more accurate is proposed, which has global convergence at a geometric rate (a Q-linear rate in optimization terminology).
Abstract: l1-regularized logistic regression, also known as sparse logistic regression, is widely used in machine learning, computer vision, data mining, bioinformatics and neural signal processing. The use of l1 regularization attributes attractive properties to the classifier, such as feature selection, robustness to noise, and as a result, classifier generality in the context of supervised learning. When a sparse logistic regression problem has large-scale data in high dimensions, it is computationally expensive to minimize the non-differentiable l1-norm in the objective function. Motivated by recent work (Koh et al., 2007; Hale et al., 2008), we propose a novel hybrid algorithm based on combining two types of optimization iterations: one being very fast and memory friendly while the other being slower but more accurate. Called hybrid iterative shrinkage (HIS), the resulting algorithm is comprised of a fixed point continuation phase and an interior point phase. The first phase is based completely on memory efficient operations such as matrix-vector multiplications, while the second phase is based on a truncated Newton's method. Furthermore, we show that various optimization techniques, including line search and continuation, can significantly accelerate convergence. The algorithm has global convergence at a geometric rate (a Q-linear rate in optimization terminology). We present a numerical comparison with several existing algorithms, including an analysis using benchmark data from the UCI machine learning repository, and show our algorithm is the most computationally efficient without loss of accuracy.

79 citations


Journal ArticleDOI
TL;DR: It is argued that weighted ML is the preferred cost-sensitive technique because it is demonstrated that thresholded ML is suboptimal and that the risk-minimizing solution varies with the misclassification cost ratio.
Abstract: The presence of asymmetry in the misclassification costs or class prevalences is a common occurrence in the pattern classification domain. While much interest has been devoted to the study of cost-sensitive learning techniques, the relationship between cost-sensitive learning and the specification of the model set in a parametric estimation framework remains somewhat unclear. To that end, we differentiate between the case of the model including the true posterior, and that in which the model is misspecified. In the former case, it is shown that thresholding the maximum likelihood (ML) estimate is an asymptotically optimal solution to the risk minimization problem. On the other hand, under model misspecification, it is demonstrated that thresholded ML is suboptimal and that the risk-minimizing solution varies with the misclassification cost ratio. Moreover, we analytically show that the negative weighted log likelihood (Elkan, 2001) is a tight, convex upper bound of the empirical loss. Coupled with empirical results on several real-world data sets, we argue that weighted ML is the preferred cost-sensitive technique.

62 citations


Proceedings ArticleDOI
11 Nov 2010
TL;DR: Preliminary results show that snapshot hyperspectral imaging in combination with NMF is able to detect biochemically meaningful components of drusen and the macular pigment, and is the first reported demonstration in vivo of the separate absorbance peaks for lutein and zeaxanthin in Macular pigment.
Abstract: Drusen, the hallmark lesions of age related macular degeneration (AMD), are biochemically heterogeneous and the identification of their biochemical distribution is key to the understanding of AMD. Yet the challenges are to develop imaging technology and analytics, which respect the physical generation of the hyperspectral signal in the presence of noise, artifacts, and multiple mixed sources while maximally exploiting the full data dimensionality to uncover clinically relevant spectral signatures. This paper reports on the statistical analysis of hyperspectral signatures of drusen and anatomical regions of interest using snapshot hyperspectral imaging and non-negative matrix factorization (NMF). We propose physical meaningful priors as initialization schemes to NMF for finding low-rank decompositions that capture the underlying physiology of drusen and the macular pigment. Preliminary results show that snapshot hyperspectral imaging in combination with NMF is able to detect biochemically meaningful components of drusen and the macular pigment. To our knowledge, this is the first reported demonstration in vivo of the separate absorbance peaks for lutein and zeaxanthin in macular pigment.

27 citations


Journal Article
TL;DR: A method is proposed that provides an unified framework for the analysis of EEG, combining first and second-order spatial and temporal features based on a bilinear model, which outperforms state-of-the art techniques for single-trial classification for a broad range of signal-to-noise ratios.
Abstract: Traditional analysis methods for single-trial classification of electro-encephalography (EEG) focus on two types of paradigms: phase-locked methods, in which the amplitude of the signal is used as the feature for classification, that is, event related potentials; and second-order methods, in which the feature of interest is the power of the signal, that is, event related (de)synchronization. The process of deciding which paradigm to use is ad hoc and is driven by assumptions regarding the underlying neural generators. Here we propose a method that provides an unified framework for the analysis of EEG, combining first and second-order spatial and temporal features based on a bilinear model. Evaluation of the proposed method on simulated data shows that the technique outperforms state-of-the art techniques for single-trial classification for a broad range of signal-to-noise ratios. Evaluations on human EEG--including one benchmark data set from the Brain Computer Interface (BCI) competition--show statistically significant gains in classification accuracy, with a reduction in overall classification error from 26%-28% to 19%.

22 citations


Book ChapterDOI
01 Jan 2010
TL;DR: This chapter describes two system architectures for C3Vision, an EEG-based BCI system designed to leverage the relative advantages of human and computer, with brain signals serving as the medium of communication of the user’s intentions and cognitive state.
Abstract: We have developed EEG-based BCI systems which couple human vision and computer vision for speeding the search of large images and image/video databases We term these types of BCI systems “cortically-coupled computer vision” (C3Vision) C3Vision exploits (1) the ability of the human visual system to get the “gist” of a scene with brief (10’s–100’s of ms) and rapid serial (10 Hz) image presentations and (2) our ability to decode from the EEG whether, based on the gist, the scene is relevant, informative and/or grabs the user’s attention In this chapter we describe two system architectures for C3Vision that we have developed The systems are designed to leverage the relative advantages, in both speed and recognition capabilities, of human and computer, with brain signals serving as the medium of communication of the user’s intentions and cognitive state

15 citations


Proceedings ArticleDOI
03 Mar 2010
TL;DR: A novel method is proposed that distinguishes between subject-invariant features and subject-specific features, based on a bilinear formulation, and extracts neurological components never before reported on the RSVP thus demonstrating the ability of the method to extract novel neural signatures from the data.
Abstract: A major challenge in single-trial electroencephalography (EEG) analysis and Brain Computer Interfacing (BCI) is the so called, inter-subject/inter-session variability: (i.e large variability in measurements obtained during different recording sessions). This variability restricts the number of samples available for single-trial analysis to a limited number that can be obtained during a single session. Here we propose a novel method that distinguishes between subject-invariant features and subject-specific features, based on a bilinear formulation. The method allows for one to combine multiple recording of EEG to estimate the subject-invariant parameters, hence addressing the issue of inter-subject variability, while reducing the complexity of estimation for the subject-specific parameters. The method is demonstrated on 34 datasets from two different experimental paradigms: Perception categorization task and Rapid Serial Visual Presentation (RSVP) task. We show significant improvements in classification performance over state-of-the-art methods. Further, our method extracts neurological components never before reported on the RSVP thus demonstrating the ability of our method to extract novel neural signatures from the data.

13 citations


Book ChapterDOI
01 Jan 2010
TL;DR: The simultaneous acquisition of electroencephalography (EEG) and functional magnetic resonance imaging (fMRI) is a potentially powerful multimodal imaging technique for measuring the functional activity of the human brain.
Abstract: Publisher Summary The simultaneous acquisition of electroencephalography (EEG) and functional magnetic resonance imaging (fMRI) is a potentially powerful multimodal imaging technique for measuring the functional activity of the human brain. Given that EEG measures the electrical activity of neural populations while fMRI measures hemodynamics via a blood oxygenation-level-dependent (BOLD) signal related to neuronal activity, simultaneous EEG/fMRI (hereafter referred to as EEG/fMRI) offers a modality to investigate the relationship between these two phenomena within the context of noninvasive neuroimaging. Though fMRI is widely used to study cognitive and perceptual function, there is still substantial debate regarding the relation- ship between local neuronal activity and hemodynamic changes. Another rationale for EEG/fMRI is that, despite the fact that the individual modalities measure markedly different physiological phenomena, in terms of spatial and temporal resolution they are quite complementary. EEG offers millisecond temporal resolution; however, the spatial sampling density and ill-posed nature of the inverse model problem limit its spatial resolution. On the other hand, fMRI provides millimeter spatial resolution, but because of scanning rates and the low-pass nature of the BOLD response, the temporal resolution is limited. One approach that has been adopted to take advantage of this complementarity is to use fMRI activations to seed EEG source localization.

13 citations


Journal ArticleDOI
TL;DR: It is concluded that the electroencephalogram can identify neural signatures of detection both before and after the saccade, indicating that subjects anticipate the target before the last saccades which serves to foveate and confirm it target identity.
Abstract: We investigated neural correlates of target detection in the electroencephalogram (EEG) during a free viewing search task and analyzed signals locked to saccadic events. We adopted stimuli similar to ones we used previously to study target detection in serial presentations of briefly flashed images. Subjects performed the search task for multiple random scenes while we simultaneously recorded 64 channels of EEG and tracked subjects’ eye position. For each subject we identified target saccades (TS) and distractor saccades (DS). For TS we used saccades which were aimed directly to the target and were followed by a correct behavioral response (button press); for DS, we used saccades in correctly responded trials having no target (these were 28% of the trials). We sampled the sets of TS and DS saccades such that they were equalized/matched for saccade direction and duration, ensuring that no information in the saccade properties themselves was discriminating for their type. We aligned EEG to the saccade onset and used logistic regression (LR), in the space of the 64 electrodes, to identify activity discriminating a TS from a DS on a single-trial basis. Specifically, LR was applied to the signals from 50ms time windows preceding and following saccade onset for varying latencies. We found that there is significant discriminating activity in the EEG both before and after the saccade—average discriminability across 7 subjects was AUC=0.64, 80 ms before the saccade, and AUC=0.68, 60 ms after the saccade (p<0.01 established using bootstrap resampling). Between these time periods we saw substantial reduction in discriminating activity (mean AUC=0.59). We conclude that we can identify neural signatures of detection both before and after the saccade, indicating that subjects anticipate the target before the last saccade which serves to foveate and confirm it target identity.

10 citations


Proceedings ArticleDOI
11 Nov 2010
TL;DR: This paper considers how the precision of the EEG scores affects the resulting precision of images retrieved by a graph-based transductive learning model designed to propagate image class labels based on image feature similarity and sparse labels.
Abstract: Our group has been investigating the development of BCI systems for improving information delivery to a user, specifically systems for triaging image content based on what captures a user's attention. One of the systems we have developed uses single-trial EEG scores as noisy labels for a computer vision image retrieval system. In this paper we investigate how the noisy nature of the EEG-derived labels affects the resulting accuracy of the computer vision system. Specifically, we consider how the precision of the EEG scores affects the resulting precision of images retrieved by a graph-based transductive learning model designed to propagate image class labels based on image feature similarity and sparse labels.

Patent
05 May 2010
TL;DR: In this paper, a system and method for evaluating and predicting visual perception of animals is provided, in which retinal image data or other data representing the pathology of an animal is provided as input to a nonlinear computational model of a primary visual cortex.
Abstract: A system and method for evaluating and predicting visual perception of animals are provided. Retinal image data or other data representing the pathology of an animal is provided as input to a nonlinear computational model of a primary visual cortex of an animal. A plurality of images of a predetermined set of recognizable objects is applied to the computational model input and then made accessible to a processor for generation of the model. A confidence estimate of a recognition by the model of the object represented by the selected image is generated. The system and method allow for quantifying the impact on visual perception of a predetermined condition, as well as determining a strategy for ameliorating the effect of a visual impairment Weights in the computational model regression can be adjusted to provide the best prediction results. This can be done by iterative optimization, for example, using psychometric data.

01 Jan 2010
TL;DR: This chapter discusses information overload in the context of information technology, which has resulted in the increasingly problematic issue of Information overload in modern society.
Abstract: Our society's information technology advance- ments have resulted in the increasingly problematic issue of information overloadVi.e.,wehavemoreaccesstoinformation