scispace - formally typeset
Search or ask a question

Showing papers by "Carnegie Mellon University published in 2014"


Journal ArticleDOI
Keith A. Olive1, Kaustubh Agashe2, Claude Amsler3, Mario Antonelli  +222 moreInstitutions (107)
TL;DR: The review as discussed by the authors summarizes much of particle physics and cosmology using data from previous editions, plus 3,283 new measurements from 899 Japers, including the recently discovered Higgs boson, leptons, quarks, mesons and baryons.
Abstract: The Review summarizes much of particle physics and cosmology. Using data from previous editions, plus 3,283 new measurements from 899 Japers, we list, evaluate, and average measured properties of gauge bosons and the recently discovered Higgs boson, leptons, quarks, mesons, and baryons. We summarize searches for hypothetical particles such as heavy neutrinos, supersymmetric and technicolor particles, axions, dark photons, etc. All the particle properties and search limits are listed in Summary Tables. We also give numerous tables, figures, formulae, and reviews of topics such as Supersymmetry, Extra Dimensions, Particle Detectors, Probability, and Statistics. Among the 112 reviews are many that are new or heavily revised including those on: Dark Energy, Higgs Boson Physics, Electroweak Model, Neutrino Cross Section Measurements, Monte Carlo Neutrino Generators, Top Quark, Dark Matter, Dynamical Electroweak Symmetry Breaking, Accelerator Physics of Colliders, High-Energy Collider Parameters, Big Bang Nucleosynthesis, Astrophysical Constants and Cosmological Parameters.

7,337 citations


Journal ArticleDOI
TL;DR: In this article, the optical properties and applications of various two-dimensional materials including transition metal dichalcogenides are reviewed with an emphasis on nanophotonic applications, and two different approaches for enhancing their interactions with light: through their integration with external photonic structures, and through intrinsic polaritonic resonances.
Abstract: The optical properties of graphene and emerging two-dimensional materials including transition metal dichalcogenides are reviewed with an emphasis on nanophotonic applications. Two-dimensional materials exhibit diverse electronic properties, ranging from insulating hexagonal boron nitride and semiconducting transition metal dichalcogenides such as molybdenum disulphide, to semimetallic graphene. In this Review, we first discuss the optical properties and applications of various two-dimensional materials, and then cover two different approaches for enhancing their interactions with light: through their integration with external photonic structures, and through intrinsic polaritonic resonances. Finally, we present a narrow-bandgap layered material — black phosphorus — that serendipitously bridges the energy gap between the zero-bandgap graphene and the relatively large-bandgap transition metal dichalcogenides. The plethora of two-dimensional materials and their heterostructures, together with the array of available approaches for enhancing the light–matter interaction, offers the promise of scientific discoveries and nanophotonics technologies across a wide range of the electromagnetic spectrum.

2,414 citations


Journal ArticleDOI
TL;DR: In this article, the authors provide a brief review of both theoretical and experimental advances in this field and uncover the interplay between real spin and pseudospins in layered transition metal dichalcogenides.
Abstract: The recent emergence of two-dimensional layered materials — in particular the transition metal dichalcogenides — provides a new laboratory for exploring the internal quantum degrees of freedom of electrons and their potential for new electronics. These degrees of freedom are the real electron spin, the layer pseudospin, and the valley pseudospin. New methods for the quantum control of the spin and these pseudospins arise from the existence of Berry phase-related physical properties and strong spin–orbit coupling. The former leads to the versatile control of the valley pseudospin, whereas the latter gives rise to an interplay between the spin and the pseudospins. Here, we provide a brief review of both theoretical and experimental advances in this field. Understanding the physics of two-dimensional materials beyond graphene is of both fundamental and practical interest. Recent theoretical and experimental advances uncover the interplay between real spin and pseudospins in layered transition metal dichalcogenides.

2,363 citations


Journal ArticleDOI
Silvia De Rubeis1, Xin-Xin He2, Arthur P. Goldberg1, Christopher S. Poultney1, Kaitlin E. Samocha3, A. Ercument Cicek2, Yan Kou1, Li Liu2, Menachem Fromer1, Menachem Fromer3, R. Susan Walker4, Tarjinder Singh5, Lambertus Klei6, Jack A. Kosmicki3, Shih-Chen Fu1, Branko Aleksic7, Monica Biscaldi8, Patrick Bolton9, Jessica M. Brownfeld1, Jinlu Cai1, Nicholas G. Campbell10, Angel Carracedo11, Angel Carracedo12, Maria H. Chahrour3, Andreas G. Chiocchetti, Hilary Coon13, Emily L. Crawford10, Lucy Crooks5, Sarah Curran9, Geraldine Dawson14, Eftichia Duketis, Bridget A. Fernandez15, Louise Gallagher16, Evan T. Geller17, Stephen J. Guter18, R. Sean Hill3, R. Sean Hill19, Iuliana Ionita-Laza20, Patricia Jiménez González, Helena Kilpinen, Sabine M. Klauck21, Alexander Kolevzon1, Irene Lee22, Jing Lei2, Terho Lehtimäki, Chiao-Feng Lin17, Avi Ma'ayan1, Christian R. Marshall4, Alison L. McInnes23, Benjamin M. Neale24, Michael John Owen25, Norio Ozaki7, Mara Parellada26, Jeremy R. Parr27, Shaun Purcell1, Kaija Puura, Deepthi Rajagopalan4, Karola Rehnström5, Abraham Reichenberg1, Aniko Sabo28, Michael Sachse, Stephen Sanders29, Chad M. Schafer2, Martin Schulte-Rüther30, David Skuse31, David Skuse22, Christine Stevens24, Peter Szatmari32, Kristiina Tammimies4, Otto Valladares17, Annette Voran33, Li-San Wang17, Lauren A. Weiss29, A. Jeremy Willsey29, Timothy W. Yu3, Timothy W. Yu19, Ryan K. C. Yuen4, Edwin H. Cook18, Christine M. Freitag, Michael Gill16, Christina M. Hultman34, Thomas Lehner35, Aarno Palotie3, Aarno Palotie36, Aarno Palotie24, Gerard D. Schellenberg17, Pamela Sklar1, Matthew W. State29, James S. Sutcliffe10, Christopher A. Walsh3, Christopher A. Walsh19, Stephen W. Scherer4, Michael E. Zwick37, Jeffrey C. Barrett5, David J. Cutler37, Kathryn Roeder2, Bernie Devlin6, Mark J. Daly24, Mark J. Daly3, Joseph D. Buxbaum1 
13 Nov 2014-Nature
TL;DR: Using exome sequencing, it is shown that analysis of rare coding variation in 3,871 autism cases and 9,937 ancestry-matched or parental controls implicates 22 autosomal genes at a false discovery rate of < 0.05, plus a set of 107 genes strongly enriched for those likely to affect risk (FDR < 0.30).
Abstract: The genetic architecture of autism spectrum disorder involves the interplay of common and rare variants and their impact on hundreds of genes. Using exome sequencing, here we show that analysis of rare coding variation in 3,871 autism cases and 9,937 ancestry-matched or parental controls implicates 22 autosomal genes at a false discovery rate (FDR) < 0.05, plus a set of 107 autosomal genes strongly enriched for those likely to affect risk (FDR < 0.30). These 107 genes, which show unusual evolutionary constraint against mutations, incur de novo loss-of-function mutations in over 5% of autistic subjects. Many of the genes implicated encode proteins for synaptic formation, transcriptional regulation and chromatin-remodelling pathways. These include voltage-gated ion channels regulating the propagation of action potentials, pacemaking and excitability-transcription coupling, as well as histone-modifying enzymes and chromatin remodellers-most prominently those that mediate post-translational lysine methylation/demethylation modifications of histones.

2,228 citations


Journal ArticleDOI
TL;DR: W Whole-brain analyses reconciled seemingly disparate themes of both hypo- and hyperconnectivity in the ASD literature; both were detected, although hypoconnectivity dominated, particularly for corticocortical and interhemispheric functional connectivity.
Abstract: Autism spectrum disorders (ASDs) represent a formidable challenge for psychiatry and neuroscience because of their high prevalence, lifelong nature, complexity and substantial heterogeneity. Facing these obstacles requires large-scale multidisciplinary efforts. Although the field of genetics has pioneered data sharing for these reasons, neuroimaging had not kept pace. In response, we introduce the Autism Brain Imaging Data Exchange (ABIDE)-a grassroots consortium aggregating and openly sharing 1112 existing resting-state functional magnetic resonance imaging (R-fMRI) data sets with corresponding structural MRI and phenotypic information from 539 individuals with ASDs and 573 age-matched typical controls (TCs; 7-64 years) (http://fcon_1000.projects.nitrc.org/indi/abide/). Here, we present this resource and demonstrate its suitability for advancing knowledge of ASD neurobiology based on analyses of 360 male subjects with ASDs and 403 male age-matched TCs. We focused on whole-brain intrinsic functional connectivity and also survey a range of voxel-wise measures of intrinsic functional brain architecture. Whole-brain analyses reconciled seemingly disparate themes of both hypo- and hyperconnectivity in the ASD literature; both were detected, although hypoconnectivity dominated, particularly for corticocortical and interhemispheric functional connectivity. Exploratory analyses using an array of regional metrics of intrinsic brain function converged on common loci of dysfunction in ASDs (mid- and posterior insula and posterior cingulate cortex), and highlighted less commonly explored regions such as the thalamus. The survey of the ABIDE R-fMRI data sets provides unprecedented demonstrations of both replication and novel discovery. By pooling multiple international data sets, ABIDE is expected to accelerate the pace of discovery setting the stage for the next generation of ASD studies.

1,939 citations


Proceedings ArticleDOI
01 Jun 2014
TL;DR: Meteor Universal brings language specific evaluation to previously unsupported target languages by automatically extracting linguistic resources from the bitext used to train MT systems and using a universal parameter set learned from pooling human judgments of translation quality from several language directions.
Abstract: This paper describes Meteor Universal, released for the 2014 ACL Workshop on Statistical Machine Translation. Meteor Universal brings language specific evaluation to previously unsupported target languages by (1) automatically extracting linguistic resources (paraphrase tables and function word lists) from the bitext used to train MT systems and (2) using a universal parameter set learned from pooling human judgments of translation quality from several language directions. Meteor Universal is shown to significantly outperform baseline BLEU on two new languages, Russian (WMT13) and Hindi (WMT14).

1,893 citations


Proceedings ArticleDOI
12 Jul 2014
TL;DR: The method achieves both low-drift and low-computational complexity without the need for high accuracy ranging or inertial measurements and can achieve accuracy at the level of state of the art offline batch methods.
Abstract: We propose a real-time method for odometry and mapping using range measurements from a 2-axis lidar moving in 6-DOF. The problem is hard because the range measurements are received at different times, and errors in motion estimation can cause mis-registration of the resulting point cloud. To date, coherent 3D maps can be built by off-line batch methods, often using loop closure to correct for drift over time. Our method achieves both low-drift and low-computational complexity without the need for high accuracy ranging or inertial measurements. The key idea in obtaining this level of performance is the division of the complex problem of simultaneous localization and mapping, which seeks to optimize a large number of variables simultaneously, by two algorithms. One algorithm performs odometry at a high frequency but low fidelity to estimate velocity of the lidar. Another algorithm runs at a frequency of an order of magnitude lower for fine matching and registration of the point cloud. Combination of the two algorithms allows the method to map in real-time. The method has been evaluated by a large set of experiments as well as on the KITTI odometry benchmark. The results indicate that the method can achieve accuracy at the level of state of the art offline batch methods.

1,879 citations


Journal ArticleDOI
Peter A. R. Ade1, Nabila Aghanim2, M. I. R. Alves2, C. Armitage-Caplan3  +469 moreInstitutions (89)
TL;DR: The European Space Agency's Planck satellite, dedicated to studying the early Universe and its subsequent evolution, was launched 14 May 2009 and has been scanning the microwave and submillimetre sky continuously since 12 August 2009 as discussed by the authors.
Abstract: The European Space Agency’s Planck satellite, dedicated to studying the early Universe and its subsequent evolution, was launched 14 May 2009 and has been scanning the microwave and submillimetre sky continuously since 12 August 2009. In March 2013, ESA and the Planck Collaboration released the initial cosmology products based on the first 15.5 months of Planck data, along with a set of scientific and technical papers and a web-based explanatory supplement. This paper gives an overview of the mission and its performance, the processing, analysis, and characteristics of the data, the scientific results, and the science data products and papers in the release. The science products include maps of the cosmic microwave background (CMB) and diffuse extragalactic foregrounds, a catalogue of compact Galactic and extragalactic sources, and a list of sources detected through the Sunyaev-Zeldovich effect. The likelihood code used to assess cosmological models against the Planck data and a lensing likelihood are described. Scientific results include robust support for the standard six-parameter ΛCDM model of cosmology and improved measurements of its parameters, including a highly significant deviation from scale invariance of the primordial power spectrum. The Planck values for these parameters and others derived from them are significantly different from those previously determined. Several large-scale anomalies in the temperature distribution of the CMB, first detected by WMAP, are confirmed with higher confidence. Planck sets new limits on the number and mass of neutrinos, and has measured gravitational lensing of CMB anisotropies at greater than 25σ. Planck finds no evidence for non-Gaussianity in the CMB. Planck’s results agree well with results from the measurements of baryon acoustic oscillations. Planck finds a lower Hubble constant than found in some more local measures. Some tension is also present between the amplitude of matter fluctuations (σ8) derived from CMB data and that derived from Sunyaev-Zeldovich data. The Planck and WMAP power spectra are offset from each other by an average level of about 2% around the first acoustic peak. Analysis of Planck polarization data is not yet mature, therefore polarization results are not released, although the robust detection of E-mode polarization around CMB hot and cold spots is shown graphically.

1,719 citations


OtherDOI
29 Sep 2014
TL;DR: The survey has been conducted for more than 50 years and has been used by the Census Bureau for the Bureau of Labor Statistics (BLS) as discussed by the authors to estimate employment, unemployment, earnings, hours of work, and other indicators.
Abstract: The CPS is a monthly survey of about 50,000 households conducted by the Bureau of the Census for the Bureau of Labor Statistics. The survey has been conducted for more than 50 years. This is all about employment. Estimates obtained from the CPS include employment, unemployment, earnings, hours of work, and other indicators. They are available by a variety of demographic characteristics including age, sex, race, marital status, and educational attainment. They are also available by occupation, industry, and class of worker. Supplemental questions to produce estimates on a variety of topics including school enrollment, income, previous work experience, health, employee benefits, and work schedules are also often added to the regular CPS questionnaire.

1,713 citations


Journal ArticleDOI
30 Jan 2014-Blood
TL;DR: Iron-deficiency anemia was the top cause globally, although 10 different conditions were among the top 3 in regional rankings, and Malaria, schistosomiasis, and chronic kidney disease-related anemia were the only conditions to increase in prevalence.

1,427 citations


Journal ArticleDOI
TL;DR: It is concluded that sampling high-reputation workers can ensure high-quality data without having to resort to using attention check questions (ACQs), which may lead to selection bias if participants who fail ACQs are excluded post-hoc.
Abstract: Data quality is one of the major concerns of using crowdsourcing websites such as Amazon Mechanical Turk (MTurk) to recruit participants for online behavioral studies. We compared two methods for ensuring data quality on MTurk: attention check questions (ACQs) and restricting participation to MTurk workers with high reputation (above 95% approval ratings). In Experiment 1, we found that high-reputation workers rarely failed ACQs and provided higher-quality data than did low-reputation workers; ACQs improved data quality only for low-reputation workers, and only in some cases. Experiment 2 corroborated these findings and also showed that more productive high-reputation workers produce the highest-quality data. We concluded that sampling high-reputation workers can ensure high-quality data without having to resort to using ACQs, which may lead to selection bias if participants who fail ACQs are excluded post-hoc.

Journal ArticleDOI
TL;DR: The 10th public data release (DR10) from the Sloan Digital Sky Survey (SDSS-III) was released in 2013 as mentioned in this paper, which includes the first spectroscopic data from the Apache Point Observatory Galaxy Evolution Experiment (APOGEE), along with spectroscopy data from Baryon Oscillation Spectroscopic Survey (BOSS) taken through 2012 July.
Abstract: The Sloan Digital Sky Survey (SDSS) has been in operation since 2000 April. This paper presents the Tenth Public Data Release (DR10) from its current incarnation, SDSS-III. This data release includes the first spectroscopic data from the Apache Point Observatory Galaxy Evolution Experiment (APOGEE), along with spectroscopic data from the Baryon Oscillation Spectroscopic Survey (BOSS) taken through 2012 July. The APOGEE instrument is a near-infrared R ~ 22,500 300 fiber spectrograph covering 1.514-1.696 μm. The APOGEE survey is studying the chemical abundances and radial velocities of roughly 100,000 red giant star candidates in the bulge, bar, disk, and halo of the Milky Way. DR10 includes 178,397 spectra of 57,454 stars, each typically observed three or more times, from APOGEE. Derived quantities from these spectra (radial velocities, effective temperatures, surface gravities, and metallicities) are also included. DR10 also roughly doubles the number of BOSS spectra over those included in the Ninth Data Release. DR10 includes a total of 1,507,954 BOSS spectra comprising 927,844 galaxy spectra, 182,009 quasar spectra, and 159,327 stellar spectra selected over 6373.2 deg2.

Proceedings ArticleDOI
06 Oct 2014
TL;DR: In this paper, the authors propose a parameter server framework for distributed machine learning problems, where both data and workloads are distributed over worker nodes, while the server nodes maintain globally shared parameters, represented as dense or sparse vectors and matrices.
Abstract: We propose a parameter server framework for distributed machine learning problems. Both data and workloads are distributed over worker nodes, while the server nodes maintain globally shared parameters, represented as dense or sparse vectors and matrices. The framework manages asynchronous data communication between nodes, and supports flexible consistency models, elastic scalability, and continuous fault tolerance.To demonstrate the scalability of the proposed framework, we show experimental results on petabytes of real data with billions of examples and parameters on problems ranging from Sparse Logistic Regression to Latent Dirichlet Allocation and Distributed Sketching.

Journal ArticleDOI
TL;DR: Autism's genetic architecture is reached: its narrow-sense heritability is ∼52.4%, with most due to common variation, and rare de novo mutations contribute substantially to individual liability, yet their contribution to variance in liability, 2.6%, is modest compared to that for heritable variation.
Abstract: Joseph Buxbaum and colleagues use an epidemiological sample from Sweden to investigate the genetic architecture of autism spectrum disorders. They conclude that most inherited risk for autism is determined by common variation and that rare variation explains a smaller fraction of total heritability.

Journal ArticleDOI
14 Jun 2014
TL;DR: This paper exposes the vulnerability of commodity DRAM chips to disturbance errors, and shows that it is possible to corrupt data in nearby addresses by reading from the same address in DRAM by activating the same row inDRAM.
Abstract: Memory isolation is a key property of a reliable and secure computing system--an access to one memory address should not have unintended side effects on data stored in other addresses. However, as DRAM process technology scales down to smaller dimensions, it becomes more difficult to prevent DRAM cells from electrically interacting with each other. In this paper, we expose the vulnerability of commodity DRAM chips to disturbance errors. By reading from the same address in DRAM, we show that it is possible to corrupt data in nearby addresses. More specifically, activating the same row in DRAM corrupts data in nearby rows. We demonstrate this phenomenon on Intel and AMD systems using a malicious program that generates many DRAM accesses. We induce errors in most DRAM modules (110 out of 129) from three major DRAM manufacturers. From this we conclude that many deployed systems are likely to be at risk. We identify the root cause of disturbance errors as the repeated toggling of a DRAM row's wordline, which stresses inter-cell coupling effects that accelerate charge leakage from nearby rows. We provide an extensive characterization study of disturbance errors and their behavior using an FPGA-based testing platform. Among our key findings, we show that (i) it takes as few as 139K accesses to induce an error and (ii) up to one in every 1.7K cells is susceptible to errors. After examining various potential ways of addressing the problem, we propose a low-overhead solution to prevent the errors

Journal ArticleDOI
TL;DR: This Perspective presents recent advances in macromolecular engineering enabled by ATRP with emphasis on various catalytic/initiation systems that use parts-per-million concentrations of Cu catalysts and can be run in environmentally friendly media, e.g., water.
Abstract: This Perspective presents recent advances in macromolecular engineering enabled by ATRP. They include the fundamental mechanistic and synthetic features of ATRP with emphasis on various catalytic/initiation systems that use parts-per-million concentrations of Cu catalysts and can be run in environmentally friendly media, e.g., water. The roles of the major components of ATRP—monomers, initiators, catalysts, and various additives—are explained, and their reactivity and structure are correlated. The effects of media and external stimuli on polymerization rates and control are presented. Some examples of precisely controlled elements of macromolecular architecture, such as chain uniformity, composition, topology, and functionality, are discussed. Syntheses of polymers with complex architecture, various hybrids, and bioconjugates are illustrated. Examples of current and forthcoming applications of ATRP are covered. Future challenges and perspectives for macromolecular engineering by ATRP are discussed.

Journal ArticleDOI
TL;DR: This model is used to identify ∼1,000 genes that are significantly lacking in functional coding variation in non-ASD samples and are enriched for de novo loss-of-function mutations identified in ASD cases, suggesting that the role of de noVO mutations in ASDs might reside in fundamental neurodevelopmental processes.
Abstract: Mark Daly and colleagues present a statistical framework to evaluate the role of de novo mutations in human disease by calibrating a model of de novo mutation rates at the individual gene level. The mutation probabilities defined by their model and list of constrained genes can be used to help identify genetic variants that have a significant role in disease.

Posted Content
TL;DR: In this paper, the authors present the design principles and properties of Bitcoin for a non-technical audience, reviews its past, present and future uses, and points out risks and regulatory issues as Bitcoin interacts with the conventional financial system and real economy.
Abstract: Bitcoin is an online communication protocol that facilitates virtual currency including electronic payments. Since its inception in 2009 by an anonymous group of developers, Bitcoin has served tens of millions of transactions with total dollar value in the billions. Users have been drawn to Bitcoin for its decentralization, intentionally relying on no single server or set of servers to store transactions and also avoiding any single party that can ban certain participants or certain types of transactions. Bitcoin is of interest to economists in part for its potential to disrupt existing payment systems and perhaps monetary systems, and also for the wealth of data it provides about agents’ behavior and about the Bitcoin system itself. This article presents the platform’s design principles and properties for a non-technical audience, reviews its past, present and future uses, and points out risks and regulatory issues as Bitcoin interacts with the conventional financial system and the real economy.

Journal ArticleDOI
TL;DR: This review examines three important motivations for population studies: single-trial hypotheses requiring statistical power, hypotheses of population response structure and exploratory analyses of large data sets, and practical advice about selecting methods and interpreting their outputs.
Abstract: Most sensory, cognitive and motor functions depend on the interactions of many neurons. In recent years, there has been rapid development and increasing use of technologies for recording from large numbers of neurons, either sequentially or simultaneously. A key question is what scientific insight can be gained by studying a population of recorded neurons beyond studying each neuron individually. Here, we examine three important motivations for population studies: single-trial hypotheses requiring statistical power, hypotheses of population response structure and exploratory analyses of large data sets. Many recent studies have adopted dimensionality reduction to analyze these populations and to find features that are not apparent at the level of individual neurons. We describe the dimensionality reduction methods commonly applied to population activity and offer practical advice about selecting methods and interpreting their outputs. This review is intended for experimental and computational researchers who seek to understand the role dimensionality reduction has had and can have in systems neuroscience, and who seek to apply these methods to their own data.

Book
05 Jun 2014
TL;DR: This text gives a thorough overview of Boolean functions, beginning with the most basic definitions and proceeding to advanced topics such as hypercontractivity and isoperimetry, and includes a "highlight application" such as Arrow's theorem from economics.
Abstract: Boolean functions are perhaps the most basic objects of study in theoretical computer science. They also arise in other areas of mathematics, including combinatorics, statistical physics, and mathematical social choice. The field of analysis of Boolean functions seeks to understand them via their Fourier transform and other analytic methods. This text gives a thorough overview of the field, beginning with the most basic definitions and proceeding to advanced topics such as hypercontractivity and isoperimetry. Each chapter includes a "highlight application" such as Arrow's theorem from economics, the Goldreich-Levin algorithm from cryptography/learning theory, Hstad's NP-hardness of approximation results, and "sharp threshold" theorems for random graph properties. The book includes roughly 450 exercises and can be used as the basis of a one-semester graduate course. It should appeal to advanced undergraduates, graduate students, and researchers in computer science theory and related mathematical fields.

Proceedings ArticleDOI
28 Nov 2014
TL;DR: An approach to automatically extract discriminative features for activity recognition based on Convolutional Neural Networks, which can capture local dependency and scale invariance of a signal as it has been shown in speech recognition and image recognition domains is proposed.
Abstract: A variety of real-life mobile sensing applications are becoming available, especially in the life-logging, fitness tracking and health monitoring domains. These applications use mobile sensors embedded in smart phones to recognize human activities in order to get a better understanding of human behavior. While progress has been made, human activity recognition remains a challenging task. This is partly due to the broad range of human activities as well as the rich variation in how a given activity can be performed. Using features that clearly separate between activities is crucial. In this paper, we propose an approach to automatically extract discriminative features for activity recognition. Specifically, we develop a method based on Convolutional Neural Networks (CNN), which can capture local dependency and scale invariance of a signal as it has been shown in speech recognition and image recognition domains. In addition, a modified weight sharing technique, called partial weight sharing, is proposed and applied to accelerometer signals to get further improvements. The experimental results on three public datasets, Skoda (assembly line activities), Opportunity (activities in kitchen), Actitracker (jogging, walking, etc.), indicate that our novel CNN-based approach is practical and achieves higher accuracy than existing state-of-the-art methods.

Journal ArticleDOI
TL;DR: This article considers product graphs as a graph model that helps extend the application of DSPG methods to large data sets through efficient implementation based on parallelization and vectorization and relates the presented framework to existing methods for large-scale data processing.
Abstract: Analysis and processing of very large data sets, or big data, poses a significant challenge. Massive data sets are collected and studied in numerous domains, from engineering sciences to social networks, biomolecular research, commerce, and security. Extracting valuable information from big data requires innovative approaches that efficiently process large amounts of data as well as handle and, moreover, utilize their structure. This article discusses a paradigm for large-scale data analysis based on the discrete signal processing (DSP) on graphs (DSPG). DSPG extends signal processing concepts and methodologies from the classical signal processing theory to data indexed by general graphs. Big data analysis presents several challenges to DSPG, in particular, in filtering and frequency analysis of very large data sets. We review fundamental concepts of DSPG, including graph signals and graph filters, graph Fourier transform, graph frequency, and spectrum ordering, and compare them with their counterparts from the classical signal processing theory. We then consider product graphs as a graph model that helps extend the application of DSPG methods to large data sets through efficient implementation based on parallelization and vectorization. We relate the presented framework to existing methods for large-scale data processing and illustrate it with an application to data compression.

Proceedings ArticleDOI
24 Aug 2014
TL;DR: It is proved that the convergence rate does not decrease with increasing minibatch size, and with suitable implementations of approximate optimization, the resulting algorithm can outperform standard SGD in many scenarios.
Abstract: Stochastic gradient descent (SGD) is a popular technique for large-scale optimization problems in machine learning. In order to parallelize SGD, minibatch training needs to be employed to reduce the communication cost. However, an increase in minibatch size typically decreases the rate of convergence. This paper introduces a technique based on approximate optimization of a conservatively regularized objective function within each minibatch. We prove that the convergence rate does not decrease with increasing minibatch size. Experiments demonstrate that with suitable implementations of approximate optimization, the resulting algorithm can outperform standard SGD in many scenarios.

Posted Content
TL;DR: A comprehensive survey of the state-of-the-art methods for anomaly detection in data represented as graphs can be found in this article, where the authors highlight the effectiveness, scalability, generality, and robustness aspects of the methods.
Abstract: Detecting anomalies in data is a vital task, with numerous high-impact applications in areas such as security, finance, health care, and law enforcement. While numerous techniques have been developed in past years for spotting outliers and anomalies in unstructured collections of multi-dimensional points, with graph data becoming ubiquitous, techniques for structured {\em graph} data have been of focus recently. As objects in graphs have long-range correlations, a suite of novel technology has been developed for anomaly detection in graph data. This survey aims to provide a general, comprehensive, and structured overview of the state-of-the-art methods for anomaly detection in data represented as graphs. As a key contribution, we provide a comprehensive exploration of both data mining and machine learning algorithms for these {\em detection} tasks. we give a general framework for the algorithms categorized under various settings: unsupervised vs. (semi-)supervised approaches, for static vs. dynamic graphs, for attributed vs. plain graphs. We highlight the effectiveness, scalability, generality, and robustness aspects of the methods. What is more, we stress the importance of anomaly {\em attribution} and highlight the major techniques that facilitate digging out the root cause, or the `why', of the detected anomalies for further analysis and sense-making. Finally, we present several real-world applications of graph-based anomaly detection in diverse domains, including financial, auction, computer traffic, and social networks. We conclude our survey with a discussion on open theoretical and practical challenges in the field.

Proceedings ArticleDOI
01 Apr 2014
TL;DR: This paper argues that lexico-semantic content should additionally be invariant across languages and proposes a simple technique based on canonical correlation analysis (CCA) for incorporating multilingual evidence into vectors generated monolingually.
Abstract: The distributional hypothesis of Harris (1954), according to which the meaning of words is evidenced by the contexts they occur in, has motivated several effective techniques for obtaining vector space semantic representations of words using unannotated text corpora. This paper argues that lexico-semantic content should additionally be invariant across languages and proposes a simple technique based on canonical correlation analysis (CCA) for incorporating multilingual evidence into vectors generated monolingually. We evaluate the resulting word representations on standard lexical semantic evaluation tasks and show that our method produces substantially better semantic representations than monolingual techniques.

Journal ArticleDOI
TL;DR: In this paper, the concepts of low and high frequencies on graphs, and low-, high-, and band-pass graph signals and graph filters are defined and applied to sensor malfunction detection and data classification.
Abstract: Signals and datasets that arise in physical and engineering applications, as well as social, genetics, biomolecular, and many other domains, are becoming increasingly larger and more complex. In contrast to traditional time and image signals, data in these domains are supported by arbitrary graphs. Signal processing on graphs extends concepts and techniques from traditional signal processing to data indexed by generic graphs. This paper studies the concepts of low and high frequencies on graphs, and low-, high- and band-pass graph signals and graph filters. In traditional signal processing, these concepts are easily defined because of a natural frequency ordering that has a physical interpretation. For signals residing on graphs, in general, there is no obvious frequency ordering. We propose a definition of total variation for graph signals that naturally leads to a frequency ordering on graphs and defines low-, high-, and band-pass graph signals and filters. We study the design of graph filters with specified frequency response, and illustrate our approach with applications to sensor malfunction detection and data classification.

Journal ArticleDOI
TL;DR: Grain boundary complexion transitions are the root cause of a wide variety of materials phenomena such as abnormal grain growth, grain boundary embrittlement and activated sintering as discussed by the authors, which have defied mechanistic explanation for years.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed two fast distributed gradient algorithms based on the centralized Nesterov gradient algorithm and established their convergence rates in terms of the per-node communications and the pernode gradient evaluations.
Abstract: We study distributed optimization problems when N nodes minimize the sum of their individual costs subject to a common vector variable. The costs are convex, have Lipschitz continuous gradient (with constant L), and bounded gradient. We propose two fast distributed gradient algorithms based on the centralized Nesterov gradient algorithm and establish their convergence rates in terms of the per-node communications K and the per-node gradient evaluations k. Our first method, Distributed Nesterov Gradient, achieves rates O( logK/K) and O(logk/k). Our second method, Distributed Nesterov gradient with Consensus iterations, assumes at all nodes knowledge of L and μ(W) - the second largest singular value of the N ×N doubly stochastic weight matrix W. It achieves rates O( 1/ K2-ξ) and O( 1/k2) ( ξ > 0 arbitrarily small). Further, we give for both methods explicit dependence of the convergence constants on N and W. Simulation examples illustrate our findings.

Journal ArticleDOI
TL;DR: This review concentrates on the use of electric fields within catalyst particles to mitigate the effects of recombination and back-reaction and to increase photochemical reactivity.
Abstract: The photocatalytic activity of materials for water splitting is limited by the recombination of photogenerated electron–hole pairs as well as the back-reaction of intermediate species. This review concentrates on the use of electric fields within catalyst particles to mitigate the effects of recombination and back-reaction and to increase photochemical reactivity. Internal electric fields in photocatalysts can arise from ferroelectric phenomena, p–n junctions, polar surface terminations, and polymorph junctions. The manipulation of internal fields through the creation of charged interfaces in hierarchically structured materials is a promising strategy for the design of improved photocatalysts.

Journal ArticleDOI
TL;DR: Sailfish, a computational method for quantifying the abundance of previously annotated RNA isoforms from RNA-seq data, exemplifies the potential of lightweight algorithms for efficiently processing sequencing reads.
Abstract: A new algorithm speeds up the quantification of transcripts from RNA-seq data by doing away with read mapping.