scispace - formally typeset
Search or ask a question

Showing papers by "Carnegie Mellon University published in 2010"


Proceedings ArticleDOI
13 Jun 2010
TL;DR: The Cohn-Kanade (CK+) database is presented, with baseline results using Active Appearance Models (AAMs) and a linear support vector machine (SVM) classifier using a leave-one-out subject cross-validation for both AU and emotion detection for the posed data.
Abstract: In 2000, the Cohn-Kanade (CK) database was released for the purpose of promoting research into automatically detecting individual facial expressions. Since then, the CK database has become one of the most widely used test-beds for algorithm development and evaluation. During this period, three limitations have become apparent: 1) While AU codes are well validated, emotion labels are not, as they refer to what was requested rather than what was actually performed, 2) The lack of a common performance metric against which to evaluate new algorithms, and 3) Standard protocols for common databases have not emerged. As a consequence, the CK database has been used for both AU and emotion detection (even though labels for the latter have not been validated), comparison with benchmark algorithms is missing, and use of random subsets of the original database makes meta-analyses difficult. To address these and other concerns, we present the Extended Cohn-Kanade (CK+) database. The number of sequences is increased by 22% and the number of subjects by 27%. The target expression for each sequence is fully FACS coded and emotion labels have been revised and validated. In addition to this, non-posed sequences for several types of smiles and their associated metadata have been added. We present baseline results using Active Appearance Models (AAMs) and a linear support vector machine (SVM) classifier using a leave-one-out subject cross-validation for both AU and emotion detection for the posed data. The emotion and AU labels, along with the extended image data and tracked landmarks will be made available July 2010.

3,439 citations


Journal ArticleDOI
Koji Nakamura1, K. Hagiwara, Ken Ichi Hikasa2, Hitoshi Murayama3  +180 moreInstitutions (92)
TL;DR: In this article, a biennial review summarizes much of particle physics using data from previous editions, plus 2158 new measurements from 551 papers, they list, evaluate and average measured properties of gauge bosons, leptons, quarks, mesons, and baryons.
Abstract: This biennial Review summarizes much of particle physics. Using data from previous editions, plus 2158 new measurements from 551 papers, we list, evaluate, and average measured properties of gauge bosons, leptons, quarks, mesons, and baryons. We also summarize searches for hypothetical particles such as Higgs bosons, heavy neutrinos, and supersymmetric particles. All the particle properties and search limits are listed in Summary Tables. We also give numerous tables, figures, formulae, and reviews of topics such as the Standard Model, particle detectors, probability, and statistics. Among the 108 reviews are many that are new or heavily revised including those on neutrino mass, mixing, and oscillations, QCD, top quark, CKM quark-mixing matrix, V-ud & V-us, V-cb & V-ub, fragmentation functions, particle detectors for accelerator and non-accelerator physics, magnetic monopoles, cosmological parameters, and big bang cosmology.

2,788 citations


Proceedings Article
11 Jul 2010
TL;DR: This work proposes an approach and a set of design principles for an intelligent computer agent that runs forever and describes a partial implementation of such a system that has already learned to extract a knowledge base containing over 242,000 beliefs.
Abstract: We consider here the problem of building a never-ending language learner; that is, an intelligent computer agent that runs forever and that each day must (1) extract, or read, information from the web to populate a growing structured knowledge base, and (2) learn to perform this task better than on the previous day In particular, we propose an approach and a set of design principles for such an agent, describe a partial implementation of such a system that has already learned to extract a knowledge base containing over 242,000 beliefs with an estimated precision of 74% after running for 67 days, and discuss lessons learned from this preliminary attempt to build a never-ending learning agent

2,010 citations


Journal ArticleDOI
29 Oct 2010-Science
TL;DR: A psychometric methodology for quantifying a factor termed “collective intelligence” (c), which reflects how well groups perform on a similarly diverse set of group problem-solving tasks, and finds converging evidence of a general collective intelligence factor that explains a group’s performance on a wide variety of tasks.
Abstract: Psychologists have repeatedly shown that a single statistical factor—often called “general intelligence”— emerges from the correlations among people's performance on a wide variety of cognitive tasks. But no one has systematically examined whether a similar kind of “collective intelligence” exists for groups of people. In two studies with 699 individuals, working in groups of two to five, we find converging evidence of a general collective intelligence factor that explains a group's performance on a wide variety of tasks. This “c factor” is not strongly correlated with the average or maximum individual intelligence of group members but is correlated with the average social sensitivity of group members, the equality in distribution of conversational turn-taking, and the proportion of females in the group. As research, management, and many other kinds of tasks are increasingly accomplished by groups—both those working face-to-face and "virtually"(1‐3)—it is becoming even more important to understand the determinants of group

1,941 citations


Proceedings Article
16 May 2010
TL;DR: This work connects measures of public opinion measured from polls with sentiment measured from text, and finds several surveys on consumer confidence and political opinion over the 2008 to 2009 period correlate to sentiment word frequencies in contemporaneous Twitter messages.
Abstract: We connect measures of public opinion measured from polls with sentiment measured from text. We analyze several surveys on consumer confidence and political opinion over the 2008 to 2009 period, and find they correlate to sentiment word frequencies in contemporaneous Twitter messages. While our results vary across datasets, in several cases the correlations are as high as 80%, and capture important large-scale trends. The results highlight the potential of text streams as a substitute and supplement for traditional polling.

1,940 citations


Journal ArticleDOI
Dalila Pinto1, Alistair T. Pagnamenta2, Lambertus Klei3, Richard Anney4  +178 moreInstitutions (46)
15 Jul 2010-Nature
TL;DR: The genome-wide characteristics of rare (<1% frequency) copy number variation in ASD are analysed using dense genotyping arrays to reveal many new genetic and functional targets in ASD that may lead to final connected pathways.
Abstract: The autism spectrum disorders (ASDs) are a group of conditions characterized by impairments in reciprocal social interaction and communication, and the presence of restricted and repetitive behaviours. Individuals with an ASD vary greatly in cognitive development, which can range from above average to intellectual disability. Although ASDs are known to be highly heritable ( approximately 90%), the underlying genetic determinants are still largely unknown. Here we analysed the genome-wide characteristics of rare (<1% frequency) copy number variation in ASD using dense genotyping arrays. When comparing 996 ASD individuals of European ancestry to 1,287 matched controls, cases were found to carry a higher global burden of rare, genic copy number variants (CNVs) (1.19 fold, P = 0.012), especially so for loci previously implicated in either ASD and/or intellectual disability (1.69 fold, P = 3.4 x 10(-4)). Among the CNVs there were numerous de novo and inherited events, sometimes in combination in a given family, implicating many novel ASD genes such as SHANK2, SYNGAP1, DLGAP2 and the X-linked DDX53-PTCHD1 locus. We also discovered an enrichment of CNVs disrupting functional gene sets involved in cellular proliferation, projection and motility, and GTPase/Ras signalling. Our results reveal many new genetic and functional targets in ASD that may lead to final connected pathways.

1,919 citations


Proceedings ArticleDOI
13 Jun 2010
TL;DR: The design, construction and verification of cyber-physical systems pose a multitude of technical challenges that must be addressed by a cross-disciplinary community of researchers and educators.
Abstract: Cyber-physical systems (CPS) are physical and engineered systems whose operations are monitored, coordinated, controlled and integrated by a computing and communication core. Just as the internet transformed how humans interact with one another, cyber-physical systems will transform how we interact with the physical world around us. Many grand challenges await in the economically vital domains of transportation, health-care, manufacturing, agriculture, energy, defense, aerospace and buildings. The design, construction and verification of cyber-physical systems pose a multitude of technical challenges that must be addressed by a cross-disciplinary community of researchers and educators.

1,692 citations


Journal ArticleDOI
TL;DR: The challenges associated with the application of both group-based trajectory and growth mixture modeling are discussed, and a set of preliminary guidelines for applied researchers to follow when reporting model results are proposed.
Abstract: Group-based trajectory models are increasingly being applied in clinical research to map the developmental course of symptoms and assess heterogeneity in response to clinical interventions. In this review, we provide a nontechnical overview of group-based trajectory and growth mixture modeling alongside a sampling of how these models have been applied in clinical research. We discuss the challenges associated with the application of both types of group-based models and propose a set of preliminary guidelines for applied researchers to follow when reporting model results. Future directions in group-based modeling applications are discussed, including the use of trajectory models to facilitate causal inference when random assignment to treatment condition is not possible.

1,644 citations


Book
17 May 2010
TL;DR: Bridging Learning Research and Teaching Practice applies the Seven Principles to Ourselves to help students become self-Directed Learners.
Abstract: List of Figures, Tables, and Exhibits. Foreword (Richard E. Mayer). Acknowledgments. About the Authors. Introduction Bridging Learning Research and Teaching Practice. 1 How Does Students' Prior Knowledge Affect Their Learning? 2 How Does the Way Students Organize Knowledge Affect Their Learning? 3 What Factors Motivate Students to Learn? 4 How Do Students Develop Mastery? 5 What Kinds of Practice and Feedback Enhance Learning? 6 Why Do Student Development and Course Climate Matter for Student Learning? 7 How Do Students Become Self-Directed Learners? Conclusion Applying the Seven Principles to Ourselves. Appendices. Appendix A What Is Student Self-Assessment and How Can We Use It? Appendix B What Are Concept Maps and How Can We Use Them? Appendix C What Are Rubrics and How Can We Use Them? Appendix D What Are Learning Objectives and How Can We Use Them? Appendix E What Are Ground Rules and How Can We Use Them? Appendix F What Are Exam Wrappers and How Can We Use Them? Appendix G What Are Checklists and How Can We Use Them? Appendix H What Is Reader Response/Peer Review and How Can We Use It? References. Name Index. Subject Index.

1,567 citations


Journal ArticleDOI
TL;DR: The results strongly suggest that DeepQA is an effective and extensible architecture that may be used as a foundation for combining, deploying, evaluating and advancing a wide range of algorithmic techniques to rapidly advance the field of QA.
Abstract: IBM Research undertook a challenge to build a computer system that could compete at the human champion level in real time on the American TV Quiz show, Jeopardy! The extent of the challenge includes fielding a real-time automatic contestant on the show, not merely a laboratory exercise. The Jeopardy! Challenge helped us address requirements that led to the design of the DeepQA architecture and the implementation of Watson. After 3 years of intense research and development by a core team of about 20 researches, Watson is performing at human expert-levels in terms of precision, confidence and speed at the Jeopardy! Quiz show. Our results strongly suggest that DeepQA is an effective and extensible architecture that may be used as a foundation for combining, deploying, evaluating and advancing a wide range of algorithmic techniques to rapidly advance the field of QA.

1,446 citations


Journal ArticleDOI
TL;DR: This paper introduces the database, describes the recording procedure, and presents results from baseline experiments using PCA and LDA classifiers to highlight similarities and differences between PIE and Multi-PIE.

Journal ArticleDOI
TL;DR: Recent research advances have opened an avenue to achieving the precise control of Au(n)(SR)(m) nanoclusters at the ultimate atomic level, and may stimulate a long-lasting and wider scientific and technological interest in this special type of Au nanoparticles.
Abstract: The scientific study of gold nanoparticles (typically 1–100 nm) has spanned more than 150 years since Faraday's time and will apparently last longer. This review will focus on a special type of ultrasmall (<2 nm) yet robust gold nanoparticles that are protected by thiolates, so-called gold thiolate nanoclusters, denoted as Aun(SR)m (where, n and m represent the number of gold atoms and thiolate ligands, respectively). Despite the past fifteen years' intense work on Aun(SR)m nanoclusters, there is still a tremendous amount of science that is not yet understood, which is mainly hampered by the unavailability of atomically precise Aun(SR)m clusters and by their unknown structures. Nonetheless, recent research advances have opened an avenue to achieving the precise control of Aun(SR)m nanoclusters at the ultimate atomic level. The successful structural determination of Au102(SPhCOOH)44 and [Au25(SCH2CH2Ph)18]q (q = −1, 0) by X-ray crystallography has shed some light on the unique atomic packing structure adopted in these gold thiolate nanoclusters, and has also permitted a precise correlation of their structure with properties, including electronic, optical and magnetic properties. Some exciting research is anticipated to take place in the next few years and may stimulate a long-lasting and wider scientific and technological interest in this special type of Au nanoparticles.

Journal ArticleDOI
05 Aug 2010-Nature
TL;DR: Foldit is described, a multiplayer online game that engages non-scientists in solving hard prediction problems and shows that top-ranked Foldit players excel at solving challenging structure refinement problems in which substantial backbone rearrangements are necessary to achieve the burial of hydrophobic residues.
Abstract: A natural polypeptide chain can fold into a native protein in microseconds, but predicting such stable three-dimensional structure from any given amino-acid sequence and first physical principles remains a formidable computational challenge. Aiming to recruit human visual and strategic powers to the task, Seth Cooper, David Baker and colleagues turned their 'Rosetta' structure-prediction algorithm into an online multiplayer game called Foldit, in which thousands of non-scientists competed and collaborated to produce a rich set of new algorithms and search strategies for protein structure refinement. The work shows that even computationally complex scientific problems can be effectively crowd-sourced using interactive multiplayer games. Predicting the structure of a folded protein from first principles for any given amino-acid sequence remains a formidable computational challenge. To recruit human abilities to the task, these authors turned their Rosetta structure prediction algorithm into an online multiplayer game in which thousands of non-scientists competed and collaborated to produce new algorithms and search strategies for protein structure refinement. This shows that computationally complex problems can be effectively 'crowd-sourced' through interactive multiplayer games. People exert large amounts of problem-solving effort playing computer games. Simple image- and text-recognition tasks have been successfully ‘crowd-sourced’ through games1,2,3, but it is not clear if more complex scientific problems can be solved with human-directed computing. Protein structure prediction is one such problem: locating the biologically relevant native conformation of a protein is a formidable computational challenge given the very large size of the search space. Here we describe Foldit, a multiplayer online game that engages non-scientists in solving hard prediction problems. Foldit players interact with protein structures using direct manipulation tools and user-friendly versions of algorithms from the Rosetta structure prediction methodology4, while they compete and collaborate to optimize the computed energy. We show that top-ranked Foldit players excel at solving challenging structure refinement problems in which substantial backbone rearrangements are necessary to achieve the burial of hydrophobic residues. Players working collaboratively develop a rich assortment of new strategies and algorithms; unlike computational approaches, they explore not only the conformational space but also the space of possible search strategies. The integration of human visual problem-solving and strategy development capabilities with traditional computational algorithms through interactive multiplayer games is a powerful new approach to solving computationally-limited scientific problems.

Posted Content
TL;DR: In this article, a no-regret algorithm is proposed to train a stationary deterministic policy with good performance under the distribution of observations it induces in such sequential settings, and it outperforms previous approaches on two challenging imitation learning problems and a benchmark sequence labeling problem.
Abstract: Sequential prediction problems such as imitation learning, where future observations depend on previous predictions (actions), violate the common i.i.d. assumptions made in statistical learning. This leads to poor performance in theory and often in practice. Some recent approaches provide stronger guarantees in this setting, but remain somewhat unsatisfactory as they train either non-stationary or stochastic policies and require a large number of iterations. In this paper, we propose a new iterative algorithm, which trains a stationary deterministic policy, that can be seen as a no regret algorithm in an online learning setting. We show that any such no regret algorithm, combined with additional reduction assumptions, must find a policy with good performance under the distribution of observations it induces in such sequential settings. We demonstrate that this new approach outperforms previous approaches on two challenging imitation learning problems and a benchmark sequence labeling problem.

Journal ArticleDOI
TL;DR: This work explicitly shows that the surface ligands (-SR) play a major role in enhancing the fluorescence of gold nanoparticles, and demonstrates strategies to enhance the Fluorescence of thiolate ligand-protected gold nanop particles.
Abstract: The fluorescence of metal nanoparticles (such as gold and silver) has long been an intriguing topic and has drawn considerable research interest. However, the origin of fluorescence still remains unclear. In this work, on the basis of atomically monodisperse, 25-atom gold nanoclusters we present some interesting results on the fluorescence from [Au25(SR)18]q (where q is the charge state of the particle), which has shed some light on this issue. Our work explicitly shows that the surface ligands (-SR) play a major role in enhancing the fluorescence of gold nanoparticles. Specifically, the surface ligands can influence the fluorescence in two different ways: (i) charge transfer from the ligands to the metal nanoparticle core (i.e., LMNCT) through the Au−S bonds, and (ii) direct donation of delocalized electrons of electron-rich atoms or groups of the ligands to the metal core. Following these two mechanisms, we have demonstrated strategies to enhance the fluorescence of thiolate ligand-protected gold nanopa...

Book ChapterDOI
15 Aug 2010
TL;DR: Verifiable computation as mentioned in this paper allows a computationally weak client to outsource the computation of a function F on various dynamically-chosen inputs x 1,...,xk to one or more workers.
Abstract: We introduce and formalize the notion of Verifiable Computation, which enables a computationally weak client to "outsource" the computation of a function F on various dynamically-chosen inputs x1, ...,xk to one or more workers. The workers return the result of the function evaluation, e.g., yi = F(xi), as well as a proof that the computation of F was carried out correctly on the given value xi. The primary constraint is that the verification of the proof should require substantially less computational effort than computing F(i) from scratch. We present a protocol that allows the worker to return a computationally-sound, non-interactive proof that can be verified in O(mċpoly(λ)) time, where m is the bit-length of the output of F, and λ is a security parameter. The protocol requires a one-time pre-processing stage by the client which takes O(|C|ċpoly(λ)) time, where C is the smallest known Boolean circuit computing F. Unlike previous work in this area, our scheme also provides (at no additional cost) input and output privacy for the client, meaning that the workers do not learn any information about the xi or yi values.

Proceedings ArticleDOI
06 Dec 2010
TL;DR: A high-level image representation, called the Object Bank, is proposed, where an image is represented as a scale-invariant response map of a large number of pre-trained generic object detectors, blind to the testing dataset or visual task.
Abstract: Robust low-level image features have been proven to be effective representations for a variety of visual recognition tasks such as object recognition and scene classification; but pixels, or even local image patches, carry little semantic meanings. For high level visual tasks, such low-level image representations are potentially not enough. In this paper, we propose a high-level image representation, called the Object Bank, where an image is represented as a scale-invariant response map of a large number of pre-trained generic object detectors, blind to the testing dataset or visual task. Leveraging on the Object Bank representation, superior performances on high level visual recognition tasks can be achieved with simple off-the-shelf classifiers such as logistic regression and linear SVM. Sparsity algorithms make our representation more efficient and scalable for large scene datasets, and reveal semantically meaningful feature patterns.

Journal ArticleDOI
01 Feb 2010
TL;DR: In this paper, the authors provide an overview of the historical development of statistical network modeling and then introduce a number of examples that have been studied in the network literature and their subsequent discussion focuses on some prominent static and dynamic network models and their interconnections.
Abstract: Networks are ubiquitous in science and have become a focal point for discussion in everyday life. Formal statistical models for the analysis of network data have emerged as a major topic of interest in diverse areas of study, and most of these involve a form of graphical representation. Probability models on graphs date back to 1959. Along with empirical studies in social psychology and sociology from the 1960s, these early works generated an active "network community" and a substantial literature in the 1970s. This effort moved into the statistical literature in the late 1970s and 1980s, and the past decade has seen a burgeoning network literature in statistical physics and computer science. The growth of the World Wide Web and the emergence of online "networking communities" such as Facebook, MySpace, and LinkedIn, and a host of more specialized professional network communities has intensified interest in the study of networks and network data. Our goal in this review is to provide the reader with an entry point to this burgeoning literature. We begin with an overview of the historical development of statistical network modeling and then we introduce a number of examples that have been studied in the network literature. Our subsequent discussion focuses on a number of prominent static and dynamic network models and their interconnections. We emphasize formal model descriptions, and pay special attention to the interpretation of parameters and their estimation. We end with a description of some open problems and challenges for machine learning and statistics.

Proceedings ArticleDOI
10 Apr 2010
TL;DR: It is found that directed communication is associated with greater feelings of bonding social capital and lower loneliness, but has only a modest relationship with bridging social capital, which is primarily related to overall friend network size.
Abstract: Previous research has shown a relationship between use of social networking sites and feelings of social capital. However, most studies have relied on self-reports by college students. The goals of the current study are to (1) validate the common self-report scale using empirical data from Facebook, (2) test whether previous findings generalize to older and international populations, and (3) delve into the specific activities linked to feelings of social capital and loneliness. In particular, we investigate the role of directed interaction between pairs---such as wall posts, comments, and "likes" --- and consumption of friends' content, including status updates, photos, and friends' conversations with other friends. We find that directed communication is associated with greater feelings of bonding social capital and lower loneliness, but has only a modest relationship with bridging social capital, which is primarily related to overall friend network size. Surprisingly, users who consume greater levels of content report reduced bridging and bonding social capital and increased loneliness. Implications for designs to support well-being are discussed.

Proceedings ArticleDOI
10 Apr 2010
TL;DR: A stage-based model of personal informatics systems composed of five stages (preparation, collection, integration, reflection, and action) is derived and barriers in each of the stages are identified.
Abstract: People strive to obtain self-knowledge. A class of systems called personal informatics is appearing that help people collect and reflect on personal information. However, there is no comprehensive list of problems that users experience using these systems, and no guidance for making these systems more effective. To address this, we conducted surveys and interviews with people who collect and reflect on personal information. We derived a stage-based model of personal informatics systems composed of five stages (preparation, collection, integration, reflection, and action) and identified barriers in each of the stages. These stages have four essential properties: barriers cascade to later stages; they are iterative; they are user-driven and/or system-driven; and they are uni-faceted or multi-faceted. From these properties, we recommend that personal informatics systems should 1) be designed in a holistic manner across the stages; 2) allow iteration between stages; 3) apply an appropriate balance of automated technology and user control within each stage to facilitate the user experience; and 4) explore support for associating multiple facets of people's lives to enrich the value of systems.

Journal ArticleDOI
TL;DR: The total structure of Au(38)(SC(2)H(4)Ph)(24) nanoparticles determined by single crystal X-ray crystallography is reported, which is based upon a face-fused Au(23) biicosahedral core and capped by three monomeric Au(SR)(2) staples at the waist of the Au( 23) rod.
Abstract: We report the total structure of Au38(SC2H4Ph)24 nanoparticles determined by single crystal X-ray crystallography. This nanoparticle is based upon a face-fused Au23 biicosahedral core, which is further capped by three monomeric Au(SR)2 staples at the waist of the Au23 rod and six dimeric staples with three on the top icosahedron and other three on the bottom icosahedron. The six Au2(SR)3 staples are arranged in a staggered configuration, and the Au38S24 framework has a C3 rotation axis.

Book
19 Apr 2010
TL;DR: The Mechanics and Thermodynamics of Continua as discussed by the authors provides a unified treatment of continuum mechanics and thermodynamics that emphasises the universal status of the basic balances and the entropy imbalance.
Abstract: The Mechanics and Thermodynamics of Continua presents a unified treatment of continuum mechanics and thermodynamics that emphasises the universal status of the basic balances and the entropy imbalance. These laws are viewed as fundamental building blocks on which to frame theories of material behaviour. As a valuable reference source, this book presents a detailed and complete treatment of continuum mechanics and thermodynamics for graduates and advanced undergraduates in engineering, physics and mathematics. The chapters on plasticity discuss the standard isotropic theories and, in addition, crystal plasticity and gradient plasticity.

Journal ArticleDOI
09 Aug 2010
TL;DR: An overview of recent gossip algorithms work, including convergence rate results, which are related to the number of transmitted messages and thus the amount of energy consumed in the network for gossiping, and the use of gossip algorithms for canonical signal processing tasks including distributed estimation, source localization, and compression.
Abstract: Gossip algorithms are attractive for in-network processing in sensor networks because they do not require any specialized routing, there is no bottleneck or single point of failure, and they are robust to unreliable wireless network conditions. Recently, there has been a surge of activity in the computer science, control, signal processing, and information theory communities, developing faster and more robust gossip algorithms and deriving theoretical performance guarantees. This paper presents an overview of recent work in the area. We describe convergence rate results, which are related to the number of transmitted messages and thus the amount of energy consumed in the network for gossiping. We discuss issues related to gossiping over wireless links, including the effects of quantization and noise, and we illustrate the use of gossip algorithms for canonical signal processing tasks including distributed estimation, source localization, and compression.

Journal ArticleDOI
TL;DR: In this article, the authors present results from the factor analysis of 43 AMS datasets (27 of the datasets are reanalyzed in this work) and provide a holistic overview of Northern Hemisphere organic aerosol (OA) and its evolution in the atmosphere.
Abstract: . In this study we compile and present results from the factor analysis of 43 Aerosol Mass Spectrometer (AMS) datasets (27 of the datasets are reanalyzed in this work). The components from all sites, when taken together, provide a holistic overview of Northern Hemisphere organic aerosol (OA) and its evolution in the atmosphere. At most sites, the OA can be separated into oxygenated OA (OOA), hydrocarbon-like OA (HOA), and sometimes other components such as biomass burning OA (BBOA). We focus on the OOA components in this work. In many analyses, the OOA can be further deconvolved into low-volatility OOA (LV-OOA) and semi-volatile OOA (SV-OOA). Differences in the mass spectra of these components are characterized in terms of the two main ions m/z 44 (CO2+) and m/z 43 (mostly C2H3O+), which are used to develop a new mass spectral diagnostic for following the aging of OA components in the atmosphere. The LV-OOA component spectra have higher f44 (ratio of m/z 44 to total signal in the component mass spectrum) and lower f43 (ratio of m/z 43 to total signal in the component mass spectrum) than SV-OOA. A wide range of f44 and O:C ratios are observed for both LV-OOA (0.17±0.04, 0.73±0.14) and SV-OOA (0.07±0.04, 0.35±0.14) components, reflecting the fact that there is a continuum of OOA properties in ambient aerosol. The OOA components (OOA, LV-OOA, and SV-OOA) from all sites cluster within a well-defined triangular region in the f44 vs. f43 space, which can be used as a standardized means for comparing and characterizing any OOA components (laboratory or ambient) observed with the AMS. Examination of the OOA components in this triangular space indicates that OOA component spectra become increasingly similar to each other and to fulvic acid and HULIS sample spectra as f44 (a surrogate for O:C and an indicator of photochemical aging) increases. This indicates that ambient OA converges towards highly aged LV-OOA with atmospheric oxidation. The common features of the transformation between SV-OOA and LV-OOA at multiple sites potentially enable a simplified description of the oxidation of OA in the atmosphere. Comparison of laboratory SOA data with ambient OOA indicates that laboratory SOA are more similar to SV-OOA and rarely become as oxidized as ambient LV-OOA, likely due to the higher loadings employed in the experiments and/or limited oxidant exposure in most chamber experiments.

Journal ArticleDOI
TL;DR: This article surveys techniques developed in civil engineering and computer science that can be utilized to automate the process of creating as-built BIMs and outlines the main methods used by these algorithms for representing knowledge about shape, identity, and relationships.

Proceedings ArticleDOI
16 May 2010
TL;DR: The algorithms for dynamic taint analysis and forward symbolic execution are described as extensions to the run-time semantics of a general language to highlight important implementation choices, common pitfalls, and considerations when using these techniques in a security context.
Abstract: Dynamic taint analysis and forward symbolic execution are quickly becoming staple techniques in security analyses. Example applications of dynamic taint analysis and forward symbolic execution include malware analysis, input filter generation, test case generation, and vulnerability discovery. Despite the widespread usage of these two techniques, there has been little effort to formally define the algorithms and summarize the critical issues that arise when these techniques are used in typical security contexts. The contributions of this paper are two-fold. First, we precisely describe the algorithms for dynamic taint analysis and forward symbolic execution as extensions to the run-time semantics of a general language. Second, we highlight important implementation choices, common pitfalls, and considerations when using these techniques in a security context.

Journal ArticleDOI
TL;DR: In this paper, the problem of estimating the graph associated with a binary Ising Markov random field is considered, where the neighborhood of any given node is estimated by performing logistic regression subject to an l 1-constraint.
Abstract: We consider the problem of estimating the graph associated with a binary Ising Markov random field. We describe a method based on l1-regularized logistic regression, in which the neighborhood of any given node is estimated by performing logistic regression subject to an l1-constraint. The method is analyzed under high-dimensional scaling in which both the number of nodes p and maximum neighborhood size d are allowed to grow as a function of the number of observations n. Our main results provide sufficient conditions on the triple (n, p, d) and the model parameters for the method to succeed in consistently estimating the neighborhood of every node in the graph simultaneously. With coherence conditions imposed on the population Fisher information matrix, we prove that consistent neighborhood selection can be obtained for sample sizes n=Ω(d3log p) with exponentially decaying error. When these same conditions are imposed directly on the sample matrices, we show that a reduced sample size of n=Ω(d2log p) suffices for the method to estimate neighborhoods consistently. Although this paper focuses on the binary graphical models, we indicate how a generalization of the method of the paper would apply to general discrete Markov random fields.

Proceedings Article
08 Jul 2010
TL;DR: The expressiveness of the GraphLab framework is demonstrated by designing and implementing parallel versions of belief propagation, Gibbs sampling, Co-EM, Lasso and Compressed Sensing and it is shown that using GraphLab the authors can achieve excellent parallel performance on large scale real-world problems.
Abstract: Designing and implementing efficient, provably correct parallel machine learning (ML) algorithms is challenging. Existing high-level parallel abstractions like MapReduce are insufficiently expressive while low-level tools like MPI and Pthreads leave ML experts repeatedly solving the same design challenges. By targeting common patterns in ML, we developed GraphLab, which improves upon abstractions like MapReduce by compactly expressing asynchronous iterative algorithms with sparse computational dependencies while ensuring data consistency and achieving a high degree of parallel performance. We demonstrate the expressiveness of the GraphLab framework by designing and implementing parallel versions of belief propagation, Gibbs sampling, Co-EM, Lasso and Compressed Sensing. We show that using GraphLab we can achieve excellent parallel performance on large scale real-world problems.

Journal ArticleDOI
TL;DR: The authors demonstrate, with simple examples, that asymmetries in regression coefficients cannot identify causal effects and that very simple models of imitation can produce substantial correlations between an individual’s enduring traits and his or her choices, even when there is no intrinsic affinity between them.
Abstract: We consider processes on social networks that can potentially involve three factors: homophily, or the formation of social ties due to matching individual traits; social contagion, also known as social influence; and the causal effect of an individual's covariates on their behavior or other measurable responses. We show that, generically, all of these are confounded with each other. Distinguishing them from one another requires strong assumptions on the parametrization of the social process or on the adequacy of the covariates used (or both). In particular we demonstrate, with simple examples, that asymmetries in regression coefficients cannot identify causal effects, and that very simple models of imitation (a form of social contagion) can produce substantial correlations between an individual's enduring traits and their choices, even when there is no intrinsic affinity between them. We also suggest some possible constructive responses to these results.

Journal ArticleDOI
TL;DR: The proposed GQI method can be applied to grid or shell sampling schemes and can provide directional and quantitative information about the crossing fibers.
Abstract: Based on the Fourier transform relation between diffusion magnetic resonance (MR) signals and the underlying diffusion displacement, a new relation is derived to estimate the spin distribution function (SDF) directly from diffusion MR signals. This relation leads to an imaging method called generalized q-sampling imaging (GQI), which can obtain the SDF from the shell sampling scheme used in q-ball imaging (QBI) or the grid sampling scheme used in diffusion spectrum imaging (DSI). The accuracy of GQI was evaluated by a simulation study and an in vivo experiment in comparison with QBI and DSI. The simulation results showed that the accuracy of GQI was comparable to that of QBI and DSI. The simulation study of GQI also showed that an anisotropy index, named quantitative anisotropy, was correlated with the volume fraction of the resolved fiber component. The in vivo images of GQI demonstrated that SDF patterns were similar to the ODFs reconstructed by QBI or DSI. The tractography generated from GQI was also similar to those generated from QBI and DSI. In conclusion, the proposed GQI method can be applied to grid or shell sampling schemes and can provide directional and quantitative information about the crossing fibers.