scispace - formally typeset
Search or ask a question
Author

Emre Kiciman

Bio: Emre Kiciman is an academic researcher from Microsoft. The author has contributed to research in topics: Social media & The Internet. The author has an hindex of 38, co-authored 156 publications receiving 6297 citations. Previous affiliations of Emre Kiciman include Stanford University & University of California, Berkeley.


Papers
More filters
Proceedings ArticleDOI
23 Jun 2002
TL;DR: This work presents a dynamic analysis methodology that automates problem determination in these environments by coarse-grained tagging of numerous real client requests as they travel through the system and using data mining techniques to correlate the believed failures and successes of these requests to determine which components are most likely to be at fault.
Abstract: Traditional problem determination techniques rely on static dependency models that are difficult to generate accurately in today's large, distributed, and dynamic application environments such as e-commerce systems. We present a dynamic analysis methodology that automates problem determination in these environments by 1) coarse-grained tagging of numerous real client requests as they travel through the system and 2) using data mining techniques to correlate the believed failures and successes of these requests to determine which components are most likely to be at fault. To validate our methodology, we have implemented Pinpoint, a framework for root cause analysis on the J2EE platform that requires no knowledge of the application components. Pinpoint consists of three parts: a communications layer that traces client requests, a failure detector that uses traffic-sniffing and middleware instrumentation, and a data analysis engine. We evaluate Pinpoint by injecting faults into various application components and show that Pinpoint identifies the faulty components with high accuracy and produces few false-positives.

910 citations

Proceedings ArticleDOI
07 May 2016
TL;DR: This paper develops a statistical methodology to infer which individuals could undergo transitions from mental health discourse to suicidal ideation, and utilizes semi-anonymous support communities on Reddit as unobtrusive data sources to infer the likelihood of these shifts.
Abstract: History of mental illness is a major factor behind suicide risk and ideation. However research efforts toward characterizing and forecasting this risk is limited due to the paucity of information regarding suicide ideation, exacerbated by the stigma of mental illness. This paper fills gaps in the literature by developing a statistical methodology to infer which individuals could undergo transitions from mental health discourse to suicidal ideation. We utilize semi-anonymous support communities on Reddit as unobtrusive data sources to infer the likelihood of these shifts. We develop language and interactional measures for this purpose, as well as a propensity score matching based statistical approach. Our approach allows us to derive distinct markers of shifts to suicidal ideation. These markers can be modeled in a prediction framework to identify individuals likely to engage in suicidal ideation in the future. We discuss societal and ethical implications of this research.

513 citations

Proceedings Article
29 Mar 2004
TL;DR: The approach enables significantly stronger capabilities in failure detection, failure diagnosis, impact analysis, and understanding system evolution, and is explored with three real implementations, two of which service millions of requests per day.
Abstract: We present a new approach to managing failures and evolution in large, complex distributed systems using runtime paths. We use the paths that requests follow as they move through the system as our core abstraction, and our "macro" approach focuses on component interactions rather than the details of the components themselves. Paths record component performance and interactions, are user- and request-centric, and occur in sufficient volume to enable statistical analysis, all in a way that is easily reusable across applications. Automated statistical analysis of multiple paths allows for the detection and diagnosis of complex failures and the assessment of evolution issues. In particular, our approach enables significantly stronger capabilities in failure detection, failure diagnosis, impact analysis, and understanding system evolution. We explore these capabilities with three real implementations, two of which service millions of requests per day. Our contributions include the approach; the maintainable, extensible, and reusable architecture; the various statistical analysis engines; and the discussion of our experience with a high-volume production service over several years.

385 citations

Journal ArticleDOI
11 Jul 2019
TL;DR: A framework for identifying a broad range of menaces in the research and practices around social data is presented, including biases and inaccuracies at the source of the data, but also introduced during processing.
Abstract: Social data in digital form—including user-generated content, expressed or implicit relations between people, and behavioral traces—are at the core of popular applications and platforms, driving the research agenda of many researchers. The promises of social data are many, including understanding “what the world thinks” about a social issue, brand, celebrity, or other entity, as well as enabling better decision-making in a variety of fields including public policy, healthcare, and economics. Many academics and practitioners have warned against the naive usage of social data. There are biases and inaccuracies occurring at the source of the data, but also introduced during processing. There are methodological limitations and pitfalls, as well as ethical boundaries and unexpected consequences that are often overlooked. This paper recognizes the rigor with which these issues are addressed by different researchers varies across a wide range. We identify a variety of menaces in the practices around social data use, and organize them in a framework that helps to identify them. “For your own sanity, you have to remember that not all problems can be solved. Not all problems can be solved, but all problems can be illuminated.” –Ursula Franklin1

379 citations


Cited by
More filters
Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

01 Jan 2012

3,692 citations

Journal Article
TL;DR: Thaler and Sunstein this paper described a general explanation of and advocacy for libertarian paternalism, a term coined by the authors in earlier publications, as a general approach to how leaders, systems, organizations, and governments can nudge people to do the things the nudgers want and need done for the betterment of the nudgees, or of society.
Abstract: NUDGE: IMPROVING DECISIONS ABOUT HEALTH, WEALTH, AND HAPPINESS by Richard H. Thaler and Cass R. Sunstein Penguin Books, 2009, 312 pp, ISBN 978-0-14-311526-7This book is best described formally as a general explanation of and advocacy for libertarian paternalism, a term coined by the authors in earlier publications. Informally, it is about how leaders, systems, organizations, and governments can nudge people to do the things the nudgers want and need done for the betterment of the nudgees, or of society. It is paternalism in the sense that "it is legitimate for choice architects to try to influence people's behavior in order to make their lives longer, healthier, and better", (p. 5) It is libertarian in that "people should be free to do what they like - and to opt out of undesirable arrangements if they want to do so", (p. 5) The built-in possibility of opting out or making a different choice preserves freedom of choice even though people's behavior has been influenced by the nature of the presentation of the information or by the structure of the decisionmaking system. I had never heard of libertarian paternalism before reading this book, and I now find it fascinating.Written for a general audience, this book contains mostly social and behavioral science theory and models, but there is considerable discussion of structure and process that has roots in mathematical and quantitative modeling. One of the main applications of this social system is economic choice in investing, selecting and purchasing products and services, systems of taxes, banking (mortgages, borrowing, savings), and retirement systems. Other quantitative social choice systems discussed include environmental effects, health care plans, gambling, and organ donations. Softer issues that are also subject to a nudge-based approach are marriage, education, eating, drinking, smoking, influence, spread of information, and politics. There is something in this book for everyone.The basis for this libertarian paternalism concept is in the social theory called "science of choice", the study of the design and implementation of influence systems on various kinds of people. The terms Econs and Humans, are used to refer to people with either considerable or little rational decision-making talent, respectively. The various libertarian paternalism concepts and systems presented are tested and compared in light of these two types of people. Two foundational issues that this book has in common with another book, Network of Echoes: Imitation, Innovation and Invisible Leaders, that was also reviewed for this issue of the Journal are that 1 ) there are two modes of thinking (or components of the brain) - an automatic (intuitive) process and a reflective (rational) process and 2) the need for conformity and the desire for imitation are powerful forces in human behavior. …

3,435 citations

04 Mar 2010
TL;DR: Recording of presentation introducing narrative analysis, outlining what it is, why it can be a useful approach, how to do it and where to find out more.
Abstract: Recording of presentation introducing narrative analysis, outlining what it is, why it can be a useful approach, how to do it and where to find out more. Presentation given at methods@manchester seminar at University of Manchester on 4 March 2010.

3,188 citations