scispace - formally typeset
Search or ask a question
Institution

Yahoo!

CompanyLondon, United Kingdom
About: Yahoo! is a company organization based out in London, United Kingdom. It is known for research contribution in the topics: Population & Web search query. The organization has 26749 authors who have published 29915 publications receiving 732583 citations. The organization is also known as: Yahoo! Inc. & Maudwen-Yahoo! Inc.


Papers
More filters
Journal ArticleDOI
03 Dec 2012
TL;DR: This work provides a simple and efficient learning procedure that is guaranteed to recover the parameters for a wide class of multi-view models and topic models, including latent Dirichlet allocation (LDA).
Abstract: Topic modeling is a generalization of clustering that posits that observations (words in a document) are generated by multiple latent factors (topics), as opposed to just one. The increased representational power comes at the cost of a more challenging unsupervised learning problem for estimating the topic-word distributions when only words are observed, and the topics are hidden. This work provides a simple and efficient learning procedure that is guaranteed to recover the parameters for a wide class of multi-view models and topic models, including latent Dirichlet allocation (LDA). For LDA, the procedure correctly recovers both the topic-word distributions and the parameters of the Dirichlet prior over the topic mixtures, using only trigram statistics (i.e., third order moments, which may be estimated with documents containing just three words). The method is based on an efficiently computable orthogonal tensor decomposition of low-order moments.

271 citations

Proceedings ArticleDOI
20 Jul 2008
TL;DR: Experiments on large-scale tagging datasets of scientific documents and web pages del.icio.us indicate that the proposed framework for real-time tag recommendation is capable of making tag recommendation efficiently and effectively.
Abstract: Tags are user-generated labels for entities. Existing research on tag recommendation either focuses on improving its accuracy or on automating the process, while ignoring the efficiency issue. We propose a highly-automated novel framework for real-time tag recommendation. The tagged training documents are treated as triplets of (words, docs, tags), and represented in two bipartite graphs, which are partitioned into clusters by Spectral Recursive Embedding (SRE). Tags in each topical cluster are ranked by our novel ranking algorithm. A two-way Poisson Mixture Model (PMM) is proposed to model the document distribution into mixture components within each cluster and aggregate words into word clusters simultaneously. A new document is classified by the mixture model based on its posterior probabilities so that tags are recommended according to their ranks. Experiments on large-scale tagging datasets of scientific documents (CiteULike) and web pages del.icio.us) indicate that our framework is capable of making tag recommendation efficiently and effectively. The average tagging time for testing a document is around 1 second, with over 88% test documents correctly labeled with the top nine tags we suggested.

271 citations

Patent
Ronald Martinez1, Marc Davis1
15 Nov 2007
TL;DR: In this paper, the authors present a system for collecting data and presenting media to a user, which generally includes a data gathering module associated with an electronic device and a management module, which manages at least one user profile based on the gathered data.
Abstract: Systems, methods and apparatus for collecting data and presenting media to a user are provided. The systems generally includes a data gathering module associated with an electronic device. The data gathering module communicates gathered data to a management module, which manages at least one user profile based on the gathered data. The management module may select media for presentation to a user based on the user profile, and the selected media may be displayed to the user via a media output device co-located with the user, such as a display of the user's mobile electronic device or a television, computer, billboard or other display co-located with the user. Related methods are also provided.

271 citations

Journal Article
Alamelu Raja1
TL;DR: The review brings out the complexity of the host-pathogen interaction and underlines the importance of identifying the mechanisms involved in protection, in order to design vaccine strategies and find out surrogate markers to be measured as in vitro correlate of protective immunity.
Abstract: Tuberculosis is a major health problem throughout the world causing large number of deaths, more than that from any other single infectious disease. The review attempts to summarize the information available on host immune response to Mycobacterium tuberculosis. Since the main route of entry of the causative agent is the respiratory route, alveolar macrophages are the important cell types, which combat the pathogen. Various aspects of macrophage-mycobacterium interactions and the role of macrophage in host response such as binding of M. tuberculosis to macrophages via surface receptors, phagosome-lysosome fusion, mycobacterial growth inhibition/killing through free radical based mechanisms such as reactive oxygen and nitrogen intermediates; cytokine-mediated mechanisms; recruitment of accessory immune cells for local inflammatory response and presentation of antigens to T cells for development of acquired immunity have been described. The role of macrophage apoptosis in containing the growth of the bacilli is also discussed. The role of other components of innate immune response such as natural resistance associated macrophage protein (Nramp), neutrophils, and natural killer cells has been discussed. The specific acquired immune response through CD4 T cells, mainly responsible for protective Th1 cytokines and through CD8 cells bringing about cytotoxicity, also has been described. The role of CD-1 restricted CD8(+) T cells and non-MHC restricted gamma/deltaT cells has been described although it is incompletely understood at the present time. Humoral immune response is seen though not implicated in protection. The value of cytokine therapy has also been reviewed. Influence of the host human leucocyte antigens (HLA) on the susceptibility to disease is discussed. Mycobacteria are endowed with mechanisms through which they can evade the onslaught of host defense response. These mechanisms are discussed including diminishing the ability of antigen presenting cells to present antigens to CD4(+) T cells; production of suppressive cytokines; escape from fused phagosomes and inducing T cell apoptosis. The review brings out the complexity of the host-pathogen interaction and underlines the importance of identifying the mechanisms involved in protection, in order to design vaccine strategies and find out surrogate markers to be measured as in vitro correlate of protective immunity.

270 citations

Proceedings ArticleDOI
12 Aug 2012
TL;DR: A large-scale data mining approach to learning word-word relatedness, where known pairs of related words impose constraints on the learning process, and learns for each word a low-dimensional representation, which strives to maximize the likelihood of a word given the contexts in which it appears.
Abstract: Prior work on computing semantic relatedness of words focused on representing their meaning in isolation, effectively disregarding inter-word affinities. We propose a large-scale data mining approach to learning word-word relatedness, where known pairs of related words impose constraints on the learning process. We learn for each word a low-dimensional representation, which strives to maximize the likelihood of a word given the contexts in which it appears. Our method, called CLEAR, is shown to significantly outperform previously published approaches. The proposed method is based on first principles, and is generic enough to exploit diverse types of text corpora, while having the flexibility to impose constraints on the derived word similarities. We also make publicly available a new labeled dataset for evaluating word relatedness algorithms, which we believe to be the largest such dataset to date.

270 citations


Authors

Showing all 26766 results

NameH-indexPapersCitations
Ashok Kumar1515654164086
Alexander J. Smola122434110222
Howard I. Maibach116182160765
Sanjay Jain10388146880
Amirhossein Sahebkar100130746132
Marc Davis9941250243
Wenjun Zhang9697638530
Jian Xu94136652057
Fortunato Ciardiello9469547352
Tong Zhang9341436519
Michael E. J. Lean9241130939
Ashish K. Jha8750330020
Xin Zhang87171440102
Theunis Piersma8663234201
George Varghese8425328598
Network Information
Related Institutions (5)
University of Toronto
294.9K papers, 13.5M citations

85% related

University of California, San Diego
204.5K papers, 12.3M citations

85% related

University College London
210.6K papers, 9.8M citations

84% related

Cornell University
235.5K papers, 12.2M citations

84% related

University of Washington
305.5K papers, 17.7M citations

84% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
20232
202247
20211,088
20201,074
20191,568
20181,352