Institution

Yahoo!

Company•London, United Kingdom•

About: Yahoo! is a company organization based out in London, United Kingdom. It is known for research contribution in the topics: Population & Web search query. The organization has 26749 authors who have published 29915 publications receiving 732583 citations. The organization is also known as: Yahoo! Inc. & Maudwen-Yahoo! Inc.

...read moreread less

Topics: Population, Web search query, Web page, Web query classification, Query expansion ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Deep learning of binary hash codes for fast image retrieval

[...]

Kevin Lin¹, Huei-Fang Yang¹, Jen-Hao Hsiao², Chu-Song Chen¹•Institutions (2)

Academia Sinica¹, Yahoo!²

07 Jun 2015

TL;DR: This work proposes an effective deep learning framework to generate binary hash codes for fast image retrieval by employing a hidden layer for representing the latent concepts that dominate the class labels in convolutional neural networks.

...read moreread less

Abstract: Approximate nearest neighbor search is an efficient strategy for large-scale image retrieval. Encouraged by the recent advances in convolutional neural networks (CNNs), we propose an effective deep learning framework to generate binary hash codes for fast image retrieval. Our idea is that when the data labels are available, binary codes can be learned by employing a hidden layer for representing the latent concepts that dominate the class labels. The utilization of the CNN also allows for learning image representations. Unlike other supervised methods that require pair-wised inputs for binary code learning, our method learns hash codes and image representations in a point-wised manner, making it suitable for large-scale datasets. Experimental results show that our method outperforms several state-of-the-art hashing algorithms on the CIFAR-10 and MNIST datasets. We further demonstrate its scalability and efficacy on a large-scale dataset of 1 million clothing images.

...read moreread less

605 citations

Journal Article•DOI•

A Space-Time Spectral Method for the Time Fractional Diffusion Equation

[...]

Xianjuan Li¹, Chuanju Xu•Institutions (1)

Yahoo!¹

01 Apr 2009-SIAM Journal on Numerical Analysis

TL;DR: Thanks to the spectral accuracy in both space and time of the proposed method, the storage requirement due to the “global time dependence” can be considerably relaxed, and therefore calculation of the long-time solution becomes possible.

...read moreread less

Abstract: In this paper, we consider the numerical solution of the time fractional diffusion equation. Essentially, the time fractional diffusion equation differs from the standard diffusion equation in the time derivative term. In the former case, the first-order time derivative is replaced by a fractional derivative, making the problem global in time. We propose a spectral method in both temporal and spatial discretizations for this equation. The convergence of the method is proven by providing a priori error estimate. Numerical tests are carried out to confirm the theoretical results. Thanks to the spectral accuracy in both space and time of the proposed method, the storage requirement due to the “global time dependence” can be considerably relaxed, and therefore calculation of the long-time solution becomes possible.

...read moreread less

599 citations

Journal Article•DOI•

Parallel Spectral Clustering in Distributed Systems

[...]

Wen-Yen Chen¹, Yangqiu Song², Hongjie Bai³, Chih-Jen Lin⁴, Edward Y. Chang³ - Show less +1 more•Institutions (4)

Yahoo!¹, Microsoft², Google³, National Taiwan University⁴

01 Mar 2011-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This work investigates two representative ways of approximating the dense similarity matrix and picks the strategy of sparsifying the matrix via retaining nearest neighbors and investigates its parallelization, which can effectively handle large problems.

...read moreread less

Abstract: Spectral clustering algorithms have been shown to be more effective in finding clusters than some traditional algorithms, such as k-means. However, spectral clustering suffers from a scalability problem in both memory use and computational time when the size of a data set is large. To perform clustering on large data sets, we investigate two representative ways of approximating the dense similarity matrix. We compare one approach by sparsifying the matrix with another by the Nystrom method. We then pick the strategy of sparsifying the matrix via retaining nearest neighbors and investigate its parallelization. We parallelize both memory use and computation on distributed computers. Through an empirical study on a document data set of 193,844 instances and a photo data set of 2,121,863, we show that our parallel algorithm can effectively handle large problems.

...read moreread less

591 citations

Proceedings Article•

A Machine Learning Approach to Twitter User Classification

[...]

Marco Pennacchiotti¹, Ana-Maria Popescu¹•Institutions (1)

Yahoo!¹

05 Jul 2011

TL;DR: This paper automatically infer the values of user attributes such as political orientation or ethnicity by leveraging observable information such as the user behavior, network structure and the linguistic content of the user’s Twitter feed through a machine learning approach.

...read moreread less

Abstract: This paper addresses the task of user classification in social media, with an application to Twitter. We automatically infer the values of user attributes such as political orientation or ethnicity by leveraging observable information such as the user behavior, network structure and the linguistic content of the user’s Twitter feed. We employ a machine learning approach which relies on a comprehensive set of features derived from such user information. We report encouraging experimental results on 3 tasks with different characteristics: political affiliation detection, ethnicity identification and detecting affinity for a particular business. Finally, our analysis shows that rich linguistic features prove consistently valuable across the 3 tasks and show great promise for additional user classification needs.

...read moreread less

584 citations

Journal Article•DOI•

Using internet searches for influenza surveillance.

[...]

Philip M. Polgreen¹, Yiling Chen², David M. Pennock³, Forrest D. Nelson⁴•Institutions (4)

Roy J. and Lucille A. Carver College of Medicine¹, Harvard University², Yahoo!³, University of Iowa⁴

01 Dec 2008-Clinical Infectious Diseases

TL;DR: This work counted daily unique queries originating in the United States that contained influenza-related search terms from the Yahoo! search engine from March 2004 through May 2008, and estimated linear models, using searches with 1-10-week lead times as explanatory variables to predict the percentage of cultures positive for influenza and deaths attributable to pneumonia and influenza in the US.

...read moreread less

Abstract: The Internet is an important source of health information. Thus, the frequency of Internet searches may provide information regarding infectious disease activity. As an example, we examined the relationship between searches for influenza and actual influenza occurrence. Using search queries from the Yahoo! search engine ( http://search.yahoo.com ) from March 2004 through May 2008, we counted daily unique queries originating in the United States that contained influenza-related search terms. Counts were divided by the total number of searches, and the resulting daily fraction of searches was averaged over the week. We estimated linear models, using searches with 1-10-week lead times as explanatory variables to predict the percentage of cultures positive for influenza and deaths attributable to pneumonia and influenza in the United States. With use of the frequency of searches, our models predicted an increase in cultures positive for influenza 1-3 weeks in advance of when they occurred (P < .001), and similar models predicted an increase in mortality attributable to pneumonia and influenza up to 5 weeks in advance (P < .001). Search-term surveillance may provide an additional tool for disease surveillance.

...read moreread less

584 citations

Collapse

Authors

Showing all 26766 results

Name	H-index	Papers	Citations
Ashok Kumar	151	5654	164086
Alexander J. Smola	122	434	110222
Howard I. Maibach	116	1821	60765
Sanjay Jain	103	881	46880
Amirhossein Sahebkar	100	1307	46132
Marc Davis	99	412	50243
Wenjun Zhang	96	976	38530
Jian Xu	94	1366	52057
Fortunato Ciardiello	94	695	47352
Tong Zhang	93	414	36519
Michael E. J. Lean	92	411	30939
Ashish K. Jha	87	503	30020
Xin Zhang	87	1714	40102
Theunis Piersma	86	632	34201
George Varghese	84	253	28598