scispace - formally typeset
Search or ask a question

Showing papers on "Ranking (information retrieval) published in 2014"


Posted Content
TL;DR: A deep ranking model that employs deep learning techniques to learn similarity metric directly from images has higher learning capability than models based on hand-crafted features and deep classification models.
Abstract: Learning fine-grained image similarity is a challenging task. It needs to capture between-class and within-class image differences. This paper proposes a deep ranking model that employs deep learning techniques to learn similarity metric directly from this http URL has higher learning capability than models based on hand-crafted features. A novel multiscale network structure has been developed to describe the images effectively. An efficient triplet sampling algorithm is proposed to learn the model with distributed asynchronized stochastic gradient. Extensive experiments show that the proposed algorithm outperforms models based on hand-crafted visual features and deep classification models.

967 citations


Proceedings ArticleDOI
23 Jun 2014
TL;DR: Zhang et al. as mentioned in this paper proposed a deep ranking model that employs deep learning techniques to learn similarity metric directly from images, which has higher learning capability than models based on hand-crafted features.
Abstract: Learning fine-grained image similarity is a challenging task. It needs to capture between-class and within-class image differences. This paper proposes a deep ranking model that employs deep learning techniques to learn similarity metric directly from images. It has higher learning capability than models based on hand-crafted features. A novel multiscale network structure has been developed to describe the images effectively. An efficient triplet sampling algorithm is also proposed to learn the model with distributed asynchronized stochastic gradient. Extensive experiments show that the proposed algorithm outperforms models based on hand-crafted visual features and deep classification models.

935 citations


Proceedings ArticleDOI
01 Jan 2014
TL;DR: LDAvis, a web-based interactive visualization of topics estimated using Latent Dirichlet Allocation that is built using a combination of R and D3, and a novel method for choosing which terms to present to a user to aid in the task of topic interpretation is proposed.
Abstract: We present LDAvis, a web-based interactive visualization of topics estimated using Latent Dirichlet Allocation that is built using a combination of R and D3. Our visualization provides a global view of the topics (and how they differ from each other), while at the same time allowing for a deep inspection of the terms most highly associated with each individual topic. First, we propose a novel method for choosing which terms to present to a user to aid in the task of topic interpretation, in which we define the relevance of a term to a topic. Second, we present results from a user study that suggest that ranking terms purely by their probability under a topic is suboptimal for topic interpretation. Last, we describe LDAvis, our visualization system that allows users to flexibly explore topic-term relationships using relevance to better understand a fitted LDA model.

836 citations


Posted Content
TL;DR: In this paper, Wang et al. proposed the triplet network model, which aims to learn useful representations by distance comparisons, and demonstrate using various datasets that their model learns a better representation than that of its immediate competitor, the Siamese network.
Abstract: Deep learning has proven itself as a successful set of models for learning useful semantic representations of data. These, however, are mostly implicitly learned as part of a classification task. In this paper we propose the triplet network model, which aims to learn useful representations by distance comparisons. A similar model was defined by Wang et al. (2014), tailor made for learning a ranking for image information retrieval. Here we demonstrate using various datasets that our model learns a better representation than that of its immediate competitor, the Siamese network. We also discuss future possible usage as a framework for unsupervised learning.

824 citations


Proceedings ArticleDOI
07 Apr 2014
TL;DR: This paper presents a series of new latent semantic models based on a convolutional neural network to learn low-dimensional semantic vectors for search queries and Web documents that significantly outperforms other se-mantic models in retrieval performance.
Abstract: This paper presents a series of new latent semantic models based on a convolutional neural network (CNN) to learn low-dimensional semantic vectors for search queries and Web documents. By using the convolution-max pooling operation, local contextual information at the word n-gram level is modeled first. Then, salient local fea-tures in a word sequence are combined to form a global feature vector. Finally, the high-level semantic information of the word sequence is extracted to form a global vector representation. The proposed models are trained on clickthrough data by maximizing the conditional likelihood of clicked documents given a query, us-ing stochastic gradient ascent. The new models are evaluated on a Web document ranking task using a large-scale, real-world data set. Results show that our model significantly outperforms other se-mantic models, which were state-of-the-art in retrieval performance prior to this work.

706 citations


Book ChapterDOI
06 Sep 2014
TL;DR: A novel model to automatically select the most discriminative video fragments from noisy image sequences of people where more reliable space-time features can be extracted, whilst simultaneously to learn a video ranking function for person re-id is presented.
Abstract: Current person re-identification (re-id) methods typically rely on single-frame imagery features, and ignore space-time information from image sequences. Single-frame (single-shot) visual appearance matching is inherently limited for person re-id in public spaces due to visual ambiguity arising from non-overlapping camera views where viewpoint and lighting changes can cause significant appearance variation. In this work, we present a novel model to automatically select the most discriminative video fragments from noisy image sequences of people where more reliable space-time features can be extracted, whilst simultaneously to learn a video ranking function for person re-id. Also, we introduce a new image sequence re-id dataset (iLIDS-VID) based on the i-LIDS MCT benchmark data. Using the iLIDS-VID and PRID 2011 sequence re-id datasets, we extensively conducted comparative evaluations to demonstrate the advantages of the proposed model over contemporary gait recognition, holistic image sequence matching and state-of-the-art single-shot/multi-shot based re-id methods.

600 citations


Patent
07 Apr 2014
TL;DR: In this paper, a user who is composing or reading a document can identify and link multiple sets of key words into separate search queries by highlighting and assigning either unique search numbers, colors or other readily ascertained indicators of their logical relation.
Abstract: Systems and methods allow a user of a text or graphics editor to quickly create multiple robust internet search queries by selecting and ranking groups or individual key words from a document. A user who is composing or reading a document can identify and link multiple sets of key words into separate search queries by highlighting and assigning either unique search numbers, colors or other readily ascertained indicators of their logical relation. Each individual search query is routed to selected internet search engines, and the results are returned to the user in the same viewed document. The user may select the form in which the results are displayed. For example, results may be listed within the document by way footnotes, endnotes, or separate hover or pull-down windows accessible from the search terms. In addition, the user can browse, sort, rank, edit or eliminate portions of the results.

269 citations


Journal ArticleDOI
TL;DR: This paper presents a verifiable privacy-preserving multi-keyword text search (MTS) scheme with similarity-based ranking and proposes two secure index schemes to meet the stringent privacy requirements under strong threat models.
Abstract: With the growing popularity of cloud computing, huge amount of documents are outsourced to the cloud for reduced management cost and ease of access. Although encryption helps protecting user data confidentiality, it leaves the well-functioning yet practically-efficient secure search functions over encrypted data a challenging problem. In this paper, we present a verifiable privacy-preserving multi-keyword text search (MTS) scheme with similarity-based ranking to address this problem. To support multi-keyword search and search result ranking, we propose to build the search index based on term frequency and the vector space model with cosine similarity measure to achieve higher search result accuracy. To improve the search efficiency, we propose a tree-based index structure and various adaptive methods for multi-dimensional (MD) algorithm so that the practical search efficiency is much better than that of linear search. To further enhance the search privacy, we propose two secure index schemes to meet the stringent privacy requirements under strong threat models, i.e., known ciphertext model and known background model. In addition, we devise a scheme upon the proposed index tree structure to enable authenticity check over the returned search results. Finally, we demonstrate the effectiveness and efficiency of the proposed schemes through extensive experimental evaluation.

243 citations


Journal ArticleDOI
TL;DR: A context-aware sensor search, selection, and ranking model, called CASSARAM, is proposed to address the challenge of efficiently selecting a subset of relevant sensors out of a large set of sensors with similar functionality and capabilities.
Abstract: The Internet of Things (IoT) is part of the Internet of the future and will comprise billions of intelligent communicating “things” or Internet Connected Objects (ICOs) that will have sensing, actuating, and data processing capabilities. Each ICO will have one or more embedded sensors that will capture potentially enormous amounts of data. The sensors and related data streams can be clustered physically or virtually, which raises the challenge of searching and selecting the right sensors for a query in an efficient and effective way. This paper proposes a context-aware sensor search, selection, and ranking model, called CASSARAM, to address the challenge of efficiently selecting a subset of relevant sensors out of a large set of sensors with similar functionality and capabilities. CASSARAM considers user preferences and a broad range of sensor characteristics such as reliability, accuracy, location, battery life, and many more. This paper highlights the importance of sensor search, selection and ranking for the IoT, identifies important characteristics of both sensors and data capture processes, and discusses how semantic and quantitative reasoning can be combined together. This paper also addresses challenges such as efficient distributed sensor search and relational-expression based filtering. CASSARAM testing and performance evaluation results are presented and discussed.

189 citations


Book
Hang Li1, Jun Xu1
20 Jun 2014
TL;DR: This survey gives a systematic and detailed introduction to newly developed machine learning technologies for query document matching (semantic matching) in search, particularly web search, and focuses on the fundamental problems, as well as the state-of-the-art solutions.
Abstract: Relevance is the most important factor to assure users' satisfaction in search and the success of a search engine heavily depends on its performance on relevance. It has been observed that most of the dissatisfaction cases in relevance are due to term mismatch between queries and documents (e.g., query "NY times" does not match well with a document only containing "New York Times"), because term matching, i.e., the bag-of-words approach, still functions as the main mechanism of modern search engines. It is not exaggerated to say, therefore, that mismatch between query and document poses the most critical challenge in search. Ideally, one would like to see query and document match with each other, if they are topically relevant. Recently, researchers have expended significant effort to address the problem. The major approach is to conduct semantic matching, i.e., to perform more query and document understanding to represent the meanings of them, and perform better matching between the enriched query and document representations. With the availability of large amounts of log data and advanced machine learning techniques, this becomes more feasible and significant progress has been made recently. This survey gives a systematic and detailed introduction to newly developed machine learning technologies for query document matching (semantic matching) in search, particularly web search. It focuses on the fundamental problems, as well as the state-of-the-art solutions of query document matching on form aspect, phrase aspect, word sense aspect, topic aspect, and structure aspect. The ideas and solutions explained may motivate industrial practitioners to turn the research results into products. The methods introduced and the discussions made may also stimulate academic researchers to find new research directions and approaches. Matching between query and document is not limited to search and similar problems can be found in question answering, online advertising, cross-language information retrieval, machine translation, recommender systems, link prediction, image annotation, drug design, and other applications, as the general task of matching between objects from two different spaces. The technologies introduced can be generalized into more general machine learning techniques, which is referred to as learning to match in this survey.

179 citations


Proceedings ArticleDOI
29 Sep 2014
TL;DR: Experimental results show that Multric localizes faults more effectively than state-of-art metrics, such as Tarantula, Ochiai, and Ample.
Abstract: Fault localization is an inevitable step in software debugging. Spectrum-based fault localization consists in computing a ranking metric on execution traces to identify faulty source code. Existing empirical studies on fault localization show that there is no optimal ranking metric for all faults in practice. In this paper, we propose Multric, a learning-based approach to combining multiple ranking metrics for effective fault localization. In Multric, a suspiciousness score of a program entity is a combination of existing ranking metrics. Multric consists two major phases: learning and ranking. Based on training faults, Multric builds a ranking model by learning from pairs of faulty and non-faulty source code elements. When a new fault appears, Multric computes the final ranking with the learned model. Experiments are conducted on 5386 seeded faults in ten open-source Java programs. We empirically compare Multric against four widely-studied metrics and three recently-proposed one. Our experimental results show that Multric localizes faults more effectively than state-of-art metrics, such as Tarantula, Ochiai, and Ample.

Journal ArticleDOI
01 Jun 2014
TL;DR: A novel two-phase search algorithm is proposed that carefully selects a set of expansion centers from the query trajectory and exploits upper and lower bounds to prune the search space in the spatial and temporal domains.
Abstract: With the increasing availability of moving-object tracking data, trajectory search and matching is increasingly important. We propose and investigate a novel problem called personalized trajectory matching (PTM). In contrast to conventional trajectory similarity search by spatial distance only, PTM takes into account the significance of each sample point in a query trajectory. A PTM query takes a trajectory with user-specified weights for each sample point in the trajectory as its argument. It returns the trajectory in an argument data set with the highest similarity to the query trajectory. We believe that this type of query may bring significant benefits to users in many popular applications such as route planning, carpooling, friend recommendation, traffic analysis, urban computing, and location-based services in general. PTM query processing faces two challenges: how to prune the search space during the query processing and how to schedule multiple so-called expansion centers effectively. To address these challenges, a novel two-phase search algorithm is proposed that carefully selects a set of expansion centers from the query trajectory and exploits upper and lower bounds to prune the search space in the spatial and temporal domains. An efficiency study reveals that the algorithm explores the minimum search space in both domains. Second, a heuristic search strategy based on priority ranking is developed to schedule the multiple expansion centers, which can further prune the search space and enhance the query efficiency. The performance of the PTM query is studied in extensive experiments based on real and synthetic trajectory data sets.

Proceedings ArticleDOI
07 Apr 2014
TL;DR: The experiments indicate that the combination of a mixture of local low-rank matrices each of which was trained to minimize a ranking loss outperforms many of the currently used state-of-the-art recommendation systems.
Abstract: Personalized recommendation systems are used in a wide variety of applications such as electronic commerce, social networks, web search, and more. Collaborative filtering approaches to recommendation systems typically assume that the rating matrix (e.g., movie ratings by viewers) is low-rank. In this paper, we examine an alternative approach in which the rating matrix is locally low-rank. Concretely, we assume that the rating matrix is low-rank within certain neighborhoods of the metric space defined by (user, item) pairs. We combine a recent approach for local low-rank approximation based on the Frobenius norm with a general empirical risk minimization for ranking losses. Our experiments indicate that the combination of a mixture of local low-rank matrices each of which was trained to minimize a ranking loss outperforms many of the currently used state-of-the-art recommendation systems. Moreover, our method is easy to parallelize, making it a viable approach for large scale real-world rank-based recommendation systems.

Proceedings ArticleDOI
26 Nov 2014
TL;DR: This investigation finds that once trained (using particle swarm optimization) there is very little difference in performance between these functions, that relevance feedback is effective, that stemming is effective and that it remains unclear which function is best over-all.
Abstract: Recent work on search engine ranking functions report improvements on BM25 and Language Models with Dirichlet Smoothing. In this investigation 9 recent ranking functions (BM25, BM25+, BM25T, BM25-adpt, BM25L, TF1°δ°p×ID, LM-DS, LM-PYP, and LM-PYP-TFIDF) are compared by training on the INEX 2009 Wikipedia collection and testing on INEX 2010 and 9 TREC collections. We find that once trained (using particle swarm optimization) there is very little difference in performance between these functions, that relevance feedback is effective, that stemming is effective, and that it remains unclear which function is best over-all.

Journal ArticleDOI
TL;DR: A product aspect ranking framework, which automatically identifies the important aspects of products from online consumer reviews, aiming at improving the usability of the numerous reviews, and develops a probabilistic aspect ranking algorithm to infer the importance of aspects.
Abstract: Numerous consumer reviews of products are now available on the Internet. Consumer reviews contain rich and valuable knowledge for both firms and users. However, the reviews are often disorganized, leading to difficulties in information navigation and knowledge acquisition. This article proposes a product aspect ranking framework, which automatically identifies the important aspects of products from online consumer reviews, aiming at improving the usability of the numerous reviews. The important product aspects are identified based on two observations: 1) the important aspects are usually commented on by a large number of consumers and 2) consumer opinions on the important aspects greatly influence their overall opinions on the product. In particular, given the consumer reviews of a product, we first identify product aspects by a shallow dependency parser and determine consumer opinions on these aspects via a sentiment classifier. We then develop a probabilistic aspect ranking algorithm to infer the importance of aspects by simultaneously considering aspect frequency and the influence of consumer opinions given to each aspect over their overall opinions. The experimental results on a review corpus of 21 popular products in eight domains demonstrate the effectiveness of the proposed approach. Moreover, we apply product aspect ranking to two real-world applications, i.e., document-level sentiment classification and extractive review summarization, and achieve significant performance improvements, which demonstrate the capacity of product aspect ranking in facilitating real-world applications.

Patent
06 May 2014
TL;DR: In this article, a computer-implemented method and system for enabling communication between networked users based on search queries and common characteristics is disclosed, where the authors relate to receiving a search query from a first user and establishing a communication link between the first users and a second user based on the first user's search query.
Abstract: A computer-implemented method and system for enabling communication between networked users based on search queries and common characteristics is disclosed. Particular embodiments relate to receiving a search query from a first user and establishing a communication link between the first user and a second user based on the first user's search query. Particular embodiments relate to receiving a first search query from a first user, receiving a second search query from a second user, determining if the first user and the second user fit within match criteria, and establishing a communication link between the first user and the second user if the first user and the second user fit within match criteria. Particular embodiments relate to receiving a first search query from a first user, receiving a second search query from a second user, determining if the first search query and the second search query fit within match criteria, determining if the first user and the second user fit within match criteria, and establishing a communication link between the first user and the second user if the first search query and the second search query fit within match criteria and if the first user and the second user fit within match criteria.

Journal ArticleDOI
TL;DR: This paper proposes a novel method for object detection based on structural feature description and query expansion that is evaluated on high-resolution satellite images and demonstrates its clear advantages over several other object detection methods.
Abstract: Object detection is an important task in very high-resolution remote sensing image analysis. Traditional detection approaches are often not sufficiently robust in dealing with the variations of targets and sometimes suffer from limited training samples. In this paper, we tackle these two problems by proposing a novel method for object detection based on structural feature description and query expansion. The feature description combines both local and global information of objects. After initial feature extraction from a query image and representative samples, these descriptors are updated through an augmentation process to better describe the object of interest. The object detection step is implemented using a ranking support vector machine (SVM), which converts the detection task to a ranking query task. The ranking SVM is first trained on a small subset of training data with samples automatically ranked based on similarities to the query image. Then, a novel query expansion method is introduced to update the initial object model by active learning with human inputs on ranking of image pairs. Once the query expansion process is completed, which is determined by measuring entropy changes, the model is then applied to the whole target data set in which objects in different classes shall be detected. We evaluate the proposed method on high-resolution satellite images and demonstrate its clear advantages over several other object detection methods.

Journal ArticleDOI
TL;DR: This paper proposes a query expansion technique for image search that is faster and more precise than the existing ones and significantly outperforms the visual query expansion state of the art on popular benchmarks.

Proceedings ArticleDOI
Suqi Cheng1, Huawei Shen1, Junming Huang1, Wei Chen1, Xueqi Cheng1 
03 Jul 2014
TL;DR: Li et al. as mentioned in this paper developed an iterative ranking framework, i.e., IMRank, to efficiently solve influence maximization problem under independent cascade model, where starting from an initial ranking, e.g., one obtained from efficient heuristic algorithm, IMRank finds a selfconsistent ranking by reordering nodes iteratively in terms of their ranking-based marginal influence spread computed according to current ranking.
Abstract: Influence maximization, fundamental for word-of-mouth marketing and viral marketing, aims to find a set of seed nodes maximizing influence spread on social network. Early methods mainly fall into two paradigms with certain benefits and drawbacks: (1) Greedy algorithms, selecting seed nodes one by one, give a guaranteed accuracy relying on the accurate approximation of influence spread with high computational cost; (2) Heuristic algorithms, estimating influence spread using efficient heuristics, have low computational cost but unstable accuracy. We first point out that greedy algorithms are essentially finding a self-consistent ranking, where nodes' ranks are consistent with their ranking-based marginal influence spread. This insight motivates us to develop an iterative ranking framework, i.e., IMRank, to efficiently solve influence maximization problem under independent cascade model. Starting from an initial ranking, e.g., one obtained from efficient heuristic algorithm, IMRank finds a self-consistent ranking by reordering nodes iteratively in terms of their ranking-based marginal influence spread computed according to current ranking. We also prove that IMRank definitely converges to a self-consistent ranking starting from any initial ranking. Furthermore, within this framework, a last-to-first allocating strategy and a generalization of this strategy are proposed to improve the efficiency of estimating ranking-based marginal influence spread for a given ranking. In this way, IMRank achieves both remarkable efficiency and high accuracy by leveraging simultaneously the benefits of greedy algorithms and heuristic algorithms. As demonstrated by extensive experiments on large scale real-world social networks, IMRank always achieves high accuracy comparable to greedy algorithms, while the computational cost is reduced dramatically, about 10-100 times faster than other scalable heuristics.

Proceedings ArticleDOI
03 Jul 2014
TL;DR: By modeling comments as a time-aware bipartite graph, this work proposes a regularization-based ranking algorithm that accounts for temporal, social influence and current popularity factors to predict the future popularity of items.
Abstract: In the current Web 2.0 era, the popularity of Web resources fluctuates ephemerally, based on trends and social interest. As a result, content-based relevance signals are insufficient to meet users' constantly evolving information needs in searching for Web 2.0 items. Incorporating future popularity into ranking is one way to counter this. However, predicting popularity as a third party (as in the case of general search engines) is difficult in practice, due to their limited access to item view histories. To enable popularity prediction externally without excessive crawling, we propose an alternative solution by leveraging user comments, which are more accessible than view counts. Due to the sparsity of comments, traditional solutions that are solely based on view histories do not perform well. To deal with this sparsity, we mine comments to recover additional signal, such as social influence. By modeling comments as a time-aware bipartite graph, we propose a regularization-based ranking algorithm that accounts for temporal, social influence and current popularity factors to predict the future popularity of items. Experimental results on three real-world datasets --- crawled from YouTube, Flickr and Last.fm --- show that our method consistently outperforms competitive baselines in several evaluation tasks.

Proceedings ArticleDOI
01 Jun 2014
TL;DR: The UvA-ILLC submission of the BEER metric to WMT 14 metrics task is presented, with novel contributions of efficient tuning of a large number of features for maximizing correlation with human system ranking and novel features that give smoother sentence level scores.
Abstract: We present the UvA-ILLC submission of the BEER metric to WMT 14 metrics task. BEER is a sentence level metric that can incorporate a large number of features combined in a linear model. Novel contributions are (1) efficient tuning of a large number of features for maximizing correlation with human system ranking, and (2) novel features that give smoother sentence level scores.

Proceedings ArticleDOI
03 Nov 2014
TL;DR: This paper proposes using ``Bag-of-Concepts'' in short text representation, aiming to avoid the surface mismatching and handle the synonym and polysemy problem, and proposes a novel framework for lightweight short text classification applications.
Abstract: Most existing approaches for text classification represent texts as vectors of words, namely ``Bag-of-Words.'' This text representation results in a very high dimensionality of feature space and frequently suffers from surface mismatching. Short texts make these issues even more serious, due to their shortness and sparsity. In this paper, we propose using ``Bag-of-Concepts'' in short text representation, aiming to avoid the surface mismatching and handle the synonym and polysemy problem. Based on ``Bag-of-Concepts,'' a novel framework is proposed for lightweight short text classification applications. By leveraging a large taxonomy knowledgebase, it learns a concept model for each category, and conceptualizes a short text to a set of relevant concepts. A concept-based similarity mechanism is presented to classify the given short text to the most similar category. One advantage of this mechanism is that it facilitates short text ranking after classification, which is needed in many applications, such as query or ad recommendation. We demonstrate the usage of our proposed framework through a real online application: Channel-based Query Recommendation. Experiments show that our framework can map queries to channels with a high degree of precision (avg. precision=90.3%), which is critical for recommendation applications.

Journal ArticleDOI
TL;DR: This paper addresses the problem of predicting the popularity of news articles based on user comments as a ranking problem and indicates that popularity prediction methods are adequate solutions for this ranking task and could be considered as a valuable alternative for automatic online news ranking.
Abstract: News articles are an engaging type of online content that captures the attention of a significant amount of Internet users. They are particularly enjoyed by mobile users and massively spread through online social platforms. As a result, there is an increased interest in discovering the articles that will become popular among users. This objective falls under the broad scope of content popularity prediction and has direct implications in the development of new services for online advertisement and content distribution. In this paper, we address the problem of predicting the popularity of news articles based on user comments. We formulate the prediction task as a ranking problem, where the goal is not to infer the precise attention that a content will receive but to accurately rank articles based on their predicted popularity. Using data obtained from two important news sites in France and Netherlands, we analyze the ranking effectiveness of two prediction models. Our results indicate that popularity prediction methods are adequate solutions for this ranking task and could be considered as a valuable alternative for automatic online news ranking.

Journal ArticleDOI
TL;DR: This paper proposes a flexible multi-keyword query scheme, called MKQE, which greatly reduces the maintenance overhead during the keyword dictionary expansion and takes keyword weights and user access history into consideration when generating the query result.

Patent
Erick Tseng1
01 Oct 2014
TL;DR: In this article, a user of a social networking system requests to look up an address book maintained by the social network system, which improves the look up search results by ranking one or more contacts in the address book based on social graph, social relationship and communication history information.
Abstract: In one embodiment, a user of a social networking system requests to look up an address book maintained by the social networking system. The social networking system improves the look up search results by ranking one or more contacts in the address book based on social graph, social relationship and communication history information.

BookDOI
31 Dec 2014
TL;DR: This volume points to a number of advances topically subdivided into four parts: estimation of importance of characteristic features, their relevance, dependencies, weighting and ranking; rough set approach to attribute reduction with focus on relative reducts; construction of rules and their evaluation; and data- and domain-oriented methodologies.
Abstract: This research book provides the reader with a selection of high-quality texts dedicated to current progress, new developments and research trends in feature selection for data and pattern recognition. Even though it has been the subject of interest for some time, feature selection remains one of actively pursued avenues of investigations due to its importance and bearing upon other problems and tasks. This volume points to a number of advances topically subdivided into four parts: estimation of importance of characteristic features, their relevance, dependencies, weighting and ranking; rough set approach to attribute reduction with focus on relative reducts; construction of rules and their evaluation; and data- and domain-oriented methodologies.

Proceedings ArticleDOI
23 Jun 2014
TL;DR: This work forms the color enhancement task as a learning-to-rank problem in which ordered pairs of images are used for training, and then various color enhancements of a novel input image can be evaluated from their corresponding rank values.
Abstract: We present a machine-learned ranking approach for automatically enhancing the color of a photograph. Unlike previous techniques that train on pairs of images before and after adjustment by a human user, our method takes into account the intermediate steps taken in the enhancement process, which provide detailed information on the person's color preferences. To make use of this data, we formulate the color enhancement task as a learning-to-rank problem in which ordered pairs of images are used for training, and then various color enhancements of a novel input image can be evaluated from their corresponding rank values. From the parallels between the decision tree structures we use for ranking and the decisions made by a human during the editing process, we posit that breaking a full enhancement sequence into individual steps can facilitate training. Our experiments show that this approach compares well to existing methods for automatic color enhancement.

Patent
08 Dec 2014
TL;DR: In this paper, a social-networking system accesses a social graph with a plurality of user nodes, receives a search query with location parameters, identifies a set of location nodes based on the search query, and determines a value for each location node in the set based on edges connected to the location nodes in the social graph.
Abstract: In one embodiment, a social-networking system accesses a social graph with a plurality of user nodes and a plurality of location nodes, receives a search query with location parameters, identifies a set of location nodes based on the search query, and determines a value for each location nodes in the set based on the edges connected to the location nodes in the social graph.

Journal ArticleDOI
TL;DR: This paper introduces three novel ranking algorithms for signed networks and compares their ability in predicting signs of edges with already existing ones and identifies a number of ranking algorithms that result in higher prediction accuracy compared to others.
Abstract: Social networks are inevitable part of modern life. A class of social networks is those with both positive (friendship or trust) and negative (enmity or distrust) links. Ranking nodes in signed networks remains a hot topic in computer science. In this manuscript, we review different ranking algorithms to rank the nodes in signed networks, and apply them to the sign prediction problem. Ranking scores are used to obtain reputation and optimism, which are used as features in the sign prediction problem. Reputation of a node shows patterns of voting towards the node and its optimism demonstrates how optimistic a node thinks about others. To assess the performance of different ranking algorithms, we apply them on three signed networks including Epinions, Slashdot and Wikipedia. In this paper, we introduce three novel ranking algorithms for signed networks and compare their ability in predicting signs of edges with already existing ones. We use logistic regression as the predictor and the reputation and optimism values for the trustee and trustor as features (that are obtained based on different ranking algorithms). We find that ranking algorithms resulting in correlated ranking scores, leads to almost the same prediction accuracy. Furthermore, our analysis identifies a number of ranking algorithms that result in higher prediction accuracy compared to others.

Journal ArticleDOI
TL;DR: In MODE-RMO, the ranking-based mutation operator is integrated into the MODE algorithm to accelerate the convergence speed, and thus enhance the performance, and this variant can generate Pareto optimal fronts with satisfactory convergence and diversity.