scispace - formally typeset
Search or ask a question

Showing papers in "ACM Transactions on Information Systems in 2019"


Journal ArticleDOI
TL;DR: This article proposes a more expressive ICF solution by accounting for the nonlinear and higher-order relationships among items, and treats this solution as a deep variant of ICF, thus term it as DeepICF.
Abstract: Item-based Collaborative Filtering (ICF) has been widely adopted in recommender systems in industry, owing to its strength in user interest modeling and ease in online personalization. By constructing a user’s profile with the items that the user has consumed, ICF recommends items that are similar to the user’s profile. With the prevalence of machine learning in recent years, significant processes have been made for ICF by learning item similarity (or representation) from data. Nevertheless, we argue that most existing works have only considered linear and shallow relationships between items, which are insufficient to capture the complicated decision-making process of users.In this article, we propose a more expressive ICF solution by accounting for the nonlinear and higher-order relationships among items. Going beyond modeling only the second-order interaction (e.g., similarity) between two items, we additionally consider the interaction among all interacted item pairs by using nonlinear neural networks. By doing this, we can effectively model the higher-order relationship among items, capturing more complicated effects in user decision-making. For example, it can differentiate which historical itemsets in a user’s profile are more important in affecting the user to make a purchase decision on an item. We treat this solution as a deep variant of ICF, thus term it as DeepICF. To justify our proposal, we perform empirical studies on two public datasets from MovieLens and Pinterest. Extensive experiments verify the highly positive effect of higher-order item interaction modeling with nonlinear neural networks. Moreover, we demonstrate that by more fine-grained second-order interaction modeling with attention network, the performance of our DeepICF method can be further improved.

181 citations


Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper proposed a new deep learning solution, named Relational Stock Ranking (RSR), for stock prediction, which combines the temporal evolution and relation network of stocks.
Abstract: Stock prediction aims to predict the future trends of a stock in order to help investors make good investment decisions. Traditional solutions for stock prediction are based on time-series models. With the recent success of deep neural networks in modeling sequential data, deep learning has become a promising choice for stock prediction. However, most existing deep learning solutions are not optimized toward the target of investment, i.e., selecting the best stock with the highest expected revenue. Specifically, they typically formulate stock prediction as a classification (to predict stock trends) or a regression problem (to predict stock prices). More importantly, they largely treat the stocks as independent of each other. The valuable signal in the rich relations between stocks (or companies), such as two stocks are in the same sector and two companies have a supplier-customer relation, is not considered. In this work, we contribute a new deep learning solution, named Relational Stock Ranking (RSR), for stock prediction. Our RSR method advances existing solutions in two major aspects: (1) tailoring the deep learning models for stock ranking, and (2) capturing the stock relations in a time-sensitive manner. The key novelty of our work is the proposal of a new component in neural network modeling, named Temporal Graph Convolution, which jointly models the temporal evolution and relation network of stocks. To validate our method, we perform back-testing on the historical data of two stock markets, NYSE and NASDAQ. Extensive experiments demonstrate the superiority of our RSR method. It outperforms state-of-the-art stock prediction solutions achieving an average return ratio of 98% and 71% on NYSE and NASDAQ, respectively.

176 citations


Journal ArticleDOI
TL;DR: Zhang et al. as mentioned in this paper proposed a multi-modal aspect-aware topic model (MATM) on text reviews and item images to model users' preferences and items' features from different aspects, and also estimate the aspect importance of a user toward an item.
Abstract: Personalized rating prediction is an important research problem in recommender systems. Although the latent factor model (e.g., matrix factorization) achieves good accuracy in rating prediction, it suffers from many problems including cold-start, non-transparency, and suboptimal results for individual user-item pairs. In this article, we exploit textual reviews and item images together with ratings to tackle these limitations. Specifically, we first apply a proposed multi-modal aspect-aware topic model (MATM) on text reviews and item images to model users’ preferences and items’ features from different aspects, and also estimate the aspect importance of a user toward an item. Then, the aspect importance is integrated into a novel aspect-aware latent factor model (ALFM), which learns user’s and item’s latent factors based on ratings. In particular, ALFM introduces a weight matrix to associate those latent factors with the same set of aspects in MATM, such that the latent factors could be used to estimate aspect ratings. Finally, the overall rating is computed via a linear combination of the aspect ratings, which are weighted by the corresponding aspect importance. To this end, our model could alleviate the data sparsity problem and gain good interpretability for recommendation. Besides, every aspect rating is weighted by its aspect importance, which is dependent on the targeted user’s preferences and the targeted item’s features. Therefore, it is expected that the proposed method can model a user’s preferences on an item more accurately for each user-item pair. Comprehensive experimental studies have been conducted on the Yelp 2017 Challenge dataset and Amazon product datasets. Results show that (1) our method achieves significant improvement compared to strong baseline methods, especially for users with only few ratings; (2) item visual features can improve the prediction performance—the effects of item image features on improving the prediction results depend on the importance of the visual features for the items; and (3) our model can explicitly interpret the predicted results in great detail.

156 citations


Journal ArticleDOI
TL;DR: This article proposes a spatiotemporal context-aware and translation-based recommender framework (STA) to model the third-order relationship among users, POIs, and spatiotmporal contexts for large-scale POI recommendation and demonstrates that the STA framework achieves the superior performance in terms of high recommendation accuracy, robustness to data sparsity, and effectiveness in handling the cold-start problem.
Abstract: The increasing proliferation of location-based social networks brings about a huge volume of user check-in data, which facilitates the recommendation of points of interest (POIs). Time and location are the two most important contextual factors in the user’s decision-making for choosing a POI to visit. In this article, we focus on the spatiotemporal context-aware POI recommendation, which considers the joint effect of time and location for POI recommendation. Inspired by the recent advances in knowledge graph embedding, we propose a spatiotemporal context-aware and translation-based recommender framework (STA) to model the third-order relationship among users, POIs, and spatiotemporal contexts for large-scale POI recommendation. Specifically, we embed both users and POIs into a “transition space” where spatiotemporal contexts (i.e., a time, location> pair) are modeled as translation vectors operating on users and POIs. We further develop a series of strategies to exploit various correlation information to address the data sparsity and cold-start issues for new spatiotemporal contexts, new users, and new POIs. We conduct extensive experiments on two real-world datasets. The experimental results demonstrate that our STA framework achieves the superior performance in terms of high recommendation accuracy, robustness to data sparsity, and effectiveness in handling the cold-start problem.

119 citations


Journal ArticleDOI
TL;DR: CARLiu et al. as mentioned in this paper proposed a context-aware user-item representation learning model for rating prediction, which adopts Factorization Machines to further model higher order feature interactions on the basis of the useritem pair.
Abstract: Both reviews and user-item interactions (i.e., rating scores) have been widely adopted for user rating prediction. However, these existing techniques mainly extract the latent representations for users and items in an independent and static manner. That is, a single static feature vector is derived to encode user preference without considering the particular characteristics of each candidate item. We argue that this static encoding scheme is incapable of fully capturing users’ preferences, because users usually exhibit different preferences when interacting with different items. In this article, we propose a novel context-aware user-item representation learning model for rating prediction, named CARL. CARL derives a joint representation for a given user-item pair based on their individual latent features and latent feature interactions. Then, CARL adopts Factorization Machines to further model higher order feature interactions on the basis of the user-item pair for rating prediction. Specifically, two separate learning components are devised in CARL to exploit review data and interaction data, respectively: review-based feature learning and interaction-based feature learning. In the review-based learning component, with convolution operations and attention mechanism, the pair-based relevant features for the given user-item pair are extracted by jointly considering their corresponding reviews. However, these features are only reivew-driven and may not be comprehensive. Hence, an interaction-based learning component further extracts complementary features from interaction data alone, also on the basis of user-item pairs. The final rating score is then derived with a dynamic linear fusion mechanism. Experiments on seven real-world datasets show that CARL achieves significantly better rating prediction accuracy than existing state-of-the-art alternatives. Also, with the attention mechanism, we show that the pair-based relevant information (i.e., context-aware information) in reviews can be highlighted to interpret the rating prediction for different user-item pairs.

113 citations


Journal ArticleDOI
TL;DR: The J-NCF model has a competitive recommendation performance with inactive users and different degrees of data sparsity when compared to state-of-the-art baselines and a new loss function for optimization that takes both implicit and explicit feedback, point-wise and pair-wise loss into account.
Abstract: We propose a Joint Neural Collaborative Filtering (J-NCF) method for recommender systems. The J-NCF model applies a joint neural network that couples deep feature learning and deep interaction modeling with a rating matrix. Deep feature learning extracts feature representations of users and items with a deep learning architecture based on a user-item rating matrix. Deep interaction modeling captures non-linear user-item interactions with a deep neural network using the feature representations generated by the deep feature learning process as input. J-NCF enables the deep feature learning and deep interaction modeling processes to optimize each other through joint training, which leads to improved recommendation performance. In addition, we design a new loss function for optimization that takes both implicit and explicit feedback, point-wise and pair-wise loss into account. Experiments on several real-world datasets show significant improvements of J-NCF over state-of-the-art methods, with improvements of up to 8.24% on the MovieLens 100K dataset, 10.81% on the MovieLens 1M dataset, and 10.21% on the Amazon Movies dataset in terms of HR@10. NDCG@10 improvements are 12.42%, 14.24%, and 15.06%, respectively. We also conduct experiments to evaluate the scalability and sensitivity of J-NCF. Our experiments show that the J-NCF model has a competitive recommendation performance with inactive users and different degrees of data sparsity when compared to state-of-the-art baselines.

87 citations


Journal ArticleDOI
TL;DR: It is found that RippleNet provides a new perspective of explainability for the recommended results in terms of the KG, and both versions of RippleNet achieve substantial gains in a variety of scenarios, including movie, book, and news recommendations, over several state of theart baselines.
Abstract: To address the sparsity and cold-start problem of collaborative filtering, researchers usually make use of side information, such as social networks or item attributes, to improve the performance of recommendation. In this article, we consider the knowledge graph (KG) as the source of side information. To address the limitations of existing embedding-based and path-based methods for KG-aware recommendation, we propose RippleNet, an end-to-end framework that naturally incorporates the KG into recommender systems. RippleNet has two versions: (1) The outward propagation version, which is analogous to the actual ripples on water, stimulates the propagation of user preferences over the set of knowledge entities by automatically and iteratively extending a user’s potential interests along links in the KG. The multiple “ripples” activated by a user’s historically clicked items are thus superposed to form the preference distribution of the user with respect to a candidate item. (2) The inward aggregation version aggregates and incorporates the neighborhood information biasedly when computing the representation of a given entity. The neighborhood can be extended to multiple hops away to model high-order proximity and capture users’ long-distance interests. In addition, we intuitively demonstrate how a KG assists with recommender systems in RippleNet, and we also find that RippleNet provides a new perspective of explainability for the recommended results in terms of the KG. Through extensive experiments on real-world datasets, we demonstrate that both versions of RippleNet achieve substantial gains in a variety of scenarios, including movie, book, and news recommendations, over several state-of-the-art baselines.

85 citations


Journal ArticleDOI
TL;DR: Zhang et al. as discussed by the authors proposed an attentive aspect-based recommendation model (AARM) to enrich the aspect connections between user and product, besides common aspects, also models the interactions between synonymous and similar aspects.
Abstract: In recent years, many studies extract aspects from user reviews and integrate them with ratings for improving the recommendation performance. The common aspects mentioned in a user’s reviews and a product’s reviews indicate indirect connections between the user and product. However, these aspect-based methods suffer from two problems. First, the common aspects are usually very sparse, which is caused by the sparsity of user-product interactions and the diversity of individual users’ vocabularies. Second, a user’s interests on aspects could be different with respect to different products, which are usually assumed to be static in existing methods. In this article, we propose an Attentive Aspect-based Recommendation Model (AARM) to tackle these challenges. For the first problem, to enrich the aspect connections between user and product, besides common aspects, AARM also models the interactions between synonymous and similar aspects. For the second problem, a neural attention network which simultaneously considers user, product, and aspect information is constructed to capture a user’s attention toward aspects when examining different products. Extensive quantitative and qualitative experiments show that AARM can effectively alleviate the two aforementioned problems and significantly outperforms several state-of-the-art recommendation methods on the top-N recommendation task.

81 citations


Journal ArticleDOI
TL;DR: Zhang et al. as mentioned in this paper proposed an attention mechanism to integrate both long-term and short-term user preferences with the given query for personalized product search, which can capture users' current search intentions more accurately.
Abstract: E-commerce users may expect different products even for the same query, due to their diverse personal preferences. It is well known that there are two types of preferences: long-term ones and short-term ones. The former refers to users’ inherent purchasing bias and evolves slowly. By contrast, the latter reflects users’ purchasing inclination in a relatively short period. They both affect users’ current purchasing intentions. However, few research efforts have been dedicated to jointly model them for the personalized product search. To this end, we propose a novel Attentive Long Short-Term Preference model, dubbed as ALSTP, for personalized product search. Our model adopts the neural networks approach to learn and integrate the long- and short-term user preferences with the current query for the personalized product search. In particular, two attention networks are designed to distinguish which factors in the short-term as well as long-term user preferences are more relevant to the current query. This unique design enables our model to capture users’ current search intentions more accurately. Our work is the first to apply attention mechanisms to integrate both long- and short-term user preferences with the given query for the personalized search. Extensive experiments over four Amazon product datasets show that our model significantly outperforms several state-of-the-art product search methods in terms of different evaluation metrics.

81 citations


Journal ArticleDOI
TL;DR: An automated, supervised classifier that uses multi-task learning to classify the stance expressed in each individual tweet in a conversation around a rumour as either supporting, denying or questioning the rumour is developed.
Abstract: Social media tend to be rife with rumours while new reports are released piecemeal during breaking news. Interestingly, one can mine multiple reactions expressed by social media users in those situations, exploring their stance towards rumours, ultimately enabling the flagging of highly disputed rumours as being potentially false. In this work, we set out to develop an automated, supervised classifier that uses multi-task learning to classify the stance expressed in each individual tweet in a conversation around a rumour as either supporting, denying or questioning the rumour. Using a Gaussian Process classifier, and exploring its effectiveness on two datasets with very different characteristics and varying distributions of stances, we show that our approach consistently outperforms competitive baseline classifiers. Our classifier is especially effective in estimating the distribution of different types of stance associated with a given rumour, which we set forth as a desired characteristic for a rumour-tracking system that will show both ordinary users of Twitter and professional news practitioners how others orient to the disputed veracity of a rumour, with the final aim of establishing its actual truth value.

42 citations


Journal ArticleDOI
TL;DR: This work proposes to model the “search and purchase” behavior as a dynamic relation between users and items, and create a dynamic knowledge graph based on both the multi-relational product data and the context of the search session, which significantly outperforms the state-of-the-art baselines and has the ability to produce reasonable explanations for search results.
Abstract: Product search is one of the most popular methods for customers to discover products online. Most existing studies on product search focus on developing effective retrieval models that rank items by their likelihood to be purchased. However, they ignore the problem that there is a gap between how systems and customers perceive the relevance of items. Without explanations, users may not understand why product search engines retrieve certain items for them, which consequentially leads to imperfect user experience and suboptimal system performance in practice. In this work, we tackle this problem by constructing explainable retrieval models for product search. Specifically, we propose to model the “search and purchase” behavior as a dynamic relation between users and items, and create a dynamic knowledge graph based on both the multi-relational product data and the context of the search session. Ranking is conducted based on the relationship between users and items in the latent space, and explanations are generated with logic inferences and entity soft matching on the knowledge graph. Empirical experiments show that our model, which we refer to as the Dynamic Relation Embedding Model (DREM), significantly outperforms the state-of-the-art baselines and has the ability to produce reasonable explanations for search results.

Journal ArticleDOI
TL;DR: This article examines an academic paper recommender that sends out paper recommendations in email newsletters, based on the users’ browsing history on the academic search engine, and proposes an approach to reranking candidate recommendations that utilizes both paper content and user behavior.
Abstract: Academic search engines have been widely used to access academic papers, where users’ information needs are explicitly represented as search queries. Some modern recommender systems have taken one step further by predicting users’ information needs without the presence of an explicit query. In this article, we examine an academic paper recommender that sends out paper recommendations in email newsletters, based on the users’ browsing history on the academic search engine. Specifically, we look at users who regularly browse papers on the search engine, and we sign up for the recommendation newsletters for the first time. We address the task of reranking the recommendation candidates that are generated by a production system for such users.We face the challenge that the users on whom we focus have not interacted with the recommender system before, which is a common scenario that every recommender system encounters when new users sign up. We propose an approach to reranking candidate recommendations that utilizes both paper content and user behavior. The approach is designed to suit the characteristics unique to our academic recommendation setting. For instance, content similarity measures can be used to find the closest match between candidate recommendations and the papers previously browsed by the user. To this end, we use a knowledge graph derived from paper metadata to compare entity similarities (papers, authors, and journals) in the embedding space. Since the users on whom we focus have no prior interactions with the recommender system, we propose a model to learn a mapping from users’ browsed articles to user clicks on the recommendations. We combine both content and behavior into a hybrid reranking model that outperforms the production baseline significantly, providing a relative 13% increase in Mean Average Precision and 28% in Precisionc1.Moreover, we provide a detailed analysis of the model components, highlighting where the performance boost comes from. The obtained insights reveal useful components for the reranking process and can be generalized to other academic recommendation settings as well, such as the utility of graph embedding similarity. Also, recent papers browsed by users provide stronger evidence for recommendation than historical ones.

Journal ArticleDOI
TL;DR: This article proposes to utilize personalized latent behavior patterns learned from contextual features, e.g., time of day, day of week, and location category, to improve the effectiveness of the recommendations of next and next new point-of-interest recommendations.
Abstract: Next and next new point-of-interest (POI) recommendation are essential instruments in promoting customer experiences and business operations related to locations. However, due to the sparsity of the check-in records, they still remain insufficiently studied. In this article, we propose to utilize personalized latent behavior patterns learned from contextual features, e.g., time of day, day of week, and location category, to improve the effectiveness of the recommendations. Two variations of models are developed, including GPDM, which learns a fixed pattern distribution for all users; and PPDM, which learns personalized pattern distribution for each user. In both models, a soft-max function is applied to integrate the personalized Markov chain with the latent patterns, and a sequential Bayesian Personalized Ranking (S-BPR) is applied as the optimization criterion. Then, Expectation Maximization (EM) is in charge of finding optimized model parameters. Extensive experiments on three large-scale commonly adopted real-world LBSN data sets prove that the inclusion of location category and latent patterns helps to boost the performance of POI recommendations. Specifically, our models in general significantly outperform other state-of-the-art methods for both next and next new POI recommendation tasks. Moreover, our models are capable of making accurate recommendations regardless of the short/long duration or distance.

Journal ArticleDOI
TL;DR: This work proposes a novel solution to query fusion by splitting the computation into two parts: one phase that is carried out offline, to generate pre-computed centroid answers for queries addressing broadly similar information needs, and then a second online phase that uses the corresponding topic centroid to compute a result page for each query.
Abstract: Rank fusion is a powerful technique that allows multiple sources of information to be combined into a single result set. Query variations covering the same information need represent one way in which different sources of information might arise. However, when implemented in the obvious manner, fusion over query variations is not cost-effective, at odds with the usual web-search requirement for strict per-query efficiency guarantees. In this work, we propose a novel solution to query fusion by splitting the computation into two parts: one phase that is carried out offline, to generate pre-computed centroid answers for queries addressing broadly similar information needs, and then a second online phase that uses the corresponding topic centroid to compute a result page for each query. To achieve this, we make use of score-based fusion algorithms whose costs can be amortized via the pre-processing step and that can then be efficiently combined during subsequent per-query re-ranking operations. Experimental results using the ClueWeb12B collection and the UQV100 query variations demonstrate that centroid-based approaches allow improved retrieval effectiveness at little or no loss in query throughput or latency and within reasonable pre-processing requirements. We additionally show that queries that do not match any of the pre-computed clusters can be accurately identified and efficiently processed in our proposed ranking pipeline.

Journal ArticleDOI
TL;DR: Zhang et al. as mentioned in this paper proposed a memory-augmented dialogue management model that employs a memory controller and two additional memory structures (i.e., a slot-value memory and an external memory).
Abstract: Dialogue management (DM) is responsible for predicting the next action of a dialogue system according to the current dialogue state and thus plays a central role in task-oriented dialogue systems. Since DM requires having access not only to local utterances but also to the global semantics of the entire dialogue session, modeling the long-range history information is a critical issue. To this end, we propose MAD, a novel memory-augmented dialogue management model that employs a memory controller and two additional memory structures (i.e., a slot-value memory and an external memory). The slot-value memory tracks the dialogue state by memorizing and updating the values of semantic slots (i.e., cuisine, price, and location), and the external memory augments the representation of hidden states of traditional recurrent neural networks by storing more context information. To update the dialogue state efficiently, we also propose slot-level attention on user utterances to extract specific semantic information for each slot. Experiments show that our model can obtain state-of-the-art performance and outperforms existing baselines.

Journal ArticleDOI
TL;DR: This article designs a novel transfer learning solution termed role-based transfer to rank (RoToR), which contains two variants, i.e., integrative Ro toR and sequential RoToR, which aim to simplify the integrative one by decomposing it into two dependent phases according to a typical shopping process and instantiate both variants using different preference learning paradigms.
Abstract: Heterogeneous one-class collaborative filtering is an emerging and important problem in recommender systems, where two different types of one-class feedback, i.e., purchases and browses, are available as input data. The associated challenges include ambiguity of browses, scarcity of purchases, and heterogeneity arising from different feedback. In this article, we propose to model purchases and browses from a new perspective, i.e., users’ roles of mixer, browser and purchaser. Specifically, we design a novel transfer learning solution termed role-based transfer to rank (RoToR), which contains two variants, i.e., integrative RoToR and sequential RoToR. In integrative RoToR, we leverage browses into the preference learning task of purchases, in which we take each user as a sophisticated customer (i.e., mixer) that is able to take different types of feedback into consideration. In sequential RoToR, we aim to simplify the integrative one by decomposing it into two dependent phases according to a typical shopping process. Furthermore, we instantiate both variants using different preference learning paradigms such as pointwise preference learning and pairwise preference learning. Finally, we conduct extensive empirical studies with various baseline methods on three large public datasets and find that our RoToR can perform significantly more accurate than the state-of-the-art methods.

Journal ArticleDOI
TL;DR: This study investigated user real-time interactions with facets over the course of a search from both data science and human factor perspectives and adopted a Random Forest model to successfully predict facet use using search dynamic variables.
Abstract: Faceted search has become a common feature on most search interfaces in e-commerce websites, digital libraries, government’s open information portals, and so on. Beyond the existing studies on developing algorithms for faceted search and empirical studies on facet usage, this study investigated user real-time interactions with facets over the course of a search from both data science and human factor perspectives. It adopted a Random Forest (RF) model to successfully predict facet use using search dynamic variables. In addition, the RF model provided a ranking of variables by their predictive power, which suggests that the search process follows rhythmic flow of a sequence within which facet addition is mostly influenced by its immediately preceding action. In the follow-up user study, we found that participants used facets at critical points from the beginning to end of search sessions. Participants used facets for distinctive reasons at different stages. They also used facets implicitly without applying the facets to their search. Most participants liked the faceted search, although a few participants were concerned about the choice overload introduced by facets. The results of this research can be used to understand information seekers and propose or refine a set of practical design guidelines for faceted search.

Journal ArticleDOI
TL;DR: A new summarization approach is proposed that exploits frequent itemsets to describe all of the latent concepts covered by the documents under analysis and LSA to reduce the potentially redundant set of itemset to a compact set of uncorrelated concepts.
Abstract: Sentence-based summarization aims at extracting concise summaries of collections of textual documents. Summaries consist of a worthwhile subset of document sentences. The most effective multilingual strategies rely on Latent Semantic Analysis (LSA) and on frequent itemset mining, respectively. LSA-based summarizers pick the document sentences that cover the most important concepts. Concepts are modeled as combinations of single-document terms and are derived from a term-by-sentence matrix by exploiting Singular Value Decomposition (SVD). Itemset-based summarizers pick the sentences that contain the largest number of frequent itemsets, which represent combinations of frequently co-occurring terms. The main drawbacks of existing approaches are (i) the inability of LSA to consider the correlation between combinations of multiple-document terms and the underlying concepts, (ii) the inherent redundancy of frequent itemsets because similar itemsets may be related to the same concept, and (iii) the inability of itemset-based summarizers to correlate itemsets with the underlying document concepts. To overcome the issues of both of the abovementioned algorithms, we propose a new summarization approach that exploits frequent itemsets to describe all of the latent concepts covered by the documents under analysis and LSA to reduce the potentially redundant set of itemsets to a compact set of uncorrelated concepts. The summarizer selects the sentences that cover the latent concepts with minimal redundancy. We tested the summarization algorithm on both multilingual and English-language benchmark document collections. The proposed approach performed significantly better than both itemset- and LSA-based summarizers, and better than most of the other state-of-the-art approaches.

Journal ArticleDOI
TL;DR: This work uses the general linear mixed model framework and presents a model that encompasses the experimental factors of system, topic, shard, and their interaction effects and discovers that the topic*shard interaction effect is a large effect almost globally across all datasets.
Abstract: Despite the bulk of research studying how to more accurately compare the performance of IR systems, less attention is devoted to better understanding the different factors that play a role in such performance and how they interact. This is the case of shards, i.e., partitioning a document collection into sub-parts, which are used for many different purposes, ranging from efficiency to selective search or making test collection evaluation more accurate. In all these cases, there is empirical knowledge supporting the importance of shards, but we lack actual models that allow us to measure the impact of shards on system performance and how they interact with topics and systems. We use the general linear mixed model framework and present a model that encompasses the experimental factors of system, topic, shard, and their interaction effects. This detailed model allows us to more accurately estimate differences between the effect of various factors. We study shards created by a range of methods used in prior work and better explain observations noted in prior work in a principled setting and offer new insights. Notably, we discover that the topicashard interaction effect, in particular, is a large effect almost globally across all datasets, an observation that, to our knowledge, has not been measured before.

Journal ArticleDOI
TL;DR: This work proposes a novel and general neural collaborative filtering framework, ConvNCF, which is featured with two designs: (1) applying outer product on user embedding and item embedding to explicitly model the pairwise correlations between embedding dimensions, and (2) employing convolutional neural network above the outer product to learn the high-order correlations amongembedding dimensions.
Abstract: As the core of recommender systems, collaborative filtering (CF) models the affinity between a user and an item from historical user-item interactions, such as clicks, purchases, and so on. Benefiting from the strong representation power, neural networks have recently revolutionized the recommendation research, setting up a new standard for CF. However, existing neural recommender models do not explicitly consider the correlations among embedding dimensions, making them less effective in modeling the interaction function between users and items. In this work, we emphasize on modeling the correlations among embedding dimensions in neural networks to pursue higher effectiveness for CF. We propose a novel and general neural collaborative filtering framework—namely, ConvNCF, which is featured with two designs: (1) applying outer product on user embedding and item embedding to explicitly model the pairwise correlations between embedding dimensions, and (2) employing convolutional neural network above the outer product to learn the high-order correlations among embedding dimensions. To justify our proposal, we present three instantiations of ConvNCF by using different inputs to represent a user and conduct experiments on two real-world datasets. Extensive results verify the utility of modeling embedding dimension correlations with ConvNCF, which outperforms several competitive CF methods.

Journal ArticleDOI
TL;DR: In this paper, the authors present a trie data structure in which each word of an n-gram following a context of fixed length k, that is, its preceding k words, is encoded as an integer whose value is proportional to the number of words that follow such context.
Abstract: Two fundamental problems concern the handling of large n-gram language models: indexing, that is, compressing the n-grams and associated satellite values without compromising their retrieval speed, and estimation, that is, computing the probability distribution of the n-grams extracted from a large textual source Performing these two tasks efficiently is vital for several applications in the fields of Information Retrieval, Natural Language Processing, and Machine Learning, such as auto-completion in search engines and machine translation Regarding the problem of indexing, we describe compressed, exact, and lossless data structures that simultaneously achieve high space reductions and no time degradation with respect to the state-of-the-art solutions and related software packages In particular, we present a compressed trie data structure in which each word of an n-gram following a context of fixed length k, that is, its preceding k words, is encoded as an integer whose value is proportional to the number of words that follow such context Since the number of words following a given context is typically very small in natural languages, we lower the space of representation to compression levels that were never achieved before, allowing the indexing of billions of strings Despite the significant savings in space, our technique introduces a negligible penalty at query time Specifically, the most space-efficient competitors in the literature, which are both quantized and lossy, do not take less than our trie data structure and are up to 5 times slower Conversely, our trie is as fast as the fastest competitor but also retains an advantage of up to 65% in absolute space Regarding the problem of estimation, we present a novel algorithm for estimating modified Kneser-Ney language models that have emerged as the de-facto choice for language modeling in both academia and industry thanks to their relatively low perplexity performance Estimating such models from large textual sources poses the challenge of devising algorithms that make a parsimonious use of the disk The state-of-the-art algorithm uses three sorting steps in external memory: we show an improved construction that requires only one sorting step by exploiting the properties of the extracted n-gram strings With an extensive experimental analysis performed on billions of n-grams, we show an average improvement of 45 times on the total runtime of the previous approach

Journal ArticleDOI
TL;DR: This work proposes a model that performs query expansion on the tweet (query) using two novel approaches that combine both query expansion approaches via a novel fusion framework and overlay them on a Hidden Markov Model to account for sequential information.
Abstract: In fine-grained tweet geolocation, tweets are linked to the specific venues (e.g., restaurants, shops) from which they were posted. This explicitly recovers the venue context that is essential for applications such as location-based advertising or user profiling. For this geolocation task, we focus on geolocating tweets that are contained in tweet sequences. In a tweet sequence, tweets are posted from some latent venue(s) by the same user and within a short time interval. This scenario arises from two observations: (1) It is quite common that users post multiple tweets in a short time and (2) most tweets are not geocoded. To more accurately geolocate a tweet, we propose a model that performs query expansion on the tweet (query) using two novel approaches. The first approach temporal query expansion considers users’ staying behavior around venues. The second approach visitation query expansion leverages on user revisiting the same or similar venues in the past. We combine both query expansion approaches via a novel fusion framework and overlay them on a Hidden Markov Model to account for sequential information. In our comprehensive experiments across multiple datasets and metrics, we show our proposed model to be more robust and accurate than other baselines.

Journal ArticleDOI
TL;DR: This article investigates how a learning-to-rank recommender system can best take advantage of implicit feedback signals from multiple channels, and proposes a multi-channel sampling method, which exploits multiple feedback channels during the sampling process of training.
Abstract: User interactions can be considered to constitute different feedback channels, for example, view, click, like or follow, that provide implicit information on users’ preferences. Each implicit feedback channel typically carries a unary, positive-only signal that can be exploited by collaborative filtering models to generate lists of personalized recommendations. This article investigates how a learning-to-rank recommender system can best take advantage of implicit feedback signals from multiple channels. We focus on Factorization Machines (FMs) with Bayesian Personalized Ranking (BPR), a pairwise learning-to-rank method, that allows us to experiment with different forms of exploitation. We perform extensive experiments on three datasets with multiple types of feedback to arrive at a series of insights. We compare conventional, direct integration of feedback types with our proposed method, which exploits multiple feedback channels during the sampling process of training. We refer to our method as multi-channel sampling. Our results show that multi-channel sampling outperforms conventional integration, and that sampling with the relative “level” of feedback is always superior to a level-blind sampling approach. We evaluate our method experimentally on three datasets in different domains and observe that with our multi-channel sampler the accuracy of recommendations can be improved considerably compared to the state-of-the-art models. Further experiments reveal that the appropriate sampling method depends on particular properties of datasets such as popularity skewness.

Journal ArticleDOI
TL;DR: This article propose a two-tier ensemble learning method for cross-lingual text classification, where all documents, irrespective of language, are classified by the same (second-tier) classifier, and all documents are represented in a common, language-independent feature space consisting of the posterior probabilities generated by first-tier, languagedependent classifiers.
Abstract: Cross-lingual Text Classification (CLC) consists of automatically classifying, according to a common set C of classes, documents each written in one of a set of languages L, and doing so more accurately than when “naively” classifying each document via its corresponding language-specific classifier. To obtain an increase in the classification accuracy for a given language, the system thus needs to also leverage the training examples written in the other languages. We tackle “multilabel” CLC via funnelling, a new ensemble learning method that we propose here. Funnelling consists of generating a two-tier classification system where all documents, irrespective of language, are classified by the same (second-tier) classifier. For this classifier, all documents are represented in a common, language-independent feature space consisting of the posterior probabilities generated by first-tier, language-dependent classifiers. This allows the classification of all test documents, of any language, to benefit from the information present in all training documents, of any language. We present substantial experiments, run on publicly available multilingual text collections, in which funnelling is shown to significantly outperform a number of state-of-the-art baselines. All code and datasets (in vector form) are made publicly available.

Journal ArticleDOI
TL;DR: A laboratory study that investigated the effects of three different cognitive abilities (perceptual speed, working memory, and inhibition) in the context of aggregated search found different main and interaction effects.
Abstract: Prior work has studied how different characteristics of individual users (e.g., personality traits and cognitive abilities) can impact search behaviors and outcomes. We report on a laboratory study (N = 32) that investigated the effects of three different cognitive abilities (perceptual speed, working memory, and inhibition) in the context of aggregated search. Aggregated search systems combine results from multiple heterogeneous sources (or verticals) in a unified presentation. Participants in our study interacted with two different aggregated search interfaces (a within-subjects design) that differed based on the extent to which the layout distinguished between results originating from different verticals. The interleaved interface merged results from different verticals in a fairly unconstrained fashion. Conversely, the blocked interface displayed results from the same vertical as a group, displayed each group of vertical results in the same region on the SERP for every query, and used a border around each group of vertical results to help distinguish among results from different sources. We investigated three research questions (RQ1--RQ3). Specifically, we investigated the effects of the interface condition and each cognitive ability on three types of outcomes: (RQ1) participants’ levels of workload, (RQ2) participants’ levels of user engagement, and (RQ3) participants’ search behaviors. Our results found different main and interaction effects. Perceptual speed and inhibition did not significantly affect participants’ workload and user engagement but significantly affected their search behaviors. Specifically, with the interleaved interface, participants with lower perceptual speed had more difficulty finding relevant results on the SERP, and participants with lower inhibitory attention control searched at a slower pace. Working memory did not have a strong effect on participants’ behaviors but had several significant effects on the levels of workload and user engagement reported by participants. Specifically, participants with lower working memory reported higher levels of workload and lower levels of user engagement. We discuss implications of our results for designing aggregated search interfaces that are well suited for users with different cognitive abilities.

Journal ArticleDOI
TL;DR: In this article, the authors propose to evaluate FS for L2R with an additional objective in mind, namely risk-sensitiveness, and present novel single and multi-objective criteria to optimize feature reduction, effectiveness, and riskensitiveness.
Abstract: Learning to Rank (L2R) is one of the main research lines in Information Retrieval. Risk-sensitive L2R is a sub-area of L2R that tries to learn models that are good on average while at the same time reducing the risk of performing poorly in a few but important queries (e.g., medical or legal queries). One way of reducing risk in learned models is by selecting and removing noisy, redundant features, or features that promote some queries to the detriment of others. This is exacerbated by learning methods that usually maximize an average metric (e.g., mean average precision (MAP) or Normalized Discounted Cumulative Gain (NDCG)). However, historically, feature selection (FS) methods have focused only on effectiveness and feature reduction as the main objectives. Accordingly, in this work, we propose to evaluate FS for L2R with an additional objective in mind, namely risk-sensitiveness. We present novel single and multi-objective criteria to optimize feature reduction, effectiveness, and risk-sensitiveness, all at the same time. We also introduce a new methodology to explore the search space, suggesting effective and efficient extensions of a well-known Evolutionary Algorithm (SPEA2) for FS applied to L2R. Our experiments show that explicitly including risk as an objective criterion is crucial to achieving a more effective and risk-sensitive performance. We also provide a thorough analysis of our methodology and experimental results.

Journal ArticleDOI
TL;DR: Modern web search engines use many signals to select and rank results in response to queries, but searchers’ mental models of search are relatively unsophisticated, hindering their ability to rank results.
Abstract: Modern web search engines use many signals to select and rank results in response to queries. However, searchers’ mental models of search are relatively unsophisticated, hindering their ability to use search engines efficiently and effectively. Annotating results with more in-depth explanations could help, but search engine providers need to know what to explain. To this end, we report on a study of searchers’ mental models of web selection and ranking, with more than 400 respondents to an online survey and 11 face-to-face interviews. Participants volunteered a range of factors and showed good understanding of important concepts such as popularity, wording, and personalization. However, they showed little understanding of recency or diversity and incorrect ideas of payment for ranking. Where there are already explanatory annotations on the results page—such as “ad” markers and keyword highlighting—participants were familiar with ranking concepts. This suggests that further explanatory annotations may be useful.

Journal ArticleDOI
Zijie Zeng1, Jing Lin1, Lin Li1, Weike Pan1, Zhong Ming1 
TL;DR: A simple yet effective similarity measurement called bidirectional item similarity (BIS) that is able to capture sequential patterns even from noisy data and a compound weighting function that leverages the complementarity between two well-known time-aware weighting functions.
Abstract: Exploiting temporal effect has empirically been recognized as a promising way to improve recommendation performance in recent years. In real-world applications, one-class data in the form of (user, item, timestamp) are usually more accessible and abundant than numerical ratings. In this article, we focus on exploiting such one-class data in order to provide personalized next-item recommendation services. Specifically, we base our work on the framework of time-aware item-based collaborative filtering and propose a simple yet effective similarity measurement called bidirectional item similarity (BIS) that is able to capture sequential patterns even from noisy data. Furthermore, we extend BIS via some factorization techniques and obtain an adaptive version, i.e., adaptive BIS (ABIS), in order to better fit the behavioral data. We also design a compound weighting function that leverages the complementarity between two well-known time-aware weighting functions. With the proposed similarity measurements and weighting function, we obtain two novel collaborative filtering methods that are able to achieve significantly better performance than the state-of-the-art methods, showcasing their effectiveness for next-item recommendation.

Journal ArticleDOI
TL;DR: To tackle the unsupervised learning limitation, a Bayesian probabilistic graphical model is designed to capture the IPC structure for recommendation and both rating behaviors and social connections are introduced into the model by parameter sharing.
Abstract: Recommender systems aim to capture user preferences and provide accurate recommendations to users accordingly. For each user, there usually exist others with similar preferences, and a collection of users may also have similar preferences with each other, thus forming a community. However, such communities may not necessarily be explicitly given, and the users inside the same communities may not know each other; they are formally defined and named Implicit Preference Communities (IPCs) in this article. By enriching user preferences with the information of other users in the communities, the performance of recommender systems can also be enhanced. Historical explicit ratings are a good resource to construct the IPCs of users but is usually sparse. Meanwhile, user preferences are easily affected by their social connections, which can be jointly used for IPC modeling with the ratings. However, this imposes two challenges for model design. First, the rating and social domains are heterogeneous; thus, it is challenging to coordinate social information and rating behaviors for a same learning task. Therefore, transfer learning is a good strategy for IPC modeling. Second, the communities are not explicitly labeled, and existing supervised learning approaches do not fit the requirement of IPC modeling. As co-clustering is an effective unsupervised learning approach for discovering block structures in high-dimensional data, it is a cornerstone for discovering the structure of IPCs. In this article, we propose a recommendation model with Implicit Preference Communities from user ratings and social connections. To tackle the unsupervised learning limitation, we design a Bayesian probabilistic graphical model to capture the IPC structure for recommendation. Meanwhile, following the spirit of transfer learning, both rating behaviors and social connections are introduced into the model by parameter sharing. Moreover, Gibbs sampling-based algorithms are proposed for parameter inferences of the models. Furthermore, to meet the need for online scenarios when the data arrive sequentially as a stream, a novel online sampling-based parameter inference algorithm for recommendation is proposed. To the best of our knowledge, this is the first attempt to propose and formally define the concept of IPC.

Journal ArticleDOI
Yunqiu Shao1, Yiqun Liu1, Fan Zhang1, Min Zhang1, Shaoping Ma1 
TL;DR: This work is the first to systematically study how diverse factors in data annotation impact image search evaluation, and suggests different strategies for exploiting the crowdsourcing to get data annotated under different conditions.
Abstract: Image search engines differ significantly from general web search engines in the way of presenting search results. The difference leads to different interaction and examination behavior patterns, and therefore requires changes in evaluation methodologies. However, evaluation of image search still utilizes the methods for general web search. In particular, offline metrics are calculated based on coarse-fine topical relevance judgments with the assumption that users examine results in a sequential manner.In this article, we investigate annotation methods via crowdsourcing for image search evaluation based on a lab-based user study. Using user satisfaction as the golden standard, we make several interesting findings. First, instead of item-based annotation, annotating relevance in a row-based way is more efficient without hurting performance. Second, besides topical relevance, image quality plays a crucial role when evaluating the image search results, and the importance of image quality changes with search intent. Third, compared to traditional four-level scales, the fine-grain annotation method outperforms significantly. To our best knowledge, our work is the first to systematically study how diverse factors in data annotation impact image search evaluation. Our results suggest different strategies for exploiting the crowdsourcing to get data annotated under different conditions.