Home
/
Authors
/
Ke Ji

Author

Ke Ji

Other affiliations: Beijing Jiaotong University

Bio: Ke Ji is an academic researcher from University of Jinan. The author has contributed to research in topics: Collaborative filtering & Recommender system. The author has an hindex of 8, co-authored 29 publications receiving 197 citations. Previous affiliations of Ke Ji include Beijing Jiaotong University.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Addressing cold-start

[...]

Ke Ji¹, Hong Shen²•Institutions (2)

Beijing Jiaotong University¹, University of Adelaide²

01 Jul 2015-Knowledge Based Systems

TL;DR: A novel method for alleviating cold start problem for new users and new items by incorporating content-based information about users and items, i.e., tags and keywords, which outperforms other state-of-the-art CF algorithms for historical data, but also has good scalability for new data.

...read moreread less

Abstract: Cold start problem for new users and new items is a major challenge facing most collaborative filtering systems. Existing methods to collaborative filtering (CF) emphasize to scale well up to large and sparse dataset, lacking of scalable approach to dealing with new data. In this paper, we consider a novel method for alleviating the problem by incorporating content-based information about users and items, i.e., tags and keywords. The user-item ratings imply the relevance of users' tags to items' keywords, so we convert the direct prediction on the user-item rating matrix into the indirect prediction on the tag-keyword relation matrix that adopts to the emergence of new data. We first propose a novel neighborhood approach for building the tag-keyword relation matrix based on the statistics of tag-keyword pairs in the ratings. Then, with the relation matrix, we propose a 3-factor matrix factorization model over the rating matrix, for learning every user's interest vector for selected tags and every item's correlation vector for extracted keywords. Finally, we integrate the relation matrix with the two kinds of vectors to make recommendations. Experiments on real dataset demonstrate that our method not only outperforms other state-of-the-art CF algorithms for historical data, but also has good scalability for new data.

...read moreread less

45 citations

Journal Article•DOI•

Deep and broad URL feature mining for android malware detection

[...]

Shanshan Wang¹, Zhenxiang Chen¹, Qiben Yan², Ke Ji¹, Lizhi Peng¹, Bo Yang¹, Mauro Conti³ - Show less +3 more•Institutions (3)

University of Jinan¹, Michigan State University², University of Padua³

01 Mar 2020-Information Sciences

TL;DR: A malware detection method that uses the URLs visited by apps to identify malware that can not only effectively detect malware discovered in different months of a certain year, but also detect potentially malicious apps in the third-party app market.

...read moreread less

38 citations

Journal Article•DOI•

Improving matrix approximation for recommendation via a clustering-based reconstructive method

[...]

Ke Ji¹, Runyuan Sun¹, Xiang Li², Wenhao Shu³•Institutions (3)

University of Jinan¹, Renmin University of China², East China Jiaotong University³

15 Jan 2016-Neurocomputing

TL;DR: A reconstructive method that compresses low-rank approximation into a cluster-level rating-pattern referred to as a codebook, and then constructs an improved approximation by expending the codebook improves the prediction accuracy of the state-of theart matrix factorization and social recommendation models.

...read moreread less

37 citations

Proceedings Article•DOI•

Deep and Broad Learning Based Detection of Android Malware via Network Traffic

[...]

Shanshan Wang¹, Zhenxiang Chen¹, Qiben Yan², Ke Ji¹, Lin Wang¹, Bo Yang¹, Mauro Conti³ - Show less +3 more•Institutions (3)

University of Jinan¹, University of Nebraska–Lincoln², University of Padua³

04 Jun 2018

TL;DR: A method that uses the URLs visited by applications to identify malicious apps and can not only effectively detect malware discovered in different months of a certain year, but also detect potentially malicious apps in the third-party app market.

...read moreread less

Abstract: In recent years, the scale and diversity of malicious software on mobile networks are constantly increasing, thereby causing considerable danger to users' property and personal privacy. In this study, we devise a method that uses the URLs visited by applications to identify malicious apps. A multi-view neural network is used to create a malware detection model that emphasizes depth and width. This neural network can create multiple views of the input automatically and distribute soft attention weights to focus on different features of input. Multiple views preserve rich semantic information from input for classification without requiring complicated feature engineering. In addition, we conduct comprehensive experiments to compare the proposed method with others and verify the validity of the detection model. The experimental results show that our method has a certain timeliness. It can not only effectively detect malware discovered in different months of a certain year, but also detect potentially malicious apps in the third-party app market. We also compare the detection results of the proposed method on wild apps with 10 popular anti-virus scanners, and the final result shows that our approach ranks second in terms of detection performance.

...read moreread less

23 citations

Journal Article•DOI•

GIST: A generative model with individual and subgroup-based topics for group recommendation

[...]

Ke Ji¹, Zhenxiang Chen¹, Runyuan Sun¹, Kun Ma¹, Zhongjie Yuan², Guandong Xu³ - Show less +2 more•Institutions (3)

University of Jinan¹, Shandong Normal University², University of Technology, Sydney³

15 Mar 2018-Expert Systems With Applications

TL;DR: A Topic-based probabilistic model named GISTis proposed to infer group activities, and make group recommendations, which shows that the recommendation accuracy is significantly improved by GIST comparing with the state-of-the-art methods.

...read moreread less

Abstract: In this paper, a Topic-based probabilistic model named GISTis proposed to infer group activities, and make group recommendations. Compared with existing individual-based aggregation methods, it not only considers individual members’ interest, but also consider some subgroups’ interest. Intuition might seem that when a group of users want to take part in an activity, not every group member is decisive, instead, more likely the subgroups of members having close relationships lead to the final activity decision. That motivates our study on jointly considering individual members’ choices and subgroups’ choices for group recommendations. Based on this, our model uses two kinds of unshared topics to model individual members’ interest and subgroups’ interest separately, and then make final recommendations according to the choices from the two aspects with a weight-based scheme. Moreover, the link information in the graph topology of the groups can be used to optimize the weights of our model. The experimental results on real-life data show that the recommendation accuracy is significantly improved by GIST comparing with the state-of-the-art methods.

...read moreread less

23 citations

1
2
3
4
…
5
6
7

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Social Media and Fake News in the 2016 Election

[...]

Hunt Allcott¹, Matthew Gentzkow²•Institutions (2)

New York University¹, Stanford University²

19 Jan 2017-Journal of Economic Perspectives

TL;DR: The authors found that people are much more likely to believe stories that favor their preferred candidate, especially if they have ideologically segregated social media networks, and that the average American adult saw on the order of one or perhaps several fake news stories in the months around the 2016 U.S. presidential election, with just over half of those who recalled seeing them believing them.

...read moreread less

Abstract: Following the 2016 U.S. presidential election, many have expressed concern about the effects of false stories (“fake news”), circulated largely through social media. We discuss the economics of fake news and present new data on its consumption prior to the election. Drawing on web browsing data, archives of fact-checking websites, and results from a new online survey, we find: (i) social media was an important but not dominant source of election news, with 14 percent of Americans calling social media their “most important” source; (ii) of the known false news stories that appeared in the three months before the election, those favoring Trump were shared a total of 30 million times on Facebook, while those favoring Clinton were shared 8 million times; (iii) the average American adult saw on the order of one or perhaps several fake news stories in the months around the election, with just over half of those who recalled seeing them believing them; and (iv) people are much more likely to believe stories that favor their preferred candidate, especially if they have ideologically segregated social media networks.

...read moreread less

3,959 citations

Journal Article•DOI•

CatBoost for big data: an interdisciplinary review

[...]

John Hancock¹, Taghi M. Khoshgoftaar¹•Institutions (1)

Florida Atlantic University¹

19 Aug 2020-Journal of Big Data

TL;DR: This survey takes an interdisciplinary approach to cover studies related to CatBoost in a single work, and provides researchers an in-depth understanding to help clarify proper application of Cat boost in solving problems.

...read moreread less

Abstract: Gradient Boosted Decision Trees (GBDT’s) are a powerful tool for classification and regression tasks in Big Data. Researchers should be familiar with the strengths and weaknesses of current implementations of GBDT’s in order to use them effectively and make successful contributions. CatBoost is a member of the family of GBDT machine learning ensemble techniques. Since its debut in late 2018, researchers have successfully used CatBoost for machine learning studies involving Big Data. We take this opportunity to review recent research on CatBoost as it relates to Big Data, and learn best practices from studies that cast CatBoost in a positive light, as well as studies where CatBoost does not outshine other techniques, since we can learn lessons from both types of scenarios. Furthermore, as a Decision Tree based algorithm, CatBoost is well-suited to machine learning tasks involving categorical, heterogeneous data. Recent work across multiple disciplines illustrates CatBoost’s effectiveness and shortcomings in classification and regression tasks. Another important issue we expose in literature on CatBoost is its sensitivity to hyper-parameters and the importance of hyper-parameter tuning. One contribution we make is to take an interdisciplinary approach to cover studies related to CatBoost in a single work. This provides researchers an in-depth understanding to help clarify proper application of CatBoost in solving problems. To the best of our knowledge, this is the first survey that studies all works related to CatBoost in a single publication.

...read moreread less

247 citations

Journal Article•DOI•

A Survey of Collaborative Filtering-Based Recommender Systems: From Traditional Methods to Hybrid Methods Based on Social Networks

[...]

Rui Chen¹, Qingyi Hua¹, Yan-Shuo Chang, Bo Wang¹, Lei Zhang², Xiangjie Kong³ - Show less +2 more•Institutions (3)

Northwest University (China)¹, Yuncheng University², Dalian University of Technology³

24 Oct 2018-IEEE Access

TL;DR: The recent hybrid CF-based recommendation techniques fusing social networks to solve data sparsity and high dimensionality are introduced and provide a novel point of view to improve the performance of RS, thereby presenting a useful resource in the state-of-the-art research result for future researchers.

...read moreread less

Abstract: In the era of big data, recommender system (RS) has become an effective information filtering tool that alleviates information overload for Web users. Collaborative filtering (CF), as one of the most successful recommendation techniques, has been widely studied by various research institutions and industries and has been applied in practice. CF makes recommendations for the current active user using lots of users’ historical rating information without analyzing the content of the information resource. However, in recent years, data sparsity and high dimensionality brought by big data have negatively affected the efficiency of the traditional CF-based recommendation approaches. In CF, the context information, such as time information and trust relationships among the friends, is introduced into RS to construct a training model to further improve the recommendation accuracy and user’s satisfaction, and therefore, a variety of hybrid CF-based recommendation algorithms have emerged. In this paper, we mainly review and summarize the traditional CF-based approaches and techniques used in RS and study some recent hybrid CF-based recommendation approaches and techniques, including the latest hybrid memory-based and model-based CF recommendation algorithms. Finally, we discuss the potential impact that may improve the RS and future direction. In this paper, we aim at introducing the recent hybrid CF-based recommendation techniques fusing social networks to solve data sparsity and high dimensionality and provide a novel point of view to improve the performance of RS, thereby presenting a useful resource in the state-of-the-art research result for future researchers.

...read moreread less

177 citations

Journal Article•DOI•

Characterizing context-aware recommender systems

[...]

Norha M. Villegas¹, Cristian Snchez¹, Javier Daz-Cely¹, Gabriel Tamura¹•Institutions (1)

ICESI University¹

15 Jan 2018-Knowledge Based Systems

TL;DR: A framework that characterizes context-aware recommendation processes in terms of the recommendation techniques used at every stage of the process and the techniques used to incorporate context is characterized, providing a clear understanding about the integration of context into recommender systems.

...read moreread less

Abstract: Context-aware recommender systems leverage the value of recommendations by exploiting context information that affects user preferences and situations, with the goal of recommending items that are really relevant to changing user needs. Despite the importance of context-awareness in the recommender systems realm, researchers and practitioners lack guides that help them understand the state of the art and how to exploit context information to smarten up recommender systems. This paper presents the results of a comprehensive systematic literature review we conducted to survey context-aware recommenders and their mechanisms to exploit context information. The main contribution of this paper is a framework that characterizes context-aware recommendation processes in terms of: i) the recommendation techniques used at every stage of the process, ii) the techniques used to incorporate context, and iii) the stages of the process where context is integrated into the system. This systematic literature review provides a clear understanding about the integration of context into recommender systems, including context types more frequently used in the different application domains and validation mechanismsexplained in terms of the used datasets, properties, metrics, and evaluation protocols. The paper concludes with a set of research opportunities in this field.

...read moreread less

152 citations

Journal Article•DOI•

A Survey of Android Malware Detection with Deep Neural Models

[...]

Junyang Qiu¹, Jun Zhang², Wei Luo¹, Lei Pan¹, Surya Nepal, Yang Xiang² - Show less +2 more•Institutions (2)

Deakin University¹, Swinburne University of Technology²

06 Dec 2020-ACM Computing Surveys

TL;DR: This survey aims to address the challenges in DL-based Android malware detection and classification by systematically reviewing the latest progress, including FCN, CNN, RNN, DBN, AE, and hybrid models, and organize the literature according to the DL architecture.

...read moreread less

Abstract: Deep Learning (DL) is a disruptive technology that has changed the landscape of cyber security research. Deep learning models have many advantages over traditional Machine Learning (ML) models, particularly when there is a large amount of data available. Android malware detection or classification qualifies as a big data problem because of the fast booming number of Android malware, the obfuscation of Android malware, and the potential protection of huge values of data assets stored on the Android devices. It seems a natural choice to apply DL on Android malware detection. However, there exist challenges for researchers and practitioners, such as choice of DL architecture, feature extraction and processing, performance evaluation, and even gathering adequate data of high quality. In this survey, we aim to address the challenges by systematically reviewing the latest progress in DL-based Android malware detection and classification. We organize the literature according to the DL architecture, including FCN, CNN, RNN, DBN, AE, and hybrid models. The goal is to reveal the research frontier, with the focus on representing code semantics for Android malware detection. We also discuss the challenges in this emerging field and provide our view of future research opportunities and directions.

...read moreread less

151 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58

Collapse