scispace - formally typeset
Search or ask a question

Showing papers by "Magdalini Eirinaki published in 2019"


Proceedings ArticleDOI
10 Jun 2019
TL;DR: The main focus is to perform an in-depth analysis of the major types of crimes that occurred in the city, observe the trend over the years, and determine how various attributes contribute to specific crimes.
Abstract: Crime has been prevalent in our society for a very long time and it continues to be so even today. Currently, many cities have released crime-related data as part of an open data initiative. Using this as input, we can apply analytics to be able to predict and hopefully prevent crime in the future. In this work, we applied big data analytics to the San Francisco crime dataset, as collected by the San Francisco Police Department and available through the Open Data initiative. The main focus is to perform an in-depth analysis of the major types of crimes that occurred in the city, observe the trend over the years, and determine how various attributes contribute to specific crimes. Furthermore, we leverage the results of the exploratory data analysis to inform the data preprocessing process, prior to training various machine learning models for crime type prediction. More specifically, the model predicts the type of crime that will occur in each district of the city. We observe that the provided dataset is highly imbalanced, thus metrics used in previous research focus mainly on the majority class, disregarding the performance of the classifiers in minority classes, and propose a methodology to improve this issue. The proposed model finds applications in resource allocation of law enforcement in a Smart City.

14 citations


Proceedings ArticleDOI
13 May 2019
TL;DR: This work creates neighborhoods of influence leveraging only the social graph structure that are introduced in the recommendation process both as a pre-processing step and as a social regularization factor of the matrix factorization algorithm.
Abstract: Social recommendations have been a very intriguing domain for researchers in the past decade. The main premise is that the social network of a user can be leveraged to enhance the rating-based recommendation process. This has been achieved in various ways, and under different assumptions about the network characteristics, structure, and availability of other information (such as trust, content, etc.) In this work, we create neighborhoods of influence leveraging only the social graph structure. These are in turn introduced in the recommendation process both as a pre-processing step and as a social regularization factor of the matrix factorization algorithm. Our experimental evaluation using real-life datasets demonstrates the effectiveness of the proposed technique.

8 citations


Proceedings ArticleDOI
01 Sep 2019
TL;DR: This work proposes ViSeR, a visual search engine architecture using deep learning and machine learnging techniques, with a proof-of-concept implementation focusing on the fashion eretail industry, and presents in detail the experimental results with different deep and machine learning algorithms.
Abstract: Web search engines play a significant role in many applications in our everyday lives. Among them, online shopping is one major technological advancement which has made our life easier and comfortable. Currently, most e-commerce websites support either text-based or voice-based search. The problem with text and voice-based approaches is that they need an appropriate item name or description for the search results to be accurate. Also, with the huge variety of items available online, it is not easy to find the desired object in the top results. Lately, some top e-commerce websites started supporting visual search, where the user can submit an image of an item they'd like to find. However, this domain is still in its infancy. In this work, we propose ViSeR, a visual search engine architecture using deep learning and machine learnging techniques, with a proof-of-concept implementation focusing on the fashion eretail industry. ViSeR first classifies the query image to the right category using image classification. Then all the images in that category are ranked based on their similarity and the top images are retrieved as recommendations. We present in detail the experimental results with different deep and machine learning algorithms and provide additional details with regards to deploying this model to achieve high accuracy and low latency (in terms of training and recommendation time).

3 citations


Proceedings ArticleDOI
01 Sep 2019
TL;DR: A graph-based recommender system that is not relying on explicit item ratings to generate recommendations and employs neighborhood-based and graph mining techniques to generate item profiles using their reviews' text is presented.
Abstract: In this paper we present a graph-based recommender system that is not relying on explicit item ratings to generate recommendations. Instead, it employs neighborhood-based and graph mining techniques to generate item profiles using their reviews' text. The proposed algorithm uses user review feedback to find products related to each other and tries to find a balance between similar products and highly popular products. It achieves this balance by ranking the products based on the similarity to the target product as well as its connectivity among similar products. The algorithm breaks the entire dataset into subgroups of similar products, which makes the proposed algorithm scalable as well. We present a proof-of-concept implementation of the proposed algorithm in the food product domain and present some preliminary results.

2 citations


Proceedings ArticleDOI
22 Jul 2019
TL;DR: A local influence model is introduced, which is relied on the formation of local user networks based on common interests and results show a promising improvement on the similarity between a target user and the users recommended based on the users selected to influence the recommendations for that target user.
Abstract: The quality of recommendations on social networks is a combination of the richness of the available information and the ability of algorithms and architectures to take advantage of this information in favor of the users. Recommendation algorithms have to address several problems, such as information sparsity, scalability of algorithms, concept drift etc. In this dynamic and complex environment, it is important to provide solutions that enrich information when it is necessary to fill the gaps and at the same time to scale solutions so that they can handle the ever increasing data sizes and flows. In this work, we extend our previous work on recommender systems for social networks that studied global influence and trust metrics and their applications. More specifically, we introduce a local influence model, which is relied on the formation of local user networks based on common interests and study the performance of the new model, both stand-alone and in combination with the global one. Results show a promising improvement on the similarity between a target user and the users recommended based on the users selected to influence the recommendations for that target user.

1 citations



Proceedings ArticleDOI
01 Dec 2019
TL;DR: This work presents and evaluates a machine learning framework that takes as input a domain name and outputs the content category it belongs to and proposes a SERP (Search Engine Response Pages)-mining approach to collect and label an appropriate dataset.
Abstract: DNS request classification is an area that has received a lot of attention, mostly as part of network security process, in order to classify requests into malicious and non-malicious. However, there exist several categories of web pages that even though not malicious, they belong to “borderline” categories and need to be monitored. For instance, websites selling illegal substances or weapons might be of interest for any public or private organization to monitor as outgoing traffic. In this work, we treat this as a topic classification problem. We present and evaluate a machine learning framework that takes as input a domain name (based on the respective DNS request) and outputs the content category it belongs to. We evaluate several options for feature engineering and classification to find the most appropriate setup for the specific problem domain. We also address the problem of data collection and preprocessing. While there exist several labelled datasets with malicious/non-malicious requests, a similar labelled dataset does not exist for general web content categories. We therefore propose a SERP (Search Engine Response Pages)-mining approach to collect and label an appropriate dataset. Our experimental evaluation uncovers several interesting insights and forms the basis for further work into this interesting domain.

1 citations