scispace - formally typeset
Search or ask a question
Book ChapterDOI

Optimal Placement of Taxis in a City Using Dominating Set Problem

29 Jan 2021-pp 111-124
TL;DR: In this article, the authors proposed a dominating set problem based solution to find a local hotspot to cover the whole city area, which can help the drivers looking for near-by next customer in the region wherever they drop their last customer.
Abstract: Mobile application based ride-hailing systems, eg, DiDi, Uber have become part of day to day life and natural choices of transport for urban commuters However, the pick-up demand in any area is not always matching with the supply or drop-off request in the same area Urban planners and researchers are working hard to balance this demand and supply situation for taxi requests The existing approaches have mainly focused on clustering of the spatial regions to identify hotspots, which refer to the locations with a high demand for pick-up requests In our study, we determined that if the hotspots focus on the clustering of high demand for pick-up requests, most of the hotspots pivot near the city center or two-three spatial regions, ignoring the other parts of the city In this work, we proposed a method, which can help in finding a local hotspot to cover the whole city area We proposed a dominating set problem based solution, which covers every part of the city This will help the drivers looking for near-by next customer in the region wherever they drop their last customer It will also reduce the waiting time for customers as well as for a driver looking for next pick-up request This would maximize their profit as well as help in improving their services
Citations
More filters
References
More filters
Proceedings ArticleDOI
01 Apr 2001
TL;DR: A set of techniques for the rank aggregation problem is developed and compared to that of well-known methods, to design rank aggregation techniques that can be used to combat spam in Web searches.
Abstract: We consider the problem of combining ranking results from various sources. In the context of the Web, the main applications include building meta-search engines, combining ranking functions, selecting documents based on multiple criteria, and improving search precision through word associations. We develop a set of techniques for the rank aggregation problem and compare their performance to that of well-known methods. A primary goal of our work is to design rank aggregation techniques that can e ectively combat \spam," a serious problem in Web searches. Experiments show that our methods are simple, e cient, and e ective.

1,982 citations

Journal ArticleDOI
TL;DR: This work almost settles a long-standing conjecture of Bang-Jensen and Thomassen and shows that unless NP⊆BPP, there is no polynomial time algorithm for the problem of minimum feedback arc set in tournaments.
Abstract: We address optimization problems in which we are given contradictory pieces of input information and the goal is to find a globally consistent solution that minimizes the extent of disagreement with the respective inputs. Specifically, the problems we address are rank aggregation, the feedback arc set problem on tournaments, and correlation and consensus clustering. We show that for all these problems (and various weighted versions of them), we can obtain improved approximation factors using essentially the same remarkably simple algorithm. Additionally, we almost settle a long-standing conjecture of Bang-Jensen and Thomassen and show that unless NP⊆BPP, there is no polynomial time algorithm for the problem of minimum feedback arc set in tournaments.

740 citations

Proceedings ArticleDOI
09 Jun 2003
TL;DR: This work proposes a novel approach to performing efficient similarity search and classification in high dimensional data and proves that with high probability, it produces a result that is a (1 + ε) factor approximation to the Euclidean nearest neighbor.
Abstract: We propose a novel approach to performing efficient similarity search and classification in high dimensional data. In this framework, the database elements are vectors in a Euclidean space. Given a query vector in the same space, the goal is to find elements of the database that are similar to the query. In our approach, a small number of independent "voters" rank the database elements based on similarity to the query. These rankings are then combined by a highly efficient aggregation algorithm. Our methodology leads both to techniques for computing approximate nearest neighbors and to a conceptually rich alternative to nearest neighbors.One instantiation of our methodology is as follows. Each voter projects all the vectors (database elements and the query) on a random line (different for each voter), and ranks the database elements based on the proximity of the projections to the projection of the query. The aggregation rule picks the database element that has the best median rank. This combination has several appealing features. On the theoretical side, we prove that with high probability, it produces a result that is a (1 + e) factor approximation to the Euclidean nearest neighbor. On the practical side, it turns out to be extremely efficient, often exploring no more than 5% of the data to obtain very high-quality results. This method is also database-friendly, in that it accesses data primarily in a pre-defined order without random accesses, and, unlike other methods for approximate nearest neighbors, requires almost no extra storage. Also, we extend our approach to deal with the k nearest neighbors.We conduct two sets of experiments to evaluate the efficacy of our methods. Our experiments include two scenarios where nearest neighbors are typically employed---similarity search and classification problems. In both cases, we study the performance of our methods with respect to several evaluation criteria, and conclude that they are uniformly excellent, both in terms of quality of results and in terms of efficiency.

442 citations

Journal ArticleDOI
01 Aug 2008
TL;DR: This paper has created baseline implementations of the most important algorithms for frequent items, and used these to perform a thorough experimental study of their properties, giving empirical evidence that there is considerable variation in the performance of frequent items algorithms.
Abstract: The frequent items problem is to process a stream of items and find all items occurring more than a given fraction of the time. It is one of the most heavily studied problems in data stream mining, dating back to the 1980s. Many applications rely directly or indirectly on finding the frequent items, and implementations are in use in large scale industrial systems. However, there has not been much comparison of the different methods under uniform experimental conditions. It is common to find papers touching on this topic in which important related work is mischaracterized, overlooked, or reinvented.In this paper, we aim to present the most important algorithms for this problem in a common framework. We have created baseline implementations of the algorithms, and used these to perform a thorough experimental study of their properties. We give empirical evidence that there is considerable variation in the performance of frequent items algorithms. The best methods can be implemented to find frequent items with high accuracy using only tens of kilobytes of memory, at rates of millions of items per second on cheap modern hardware.

334 citations

Proceedings ArticleDOI
27 Jun 2006
TL;DR: This paper presents two processing techniques: the first one computes the new answer of a query whenever some of the current top-k points expire; the second one partially pre-computes the future changes in the result, achieving better running time at the expense of slightly higher space requirements.
Abstract: Given a dataset P and a preference function f, a top-k query retrieves the k tuples in P with the highest scores according to f. Even though the problem is well-studied in conventional databases, the existing methods are inapplicable to highly dynamic environments involving numerous long-running queries. This paper studies continuous monitoring of top-k queries over a fixed-size window W of the most recent data. The window size can be expressed either in terms of the number of active tuples or time units. We propose a general methodology for top-k monitoring that restricts processing to the sub-domains of the workspace that influence the result of some query. To cope with high stream rates and provide fast answers in an on-line fashion, the data in W reside in main memory. The valid records are indexed by a grid structure, which also maintains book-keeping information. We present two processing techniques: the first one computes the new answer of a query whenever some of the current top-k points expire; the second one partially pre-computes the future changes in the result, achieving better running time at the expense of slightly higher space requirements. We analyze the performance of both algorithms and evaluate their efficiency through extensive experiments. Finally, we extend the proposed framework to other query types and a different data stream model.

261 citations