Author
Keqin Li
Other affiliations: Hunan University, Pace University, University of Grenoble ...read more
Bio: Keqin Li is an academic researcher from State University of New York System. The author has contributed to research in topics: Cloud computing & Scheduling (computing). The author has an hindex of 56, co-authored 741 publications receiving 14178 citations. Previous affiliations of Keqin Li include Hunan University & Pace University.
Papers published on a yearly basis
Papers
More filters
TL;DR: A comprehensive survey on NFV is presented, which starts from the introduction of NFV motivations, and provides an extensive and in-depth discussion on state-of-the-art VNF algorithms including VNF placement, scheduling, migration, chaining and multicast.
Abstract: Today’s networks are filled with a massive and ever-growing variety of network functions that coupled with proprietary devices, which leads to network ossification and difficulty in network management and service provision. Network Function Virtualization (NFV) is a promising paradigm to change such situation by decoupling network functions from the underlying dedicated hardware and realizing them in the form of software, which are referred to as Virtual Network Functions (VNFs). Such decoupling introduces many benefits which include reduction of Capital Expenditure (CAPEX) and Operation Expense (OPEX), improved flexibility of service provision, etc. In this paper, we intend to present a comprehensive survey on NFV, which starts from the introduction of NFV motivations. Then, we explain the main concepts of NFV in terms of terminology, standardization and history, and how NFV differs from traditional middle-box based network. After that, the standard NFV architecture is introduced using a bottom up approach, based on which the corresponding use cases and solutions are also illustrated. In addition, due to the decoupling of network functionalities and hardware, people’s attention is gradually shifted to the VNFs. Next, we provide an extensive and in-depth discussion on state-of-the-art VNF algorithms including VNF placement, scheduling, migration, chaining and multicast. Finally, to accelerate the NFV deployment and avoid pitfalls as far as possible, we survey the challenges faced by NFV and the trend for future directions. In particular, the challenges are discussed from bottom up, which include hardware design, VNF deployment, VNF life cycle control, service chaining, performance evaluation, policy enforcement, energy efficiency, reliability and security, and the future directions are discussed around the current trend towards network softwarization.
361 citations
TL;DR: In this paper, a Parallel Random Forest (PRF) algorithm for big data on the Apache Spark platform is presented. And the PRF algorithm is optimized based on a hybrid approach combining dataparallel and task-parallel optimization, and a dual parallel approach is carried out in the training process of RF and a task Directed Acyclic Graph (DAG) is created according to the parallel training process.
Abstract: With the emergence of the big data age, the issue of how to obtain valuable knowledge from a dataset efficiently and accurately has attracted increasingly attention from both academia and industry. This paper presents a Parallel Random Forest (PRF) algorithm for big data on the Apache Spark platform. The PRF algorithm is optimized based on a hybrid approach combining data-parallel and task-parallel optimization. From the perspective of data-parallel optimization, a vertical data-partitioning method is performed to reduce the data communication cost effectively, and a data-multiplexing method is performed is performed to allow the training dataset to be reused and diminish the volume of data. From the perspective of task-parallel optimization, a dual parallel approach is carried out in the training process of RF, and a task Directed Acyclic Graph (DAG) is created according to the parallel training process of PRF and the dependence of the Resilient Distributed Datasets (RDD) objects. Then, different task schedulers are invoked for the tasks in the DAG. Moreover, to improve the algorithm's accuracy for large, high-dimensional, and noisy data, we perform a dimension-reduction approach in the training process and a weighted voting approach in the prediction process prior to parallelization. Extensive experimental results indicate the superiority and notable advantages of the PRF algorithm over the relevant algorithms implemented by Spark MLlib and other studies in terms of the classification accuracy, performance, and scalability. With the expansion of the scale of the random forest model and the Spark cluster, the advantage of the PRF algorithm is more obvious.
308 citations
TL;DR: The experimental results for large-sized problems from a large set of randomly generated graphs as well as graphs of real-world problems with various characteristics show that the proposed MPQGA algorithm outperforms two non-evolutionary heuristics and a random search method in terms of schedule quality.
Abstract: On parallel and distributed heterogeneous computing systems, a heuristic-based task scheduling algorithm typically consists of two phases: task prioritization and processor selection. In a heuristic based task scheduling algorithm, different prioritization will produce different makespan on a heterogeneous computing system. Therefore, a good scheduling algorithm should be able to efficiently assign a priority to each subtask depending on the resources needed to minimize makespan. In this paper, a task scheduling scheme on heterogeneous computing systems using a multiple priority queues genetic algorithm (MPQGA) is proposed. The basic idea of our approach is to exploit the advantages of both evolutionary-based and heuristic-based algorithms while avoiding their drawbacks. The proposedalgorithm incorporates a genetic algorithm (GA) approach to assign a priority to each subtask while using a heuristic-based earliest finish time (EFT) approach to search for a solution for the task-to-processor mapping. The MPQGA method also designs crossover, mutation, and fitness function suitable for the scenario of directed acyclic graph (DAG) scheduling. The experimental results for large-sized problems from a large set of randomly generated graphs as well as graphs of real-world problems with various characteristics show that the proposed MPQGA algorithm outperforms two non-evolutionary heuristics and a random search method in terms of schedule quality.
305 citations
01 Mar 2016
TL;DR: Based on the amount of randomly generated DAGs workflows, the experimental results show that DEWTS can reduce the total power consumption by up to 46.5 % for various parallel applications as well as balance the scheduling performance.
Abstract: The growth of energy consumption has been explosive in current data centers, super computers, and public cloud systems. This explosion has led to greater advocacy of green computing, and many efforts and works focus on the task scheduling in order to reduce energy dissipation. In order to obtain more energy reduction as well as maintain the quality of service by meeting the deadlines, this paper proposes a DVFS-enabled Energy-efficient Workflow Task Scheduling algorithm: DEWTS. Through merging the relatively inefficient processors by reclaiming the slack time, DEWTS can leverage the useful slack time recurrently after severs are merged. DEWTS firstly calculates the initial scheduling order of all tasks, and obtains the whole makespan and deadline based on Heterogeneous-Earliest-Finish-Time (HEFT) algorithm. Through resorting the processors with their running task number and energy utilization, the underutilized processors can be merged by closing the last node and redistributing the assigned tasks on it. Finally, in the task slacking phase, the tasks can be distributed in the idle slots under a lower voltage and frequency using DVFS technique, without violating the dependency constraints and increasing the slacked makespan. Based on the amount of randomly generated DAGs workflows, the experimental results show that DEWTS can reduce the total power consumption by up to 46.5 % for various parallel applications as well as balance the scheduling performance.
226 citations
TL;DR: It is proved that the task offloading scheduling problem is NP-hard, and centralized and distributed Greedy Maximal Scheduling algorithms are introduced to resolve the problem efficiently.
Abstract: Mobile Edge Cloud Computing (MECC) has becoming an attractive solution for augmenting the computing and storage capacity of Mobile Devices (MDs) by exploiting the available resources at the network edge. In this work, we consider computation offloading at the mobile edge cloud that is composed of a set of Wireless Devices (WDs), and each WD has an energy harvesting equipment to collect renewable energy from the environment. Moreover, multiple MDs intend to offload their tasks to the mobile edge cloud simultaneously. We first formulate the multi-user multi-task computation offloading problem for green MECC, and use Lyaponuv Optimization Approach to determine the energy harvesting policy: how much energy to be harvested at each WD; and the task offloading schedule: the set of computation offloading requests to be admitted into the mobile edge cloud, the set of WDs assigned to each admitted offloading request, and how much workload to be processed at the assigned WDs. We then prove that the task offloading scheduling problem is NP-hard, and introduce centralized and distributed Greedy Maximal Scheduling algorithms to resolve the problem efficiently. Performance bounds of the proposed schemes are also discussed. Extensive evaluations are conducted to test the performance of the proposed algorithms.
200 citations
Cited by
More filters
[...]
TL;DR: There is, I think, something ethereal about i —the square root of minus one, which seems an odd beast at that time—an intruder hovering on the edge of reality.
Abstract: There is, I think, something ethereal about i —the square root of minus one. I remember first hearing about it at school. It seemed an odd beast at that time—an intruder hovering on the edge of reality.
Usually familiarity dulls this sense of the bizarre, but in the case of i it was the reverse: over the years the sense of its surreal nature intensified. It seemed that it was impossible to write mathematics that described the real world in …
33,785 citations
01 May 1993
TL;DR: Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems.
Abstract: Three parallel algorithms for classical molecular dynamics are presented. The first assigns each processor a fixed subset of atoms; the second assigns each a fixed subset of inter-atomic forces to compute; the third assigns each a fixed spatial region. The algorithms are suitable for molecular dynamics models which can be difficult to parallelize efficiently—those with short-range forces where the neighbors of each atom change rapidly. They can be implemented on any distributed-memory parallel machine which allows for message-passing of data between independently executing processors. The algorithms are tested on a standard Lennard-Jones benchmark problem for system sizes ranging from 500 to 100,000,000 atoms on several parallel supercomputers--the nCUBE 2, Intel iPSC/860 and Paragon, and Cray T3D. Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems. For large problems, the spatial algorithm achieves parallel efficiencies of 90% and a 1840-node Intel Paragon performs up to 165 faster than a single Cray C9O processor. Trade-offs between the three algorithms and guidelines for adapting them to more complex molecular dynamics simulations are also discussed.
29,323 citations
Journal Article•
28,685 citations