scispace - formally typeset
Search or ask a question
Author

Ligang He

Bio: Ligang He is an academic researcher from University of Warwick. The author has contributed to research in topics: Scheduling (computing) & Cloud computing. The author has an hindex of 24, co-authored 157 publications receiving 1967 citations. Previous affiliations of Ligang He include Hunan University & University of Cambridge.


Papers
More filters
Journal ArticleDOI
TL;DR: In this paper, a semi-asynchronous federated learning (SAFA) protocol is proposed to mitigate the impacts of straggglers, crashes and model staleness in order to boost efficiency and improve the quality of the global model.
Abstract: Federated learning (FL) has attracted increasing attention as a promising approach to driving a vast number of end devices with artificial intelligence. However, it is very challenging to guarantee the efficiency of FL considering the unreliable nature of end devices while the cost of device-server communication cannot be neglected. In this article, we propose SAFA, a semi-asynchronous FL protocol, to address the problems in federated learning such as low round efficiency and poor convergence rate in extreme conditions (e.g., clients dropping offline frequently). We introduce novel designs in the steps of model distribution, client selection and global aggregation to mitigate the impacts of stragglers, crashes and model staleness in order to boost efficiency and improve the quality of the global model. We have conducted extensive experiments with typical machine learning tasks. The results demonstrate that the proposed protocol is effective in terms of shortening federated round duration, reducing local resource wastage, and improving the accuracy of the global model at an acceptable communication cost.

150 citations

Journal ArticleDOI
TL;DR: An improved hybrid version of the CRO method called HCRO (hybrid CRO) is developed for solving the DAG-based task scheduling problem, and a new selection strategy is proposed that reduces the chance of cloning before new molecules are generated.
Abstract: Scheduling for directed acyclic graph (DAG) tasks with the objective of minimizing makespan has become an important problem in a variety of applications on heterogeneous computing platforms, which involves making decisions about the execution order of tasks and task-to-processor mapping. Recently, the chemical reaction optimization (CRO) method has proved to be very effective in many fields. In this paper, an improved hybrid version of the CRO method called HCRO (hybrid CRO) is developed for solving the DAG-based task scheduling problem. In HCRO, the CRO method is integrated with the novel heuristic approaches, and a new selection strategy is proposed. More specifically, the following contributions are made in this paper. (1) A Gaussian random walk approach is proposed to search for optimal local candidate solutions. (2) A left or right rotating shift method based on the theory of maximum Hamming distance is used to guarantee that our HCRO algorithm can escape from local optima. (3) A novel selection strategy based on the normal distribution and a pseudo-random shuffle approach are developed to keep the molecular diversity. Moreover, an exclusive-OR (XOR) operator between two strings is introduced to reduce the chance of cloning before new molecules are generated. Both simulation and real-life experiments have been conducted in this paper to verify the effectiveness of HCRO. The results show that the HCRO algorithm schedules the DAG tasks much better than the existing algorithms in terms of makespan and speed of convergence.

144 citations

Journal ArticleDOI
TL;DR: The state-of-the-art research on GPU-based graph processing is summarized, the existing challenges are analyzed in detail, and the research opportunities for the future are explored.
Abstract: In the big data era, much real-world data can be naturally represented as graphs. Consequently, many application domains can be modeled as graph processing. Graph processing, especially the processing of the large-scale graphs with the number of vertices and edges in the order of billions or even hundreds of billions, has attracted much attention in both industry and academia. It still remains a great challenge to process such large-scale graphs. Researchers have been seeking for new possible solutions. Because of the massive degree of parallelism and the high memory access bandwidth in GPU, utilizing GPU to accelerate graph processing proves to be a promising solution. This article surveys the key issues of graph processing on GPUs, including data layout, memory access pattern, workload mapping, and specific GPU programming. In this article, we summarize the state-of-the-art research on GPU-based graph processing, analyze the existing challenges in detail, and explore the research opportunities for the future.

129 citations

Journal ArticleDOI
TL;DR: The CRO scheme is used to formulate the scheduling of Directed Acyclic Graph (DAG) jobs in heterogeneous computing systems, and a Double Molecular Structure-based Chemical Reaction Optimization (DMSCRO) method is developed.

97 citations

Journal ArticleDOI
TL;DR: An estimation of the model exchange time between each client and the server is proposed, based on which a fairness guaranteed algorithm termed RBCS-F for problem-solving is designed.
Abstract: The issue of potential privacy leakage during centralized AI’s model training has drawn intensive concern from the public. A Parallel and Distributed Computing (or PDC) scheme, termed Federated Learning (FL), has emerged as a new paradigm to cope with the privacy issue by allowing clients to perform model training locally, without the necessity to upload their personal sensitive data. In FL, the number of clients could be sufficiently large, but the bandwidth available for model distribution and re-upload is quite limited, making it sensible to only involve part of the volunteers to participate in the training process. The client selection policy is critical to an FL process in terms of training efficiency, the final model’s quality as well as fairness. In this article, we will model the fairness guaranteed client selection as a Lyapunov optimization problem and then a $\mathbf {C^2MAB}$ C 2 MAB -based method is proposed for estimation of the model exchange time between each client and the server, based on which we design a fairness guaranteed algorithm termed RBCS-F for problem-solving. The regret of RBCS-F is strictly bounded by a finite constant, justifying its theoretical feasibility. Barring the theoretical results, more empirical data can be derived from our real training experiments on public datasets.

78 citations


Cited by
More filters
01 May 1993
TL;DR: Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems.
Abstract: Three parallel algorithms for classical molecular dynamics are presented. The first assigns each processor a fixed subset of atoms; the second assigns each a fixed subset of inter-atomic forces to compute; the third assigns each a fixed spatial region. The algorithms are suitable for molecular dynamics models which can be difficult to parallelize efficiently—those with short-range forces where the neighbors of each atom change rapidly. They can be implemented on any distributed-memory parallel machine which allows for message-passing of data between independently executing processors. The algorithms are tested on a standard Lennard-Jones benchmark problem for system sizes ranging from 500 to 100,000,000 atoms on several parallel supercomputers--the nCUBE 2, Intel iPSC/860 and Paragon, and Cray T3D. Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems. For large problems, the spatial algorithm achieves parallel efficiencies of 90% and a 1840-node Intel Paragon performs up to 165 faster than a single Cray C9O processor. Trade-offs between the three algorithms and guidelines for adapting them to more complex molecular dynamics simulations are also discussed.

29,323 citations

Journal ArticleDOI
01 May 1975
TL;DR: The Fundamentals of Queueing Theory, Fourth Edition as discussed by the authors provides a comprehensive overview of simple and more advanced queuing models, with a self-contained presentation of key concepts and formulae.
Abstract: Praise for the Third Edition: "This is one of the best books available. Its excellent organizational structure allows quick reference to specific models and its clear presentation . . . solidifies the understanding of the concepts being presented."IIE Transactions on Operations EngineeringThoroughly revised and expanded to reflect the latest developments in the field, Fundamentals of Queueing Theory, Fourth Edition continues to present the basic statistical principles that are necessary to analyze the probabilistic nature of queues. Rather than presenting a narrow focus on the subject, this update illustrates the wide-reaching, fundamental concepts in queueing theory and its applications to diverse areas such as computer science, engineering, business, and operations research.This update takes a numerical approach to understanding and making probable estimations relating to queues, with a comprehensive outline of simple and more advanced queueing models. Newly featured topics of the Fourth Edition include:Retrial queuesApproximations for queueing networksNumerical inversion of transformsDetermining the appropriate number of servers to balance quality and cost of serviceEach chapter provides a self-contained presentation of key concepts and formulae, allowing readers to work with each section independently, while a summary table at the end of the book outlines the types of queues that have been discussed and their results. In addition, two new appendices have been added, discussing transforms and generating functions as well as the fundamentals of differential and difference equations. New examples are now included along with problems that incorporate QtsPlus software, which is freely available via the book's related Web site.With its accessible style and wealth of real-world examples, Fundamentals of Queueing Theory, Fourth Edition is an ideal book for courses on queueing theory at the upper-undergraduate and graduate levels. It is also a valuable resource for researchers and practitioners who analyze congestion in the fields of telecommunications, transportation, aviation, and management science.

2,562 citations

01 Jan 2016
TL;DR: Thank you very much for reading input output analysis foundations and extensions, as many people have search hundreds of times for their chosen readings like this, but end up in infectious downloads.
Abstract: Thank you very much for reading input output analysis foundations and extensions. As you may know, people have search hundreds times for their chosen readings like this input output analysis foundations and extensions, but end up in infectious downloads. Rather than reading a good book with a cup of coffee in the afternoon, instead they are facing with some malicious virus inside their desktop computer.

1,316 citations

Journal ArticleDOI
TL;DR: This exhaustive literature review provides a concrete definition of Industry 4.0 and defines its six design principles such as interoperability, virtualization, local, real-time talent, service orientation and modularity.
Abstract: Manufacturing industry profoundly impact economic and societal progress. As being a commonly accepted term for research centers and universities, the Industry 4.0 initiative has received a splendid attention of the business and research community. Although the idea is not new and was on the agenda of academic research in many years with different perceptions, the term “Industry 4.0” is just launched and well accepted to some extend not only in academic life but also in the industrial society as well. While academic research focuses on understanding and defining the concept and trying to develop related systems, business models and respective methodologies, industry, on the other hand, focuses its attention on the change of industrial machine suits and intelligent products as well as potential customers on this progress. It is therefore important for the companies to primarily understand the features and content of the Industry 4.0 for potential transformation from machine dominant manufacturing to digital manufacturing. In order to achieve a successful transformation, they should clearly review their positions and respective potentials against basic requirements set forward for Industry 4.0 standard. This will allow them to generate a well-defined road map. There has been several approaches and discussions going on along this line, a several road maps are already proposed. Some of those are reviewed in this paper. However, the literature clearly indicates the lack of respective assessment methodologies. Since the implementation and applications of related theorems and definitions outlined for the 4th industrial revolution is not mature enough for most of the reel life implementations, a systematic approach for making respective assessments and evaluations seems to be urgently required for those who are intending to speed this transformation up. It is now main responsibility of the research community to developed technological infrastructure with physical systems, management models, business models as well as some well-defined Industry 4.0 scenarios in order to make the life for the practitioners easy. It is estimated by the experts that the Industry 4.0 and related progress along this line will have an enormous effect on social life. As outlined in the introduction, some social transformation is also expected. It is assumed that the robots will be more dominant in manufacturing, implanted technologies, cooperating and coordinating machines, self-decision-making systems, autonom problem solvers, learning machines, 3D printing etc. will dominate the production process. Wearable internet, big data analysis, sensor based life, smart city implementations or similar applications will be the main concern of the community. This social transformation will naturally trigger the manufacturing society to improve their manufacturing suits to cope with the customer requirements and sustain competitive advantage. A summary of the potential progress along this line is reviewed in introduction of the paper. It is so obvious that the future manufacturing systems will have a different vision composed of products, intelligence, communications and information network. This will bring about new business models to be dominant in industrial life. Another important issue to take into account is that the time span of this so-called revolution will be so short triggering a continues transformation process to yield some new industrial areas to emerge. This clearly puts a big pressure on manufacturers to learn, understand, design and implement the transformation process. Since the main motivation for finding the best way to follow this transformation, a comprehensive literature review will generate a remarkable support. This paper presents such a review for highlighting the progress and aims to help improve the awareness on the best experiences. It is intended to provide a clear idea for those wishing to generate a road map for digitizing the respective manufacturing suits. By presenting this review it is also intended to provide a hands-on library of Industry 4.0 to both academics as well as industrial practitioners. The top 100 headings, abstracts and key words (i.e. a total of 619 publications of any kind) for each search term were independently analyzed in order to ensure the reliability of the review process. Note that, this exhaustive literature review provides a concrete definition of Industry 4.0 and defines its six design principles such as interoperability, virtualization, local, real-time talent, service orientation and modularity. It seems that these principles have taken the attention of the scientists to carry out more variety of research on the subject and to develop implementable and appropriate scenarios. A comprehensive taxonomy of Industry 4.0 can also be developed through analyzing the results of this review.

1,011 citations

Journal ArticleDOI
TL;DR: To better understand CISs to support planning, maintenance and emergency decision making, modeling and simulation of interdependencies across CISs has recently become a key field of study and this paper reviews the studies in the field and broadly groups the existing modeling and Simulation approaches into six types.

891 citations