scispace - formally typeset
Search or ask a question
Author

Kao-Shing Hwang

Bio: Kao-Shing Hwang is an academic researcher from National Sun Yat-sen University. The author has contributed to research in topics: Reinforcement learning & Robot. The author has an hindex of 22, co-authored 206 publications receiving 1713 citations. Previous affiliations of Kao-Shing Hwang include Northwestern University & National Chiao Tung University.


Papers
More filters
Journal ArticleDOI
TL;DR: The results of simulations and experiments on control of UAVs demonstrate that the proposed IBVS method with Q-Learning has better properties in stability and convergence than the competing methods.
Abstract: The objective of visual servoing aims to control an object's motion with visual feedbacks and becomes popular recently. Problems of complex modeling and instability always exist in visual servoing methods. Moreover, there are few research works on selection of the servoing gain in image-based visual servoing (IBVS) methods. This paper proposes an IBVS method with Q -Learning, where the learning rate is adjusted by a fuzzy system. Meanwhile, a synthetic preprocess is introduced to perform feature extraction. The extraction method is actually a combination of a color-based recognition algorithm and an improved contour-based recognition algorithm. For dealing with underactuated dynamics of the unmanned aerial vehicles (UAVs), a decoupled controller is designed, where the velocity and attitude are decoupled through attenuating the effects of underactuation in roll and pitch and two independent servoing gains, for linear and angular motion servoing, respectively, are designed in place of single servoing gain in traditional methods. For further improvement in convergence and stability, a reinforcement learning method, Q -Learning, is taken for adaptive servoing gain adjustment. The Q -Learning is composed of two independent learning agents for adjusting two serving gains, respectively. In order to improve the performance of the Q -Learning, a fuzzy-based method is proposed for tuning the learning rate. The results of simulations and experiments on control of UAVs demonstrate that the proposed method has better properties in stability and convergence than the competing methods.

90 citations

Journal ArticleDOI
TL;DR: In this article, a mixed-integer hybrid differential evolution (MIHDE) is developed to deal with the mixed integer optimization problems, which contains the migration operation to avoid candidate individuals clustering together.
Abstract: In this paper, mixed-integer hybrid differential evolution (MIHDE) is developed to deal with the mixed-integer optimization problems. This hybrid algorithm contains the migration operation to avoid candidate individuals clustering together. We introduce the population diversity measure to inspect when the migration operation should be performed so that the user can use a smaller population size to obtain a global solution. A mixed coding representation and a rounding operation are introduced in MIHDE so that the hybrid algorithm is not only used to solve the mixed-integer nonlinear optimization problems, but also used to solve the real and integer nonlinear optimization problems. Some numerical examples are tested to illustrate the performance of the proposed algorithm. Numerical examples show that the proposed algorithm converges to better solutions than the conventional genetic algorithms.

87 citations

Journal ArticleDOI
01 Oct 1998
TL;DR: A neuro fuzzy system which is embedded in the conventional control theory and based on the concept of the self organizing fuzzy cerebellar model articulation controller and adaptive heuristic critic is proposed to tackle physical learning control problems.
Abstract: A neuro fuzzy system which is embedded in the conventional control theory is proposed to tackle physical learning control problems. The control scheme is composed of two elements. The first element, the fuzzy sliding mode controller (FSMC), is used to drive the state variables to a specific switching hyperplane or a desired trajectory. The second one is developed based on the concept of the self organizing fuzzy cerebellar model articulation controller (FCMAC) and adaptive heuristic critic (AHC). Both compose a forward compensator to reduce the chattering effect or cancel the influence of system uncertainties. A geometrical explanation on how the FCMAC algorithm works is provided and some refined procedures of the AHC are presented as well. Simulations on smooth motion of a three-link robot is given to illustrate the performance and applicability of the proposed control scheme.

80 citations

Journal ArticleDOI
TL;DR: An end-to-end navigation planner that translates sparse laser ranging results into movement actions and achieves map-less navigation in complex environments through a reward signal that is enhanced by intrinsic motivation, the agent explores more efficiently, and the learned strategy is more reliable.
Abstract: In this article, we develop a navigation strategy based on deep reinforcement learning (DRL) for mobile robots. Because of the large difference between simulation and reality, most of the trained DRL models cannot be directly migrated into real robots. Moreover, how to explore in a sparsely rewarded environment is also a long-standing problem of DRL. This article proposes an end-to-end navigation planner that translates sparse laser ranging results into movement actions. Using this highly abstract data as input, agents trained by simulation can be extended to the real scene for practical application. For map-less navigation across obstacles and traps, it is difficult to reach the target via random exploration. Curiosity is used to encourage agents to explore the state of an environment that has not been visited and as an additional reward for exploring behavior. The agent relies on the self-supervised model to predict the next state, based on the current state and the executed action. The prediction error is used as a measure of curiosity. The experimental results demonstrate that without any manual design features and previous demonstrations, the proposed method accomplishes map-less navigation in complex environments. Through a reward signal that is enhanced by intrinsic motivation, the agent explores more efficiently, and the learned strategy is more reliable.

75 citations

Journal ArticleDOI
TL;DR: A simple ant experiment shows that Q-learning is more effective than the traditional techniques, and it is also successfully applied to the learning of the cooperative strategy.
Abstract: The objective of this paper is to develop a self-learning cooperative strategy for robot soccer systems. The strategy enables robots to cooperate and coordinate with each other to achieve the objectives of offense and defense. Through the mechanism of learning, the robots can learn from experiences in either successes or failures, and utilize these experiences to improve the performance gradually. The cooperative strategy is built using a hierarchical architecture. The first layer of the structure is responsible for assigning each role, that is, how many defenders and sidekicks should be played according to the positional states. The second layer is for the role assignment related to the decision from the previous layer. We develop two algorithms for assignment of the roles, the attacker, the defenders, and the sidekicks. The last layer is the behavior layer in which robots execute their behavior commands and tasks based on their roles. The attacker is responsible for chasing the ball and attacking. The sidekicks are responsible for finding good positions, and the defenders are responsible for defending competitor scoring. The robots' roles are not fixed. They can dynamically exchange their roles with each other. In the aspect of learning, we develop an adaptive Q-learning method which is modified form the traditional Q-learning. A simple ant experiment shows that Q-learning is more effective than the traditional techniques, and it is also successfully applied to the learning of the cooperative strategy.

62 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

01 Nov 1981
TL;DR: In this paper, the authors studied the effect of local derivatives on the detection of intensity edges in images, where the local difference of intensities is computed for each pixel in the image.
Abstract: Most of the signal processing that we will study in this course involves local operations on a signal, namely transforming the signal by applying linear combinations of values in the neighborhood of each sample point. You are familiar with such operations from Calculus, namely, taking derivatives and you are also familiar with this from optics namely blurring a signal. We will be looking at sampled signals only. Let's start with a few basic examples. Local difference Suppose we have a 1D image and we take the local difference of intensities, DI(x) = 1 2 (I(x + 1) − I(x − 1)) which give a discrete approximation to a partial derivative. (We compute this for each x in the image.) What is the effect of such a transformation? One key idea is that such a derivative would be useful for marking positions where the intensity changes. Such a change is called an edge. It is important to detect edges in images because they often mark locations at which object properties change. These can include changes in illumination along a surface due to a shadow boundary, or a material (pigment) change, or a change in depth as when one object ends and another begins. The computational problem of finding intensity edges in images is called edge detection. We could look for positions at which DI(x) has a large negative or positive value. Large positive values indicate an edge that goes from low to high intensity, and large negative values indicate an edge that goes from high to low intensity. Example Suppose the image consists of a single (slightly sloped) edge:

1,829 citations

Journal ArticleDOI
TL;DR: Numerical results show that, among the algorithms considered in this study, the most efficient additional components in a DE framework appear to be the population size reduction and the scale factor local search.
Abstract: Differential Evolution (DE) is a simple and efficient optimizer, especially for continuous optimization. For these reasons DE has often been employed for solving various engineering problems. On the other hand, the DE structure has some limitations in the search logic, since it contains too narrow a set of exploration moves. This fact has inspired many computer scientists to improve upon DE by proposing modifications to the original algorithm. This paper presents a survey on DE and its recent advances. A classification, into two macro-groups, of the DE modifications is proposed here: (1) algorithms which integrate additional components within the DE structure, (2) algorithms which employ a modified DE structure. For each macro-group, four algorithms representative of the state-of-the-art in DE, have been selected for an in depth description of their working principles. In order to compare their performance, these eight algorithm have been tested on a set of benchmark problems. Experiments have been repeated for a (relatively) low dimensional case and a (relatively) high dimensional case. The working principles, differences and similarities of these recently proposed DE-based algorithms have also been highlighted throughout the paper. Although within both macro-groups, it is unclear whether there is a superiority of one algorithm with respect to the others, some conclusions can be drawn. At first, in order to improve upon the DE performance a modification which includes some additional and alternative search moves integrating those contained in a standard DE is necessary. These extra moves should assist the DE framework in detecting new promising search directions to be used by DE. Thus, a limited employment of these alternative moves appears to be the best option in successfully assisting DE. The successful extra moves are obtained in two ways: an increase in the exploitative pressure and the introduction of some randomization. This randomization should not be excessive though, since it would jeopardize the search. A proper increase in the randomization is crucial for obtaining significant improvements in the DE functioning. Numerical results show that, among the algorithms considered in this study, the most efficient additional components in a DE framework appear to be the population size reduction and the scale factor local search. Regarding the modified DE structures, the global and local neighborhood search and self-adaptive control parameter scheme, recently proposed in literature, seem to be the most promising modifications.

884 citations

Journal ArticleDOI
TL;DR: In this paper, the basic strategies towards charged and non-charged iridium(III) complexes are summarized, and a wide range of assemblies are discussed, with special emphasis on the latter with respect to synthesis, characterization, electro-optical properties, processing technologies, and performance.
Abstract: The recent developments in using iridium(III) complexes as phosphorescent emitters in electroluminescent devices, such as (white) organic light-emitting diodes and light-emitting electrochemical cells, are discussed. Additionally, applications in the emerging fields of molecular sensors, biolabeling, and photocatalysis are briefly evaluated. The basic strategies towards charged and non-charged iridium(III) complexes are summarized, and a wide range of assemblies is discussed. Small-molecule- and polymer-based materials are under intense investigation as emissive systems in electroluminescent devices, and special emphasis is placed on the latter with respect to synthesis, characterization, electro-optical properties, processing technologies, and performance.

682 citations