Author
Y. Maeda
Bio: Y. Maeda is an academic researcher from Osaka Electro-Communication University. The author has contributed to research in topics: Adaptive learning & Autonomous agent. The author has an hindex of 1, co-authored 1 publications receiving 13 citations.
Papers
More filters
07 Aug 2002
TL;DR: This research proposes a modified Q-learning method where the reward values are tuned according to its state and can deal with multiple purposes in the continuous state space by using fuzzy reasoning.
Abstract: Reinforcement learning method can be considered as an adaptive learning method for autonomous agents. It is important to balance between searching behavior of the unknown knowledge and using behavior of the obtained knowledge. However, the learning is not always efficient in every searching stage because of constant learning parameters in the ordinary Q-learning. For this problem, we have already proposed an adaptive Q-learning method with learning parameters tuned by fuzzy rules. Furthermore, it is hard to deal with the continuous states and behaviors in the ordinary reinforcement learning method. It is also difficult to learn the problem with multiple purposes. Therefore, in this research, we propose a modified Q-learning method where the reward values are tuned according to its state and can deal with multiple purposes in the continuous state space by using fuzzy reasoning. We also report some results for the simulation of object chase agents by using this method.
13 citations
Cited by
More filters
TL;DR: In this paper, the authors present a framework for the performance analysis of transmission scheduling with the QoS support along with the issues involved in short data packet transmission in the mMTC scenario and provide a detailed overview of the existing and emerging solutions toward addressing RAN congestion problem.
Abstract: The ever-increasing number of resource-constrained machine-type communication (MTC) devices is leading to the critical challenge of fulfilling diverse communication requirements in dynamic and ultra-dense wireless environments. Among different application scenarios that the upcoming 5G and beyond cellular networks are expected to support, such as enhanced mobile broadband (eMBB), massive machine type communications (mMTCs), and ultra-reliable and low latency communications (URLLCs), the mMTC brings the unique technical challenge of supporting a huge number of MTC devices in cellular networks, which is the main focus of this paper. The related challenges include quality of service (QoS) provisioning, handling highly dynamic and sporadic MTC traffic, huge signalling overhead, and radio access network (RAN) congestion. In this regard, this paper aims to identify and analyze the involved technical issues, to review recent advances, to highlight potential solutions and to propose new research directions. First, starting with an overview of mMTC features and QoS provisioning issues, we present the key enablers for mMTC in cellular networks. Along with the highlights on the inefficiency of the legacy random access (RA) procedure in the mMTC scenario, we then present the key features and channel access mechanisms in the emerging cellular IoT standards, namely, LTE-M and narrowband IoT (NB-IoT). Subsequently, we present a framework for the performance analysis of transmission scheduling with the QoS support along with the issues involved in short data packet transmission. Next, we provide a detailed overview of the existing and emerging solutions toward addressing RAN congestion problem, and then identify potential advantages, challenges, and use cases for the applications of emerging machine learning (ML) techniques in ultra-dense cellular networks. Out of several ML techniques, we focus on the application of low-complexity $Q$ -learning approach in the mMTC scenario along with the recent advances toward enhancing its learning performance and convergence. Finally, we discuss some open research challenges and promising future research directions.
290 citations
01 Jun 2004
TL;DR: A dynamic fuzzy Q-learning method that is capable of tuning fuzzy inference systems (FIS) online and a novel online self-organizing learning algorithm is developed so that structure and parameters identification are accomplished automatically and simultaneously based only on Q- learning.
Abstract: This paper presents a dynamic fuzzy Q-learning (DFQL) method that is capable of tuning fuzzy inference systems (FIS) online. A novel online self-organizing learning algorithm is developed so that structure and parameters identification are accomplished automatically and simultaneously based only on Q-learning. Self-organizing fuzzy inference is introduced to calculate actions and Q-functions so as to enable us to deal with continuous-valued states and actions. Fuzzy rules provide a natural mean of incorporating the bias components for rapid reinforcement learning. Experimental results and comparative studies with the fuzzy Q-learning (FQL) and continuous-action Q-learning in the wall-following task of mobile robots demonstrate that the proposed DFQL method is superior.
142 citations
TL;DR: A simple ant experiment shows that Q-learning is more effective than the traditional techniques, and it is also successfully applied to the learning of the cooperative strategy.
Abstract: The objective of this paper is to develop a self-learning cooperative strategy for robot soccer systems. The strategy enables robots to cooperate and coordinate with each other to achieve the objectives of offense and defense. Through the mechanism of learning, the robots can learn from experiences in either successes or failures, and utilize these experiences to improve the performance gradually. The cooperative strategy is built using a hierarchical architecture. The first layer of the structure is responsible for assigning each role, that is, how many defenders and sidekicks should be played according to the positional states. The second layer is for the role assignment related to the decision from the previous layer. We develop two algorithms for assignment of the roles, the attacker, the defenders, and the sidekicks. The last layer is the behavior layer in which robots execute their behavior commands and tasks based on their roles. The attacker is responsible for chasing the ball and attacking. The sidekicks are responsible for finding good positions, and the defenders are responsible for defending competitor scoring. The robots' roles are not fixed. They can dynamically exchange their roles with each other. In the aspect of learning, we develop an adaptive Q-learning method which is modified form the traditional Q-learning. A simple ant experiment shows that Q-learning is more effective than the traditional techniques, and it is also successfully applied to the learning of the cooperative strategy.
62 citations
TL;DR: This paper proposes an amalgamated framework, AIBFC-FSQL, which is capable of learning human behavior patterns in a nonsupervised manner and predicting subsequent human actions and outperforms several well-known methods.
Abstract: In designing autonomous service systems such as assistive robots for the aged and the disabled, discovery and prediction of human actions are important and often crucial. Patterns of human behavior, however, involve ambiguity, uncertainty, complexity, and inconsistency caused by physical, logical, and emotional factors, and thus their modeling and recognition are known to be difficult. In this paper, a nonsupervised learning framework of human behavior patterns is suggested in consideration of human behavioral characteristics. Our approach consists of two steps. In the first step, a meaningful structure of data is discovered by using Agglomerative Iterative Bayesian Fuzzy Clustering (AIBFC) with a newly proposed cluster validity index. In the second step, the sequence of actions is learned on the basis of the structure discovered in the first step and by utilizing the proposed Fuzzy-state Q--learning (FSQL) process. These two learning steps are incorporated in an amalgamated framework, AIBFC-FSQL, which is capable of learning human behavior patterns in a nonsupervised manner and predicting subsequent human actions. Through a number of simulations with typical benchmark data sets, we show that the proposed learning method outperforms several well-known methods. We further conduct experiments with two challenging real-world databases to demonstrate its usefulness from a practical perspective.
24 citations
21 Jun 2007
TL;DR: The present paper analyzes prerequisites for user-centred prediction of future context and presents an algorithm for autonomous context recognition and prediction, based on the proposed Fuzzy-State Q- Learning technique as well as on some established methods for data-based prediction.
Abstract: In an Assistive Eenvironment (AE), explicit/obtrusive interfaces for human/computer interaction can demand exclusive user attention and, often, replacement of them with implicit ones embedded into real-world artifacts for intuitive and unobtrusive use is desirable. As a part of solution, Context Aware can be utilized to recognize current context situation from a combination of low-level sensed contexts. Assuming the current context recognized, this paper tackles the next logical step of "the prediction of future contexts". This information allows the system to know patterns and their interrelations in user behaviour, which are not apparent at the lower levels of raw sensor data. The present paper analyzes prerequisites for user-centred prediction of future context and presents an algorithm for autonomous context recognition and prediction, based on our proposed Fuzzy-State Q- Learning technique as well as on some established methods for data-based prediction.
12 citations