scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A Q-Learning Scheme for Fair Coexistence Between LTE and Wi-Fi in Unlicensed Spectrum

15 May 2018-IEEE Access (Institute of Electrical and Electronics Engineers (IEEE))-Vol. 6, pp 27278-27293
TL;DR: The system model of the mLTE-U scheme in coexistence with Wi-Fi is studied and it is enhanced with a Q-learning technique that is used for autonomous selection of the appropriate combinations of TXOP and muting period that can provide fair coexistence between co-located mLte-U andWi-Fi networks.
Abstract: During the last years, the growth of wireless traffic pushed the wireless community to search for solutions that can assist in a more efficient management of the spectrum. Toward this direction, the operation of long term evolution (LTE) in unlicensed spectrum (LTE-U) has been proposed. Targeting a global solution that respects the regional regulations worldwide, 3GPP has published the LTE licensed assisted access (LAA) standard. According to LTE LAA, a listen before talk (LBT) procedure must precede any LTE transmission burst in the unlicensed spectrum. However, the proposed standard may cause coexistence issues between LTE and Wi-Fi, especially in the case that the latter does not use frame aggregation. Toward the provision of a balanced channel access, we have proposed mLTE-U that is an adaptive LTE LBT scheme. According to mLTE-U, LTE uses a variable transmission opportunity (TXOP), followed by a variable muting period. This muting period can be exploited by co-located Wi-Fi networks to gain access to the medium. In this paper, the system model of the mLTE-U scheme in coexistence with Wi-Fi is studied. In addition, mLTE-U is enhanced with a Q-learning technique that is used for autonomous selection of the appropriate combinations of TXOP and muting period that can provide fair coexistence between co-located mLTE-U and Wi-Fi networks. Simulation results showcase the performance of the proposed model and reveal the benefit of using Q-learning for self-adaptation of mLTE-U to the changes of the dynamic wireless environment, toward fair coexistence with Wi-Fi. Finally, the Q-learning mechanism is compared with conventional selection schemes showing the superior performance of the proposed model over less complex mechanisms.
Citations
More filters
Journal ArticleDOI
TL;DR: The fundamental concepts of supervised, unsupervised, and reinforcement learning are established, taking a look at what has been done so far in the adoption of ML in the context of mobile and wireless communication, and the promising approaches for how ML can contribute to supporting each target 5G network requirement are discussed.
Abstract: Driven by the demand to accommodate today’s growing mobile traffic, 5G is designed to be a key enabler and a leading infrastructure provider in the information and communication technology industry by supporting a variety of forthcoming services with diverse requirements. Considering the ever-increasing complexity of the network, and the emergence of novel use cases such as autonomous cars, industrial automation, virtual reality, e-health, and several intelligent applications, machine learning (ML) is expected to be essential to assist in making the 5G vision conceivable. This paper focuses on the potential solutions for 5G from an ML-perspective. First, we establish the fundamental concepts of supervised, unsupervised, and reinforcement learning, taking a look at what has been done so far in the adoption of ML in the context of mobile and wireless communication, organizing the literature in terms of the types of learning. We then discuss the promising approaches for how ML can contribute to supporting each target 5G network requirement, emphasizing its specific use cases and evaluating the impact and limitations they have on the operation of the network. Lastly, this paper investigates the potential features of Beyond 5G (B5G), providing future research directions for how ML can contribute to realizing B5G. This article is intended to stimulate discussion on the role that ML can play to overcome the limitations for a wide deployment of autonomous 5G/B5G mobile and wireless communications.

249 citations


Cites background or methods from "A Q-Learning Scheme for Fair Coexis..."

  • ...In [80], Q-learning is applied for the fair coexistence between LTE andWi-Fi in the unlicensed spectrum....

    [...]

  • ...The learning approach that accounts for the coexistence of LTE and LTE-U to model the resource allocation problem in LTE-U small stations (SBS), has been studied in [80], [95]....

    [...]

Proceedings ArticleDOI
25 Jun 2019
TL;DR: An end-to-end automatic CDB tuning system, CDBTune, using deep reinforcement learning (RL), which enables end- to-end learning and accelerates the convergence speed of the model and improves efficiency of online tuning.
Abstract: Configuration tuning is vital to optimize the performance of database management system (DBMS). It becomes more tedious and urgent for cloud databases (CDB) due to the diverse database instances and query workloads, which make the database administrator (DBA) incompetent. Although there are some studies on automatic DBMS configuration tuning, they have several limitations. Firstly, they adopt a pipelined learning model but cannot optimize the overall performance in an end-to-end manner. Secondly, they rely on large-scale high-quality training samples which are hard to obtain. Thirdly, there are a large number of knobs that are in continuous space and have unseen dependencies, and they cannot recommend reasonable configurations in such high-dimensional continuous space. Lastly, in cloud environment, they can hardly cope with the changes of hardware configurations and workloads, and have poor adaptability. To address these challenges, we design an end-to-end automatic CDB tuning system, CDBTune, using deep reinforcement learning (RL). CDBTune utilizes the deep deterministic policy gradient method to find the optimal configurations in high-dimensional continuous space. CDBTune adopts a try-and-error strategy to learn knob settings with a limited number of samples to accomplish the initial training, which alleviates the difficulty of collecting massive high-quality samples. CDBTune adopts the reward-feedback mechanism in RL instead of traditional regression, which enables end-to-end learning and accelerates the convergence speed of our model and improves efficiency of online tuning. We conducted extensive experiments under 6 different workloads on real cloud databases to demonstrate the superiority of CDBTune. Experimental results showed that CDBTune had a good adaptability and significantly outperformed the state-of-the-art tuning tools and DBA experts.

197 citations


Cites background or methods from "A Q-Learning Scheme for Fair Coexis..."

  • ...As a result, applying Q-Learning to database configuration tuning is impractical....

    [...]

  • ...According to Q-Learning algorithm, Vt+1 is multiplied by discount factor γ and added by the value of reward at time t , and now we can estimate the value of V ′t of the current state st ....

    [...]

  • ...Q-Learning is effective in a relatively small state space....

    [...]

  • ...Q-Learning....

    [...]

  • ...Nevertheless, DQN still adopts Q-Learning to update Q-value, so we can describe the relationship between them as follows: Q(s,a,ω) → Q(s,a) where ω of Q(s,a,ω) represents the weights of neural network in DQN....

    [...]

Journal ArticleDOI
01 Aug 2019
TL;DR: A query-aware database tuning system QTune with a deep reinforcement learning (DRL) model, which can efficiently and effectively tune the database configurations based on both the query vector and database states, and which outperforms the state-of-the-art tuning methods.
Abstract: Database knob tuning is important to achieve high performance (e.g., high throughput and low latency). However, knob tuning is an NP-hard problem and existing methods have several limitations. First, DBAs cannot tune a lot of database instances on different environments (e.g., different database vendors). Second, traditional machine-learning methods either cannot find good configurations or rely on a lot of high-quality training examples which are rather hard to obtain. Third, they only support coarse-grained tuning (e.g., workload-level tuning) but cannot provide fine-grained tuning (e.g., query-level tuning).To address these problems, we propose a query-aware database tuning system QTune with a deep reinforcement learning (DRL) model, which can efficiently and effectively tune the database configurations. QTune first featurizes the SQL queries by considering rich features of the SQL queries. Then QTune feeds the query features into the DRL model to choose suitable configurations. We propose a Double-State Deep Deterministic Policy Gradient (DS-DDPG) model to enable query-aware database configuration tuning, which utilizes the actor-critic networks to tune the database configurations based on both the query vector and database states. QTune provides three database tuning granularities: query-level, workload-level, and cluster-level tuning. We deployed our techniques onto three real database systems, and experimental results show that QTune achieves high performance and outperforms the state-of-the-art tuning methods.

150 citations


Cites methods from "A Q-Learning Scheme for Fair Coexis..."

  • ...Note that existing DRL models [16, 19, 12] cannot utilize the query features as they ignore the effects to the environment state from the query, and we propose a Double-State Deep Deterministic Policy Gradient (DS-DDPG) model to enable query-aware tuning....

    [...]

Journal ArticleDOI
TL;DR: An intelligent algorithm for network slices is proposed based on the Monte Carlo tree search in terms of a new metric cross entropy, which is able to allocate the resource allocation for the match of traffic load in the time-space domain.
Abstract: Modern transportation systems are facing a sharp alteration since the Internet of Vehicles (IoV) has activated intense information exchange among vehicles, infrastructure, and pedestrians. Existing approaches fail in efficiently handling the heterogeneous network traffic because of the complicated network environment and dynamic vehicle density. Recently, the fog-radio access network with network slicing has emerged as a promising solution to fulfill the demands of the maldistributed network traffic. However, available fog resources as well as network traffic are all dynamic and unpredictable due to high mobility of vehicles, which results in weak resource utilization. To address this problem, we propose a smart slice scheduling scheme in vehicular fog radio access networks. This scheduling scheme is formed as a Markov decision process. Accordingly, an intelligent algorithm for network slices is proposed based on the Monte Carlo tree search in terms of a new metric cross entropy, which is able to allocate the resource allocation for the match of traffic load in the time-space domain. This slice scheduling algorithm does not require any prior knowledge of the network traffic. Furthermore, this paper first reveals the relationship between road traffic and the IoV resource based on the metric perception-reaction time. A collaborative scheduling scheme is proposed to tune the road traffic speed to further release available IoV resource under the heavy traffic load. Simulation results indicate that the proposed algorithm outperforms several baselines in terms of throughput and delay with low complexity.

53 citations


Cites background from "A Q-Learning Scheme for Fair Coexis..."

  • ...ς < 1 represents a discount factor [28], and an action ai at time i is a element of the set {Cj+ = 1, Cj− = 1, Fj+ = 1, Fj− = 1}....

    [...]

Journal ArticleDOI
TL;DR: A convolutional neural network (CNN) is proposed that is trained to perform identification of LTE and Wi-Fi transmissions and can identify the hidden terminal effect caused by multiple LTE transmissions, multiple Wi-fi transmissions, or concurrent LTE andWi-Fi broadcasts.
Abstract: Over the last years, the ever-growing wireless traffic has pushed the mobile community to investigate solutions that can assist in more efficient management of the wireless spectrum. Towards this direction, the long-term evolution (LIE) operation in the unlicensed spectrum has been proposed. Targeting a global solution that respects the regional requirements, 3GPP announced the standard of LIE licensed assisted access (LAA). However, LIE LAA may result in unfair coexistence with Wi-Fi, especially when Wi-Fi does not use frame aggregation. Targeting a technique that enables fair channel access, the mLTE-U scheme has been proposed. According to mLTE-U, LTE uses a variable transmission opportunity, followed by a variable muting period that can be exploited by other networks to transmit. For the selection of the appropriate mLTE-U configuration, information about the dynamically changing wireless environment is required. To this end, this paper proposes a convolutional neural network (CNN) that is trained to perform identification of LIE and Wi-Fi transmissions. In addition, it can identify the hidden terminal effect caused by multiple LTE transmissions, multiple Wi-Fi transmissions, or concurrent LIE and Wi-Fi transmissions. The designed CNN has been trained and validated using commercial off-the-shelf LIE and Wi-Fi hardware equipment and for two wireless signal representations, namely, in-phase and quadrature samples and frequency domain representation through fast Fourier transform. The classification accuracy of the two resulting CNNs is tested for different signal to noise ratio values. The experimentation results show that the data representation affects the accuracy of CNN. The obtained information from CNN can be exploited by the mLTE-U scheme in order to provide fair coexistence between the two wireless technologies.

41 citations


Cites background or methods from "A Q-Learning Scheme for Fair Coexis..."

  • ...In [10] and [12], we assumed that the information of the...

    [...]

  • ...In [12], we further extended our previous work by introducing a Q-learning procedure that is able to provide automatic and autonomous selection of the appropriate TXOP and muting period combinations that can enable fair coexistence between the co-located networks....

    [...]

References
More filters
Book
01 Jan 1988
TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.
Abstract: Reinforcement learning, one of the most active research areas in artificial intelligence, is a computational approach to learning whereby an agent tries to maximize the total amount of reward it receives when interacting with a complex, uncertain environment. In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the key ideas and algorithms of reinforcement learning. Their discussion ranges from the history of the field's intellectual foundations to the most recent developments and applications. The only necessary mathematical background is familiarity with elementary concepts of probability. The book is divided into three parts. Part I defines the reinforcement learning problem in terms of Markov decision processes. Part II provides basic solution methods: dynamic programming, Monte Carlo methods, and temporal-difference learning. Part III presents a unified view of the solution methods and incorporates artificial neural networks, eligibility traces, and planning; the two final chapters present case studies and consider the future of reinforcement learning.

37,989 citations

Journal ArticleDOI
TL;DR: In this article, it is shown that Q-learning converges to the optimum action-values with probability 1 so long as all actions are repeatedly sampled in all states and the action values are represented discretely.
Abstract: Q-learning (Watkins, 1989) is a simple way for agents to learn how to act optimally in controlled Markovian domains. It amounts to an incremental method for dynamic programming which imposes limited computational demands. It works by successively improving its evaluations of the quality of particular actions at particular states. This paper presents and proves in detail a convergence theorem for Q,-learning based on that outlined in Watkins (1989). We show that Q-learning converges to the optimum action-values with probability 1 so long as all actions are repeatedly sampled in all states and the action-values are represented discretely. We also sketch extensions to the cases of non-discounted, but absorbing, Markov environments, and where many Q values can be changed each iteration, rather than just one.

3,294 citations

01 Jan 1993

2,697 citations

Proceedings ArticleDOI
09 Jun 2013
TL;DR: This paper considers two of the most prominent wireless technologies available today, namely Long Term Evolution (LTE), and WiFi, and addresses some problems that arise from their coexistence in the same band, and proposes a simple coexistence scheme that reuses the concept of almost blank subframes in LTE.
Abstract: The recent development of regulatory policies that permit the use of TV bands spectrum on a secondary basis has motivated discussion about coexistence of primary (e.g. TV broadcasts) and secondary users (e.g. WiFi users in TV spectrum). However, much less attention has been given to coexistence of different secondary wireless technologies in the TV white spaces. Lack of coordination between secondary networks may create severe interference situations, resulting in less efficient usage of the spectrum. In this paper, we consider two of the most prominent wireless technologies available today, namely Long Term Evolution (LTE), and WiFi, and address some problems that arise from their coexistence in the same band. We perform exhaustive system simulations and observe that WiFi is hampered much more significantly than LTE in coexistence scenarios. A simple coexistence scheme that reuses the concept of almost blank subframes in LTE is proposed, and it is observed that it can improve the WiFi throughput per user up to 50 times in the studied scenarios.

324 citations


"A Q-Learning Scheme for Fair Coexis..." refers background in this paper

  • ...[16] propose a coexistence scheme that exploits periodically blank LTE subframes during an LTE frame in order to give transmission opportunities to Wi-Fi....

    [...]

Proceedings ArticleDOI
25 Oct 2012
TL;DR: This paper investigates deploying LTE on a license-exempt band as part of the pico-cell underlay and shows that LTE can deliver significant capacity even while sharing the spectrum with WiFi systems.
Abstract: Mobile broadband data usage in Long Term Evolution (LTE) networks is growing exponentially and capacity constraint is becoming an issue. Heterogeneous network, WiFi offload, and acquisition of additional radio spectrum can be used to address this capacity constraint. Licensed spectrum, however, is limited and can be costly to obtain. This paper investigates deploying LTE on a license-exempt band as part of the pico-cell underlay. Coexistence mechanism and other modifications to LTE are discussed. Performance analysis shows that LTE can deliver significant capacity even while sharing the spectrum with WiFi systems.

211 citations


"A Q-Learning Scheme for Fair Coexis..." refers background in this paper

  • ...Several other studies [13] [14] [15] evaluate the impact of LTE on Wi-Fi through experiments, mathematical models and simulations, all coming to the same conclusion, namely that coexistence mechanisms are required to render LTE fair towards other co-located technologies, like Wi-Fi....

    [...]

Trending Questions (1)
What is the difference between LTE Home Internet and FIOS?

However, the proposed standard may cause coexistence issues between LTE and Wi-Fi, especially in the case that the latter does not use frame aggregation.