scispace - formally typeset
Open AccessJournal ArticleDOI

Machine Learning Paradigms for Next-Generation Wireless Networks

TLDR
The goal is to assist the readers in refining the motivation, problem formulation, and methodology of powerful machine learning algorithms in the context of future networks in order to tap into hitherto unexplored applications and services.
Abstract
Next-generation wireless networks are expected to support extremely high data rates and radically new applications, which require a new wireless radio technology paradigm. The challenge is that of assisting the radio in intelligent adaptive learning and decision making, so that the diverse requirements of next-generation wireless networks can be satisfied. Machine learning is one of the most promising artificial intelligence tools, conceived to support smart radio terminals. Future smart 5G mobile terminals are expected to autonomously access the most meritorious spectral bands with the aid of sophisticated spectral efficiency learning and inference, in order to control the transmission power, while relying on energy efficiency learning/inference and simultaneously adjusting the transmission protocols with the aid of quality of service learning/inference. Hence we briefly review the rudimentary concepts of machine learning and propose their employment in the compelling applications of 5G networks, including cognitive radios, massive MIMOs, femto/small cells, heterogeneous networks, smart grid, energy harvesting, device-todevice communications, and so on. Our goal is to assist the readers in refining the motivation, problem formulation, and methodology of powerful machine learning algorithms in the context of future networks in order to tap into hitherto unexplored applications and services.

read more

Content maybe subject to copyright    Report

IEEE Wireless Communications • Accepted for Publication
2
1536-1284/16/$25.00 © 2016 IEEE
Chunxiao Jiang is with the
Tsinghua Space Center.
Y. Ren is with Tsinghua
University.
Haijun Zhang is with the
University of Science and
Technology Beijing, China
Zhu Han is with the
University of Houston.
Kwang-Cheng Chen is with
the University of South
Florida
Lajos Hanzo is with the
University of Southampton.
AbstrAct
Next-generation wireless networks are expect-
ed to support extremely high data rates and
radically new applications, which require a new
wireless radio technology paradigm. The chal-
lenge is that of assisting the radio in intelligent
adaptive learning and decision making, so that
the diverse requirements of next-generation wire-
less networks can be satisfied. Machine learning
is one of the most promising artificial intelligence
tools, conceived to support smart radio terminals.
Future smart 5G mobile terminals are expected
to autonomously access the most meritorious
spectral bands with the aid of sophisticated spec-
tral efficiency learning and inference, in order to
control the transmission power, while relying on
energy efficiency learning/inference and simul-
taneously adjusting the transmission protocols
with the aid of quality of service learning/infer-
ence. Hence we briefly review the rudimentary
concepts of machine learning and propose their
employment in the compelling applications of
5G networks, including cognitive radios, massive
MIMOs, femto/small cells, heterogeneous net-
works, smart grid, energy harvesting, device-to-
device communications, and so on. Our goal is
to assist the readers in refining the motivation,
problem formulation, and methodology of pow-
erful machine learning algorithms in the context
of future networks in order to tap into hitherto
unexplored applications and services.
IntroductIon
Radical and sometime even un-orthodox next-gen-
eration networking concepts have received sub-
stantial attention both in the academic as well as
industrial communities. One of their driving forces
is that of providing unprecedented data rates for
supporting radical new applications. Specifically,
next-generation networks are expected to learn
the diverse and colorful characteristics of both
the users’ ambience as well as human behavior,
in order to autonomously determine the opti-
mal system configurations. These smart mobile
terminals have to rely on sophisticated learning
and decision-making. Machine learning, as one
of the most powerful artificial intelligence tools,
constitutes a promising solution [1]. As shown in
Fig. 1, we may envision an intelligent radio that
is capable of autonomously accessing the avail-
able spectrum with the aid of learning, altruistical-
ly controlling transmission power for the sake of
conserving energy as well as adjusting the trans-
mission protocols.
Machine learning has found wide-ranging
applications in image/audio processing, finance
and economics, social behavior analysis, project
management, and so on [2]. Explicitly, a machine
learns the execution of a particular task T, with
the goal of maintaining a specific performance
metric P, based on a particular experience E,
where the system aims to reliably improve its
performance P while executing task T, again by
exploiting its experience E. Depending on how
we specify T, P, and E, the learning might also be
referred to as data mining, autonomous discov-
ery, database updating, programming by example,
and so on [3]. Machine learning algorithms can
be simply categorized as supervised and unsuper-
vised learning, where the adjectives “supervised/
unsupervised” indicate whether there are labeled
samples in the database. Later, reinforcement
learning emerged as a new category that was
inspired by behavioral psychology. It is concerned
with an agent’s certain form of reward/utility, who
is connected to its environment via perception
and action. The family of machine learning algo-
rithms can also be categorized based on their sim-
ilarity in terms of their functionality and structure,
yielding regression algorithms, instance-based
algorithms, regularization algorithms, decision tree
algorithms, Bayesian algorithms, clustering algo-
rithms, association rule based learning algorithms,
artificial neural networks, deep learning algo-
rithms, dimension reduction algorithms, ensem-
ble algorithms, and so on. In this article, we will
introduce the basic concept of machine learning
algorithms and the corresponding applications
according to the category of supervised, unsuper-
vised, and reinforcement learning.
Machine learning can be widely used in model-
ing various technical problems of next-generation
systems, such as large-scale MIMOs, device-to-
device (D2D) networks, heterogeneous networks
constituted by femtocells and small cells, and so
on. Figure 2 portrays the family-tree of machine
learning techniques and their potential applica-
tions in 5G. Against this background, we embark
on investigating the family of learning techniques.
Specifically, in the following sections we consider
supervised learning, unsupervised learning, and
Chunxiao Jiang, haiJun Zhang, Yong Ren, Zhu han,
Kwang-Cheng Chen, and LaJos hanZo
Machine Learning ParadigMs for
next-generation WireLess netWorks
Digital Object Identifier:
10.1109/MWC.2016.1500356WC
accePted froM oPen caLL

IEEE Wireless Communications • Accepted for Publication
3
reinforcement learning. Each section consists of
several subsections, discussing specific learning
models, such as regression models and the k-near-
est neighbor (KNN) algorithm, support vector
machines (SVM) and Bayesian learning; k-means
clustering, principal and independent component
analysis; and partially observed Markov decision
processes, Q-learning, and the multi-armed bandit
technique. Each section commences with the intro-
duction of the learning model and its applications
in 5G networks. Finally, our conclusions are drawn.
supervIsed LeArnIng In
WIreLess communIcAtIons
regressIon modeLs, Knn And svm:
mImo chAnneL And energy LeArnIng
Models: Regression analysis relies on a statisti-
cal process for estimating the relationships among
variables. The goal of regression analysis is to pre-
dict the value of one or more continuous-valued
estimation targets, given the value of a D-dimen-
sional vector x of input variables. The estimation
target is a function of the independent variables.
In linear regression, the regression function is
linear, while in logistic regression, it is a logistic
function assuming a common sigmoid curve. The
KNN and SVM algorithms are mainly utilized for
classification of points/objects. In KNN, an object
is classified into a specific category by a majority
vote of the object’s neighbors, with the object
being assigned to the class that is most common
among its k nearest neighbors. The output may be
constituted by a specific property of the object,
such as for example the average of the values
of its k nearest neighbors. By contrast, the SVM
algorithm relies on nonlinear mapping, which
transforms the original training data into a high-
er dimension where it becomes separable and
then it searches for the optimal linear separating
hyperplane that is capable of separating one class
from another, again in this higher dimension. They
correspond to non-linear classification methods
relying on the family of kernel methods. It was
shown that with the aid of an appropriate nonlin-
ear mapping to a sufficiently high dimension, the
data from two classes can always be separated by
a hyperplane [3 p. 21, 185, 239, 349] .
Applications: These models can be used for
estimating or predicting radio parameters that are
associated with specific users. For example, in
massive MIMO systems associated with hundreds
of antennas, both detection and channel estima-
tion lead to high-dimensional search-problems,
which can be addressed by the above-mentioned
learning models. In order to generalize the SVM
function for employment in data classification
problems, its hierarchical version, referred to as
H-SVM, was proposed in [4], where each hierar-
chical level consisted of a finite number of SVM
classifiers. This regime was used for the estima-
tion of the Gaussian channel’s noise level in a
MIMO-aided wireless network having t transmit
antennas and r receive antennas. By exploiting the
training data, the H-SVM model was trained for
the estimation of the channel noise statistics.
In heterogeneous networks constituted by
diverse cells, handovers may be frequent, where
both the KNN and SVM can be applied to finding
the optimal handover solutions. At the application
layer, these models can also be used for learning
the mobile terminal’s specific usage pattern in
diverse spatio-temporal and device contexts, as
discussed in [5]. This may then be exploited for
prediction of the configuration to be used in the
location-specific interface. Given a set of contex-
figure 1. Intelligent radio learning paradigm.
Smart
antenna
RF
module
ADC
DAC
Radio learning
Action
selection
Utility and cost
evaluation
Learning
algorithm
Observations
Control
figure 2. Radio learning architecture.
Technologies: massive MIMO, femto/small cells and heterogeneous networks (HetNets), cloud radio access networks, cognitive radio, full duplex, energy harvesting, etc.
Machine learning applications: channel estimation/detection, spectrum sensing/access, cell/user clustering, switch and handover among HetNets,
signal dimension reduction, energy modeling, user behavior analysis, location prediction, intrusion/fault/anomaly detection,
cell/channel selection association.
Machine learning in 5G
Supervised learning Unsupervised learning Reinforcement learning
Regression model,
KNN, SVM
apps in 5G:
massive MIMO channel
estimation/detection;
user location/behavior
learning/classification
Bayesian learning
apps in 5G:
Massive MIMO
channel estimation;
spectrum sensing/
detection and
learning in CR
K-means clustering
apps in 5G:
small cell clustering;
WiFi association;
device-to-device user
clustering; HetNet
clustering
PCA and ICA
apps in 5G:
spectrum sensing;
anomaly/fault/intrusion
detection; signal
dimension reduction smart
grid user classification
MDP, POMDP, Q-learning, multi-armed bandit
apps on 5G:
decision making under unknown network
conditions, resource competition in femto/small
cell channel selection and spectrum sharing for
device-to-device networks, energy modeling in
energy harvesting; HetNet selection/association

IEEE Wireless Communications • Accepted for Publication
4
tual input cues, machine learning algorithms are
capable of exploiting the user context learned
for the sake of dynamically classifying the cues
into a system state for the sake of saving energy,
while maintaining a high level of user satisfaction.
Donohoo et al. [5] also conducted experiments
using five real user profiles, including the user-lo-
cations and energy consumption, but their data
is not accessible to the public. The experiment
showed that up to 90 percent successful energy
demand prediction is possible with the aid of the
KNN algorithms.
bAyesIAn LeArnIng:
mAssIve mImo And cognItIve rAdIo
Models: The philosophy of Bayesian learning
is to compute the a posteriori probability distribu-
tion of the target variables conditioned on its input
signals and on all of the training instances. Some
simple examples of generative models that may
be learned with the aid of Bayesian techniques
include, but are not limited to, the Gaussians mix-
ture model (GM), expectation maximization (EM),
and hidden Markov models (HMM) [3 p. 445].
GM is a model where each data point belongs
to one of several clusters or groups, and the data
points within each cluster are Gaussian distributed.
EM is a generalization of maximum likelihood
estimation, which iteratively finds the most likely
solutions or parameters. It is characterized by two
steps: the “E” step that chooses a function repre-
senting the lower bound of the likelihood, and the
“M” step that finds the parameters maximizing
the chosen function.
HMM is a tool designed for representing prob-
ability distributions of sequences of observations.
It can be considered a generalization of a mix-
ture-based model, where the hidden variables,
which control the specific mixture of the com-
ponent to be selected for each observation, are
related to each other through a Markov process,
rather than being independent of each other.
Applications: The Bayesian learning model
may be readily invoked for spectral characteristic
learning and estimation in next-generation net-
works. To address the pilot contamination prob-
lem encountered in massive MIMO systems, the
authors of [6] estimated both the channel param-
eters of the desired links in a target cell as well as
those of the interfering links of the adjacent cells,
where channel estimation was carried out with
the aid of sparse Bayesian learning techniques.
Based on the observation of received signals, the
channel component was first modeled by a GM,
namely by a weighted sum of Gaussian distribu-
tions having different variances, and then estimat-
ed with the aid of the EM algorithm.
Another three closely related applications may
be found in cognitive radio networks. In [7], a
cooperative wideband spectrum sensing scheme
based on the EM algorithm was proposed for the
detection of a primary user (PU) supported by a
multi-antenna assisted cognitive radio network.
This iterative technique first created the log-like-
lihood function of both the unknown spectrum
occupancy as well as of the channel information
and of the noise in the “E” step. Then, it maxi-
mized the log-likelihood function for the sake of
inferring the unknown information during the “M”
step, which was carried out by jointly detecting
both the PU signal as well as estimating the chan-
nel’s unknown frequency response and the noise
variance of multiple subbands.
In contrast to [7], the authors in [8] construct-
ed a HMM relying on a two-state hidden Markov
process, where the PUs are present or absent and
a two-state observation space, indicating whether
the PUs are present or absent. Furthermore, the EM
algorithm was invoked for finding the true channel
parameters, such as the sojourn time of the avail-
able channels, the inactive states of the PUs, and
the PUs’ signal strength. Finally, the third application
of Bayesian learning was advocated in [9], where a
tomography model, belonging to the Bayesian infer-
ence framework, was proposed for conceiving and
statistically characterizing a range of techniques that
are capable of extracting the prevalent parameters
and traffic/interference patterns for employment
in cognitive radio networks at both the link layer
and network layer. The parameters collected includ-
ed both the path-delay as well as the proportion
of successful packet receptions, while the estimat-
ed parameter was the link’s successful transmission
probability. The Bayesian estimators were derived
for single/multiple transmissions in single/multi-
ple path scenarios. In Table 1, we summarize the
basic characteristics and applications of supervised
machine learning algorithms.
unsupervIsed LeArnIng In
WIreLess communIcAtIons
K-meAns cLusterIng:
heterogeneous netWorKs
Models: K-means clustering aims for partition-
ing n observations into k clusters, where each
observation belongs to the closest cluster. It
defines the centroid of a cluster as the center of
tabLe 1. Supervised machine learning algorithms.
Category Learning techniques Key characteristics Application in 5G
Supervised
learning
Regression models
• Estimate the variables’ relationships
• Linear and logistics regression
Energy learning [5]
K-nearest neighbor • Majority vote of neighbors Energy learning [5]
Support vector machines
• Non-linear mapping to high dimension
• Separate hyperplane classification
MIMO channel learning [4]
Bayesian learning
A posteriori distribution calculation
• GM, EM, and HMM
• Massive MIMO learning [6]
• Cognitive spectrum learning [7–9]
HMM is a tool designed
for representing prob-
ability distributions of
sequences of observa-
tions. It can be consid-
ered a generalization of
a mixture-based model,
where the hidden
variables, which control
the specific mixture
of the component to
be selected for each
observation, are related
to each other through a
Markov process, rather
than being independent
of each other.

IEEE Wireless Communications • Accepted for Publication
5
gravity, that is, the mean value of the points within
the cluster. The clustering algorithm proceeds in
an iterative manner, where an object is assigned
to the specific cluster whose centroid is nearest
to the object based on the Euclidean distance
‘similarity metric’, and then the in-cluster differ-
ences are minimized by iteratively updating the
cluster-centroid, until ‘convergence’ is achieved.
Explicitly, convergence is deemed to be achieved
when the assignment becomes stable, that is, the
clusters formed in the current round are the same as
those formed in the previous round [3 p. 161, 317].
Applications: Clustering is a common problem
in 5G networks, especially in heterogeneous sce-
narios associated with diverse cell sizes as well as
WiFi and D2D networks. For example, the small
cells have to be carefully clustered to avoid inter-
ference using coordinated multi-point transmis-
sion (CoMP), while the mobile users are clustered
to obey an optimal offloading policy, the devices
are clustered in D2D networks to achieve high
energy efficiency, the WiFi users are clustered to
maintain an optimal access point association, and
so on. In [10], the authors considered a hybrid
optical/wireless network scenario, in order to
reduce the overall wireless tele-traffic by encour-
aging the utilization of the high-capacity optical
infrastructure. A mixed integer programming
(MIP) problem was formulated to jointly optimize
both the gateway partitioning and the virtual-chan-
nel allocation based on classic k-means clustering,
which was employed to partition the mesh access
points (MAPs) into several groups. The proposed
scheme commenced its operation from an initial
gateway access point (GAP) set, which can be
plucked by a random selection from the set of
MAPs, or can be more astutely determined using
a meritorious initialization criterion. Next, each
MAP is assigned to its nearest GAP. If several eli-
gible GAPs are in the vicinity, then the specific
GAP that has a readily available virtual channel
is chosen. Finally, by using the classic k-means
clustering algorithm, the MAPs are divided into k
groups associated with the closest GAPs.
prIncIpAL And Independent component
AnALysIs: smArt grId And cognItIve rAdIo
Models: Principal component analysis (PCA)
transforms a set of potentially correlated variables
into a set of uncorrelated variables, referred to
as the principal components, where the number
of principal components is less than or equal to
the number of original variables. Basically, the
first principal component has the largest possible
variance (i.e., accounts for as much of the vari-
ability in the data as possible), and each succeed-
ing component in turn has the highest variance
possible under the constraint that it is orthogonal
to (i.e., uncorrelated with) the preceding compo-
nents. The principal components are orthogonal,
because they are the eigenvectors of the covari-
ance matrix, which is symmetric. By contrast, inde-
pendent component analysis (ICA) is a statistical
technique conceived to reveal hidden factors that
underlie sets of random variables, measurements,
or signals. In the model, the data variables are
assumed to be linear mixtures of some unknown
latent variables, and the mixing system is also
unknown. The latent variables are assumed to be
non-Gaussian and mutually independent, and they
are referred to as the independent components
of the observed data, which can be found by ICA
[3 p. 115].
Applications: Both the PCA and ICA consti-
tute powerful statistical signal processing tech-
niques devised to recover statistically independent
source signals from their linear mixtures. One
of their major applications may be found in the
area of anomaly-detection, fault-detection, and
intrusion-detection problems of wireless networks,
which rely on traffic monitoring. Furthermore, sim-
ilar problems may also be solved in sensor net-
works, mesh networks, and so on. They can also
be invoked for the physical layer signal dimen-
sion reduction of massive MIMO systems or to
classify the primary users’ behaviors in cognitive
radio networks. As a further example, in [11] PCA
and ICA were applied in a smart grid scenario to
recover the simultaneous wireless transmissions
of smart utility meters installed in each home. At
the power utility station, it was required to sepa-
rate the signals received from all the smart meters
before the signals can be decoded. The statistical
properties of the signals were exploited to blindly
separate them using ICA. This operation is capa-
ble of enhancing both the transmission efficiency
by avoiding channel estimation in each frame, as
well as data security by eliminating any wideband
interference or jamming signals. More explicitly,
a substantial security enhancement was achieved
by a robust version of the PCA-based method,
which exploited the sparse, low-rank nature of
the auto-covariance matrices of the smart meter-
ing signal and of the wideband interferer, respec-
tively, in order to confidently separate them prior
to ICA processing. Another pertinent example is
found in cognitive radio scenarios, where the so
called Boolean ICA relied on the Boolean mixing
of OR, XOR, and other functions of binary signals
[12]. It was also incorporated into the PU sepa-
ration problem often encountered in cognitive
radio networks for the sake of distinguishing and
characterizing the activities of PUs in the context
of collaborative spectrum sensing. Furthermore,
the observations of the secondary users (SUs)
were modeled as Boolean OR mixtures of the
underlying binary PU sources. An iterative algo-
rithm, called Binary ICA, was developed to deter-
mine the activities of the underlying latent signal
sources, such as the PUs. It was demonstrated
that given m monitors or SUs, the activities of up
to (2m – 1) distinct PUs can be inferred. In Table 2,
we summarize the basic characteristics and appli-
cations of unsupervised machine learning algo-
rithms.
reInforcement LeArnIng In
WIreLess communIcAtIons
pArtIALLy observAbLe mArKov decIsIon
process: energy hArvestIng
Models: Markov decision processes (MDPs)
provide a mathematical framework for model-
ing decision making in specific situations, where
the outcomes are partly random and partly under
the control of a decision maker, as illustrated in
Fig. 3a. At each time step, the process is in some
state s, and the decision maker may opt for any
Principal component
analysis (PCA)
transforms a set of
potentially correlated
variables into a set of
uncorrelated variables
referred to as the princi-
pal components, where
the number of principal
components is less than
or equal to the number
of original variables.

IEEE Wireless Communications • Accepted for Publication
6
of the legitimate actions a that is available in state
s. The process responds at the next time step by
randomly moving into a new state s’, and giving
the decision maker a corresponding reward U
a
(s).
The probability that the process moves into its
new state s’ is influenced both by the specific
action chosen, as well as by the system’s inherent
transitions, formally described by the state transi-
tion probability P
a
(s’|s, a). Given s and a, the state
transition probability is conditionally independent
of all previous states and actions, that is, the state
transitions of an MDP process satisfy the funda-
mental Markov property. By contrast, a partially
observable Markov decision process (POMDP)
may be viewed as the generalization of a MDP,
where the agent is unable to directly observe the
underlying state transitions and hence only has
partial knowledge, as shown in Fig. 3b. The agent
has to keep track of both the probability distri-
bution of the legitimate states, based on a set of
observations, as well as of the observation proba-
bilities and of the underlying MDP [3 p. 517].
Applications: The family of MDP/POMDP
models constitutes ideal tools for supporting deci-
sion making in 5G networks, where the users may
be regarded as agents and the network consti-
tutes the environment. There are usually three
steps associated with modeling a problem using
MDP. The first step is to specify the system’s state
space and the decision maker’s action space, as
well as verifying the Markov property. The sec-
ond step is that of constructing the state transition
probabilities P
a
(s’|s, a) formulated as the probabil-
ity of traversing from state s to s’under action a. The
last step is to quantify both the decision maker’s
immediate reward U
a
(s) and its long-term reward
using Bellman’s equation [13]. Then, a carefully
constructed iterative algorithm may be conceived
to identify the optimal action in each state.
Classical applications found in the literature
include the network selection/association prob-
lems of heterogeneous networks (HetNets), chan-
nel sensing, and user access in cognitive radio
networks, and so on. Furthermore, energy har-
vesting (EH) has also been extensively modeled
using MDP/POMDP, where the limited battery
and the time-variant channels are usually regard-
ed as the environment, while the users’ channel
selection or battery utilization are usually con-
sidered as the actions. For instance, in [13] the
transmission power control problems of EH sys-
tems were investigated using the POMDP model,
where the state space was defined by including
the battery state, the channel state, the packet
transmission/reception states, and an action by
the node, which corresponded to sending a pack-
et at a certain power level. The feedback messag-
es implicitly provided the EH system with partial
channel state information (CSI), which resulted
in the corresponding POMDP formulation. Since
finding exact solutions to the POMDP tends to be
computationally intractable [13], a pair of com-
putationally efficient suboptimal solutions, i.e. the
maximum-likelihood heuristic policy and the vot-
ing heuristic policy, were explored.
Q-LeArnIng: femto/smALL ceLLs
Models: Q-learning may be invoked to find an
optimal action policy for any given (finite) Mar-
kov decision process, especially when the system
model is unknown, as shown in Fig. 3c. It is a
model-free reinforcement learning technique and
as such it can be used in conjunction with MDP
models. In such a case, the Q-learning model is
also comprised of an agent, of the states S and of
a set of actions A per state. By executing an action
in a specific state, the agent gleans a reward and
the goal is to maximize its accumulated reward.
Such a reward is illustrated by a Q-function,
where “Q” is initialized to be an (arbitrary) fixed
value. Then, “Q” is updated in an iterative manner
after the agent carries out an action and observes
the resultant reward as well as the associated new
state at each time-instant [3 p. 517].
Applications: Q-learning has also been exten-
sively applied in heterogeneous networks, usual-
ly in conjunction with the aforementioned MDP
models. In [14] the authors presented a hetero-
geneous fully distributed multi-objective strategy
based on a reinforcement learning model con-
tabLe 2. Unsupervised machine learning algorithms.
Category Learning techniques Key characteristics Application in 5G
Unsupervised
learning
K-means clustering • K partition clustering
• Iterative updating algorithm
Heterogeneous
networks [10]
PCA • Orthogonal transformation Smart grid [11]
ICA • Reveal hidden independent
factors
Spectrum learning in
cognitive radio [12]
figure 3. Illustration of reinforcement learning: a) Markov decision process; b) partially observed Markov decision process;
c) Q-learning.
Actions
Rewards
V(s) = max U(s)+P(s'|s,a)U(s')
S
1
S
4
S
2
S
3
S
5
Known
P(s'|s,a)
System/environment
Rewards
V(s) = max U(s)+O(s'|s,a)U(s')
Actions
S
1
S
4
S
2
S
3
S
5
System/environment
True:
P(s'|s,a)
Partially
observed:
O(s'|s,a)
Actions
Observe, learn, rewards
Q= old value + learned value
S
1
S
4
S
2
S
3
S5
System/environment
Unknown
P(s'|s,a)

Citations
More filters
Journal ArticleDOI

Deep Learning in Mobile and Wireless Networking: A Survey

TL;DR: This paper bridges the gap between deep learning and mobile and wireless networking research, by presenting a comprehensive survey of the crossovers between the two areas, and provides an encyclopedic review of mobile and Wireless networking research based on deep learning, which is categorize by different domains.
Book

Unmanned Aerial Vehicles: A Survey on Civil Applications and Key Research Challenges

TL;DR: The use of unmanned aerial vehicles (UAVs) is growing rapidly across many civil application domains, including real-time monitoring, providing wireless coverage, remote sensing, search and rescue, delivery of goods, security and surveillance, precision agriculture, and civil infrastructure inspection.
Journal ArticleDOI

A Comprehensive Survey on Internet of Things (IoT) Toward 5G Wireless Systems

TL;DR: This article provides a comprehensive review on emerging and enabling technologies related to the 5G system that enables IoT, such as 5G new radio, multiple-input–multiple-output antenna with the beamformation technology, mm-wave commutation technology, heterogeneous networks (HetNets), the role of augmented reality (AR) in IoT, which are discussed in detail.
Journal ArticleDOI

6G Wireless Communications: Vision and Potential Techniques

TL;DR: A number of key technical challenges as well as the potential solutions associated with 6G, including physical-layer transmission techniques, network designs, security approaches, and testbed developments are outlined.
Journal ArticleDOI

Artificial Neural Networks-Based Machine Learning for Wireless Networks: A Tutorial

TL;DR: This paper constitutes the first holistic tutorial on the development of ANN-based ML techniques tailored to the needs of future wireless networks and overviews how artificial neural networks (ANNs)-based ML algorithms can be employed for solving various wireless networking problems.
References
More filters
Journal ArticleDOI

Channel Estimation for Massive MIMO Using Gaussian-Mixture Bayesian Learning

TL;DR: This paper proposes estimation of only the channel parameters of the desired links in a target cell, but those of the interference links from adjacent cells, which achieves much better performance in terms of the channel estimation accuracy and achievable rates in the presence of pilot contamination.
Proceedings ArticleDOI

A Neural Network Based Spectrum Prediction Scheme for Cognitive Radio

TL;DR: This work designs the spectrum predictor using the neural network model, multilayer perceptron (MLP), which does not require a prior knowledge of the traffic characteristics of the licensed user systems and achieves a low probability of error in predicting the idle channels.
Journal ArticleDOI

Cognitive Radio Network for the Smart Grid: Experimental System Architecture, Control Algorithms, Security, and Microgrid Testbed

TL;DR: The concept of independent component analysis in combination with the robust principal component analysis technique is employed to recover data from the simultaneous smart meter wireless transmissions in the presence of strong wideband interference.
Proceedings ArticleDOI

Fuzzy-based Spectrum Handoff in Cognitive Radio Networks

TL;DR: The proposal in this paper is a fuzzy-based approach able to make effective spectrum handoff decisions in a context characterized by uncertain, incomplete and heterogeneous information.
Journal ArticleDOI

Neural network-based learning schemes for cognitive radio systems

TL;DR: This paper introduces and evaluates learning schemes that are based on artificial neural networks and can be used for predicting the capabilities that can be achieved by a specific radio configuration, and presents and evaluates useful, indicative results from the benchmarking work.
Related Papers (5)
Frequently Asked Questions (16)
Q1. What are the contributions in this paper?

Hence the authors briefly review the rudimentary concepts of machine learning and propose their employment in the compelling applications of 5G networks, including cognitive radios, massive MIMOs, femto/small cells, heterogeneous networks, smart grid, energy harvesting, device-todevice communications, and so on. 

Technologies: massive MIMO, femto/small cells and heterogeneous networks (HetNets), cloud radio access networks, cognitive radio, full duplex, energy harvesting, etc. 

Next-generation wireless networks are expected to support extremely high data rates and radically new applications, which require a new wireless radio technology paradigm. 

The parameters collected included both the path-delay as well as the proportion of successful packet receptions, while the estimated parameter was the link’s successful transmission probability. 

Since finding exact solutions to the POMDP tends to be computationally intractable [13], a pair of computationally efficient suboptimal solutions, i.e. the maximum-likelihood heuristic policy and the voting heuristic policy, were explored. 

Both the PCA and ICA constitute powerful statistical signal processing techniques devised to recover statistically independent source signals from their linear mixtures. 

a neural network consists of a number of neurons and weighted connections among them, where the neurons can be regarded as variables and the weights can be viewed as parameters. 

The key idea of the proposed approach is to enable each user to forecast the future actions of its opponents based on public knowledge and to proceed by best responding to the predicted joint action profile using some bandit strategy [3 p. 517]. 

Key characteristics Application in 5GUnsupervised learning K-means clustering • K partition clustering • Iterative updating algorithm Heterogeneous networks [10] 

Some simple examples of generative models that may be learned with the aid of Bayesian techniques include, but are not limited to, the Gaussians mixture model (GM), expectation maximization (EM), and hidden Markov models (HMM) [3 p. 445]. 

The challenge is that of assisting the radio in intelligent adaptive learning and decision making, so that the diverse requirements of next-generation wireless networks can be satisfied. 

Their goal is to assist the readers in refining the motivation, problem formulation, and methodology of powerful machine learning algorithms in the context of future networks in order to tap into hitherto unexplored applications and services. 

Given s and a, the state transition probability is conditionally independent of all previous states and actions, that is, the state transitions of an MDP process satisfy the fundamental Markov property. 

It was demonstrated that the compensation strategy based on the reinforcement learning model attained an exceptional performance improvement. 

Yong Ren [SM’16] received his B.S., M.S., and Ph.D. degrees in electronic engineering from Harbin Institute of Technology, China, in 1984, 1987, and 1994, respectively. 

This distributed channel selection problem was in harmony with the typical MP-MAB settings, and thus it was modeled as an MP-MAB game.