# Showing papers in "IEEE Transactions on Signal Processing in 2017"

••

TL;DR: The material covered includes tensor rank and rank decomposition; basic tensor factorization models and their relationships and properties; broad coverage of algorithms ranging from alternating optimization to stochastic gradient; statistical performance analysis; and applications ranging from source separation to collaborative filtering, mixture and topic modeling, classification, and multilinear subspace learning.

Abstract: Tensors or multiway arrays are functions of three or more indices $(i,j,k,\ldots)$ —similar to matrices (two-way arrays), which are functions of two indices $(r,c)$ for (row, column). Tensors have a rich history, stretching over almost a century, and touching upon numerous disciplines; but they have only recently become ubiquitous in signal and data analytics at the confluence of signal processing, statistics, data mining, and machine learning. This overview article aims to provide a good starting point for researchers and practitioners interested in learning about and working with tensors. As such, it focuses on fundamentals and motivation (using various application examples), aiming to strike an appropriate balance of breadth and depth that will enable someone having taken first graduate courses in matrix algebra and probability to get started doing research and/or developing tensor algorithms and software. Some background in applied optimization is useful but not strictly required. The material covered includes tensor rank and rank decomposition; basic tensor factorization models and their relationships and properties (including fairly good coverage of identifiability); broad coverage of algorithms ranging from alternating optimization to stochastic gradient; statistical performance analysis; and applications ranging from source separation to collaborative filtering, mixture and topic modeling, classification, and multilinear subspace learning.

1,284 citations

••

TL;DR: An overview of the majorization-minimization (MM) algorithmic framework, which can provide guidance in deriving problem-driven algorithms with low computational cost and is elaborated by a wide range of applications in signal processing, communications, and machine learning.

Abstract: This paper gives an overview of the majorization-minimization (MM) algorithmic framework, which can provide guidance in deriving problem-driven algorithms with low computational cost. A general introduction of MM is presented, including a description of the basic principle and its convergence results. The extensions, acceleration schemes, and connection to other algorithmic frameworks are also covered. To bridge the gap between theory and practice, upperbounds for a large number of basic functions, derived based on the Taylor expansion, convexity, and special inequalities, are provided as ingredients for constructing surrogate functions. With the pre-requisites established, the way of applying MM to solving specific problems is elaborated by a wide range of applications in signal processing, communications, and machine learning.

1,073 citations

••

TL;DR: In this paper, the authors proposed an approach for channel estimation that is applicable for both flat and frequency-selective fading, based on the Bussgang decomposition that reformulates the nonlinear quantizer as a linear function with identical first and second-order statistics.

Abstract: This paper considers channel estimation and system performance for the uplink of a single-cell massive multiple-input multiple-output system. Each receiver antenna of the base station is assumed to be equipped with a pair of one-bit analog-to-digital converters to quantize the real and imaginary part of the received signal. We first propose an approach for channel estimation that is applicable for both flat and frequency-selective fading, based on the Bussgang decomposition that reformulates the nonlinear quantizer as a linear function with identical first- and second-order statistics. The resulting channel estimator outperforms previously proposed approaches across all SNRs. We then derive closed-form expressions for the achievable rate in flat fading channels assuming low SNR and a large number of users for the maximal ratio and zero forcing receivers that takes channel estimation error due to both noise and one-bit quantization into account. The closed-form expressions, in turn, allow us to obtain insight into important system design issues such as optimal resource allocation, maximal sum spectral efficiency, overall energy efficiency, and number of antennas. Numerical results are presented to verify our analytical results and demonstrate the benefit of optimizing system performance accordingly.

452 citations

••

Tufts University

^{1}TL;DR: In this article, a tensor singular value decomposition (t-SVD) is proposed for 3D arrays with low tubal-rank, which is similar to the SVD for matrices.

Abstract: In this paper, we focus on the problem of completion of multidimensional arrays (also referred to as tensors), in particular three-dimensional (3-D) arrays, from limited sampling. Our approach is based on a recently proposed tensor algebraic framework where 3-D tensors are treated as linear operators over the set of 2-D tensors. In this framework, one can obtain a factorization for 3-D data, referred to as the tensor singular value decomposition (t-SVD), which is similar to the SVD for matrices. t-SVD results in a notion of rank referred to as the tubal-rank. Using this approach we consider the problem of sampling and recovery of 3-D arrays with low tubal-rank. We show that by solving a convex optimization problem, which minimizes a convex surrogate to the tubal-rank, one can guarantee exact recovery with high probability as long as number of samples is of the order $O(rnk \log (nk))$ , given a tensor of size $n\times n\times k$ with tubal-rank $r$ . The conditions under which this result holds are similar to the incoherence conditions for low-rank matrix completion under random sampling. The difference is that we define incoherence under the algebraic setup of t-SVD, which is different from the standard matrix incoherence conditions. We also compare the numerical performance of the proposed algorithm with some state-of-the-art approaches on real-world datasets.

451 citations

••

TL;DR: This paper proposes two novel neural-network architectures that decouple prediction errors across layers in the same way that the approximate message passing (AMP) algorithms decouple them across iterations: through Onsager correction.

Abstract: Deep learning has gained great popularity due to its widespread success on many inference problems. We consider the application of deep learning to the sparse linear inverse problem, where one seeks to recover a sparse signal from a few noisy linear measurements. In this paper, we propose two novel neural-network architectures that decouple prediction errors across layers in the same way that the approximate message passing (AMP) algorithms decouple them across iterations: through Onsager correction. First, we propose a “learned AMP” network that significantly improves upon Gregor and LeCun's “learned ISTA.” Second, inspired by the recently proposed “vector AMP” (VAMP) algorithm, we propose a “learned VAMP” network that offers increased robustness to deviations in the measurement matrix from i.i.d. Gaussian. In both cases, we jointly learn the linear transforms and scalar nonlinearities of the network. Interestingly, with i.i.d. signals, the linear transforms and scalar nonlinearities prescribed by the VAMP algorithm coincide with the values learned through back-propagation, leading to an intuitive interpretation of learned VAMP. Finally, we apply our methods to two problems from 5G wireless communications: compressive random access and massive-MIMO channel estimation.

395 citations

••

TL;DR: An efficient implementation of the generalized labeled multi-Bernoulli (GLMB) filter is proposed by combining the prediction and update into a single step and an efficient algorithm for truncating the GLMB filtering density based on Gibbs sampling is proposed.

Abstract: This paper proposes an efficient implementation of the generalized labeled multi-Bernoulli (GLMB) filter by combining the prediction and update into a single step. In contrast to an earlier implementation that involves separate truncations in the prediction and update steps, the proposed implementation requires only one truncation procedure for each iteration. Furthermore, we propose an efficient algorithm for truncating the GLMB filtering density based on Gibbs sampling. The resulting implementation has a linear complexity in the number of measurements and quadratic in the number of hypothesized objects.

310 citations

••

TL;DR: This work proposes a direct localization approach in which the position of a user is localized by jointly processing the observations obtained at distributed massive MIMO base stations, and leads to improved performance results compared to previous existing methods.

Abstract: Large-scale MIMO systems are well known for their advantages in communications, but they also have the potential for providing very accurate localization, thanks to their high angular resolution. A difficult problem arising indoors and outdoors is localizing users over multipath channels. Localization based on angle of arrival (AOA) generally involves a two-step procedure, where signals are first processed to obtain a user's AOA at different base stations, followed by triangulation to determine the user's position. In the presence of multipath, the performance of these methods is greatly degraded due to the inability to correctly detect and/or estimate the AOA of the line-of-sight (LOS) paths. To counter the limitations of this two-step procedure which is inherently suboptimal, we propose a direct localization approach in which the position of a user is localized by jointly processing the observations obtained at distributed massive MIMO base stations. Our approach is based on a novel compressed sensing framework that exploits channel properties to distinguish LOS from non-LOS signal paths, and leads to improved performance results compared to previous existing methods.

291 citations

••

TL;DR: A generalization of the short-time Fourier-based synchrosqueezing transform using a new local estimate of instantaneous frequency enables not only to achieve a highly concentrated time-frequency representation for a wide variety of amplitude- and frequency-modulated multicomponent signals but also to reconstruct their modes with a high accuracy.

Abstract: This paper puts forward a generalization of the short-time Fourier-based synchrosqueezing transform using a new local estimate of instantaneous frequency. Such a technique enables not only to achieve a highly concentrated time-frequency representation for a wide variety of amplitude- and frequency-modulated multicomponent signals but also to reconstruct their modes with a high accuracy. Numerical investigation on synthetic and gravitational-wave signals shows the efficiency of this new approach.

282 citations

••

TL;DR: A class of nonconvex penalty functions that maintain the convexity of the least squares cost function to be minimized, and avoids the systematic underestimation characteristic of L1 norm regularization are proposed.

Abstract: Sparse approximate solutions to linear equations are classically obtained via L1 norm regularized least squares, but this method often underestimates the true solution As an alternative to the L1 norm, this paper proposes a class of nonconvex penalty functions that maintain the convexity of the least squares cost function to be minimized, and avoids the systematic underestimation characteristic of L1 norm regularization The proposed penalty function is a multivariate generalization of the minimax-concave penalty It is defined in terms of a new multivariate generalization of the Huber function, which in turn is defined via infimal convolution The proposed sparse-regularized least squares cost function can be minimized by proximal algorithms comprising simple computations

276 citations

••

TL;DR: The notion of a node-variant GF, which allows the simultaneous implementation of multiple (regular) GFs in different nodes of the graph, is introduced, which enables the design of more general operators without undermining the locality in implementation.

Abstract: We study the optimal design of graph filters (GFs) to implement arbitrary linear transformations between graph signals GFs can be represented by matrix polynomials of the graph-shift operator (GSO) Since this operator captures the local structure of the graph, GFs naturally give rise to distributed linear network operators In most setups, the GSO is given so that GF design consists fundamentally in choosing the (filter) coefficients of the matrix polynomial to resemble desired linear transformations We determine spectral conditions under which a specific linear transformation can be implemented perfectly using GFs For the cases where perfect implementation is infeasible, we address the optimization of the filter coefficients to approximate the desired transformation Additionally, for settings where the GSO itself can be modified, we study its optimal design as well After this, we introduce the notion of a node-variant GF, which allows the simultaneous implementation of multiple (regular) GFs in different nodes of the graph This additional flexibility enables the design of more general operators without undermining the locality in implementation Perfect and approximate designs are also studied for this new type of GFs To showcase the relevance of the results in the context of distributed linear network operators, this paper closes with the application of our framework to two particular distributed problems: finite-time consensus and analog network coding

261 citations

••

TL;DR: The analysis leads to the conclusion that a bounded spectral norm of the network's Jacobian matrix in the neighbourhood of the training samples is crucial for a deep neural network of arbitrary depth and width to generalize well.

Abstract: The generalization error of deep neural networks via their classification margin is studied in this paper. Our approach is based on the Jacobian matrix of a deep neural network and can be applied to networks with arbitrary nonlinearities and pooling layers, and to networks with different architectures such as feed forward networks and residual networks. Our analysis leads to the conclusion that a bounded spectral norm of the network's Jacobian matrix in the neighbourhood of the training samples is crucial for a deep neural network of arbitrary depth and width to generalize well. This is a significant improvement over the current bounds in the literature, which imply that the generalization error grows with either the width or the depth of the network. Moreover, it shows that the recently proposed batch normalization and weight normalization reparametrizations enjoy good generalization properties, and leads to a novel network regularizer based on the network's Jacobian matrix. The analysis is supported with experimental results on the MNIST, CIFAR-10, LaRED, and ImageNet datasets.

••

TL;DR: This paper derives a simplified asymptotic mean square error (MSE) expression for the MUSIC algorithm applied to the coarray model, which is applicable even if the source number exceeds the sensor number, and shows that when there are more sources than the number of sensors, the MSE converges to a positive value instead of zero when the signal-to-noise ratio (SNR) goes to infinity.

Abstract: Sparse linear arrays, such as coprime arrays and nested arrays, have the attractive capability of providing enhanced degrees of freedom. By exploiting the coarray structure, an augmented sample covariance matrix can be constructed and MUtiple SIgnal Classification (MUSIC) can be applied to identify more sources than the number of sensors. While such a MUSIC algorithm works quite well, its performance has not been theoretically analyzed. In this paper, we derive a simplified asymptotic mean square error (MSE) expression for the MUSIC algorithm applied to the coarray model, which is applicable even if the source number exceeds the sensor number. We show that the directly augmented sample covariance matrix and the spatial smoothed sample covariance matrix yield the same asymptotic MSE for MUSIC. We also show that when there are more sources than the number of sensors, the MSE converges to a positive value instead of zero when the signal-to-noise ratio (SNR) goes to infinity. This finding explains the “saturation” behavior of the coarray-based MUSIC algorithms in the high-SNR region observed in previous studies. Finally, we derive the Cramer–Rao bound for sparse linear arrays, and conduct a numerical study of the statistical efficiency of the coarray-based estimator. Experimental results verify theoretical derivations and reveal the complex efficiency pattern of coarray-based MUSIC algorithms.

••

TL;DR: This paper proposes a definition of weak stationarity for random graph signals that takes into account the structure of the graph where the random process takes place, while inheriting many of the meaningful properties of the classical time domain definition.

Abstract: Stationarity is a cornerstone property that facilitates the analysis and processing of random signals in the time domain. Although time-varying signals are abundant in nature, in many practical scenarios, the information of interest resides in more irregular graph domains. This lack of regularity hampers the generalization of the classical notion of stationarity to graph signals. This paper proposes a definition of weak stationarity for random graph signals that takes into account the structure of the graph where the random process takes place, while inheriting many of the meaningful properties of the classical time domain definition. Provided that the topology of the graph can be described by a normal matrix, stationary graph processes can be modeled as the output of a linear graph filter applied to a white input. This is shown equivalent to requiring the correlation matrix to be diagonalized by the graph Fourier transform; a fact that is leveraged to define a notion of power spectral density (PSD). Properties of the graph PSD are analyzed and a number of methods for its estimation are proposed. This includes generalizations of nonparametric approaches such as periodograms, window-based average periodograms, and filter banks, as well as parametric approaches, using moving-average, autoregressive, and ARMA processes. Graph stationarity and graph PSD estimation are investigated numerically for synthetic and real-world graph signals.

••

TL;DR: This paper generalizes the traditional concept of wide sense stationarity to signals defined over the vertices of arbitrary weighted undirected graphs and shows that stationarity is expressed through the graph localization operator reminiscent of translation.

Abstract: Graphs are a central tool in machine learning and information processing as they allow to conveniently capture the structure of complex datasets. In this context, it is of high importance to develop flexible models of signals defined over graphs or networks. In this paper, we generalize the traditional concept of wide sense stationarity to signals defined over the vertices of arbitrary weighted undirected graphs. We show that stationarity is expressed through the graph localization operator reminiscent of translation. We prove that stationary graph signals are characterized by a well-defined power spectral density that can be efficiently estimated even for large graphs. We leverage this new concept to derive Wiener-type estimation procedures of noisy and partially observed signals and illustrate the performance of this new model for denoising and regression.

••

TL;DR: A family of autoregressive moving average (ARMA) recursions is designed, which are able to approximate any desired graph frequency response, and give exact solutions for specific graph signal denoising and interpolation problems.

Abstract: One of the cornerstones of the field of signal processing on graphs are graph filters, direct analogs of classical filters, but intended for signals defined on graphs. This paper brings forth new insights on the distributed graph filtering problem. We design a family of autoregressive moving average (ARMA) recursions, which are able to approximate any desired graph frequency response, and give exact solutions for specific graph signal denoising and interpolation problems. The philosophy to design the ARMA coefficients independently from the underlying graph renders the ARMA graph filters suitable in static and, particularly, time-varying settings. The latter occur when the graph signal and/or graph topology are changing over time. We show that in case of a time-varying graph signal, our approach extends naturally to a two-dimensional filter, operating concurrently in the graph and regular time domain. We also derive the graph filter behavior, as well as sufficient conditions for filter stability when the graph and signal are time varying. The analytical and numerical results presented in this paper illustrate that ARMA graph filters are practically appealing for static and time-varying settings, as predicted by theoretical derivations.

••

TL;DR: This paper investigates the application of simultaneous wireless information and power transfer (SWIPT) to cooperative non-orthogonal multiple access (NOMA) and proposes an iterative algorithm based on successive convex approximation (SCA) for complexity reduction, which can at least attain its stationary point efficiently.

Abstract: This paper investigates the application of simultaneous wireless information and power transfer (SWIPT) to cooperative non-orthogonal multiple access (NOMA). A new cooperative multiple-input single-output (MISO) SWIPT NOMA protocol is proposed, where a user with a strong channel condition acts as an energy-harvesting (EH) relay by adopting power splitting (PS) scheme to help a user with a poor channel condition. By jointly optimizing the PS ratio and the beamforming vectors, we aim at maximizing the data rate of the “strong user” while satisfying the QoS requirement of the “weak user”. To resolve the formulated nonconvex problem, the semidefinite relaxation (SDR) technique is applied to reformulate the original problem, by proving the rank-one optimality. And then an iterative algorithm based on successive convex approximation (SCA) is proposed for complexity reduction, which can at least attain its stationary point efficiently. In view of the potential application scenarios, e.g., Internet of Things (IoT), the single-input single-output (SISO) case is also studied. The formulated problem is proved to be strictly unimodal with respect to the PS ratio. Hence, a golden section search (GSS) based algorithm with closed-form solution at each step is proposed to find the unique global optimal solution. It is worth pointing out that the SCA method can also converge to the optimal solution in SISO cases. In the numerical simulation, the proposed algorithm is numerically shown to converge within a few iterations, and the SWIPT-aided NOMA protocol outperforms the existing transmission protocols.

••

TL;DR: The proposed monotonic framework is used to shed light on the ultimate performance of wireless networks in terms of EE and also to benchmark the performance of the lower-complexity framework based on sequential programming.

Abstract: The characterization of the global maximum of energy efficiency (EE) problems in wireless networks is a challenging problem due to their nonconvex nature in interference channels. The aim of this paper is to develop a new and general framework to achieve globally optimal solutions. First, the hidden monotonic structure of the most common EE maximization problems is exploited jointly with fractional programming theory to obtain globally optimal solutions with exponential complexity in the number of network links. To overcome the high complexity, we also propose a framework to compute suboptimal power control strategies with affordable complexity. This is achieved by merging fractional programming and sequential optimization. The proposed monotonic framework is used to shed light on the ultimate performance of wireless networks in terms of EE and also to benchmark the performance of the lower-complexity framework based on sequential programming. Numerical evidence is provided to show that the sequential fractional programming framework achieves global optimality in several practical communication scenarios.

••

TL;DR: The theoretical performance comparison between NOMA and conventional OMA systems is investigated, from an optimization point of view, and a closed-form expression for the optimum sum rate of N OMA systems is derived.

Abstract: Existing work regarding the performance comparison between nonorthogonal multiple access (NOMA) and orthogonal multiple access (OMA) can be generally divided into two categories. The work in the first category aims to develop analytical results for the comparison, often with fixed system parameters. The work in the second category aims to propose efficient algorithms for optimizing these parameters, and compares NOMA with OMA by computer simulations. However, when these parameters are optimized, the theoretical superiority of NOMA over OMA is still not clear. Therefore, in this paper, the theoretical performance comparison between NOMA and conventional OMA systems is investigated, from an optimization point of view. First, sum rate maximizing problems considering user fairness in both NOMA and various OMA systems are formulated. Then, by using the method of power splitting, a closed-form expression for the optimum sum rate of NOMA systems is derived. Moreover, the fact that NOMA can always outperform any conventional OMA systems, when both are equipped with the optimum resource allocation policies, is validated with rigorous mathematical proofs. Finally, computer simulations are conducted to validate the correctness of the analytical results.

••

TL;DR: In this article, a general algorithmic framework for the minimization of a nonconvex smooth function subject to non-linear smooth constraints is proposed, and the algorithm solves a sequence of (separable) strongly convex problems and maintains feasibility at each iteration.

Abstract: In this two-part paper, we propose a general algorithmic framework for the minimization of a nonconvex smooth function subject to nonconvex smooth constraints, and also consider extensions to some structured, nonsmooth problems. The algorithm solves a sequence of (separable) strongly convex problems and maintains feasibility at each iteration. Convergence to a stationary solution of the original nonconvex optimization is established. Our framework is very general and flexible and unifies several existing successive convex approximation (SCA)-based algorithms. More importantly, and differently from current SCA approaches, it naturally leads to distributed and parallelizable implementations for a large class of nonconvex problems. This Part I is devoted to the description of the framework in its generality. In Part II, we customize our general methods to several (multiagent) optimization problems in communications, networking, and machine learning; the result is a new class of centralized and distributed algorithms that compare favorably to existing ad-hoc (centralized) schemes.

••

TL;DR: An augmented nested array concept is proposed by splitting the dense subarray of nested array into several parts, which can be rearranged at the two sides of the sparse sub Array, which possesses higher degree-of-freedom capacity and less mutual coupling.

Abstract: Recently, nonuniform linear arrays (e.g., coprime/nested array) have attracted great attention of researchers in array signal processing field due to its ability to generate virtual difference coarrays. In the array design, a critical problem is where to place the sensors for optimal performance aiming for a maximum degree of freedom capacity and a minimum mutual coupling ratio, simultaneously. An augmented nested array concept is proposed by splitting the dense subarray of nested array into several parts, which can be rearranged at the two sides of the sparse subarray of nested array. Specifically, four closed-form expressions for the physical sensor locations and the virtual sensor locations are derived for any given element number. Compared to the (super) nested array having the same element number, the newly formed augmented nested array possesses higher degree-of-freedom capacity and less mutual coupling. In the end, numerical simulation results validate the effectiveness of the proposed arrays.

••

TL;DR: In this article, a variational nonlinear chirp mode decomposition (VNCMD) is proposed to analyze wide-band NCSs, which can be viewed as a time-frequency filter bank, which concurrently extracts all the signal modes.

Abstract: Variational mode decomposition (VMD), a recently introduced method for adaptive data analysis, has aroused much attention in various fields. However, the VMD is formulated based on the assumption of narrow-band property of the signal model. To analyze wide-band nonlinear chirp signals (NCSs), we present an alternative method called variational nonlinear chirp mode decomposition (VNCMD). The VNCMD is developed from the fact that a wideband NCS can be transformed to a narrow-band signal by using demodulation techniques. Our decomposition problem is, thus, formulated as an optimal demodulation problem, which is efficiently solved by the alternating direction method of multipliers. Our method can be viewed as a time–frequency filter bank, which concurrently extracts all the signal modes. Some simulated and real data examples are provided showing the effectiveness of the VNCMD in analyzing NCSs containing close or even crossed modes.

••

TL;DR: In this article, a modified online saddle-point (MOSP) scheme is developed, and proved to simultaneously yield sublinear dynamic regret and fit, provided that the accumulated variations of per-slot minimizers and constraints are sublinearly growing with time.

Abstract: Existing approaches to online convex optimization make sequential one-slot-ahead decisions, which lead to (possibly adversarial) losses that drive subsequent decision iterates. Their performance is evaluated by the so-called regret that measures the difference of losses between the online solution and the best yet fixed overall solution in hindsight . The present paper deals with online convex optimization involving adversarial loss functions and adversarial constraints, where the constraints are revealed after making decisions, and can be tolerable to instantaneous violations but must be satisfied in the long term. Performance of an online algorithm in this setting is assessed by the difference of its losses relative to the best dynamic solution with one-slot-ahead information of the loss function and the constraint (that is here termed dynamic regret ); and the accumulated amount of constraint violations (that is here termed dynamic fit ). In this context, a modified online saddle-point (MOSP) scheme is developed, and proved to simultaneously yield sublinear dynamic regret and fit, provided that the accumulated variations of per-slot minimizers and constraints are sublinearly growing with time. MOSP is also applied to the dynamic network resource allocation task, and it is compared with the well-known stochastic dual gradient method. Numerical experiments demonstrate the performance gain of MOSP relative to the state of the art.

••

TL;DR: This paper presents a computationally tractable algorithm for estimating this graph that structures the data, and the resulting graph is directed and weighted, possibly capturing causal relations, not just reciprocal correlations as in many existing approaches in the literature.

Abstract: Many applications collect a large number of time series, for example, the financial data of companies quoted in a stock exchange, the health care data of all patients that visit the emergency room of a hospital, or the temperature sequences continuously measured by weather stations across the US. These data are often referred to as un structured. The first task in its analytics is to derive a low dimensional representation, a graph or discrete manifold, that describes well the inter relations among the time series and their intra relations across time. This paper presents a computationally tractable algorithm for estimating this graph that structures the data. The resulting graph is directed and weighted, possibly capturing causal relations, not just reciprocal correlations as in many existing approaches in the literature. A convergence analysis is carried out. The algorithm is demonstrated on random graph datasets and real network time series datasets, and its performance is compared to that of related methods. The adjacency matrices estimated with the new method are close to the true graph in the simulated data and consistent with prior physical knowledge in the real dataset tested.

••

TL;DR: This paper advocates kernel regression as a framework generalizing popular SPoG modeling and reconstruction and expanding their capabilities, capitalizes on the so-called representer theorem to devise simpler versions of existing Tikhonov regularized estimators, and offers a novel probabilistic interpretation of kernel methods on graphs based on graphical models.

Abstract: A number of applications in engineering, social sciences, physics, and biology involve inference over networks. In this context, graph signals are widely encountered as descriptors of vertex attributes or features in graph-structured data. Estimating such signals in all vertices given noisy observations of their values on a subset of vertices has been extensively analyzed in the literature of signal processing on graphs (SPoG). This paper advocates kernel regression as a framework generalizing popular SPoG modeling and reconstruction and expanding their capabilities. Formulating signal reconstruction as a regression task on reproducing kernel Hilbert spaces of graph signals permeates benefits from statistical learning, offers fresh insights, and allows for estimators that leverage richer forms of prior information than existing alternatives. A number of SPoG notions such as bandlimitedness, graph filters, and the graph Fourier transform are naturally accommodated in the kernel framework. Additionally, this paper capitalizes on the so-called representer theorem to devise simpler versions of existing Tikhonov regularized estimators, and offers a novel probabilistic interpretation of kernel methods on graphs based on graphical models. Motivated by the challenges of selecting the bandwidth parameter in SPoG estimators or the kernel map in kernel-based methods, this paper further proposes two multikernel approaches with complementary strengths. Whereas the first enables estimation of the unknown bandwidth of bandlimited signals, the second allows for efficient graph filter selection. Numerical tests with synthetic as well as real data demonstrate the merits of the proposed methods relative to state-of-the-art alternatives.

••

TL;DR: Simulation results illustrate that the proposed methodologies can outperform some counterparts providing sequences with good autocorrelation features especially in the discrete phase/binary case.

Abstract: This paper is focused on the design of phase sequences with good (aperiodic) autocorrelation properties in terms of peak sidelobe level and integrated sidelobe level. The problem is formulated as a biobjective Pareto optimization forcing either a continuous or a discrete phase constraint at the design stage. An iterative procedure based on the coordinate descent method is introduced to deal with the resulting optimization problems that are nonconvex and NP-hard in general. Each iteration of the devised method requires the solution of a nonconvex min–max problem. It is handled either through a novel bisection or an FFT-based method respectively for the continuous and the discrete phase constraint. Additionally, a heuristic approach to initialize the procedures employing the $l_p$ -norm minimization technique is proposed. Simulation results illustrate that the proposed methodologies can outperform some counterparts providing sequences with good autocorrelation features especially in the discrete phase/binary case.

••

TL;DR: An algorithm for tracking an unknown number of targets based on measurements provided by multiple sensors that can outperform multisensor versions of the probability hypothesis density (PHD) filter, the cardinalized PHD filter, and the multi-Bernoulli filter.

Abstract: We propose an algorithm for tracking an unknown number of targets based on measurements provided by multiple sensors. Our algorithm achieves low computational complexity and excellent scalability by running belief propagation on a suitably devised factor graph. A redundant formulation of data association uncertainty and the use of “augmented target states” including binary target indicators make it possible to exploit statistical independencies for a drastic reduction of complexity. An increase in the number of targets, sensors, or measurements leads to additional variable nodes in the factor graph but not to higher dimensions of the messages. As a consequence, the complexity of our method scales only quadratically in the number of targets, linearly in the number of sensors, and linearly in the number of measurements per sensor. The performance of the method compares well with that of previously proposed methods, including methods with a less favorable scaling behavior. In particular, our method can outperform multisensor versions of the probability hypothesis density (PHD) filter, the cardinalized PHD filter, and the multi-Bernoulli filter.

••

TL;DR: This paper proposes the network Newton (NN) method as a distributed algorithm that incorporates second-order information via distributed implementation of approximations of a suitably chosen Newton step and proves convergence to a point close to the optimal argument at a rate that is at least linear.

Abstract: We study the problem of minimizing a sum of convex objective functions, where the components of the objective are available at different nodes of a network and nodes are allowed to only communicate with their neighbors. The use of distributed gradient methods is a common approach to solve this problem. Their popularity notwithstanding, these methods exhibit slow convergence and a consequent large number of communications between nodes to approach the optimal argument because they rely on first-order information only. This paper proposes the network Newton (NN) method as a distributed algorithm that incorporates second-order information. This is done via distributed implementation of approximations of a suitably chosen Newton step. The approximations are obtained by truncation of the Newton step's Taylor expansion. This leads to a family of methods defined by the number $K$ of Taylor series terms kept in the approximation. When keeping $K$ terms of the Taylor series, the method is called NN- $K$ and can be implemented through the aggregation of information in $K$ -hop neighborhoods. Convergence to a point close to the optimal argument at a rate that is at least linear is proven and the existence of a tradeoff between convergence time and the distance to the optimal argument is shown. The numerical experiments corroborate reductions in the number of iterations and the communication cost that are necessary to achieve convergence relative to first-order alternatives.

••

TL;DR: Under the constant modulus constraint, two algorithms are proposed to design the probing waveform directly under the alternating direction method of multipliers (ADMM) algorithm, whose convergence speed is very fast and which can be solved through a double-ADMM algorithm.

Abstract: A multiple-input multiple-output radar has great flexibility to design the transmit beampattern via selecting the probing waveform. The idea of current transmit beampattern design is to approximate the disired transimit beampattern and minimize the cross-correlation sidelobes. In this paper, under the constant modulus constraint, two algorithms are proposed to design the probing waveform directly. In the first algorithm, the optimization criterion is minimizing the squared-error between the designed beampattern and the given beampattern. Since the objective function is a nonconvex fourth-order polynomial and the constant modulus constraint can be regarded as many nonconvex quadratic equality constraints, an efficient alternating direction method of multipliers (ADMM) algorithm, whose convergence speed is very fast, is proposed to solve it. In the second algorithm, the criterion is minimizing the absolute-error between the designed beampattern and the given beampattern. This nonconvex problem can be formulated as $l_1$ -norm problem, which can be solved through a double-ADMM algorithm. Finally, we assess the performance of the two proposed algorithms via numerical results.

••

TL;DR: This paper investigates joint RF-baseband hybrid precoding for the downlink of multiuser multiantenna mmWave systems with a limited number of RF chains and proposes efficient methods to address the JWSPD problems and jointly optimize the RF and baseband precoders under the two performance measures.

Abstract: In millimeter-wave (mmWave) systems, antenna architecture limitations make it difficult to apply conventional fully digital precoding techniques but call for low-cost analog radio frequency (RF) and digital baseband hybrid precoding methods. This paper investigates joint RF-baseband hybrid precoding for the downlink of multiuser multiantenna mmWave systems with a limited number of RF chains. Two performance measures, maximizing the spectral efficiency and the energy efficiency of the system, are considered. We propose a codebook-based RF precoding design and obtain the channel state information via a beam sweep procedure. Via the codebook-based design, the original system is transformed into a virtual multiuser downlink system with the RF chain constraint. Consequently, we are able to simplify the complicated hybrid precoding optimization problems to joint codeword selection and precoder design (JWSPD) problems. Then, we propose efficient methods to address the JWSPD problems and jointly optimize the RF and baseband precoders under the two performance measures. Finally, extensive numerical results are provided to validate the effectiveness of the proposed hybrid precoders.

••

TL;DR: This work proposes to design hybrid RF and baseband precoders/combiners for multistream transmission in massive MIMO systems, by directly decomposing the predesigned unconstrained digital precoder/combiner of a large dimension by matrix decomposition.

Abstract: For practical implementation of massive multiple-input multiple-output (MIMO) systems, the hybrid processing (precoding/combining) structure is promising to reduce the high implementation cost and power consumption rendered by large number of radio frequency (RF) chains of the traditional processing structure. The hybrid processing is realized through low-dimensional digital baseband processing combined with analog RF processing enabled by phase shifters. We propose to design hybrid RF and baseband precoders/combiners for multistream transmission in massive MIMO systems, by directly decomposing the predesigned unconstrained digital precoder/combiner of a large dimension. This approach is fundamental and general in the sense that any conventional full RF chain precoding solution of a MIMO system configuration can be converted to a hybrid processing structure by matrix decomposition. The constant amplitude constraint of analog RF processing results in the matrix decomposition problem nonconvex. Based on an alternate optimization technique, the nonconvex matrix decomposition problem can be decoupled into a series of convex subproblems and effectively solved by restricting the phase increment of each entry in the RF precoder/combiner within a small vicinity of its preceding iterate. A singular value decomposition-based technique is proposed to secure an initial point sufficiently close to the global solution of the original nonconvex problem. Through simulation, the convergence of the alternate optimization for such a matrix decomposition-based hybrid processing (MD-HP) scheme is examined, and the performance of the MD-HP scheme is demonstrated to be near-optimal.