计量经济分析 = Econometric analysis

We will review some of the major results in random graphs and some of the more challenging open problems. We will cover algorithmic and structural questions. We will touch on newer models, including those related to the WWW.

Random graphs

tions. Bootstrap has found many applications in engineering field, including artificial neural networks, biomedical engineering, environmental engineering, image processing, and radar and sonar signal processing. Basic concepts of the bootstrap are summarized in each section as a step-by-step algorithm for ease of implementation. Most of the applications are taken from the signal processing literature. The principles of the bootstrap are introduced in Chapter 2. Both the nonparametric and parametric bootstrap procedures are explained. Babu and Singh (1984) have demonstrated that in general, these two procedures behave similarly for pivotal (Studentized) statistics. The fact that the bootstrap is not the solution for all of the problems has been known to statistics community for a long time; however, this fact is rarely touched on in the manuscripts meant for practitioners. It was first observed by Babu (1984) that the bootstrap does not work in the infinite variance case. Bootstrap Techniques for Signal Processing explains the limitations of bootstrap method with an example. I especially liked the presentation style. The basic results are stated without proofs; however, the application of each result is presented as a simple step-by-step process, easy for nonstatisticians to follow. The bootstrap procedures, such as moving block bootstrap for dependent data, along with applications to autoregressive models and for estimation of power spectral density, are also presented in Chapter 2. Signal detection in the presence of noise is generally formulated as a testing of hypothesis problem. Chapter 3 introduces principles of bootstrap hypothesis testing. The topics are introduced with interesting real life examples. Flow charts, typical in engineering literature, are used to aid explanations of the bootstrap hypothesis testing procedures. The bootstrap leads to second-order correction due to pivoting; this improvement in the results due to pivoting is also explained. In the second part of Chapter 3, signal processing is treated as a regression problem. The performance of the bootstrap for matched filters as well as constant false-alarm rate matched filters is also illustrated. Chapters 2 and 3 focus on estimation problems. Chapter 4 introduces bootstrap methods used in model selection. Due to the inherent structure of the subject matter, this chapter may be difficult for nonstatisticians to follow. Chapter 5 is the most impressive chapter in the book, especially from the standpoint of statisticians. It provides real data bootstrap applications to illustrate the theory covered in the earlier chapters. These include applications to optimal sensor placement for knock detection and land-mine detection. The authors also provide a MATLAB toolbox comprising frequently used routines. Overall, this is a very useful handbook for engineers, especially those working in signal processing.

Probability and Random Processes

Probability and Measure

(1987). Applied Probability and Queues. Journal of the Operational Research Society: Vol. 38, No. 11, pp. 1095-1096.

https://link.springer.com/content/pdf/10.1057%2Fjors.1987.184.pdf

Applied Probability and Queues

We consider the stochastic bandit problem with a continuous set of arms, with the expected reward function over the arms assumed to be fixed but unknown. We provide two new Gaussian process-based algorithms for continuous bandit optimization-Improved GP-UCB (IGP-UCB) and GP-Thomson sampling (GP-TS), and derive corresponding regret bounds. Specifically, the bounds hold when the expected reward function belongs to the reproducing kernel Hilbert space (RKHS) that naturally corresponds to a Gaussian process kernel used as input by the algorithms. Along the way, we derive a new self-normalized concentration inequality for vector- valued martingales of arbitrary, possibly infinite, dimension. Finally, experimental evaluation and comparisons to existing algorithms on synthetic and real-world environments are carried out that highlight the favorable gains of the proposed strategies in many cases.

/pdf/on-kernelized-multi-armed-bandits-4uzpynqn1l.pdf

On Kernelized Multi-armed Bandits.

We consider stochastic multi-armed bandit problems with complex actions over a set of basic arms, where the decision maker plays a complex action rather than a basic arm in each round. The reward of the complex action is some function of the basic arms' rewards, and the feedback observed may not necessarily be the reward perarm. For instance, when the complex actions are subsets of the arms, we may only observe the maximum reward over the chosen subset. Thus, feedback across complex actions may be coupled due to the nature of the reward function. We prove a frequentist regret bound for Thompson sampling in a very general setting involving parameter, action and observation spaces and a likelihood function over them. The bound holds for discretely-supported priors over the parameter space without additional structural properties such as closed-form posteriors, conjugate prior structure or independence across arms. The regret bound scales logarithmically with time but, more importantly, with an improved constant that non-trivially captures the coupling across complex actions due to the structure of the rewards. As applications, we derive improved regret bounds for classes of complex bandit problems involving selecting subsets of arms, including the first nontrivial regret bounds for nonlinear MAX reward feedback from subsets. Using particle filters for computing posterior distributions which lack an explicit closed-form, we present numerical results for the performance of Thompson sampling for subset-selection and job scheduling problems.

/pdf/thompson-sampling-for-complex-online-problems-19fzerti54.pdf

Thompson Sampling for Complex Online Problems

We consider reinforcement learning in parameterized Markov Decision Processes (MDPs), where the parameterization may induce correlation across transition probabilities or rewards. Consequently, observing a particular state transition might yield useful information about other, unobserved, parts of the MDP. We present a version of Thompson sampling for parameterized reinforcement learning problems, and derive a frequentist regret bound for priors over general parameter spaces. The result shows that the number of instants where suboptimal actions are chosen scales logarithmically with time, with high probability. It holds for prior distributions that put significant probability near the true model, without any additional, specific closed-form structure such as conjugate or product-form priors. The constant factor in the logarithmic scaling encodes the information complexity of learning the MDP in terms of the Kullback-Leibler geometry of the parameter space.

Thompson Sampling for Learning Parameterized Markov Decision Processes

A time-slotted queueing system for a wireless downlink with multiple flows and a single server is considered, with exogenous arrivals and time-varying channels. It is assumed that only one user can be serviced in a single time slot. Unlike much recent work on this problem, attention is drawn to the case where the server can obtain only partial information about the instantaneous state of the channel. In each time slot, the server is allowed to specify a single subset of flows from a collection of observable subsets, observe the current service rates for that subset, and subsequently pick a user to serve. The stability region for such a system is provided. An online scheduling algorithm is presented that uses information about marginal distributions to pick the subset and the Max-Weight rule to pick a flow within the subset, and which is provably throughput-optimal. In the case where the observable subsets are all disjoint, or where the subsets and channel statistics are symmetric, it is shown that a simple scheduling algorithm-Max-Sum-Queue-that essentially picks subsets having the largest squared-sum of queues, followed by picking a user using Max-Weight within the subset, is throughput-optimal.

/pdf/on-wireless-scheduling-with-partial-channel-state-35b42t04w6.pdf

On Wireless Scheduling With Partial Channel-State Information

/pdf/thompson-sampling-for-learning-parameterized-markov-decision-1brwb9dix6.pdf

Aditya Gopalan

Papers

On Kernelized Multi-armed Bandits.

Thompson Sampling for Complex Online Problems

Thompson Sampling for Learning Parameterized Markov Decision Processes

On Wireless Scheduling With Partial Channel-State Information

Thompson Sampling for Learning Parameterized Markov Decision Processes