Author

# Jian Zhao

Other affiliations: NTT DoCoMo, ETH Zurich, Institute for Infocomm Research Singapore

Bio: Jian Zhao is an academic researcher from Nanjing University. The author has contributed to research in topics: Computer science & Artificial neural network. The author has an hindex of 12, co-authored 49 publications receiving 827 citations. Previous affiliations of Jian Zhao include NTT DoCoMo & ETH Zurich.

##### Papers

More filters

••

TL;DR: Inspired by recent results in compressive sensing, two algorithms are proposed to tackle the problem that involves the joint design of transmit beamformers and user data allocation at BSs to minimize the backhaul user data transfer, which is NP-hard.

Abstract: When the joint processing technique is applied in the coordinated multipoint (CoMP) downlink transmission, the user data for each mobile station needs to be shared among multiple base stations (BSs) via backhaul. If the number of users is large, this data exchange can lead to a huge backhaul signaling overhead. In this paper, we consider a multi-cell CoMP network with multi-antenna BSs and single antenna users. The problem that involves the joint design of transmit beamformers and user data allocation at BSs to minimize the backhaul user data transfer is addressed, which is subject to given quality-of-service and per-BS power constraints. We show that this problem can be cast into an l0-norm minimization problem, which is NP-hard. Inspired by recent results in compressive sensing, we propose two algorithms to tackle it. The first algorithm is based on reweighted l1-norm minimization, which solves a series of convex l0-norm minimization problems. In the second algorithm, we first solve the l2-norm relaxation of the joint clustering and beamforming problem and then iteratively remove the links that correspond to the smallest transmit power. The second algorithm enjoys a faster solution speed and can also be implemented in a semi-distributed manner under certain assumptions. Simulations show that both algorithms can significantly reduce the user data transfer in the backhaul.

235 citations

••

17 Jun 2007

TL;DR: This paper investigates and compares two different re-encoding schemes at the relay in a MIMO two-way decode-and-forward relaying scheme based on superposition coding and the bitwise XOR operation.

Abstract: Conventional half-duplex relaying schemes suffer from the loss in spectral efficiency due to the two channel uses required for the transmission from the source to the destination. Two-way relaying is an efficient means to reduce this loss in spectral efficiency by bidirectional simultaneous transmission of data between the two nodes. In this paper we study the impact of transmit channel state information at the relay in a MIMO two-way decode-and-forward relaying scheme. We investigate and compare two different re-encoding schemes at the relay. The first is based on superposition coding, whereas the second one is based on the bitwise XOR operation.

191 citations

••

TL;DR: A large-system iterative algorithm can produce the asymptotically optimum solution for the $\ell_1$-relaxed problem, which only requires large-scale channel coefficients irrespective of the actual channel realization.

Abstract: We consider a heterogeneous cellular network with densely underlaid small cell access points (SAPs). Wireless backhaul provides the data connection from the core network to SAPs. To serve as many SAPs and their corresponding users as possible with guaranteed data rates, admission control of SAPs needs to be performed in wireless backhaul. Such a problem involves joint design of transmit beamformers, power control, and selection of SAPs. In order to tackle such a difficult problem, we apply $\ell_1$-relaxation and propose an iterative algorithm for the $\ell_1$-relaxed problem. The selection of SAPs is made based on the outputs of the iterative algorithm. This algorithm is fast and enjoys low complexity for small-to-medium sized systems. However, its solution depends on the actual channel state information, and resuming the algorithm for each new channel realization may be unrealistic for large systems. Therefore, we make use of random matrix theory and also propose an iterative algorithm for large systems. Such a large system iterative algorithm produces asymptotically optimum solution for the $\ell_1$-relaxed problem, which only requires large-scale channel coefficients irrespective of the actual channel realization. Near optimum results are achieved by our proposed algorithms in simulations.

61 citations

••

TL;DR: In this paper, the authors considered a heterogeneous cellular network with densely underlaid small cell access points (SAPs), and proposed an iterative algorithm for the ''ell_1'' -relaxed problem.

Abstract: We consider a heterogeneous cellular network with densely underlaid small cell access points (SAPs). Wireless backhaul provides the data connection from the core network to SAPs. To serve as many SAPs and their corresponding users as possible with guaranteed data rates, admission control of SAPs needs to be performed in wireless backhaul. Such a problem involves joint design of transmit beamformers, power control, and selection of SAPs. In order to tackle such a difficult problem, we apply $\ell_1$ -relaxation and propose an iterative algorithm for the $\ell_1$ -relaxed problem. The selection of SAPs is made based on the outputs of the iterative algorithm, and we prove such an algorithm converges locally. Furthermore, this algorithm is fast and enjoys low complexity for small-to-medium sized systems. However, its solution depends on the actual channel state information, and resuming the algorithm for each new channel realization may be unrealistic for large systems. Therefore, we make use of the random matrix theory and also propose an iterative algorithm for large systems. Such a large-system iterative algorithm can produce the asymptotically optimum solution for the $\ell_1$ -relaxed problem, which only requires large-scale channel coefficients irrespective of the actual channel realization. Near optimum results are achieved by our proposed algorithms in simulations.

50 citations

••

TL;DR: In this paper, the authors proposed an operation-aware neural network (ONN) which learns different representations for different operations in order to learn feature interactions better, which outperforms the state-of-the-art models in both offline and online training environments.

Abstract: User response prediction makes a crucial contribution to the rapid development of online advertising system and recommendation system. The importance of learning feature interactions has been emphasized by many works. Many deep models are proposed to automatically learn high-order feature interactions. Since most features in advertising systems and recommendation systems are high-dimensional sparse features, deep models usually learn a low-dimensional distributed representation for each feature in the bottom layer. Besides traditional fully-connected architectures, some new operations, such as convolutional operations and product operations, are proposed to learn feature interactions better. In these models, the representation is shared among different operations. However, the best representation for each operation may be different. In this paper, we propose a new neural model named Operation-aware Neural Networks (ONN) which learns different representations for different operations. Our experimental results on two large-scale real-world ad click/conversion datasets demonstrate that ONN consistently outperforms the state-of-the-art models in both offline-training environment and online-training environment.

45 citations

##### Cited by

More filters

••

3,248 citations

•

TL;DR: This work proposes the Learning without Forgetting method, which uses only new task data to train the network while preserving the original capabilities, and performs favorably compared to commonly used feature extraction and fine-tuning adaption techniques.

Abstract: When building a unified vision system or gradually adding new capabilities to a system, the usual assumption is that training data for all tasks is always available. However, as the number of tasks grows, storing and retraining on such data becomes infeasible. A new problem arises where we add new capabilities to a Convolutional Neural Network (CNN), but the training data for its existing capabilities are unavailable. We propose our Learning without Forgetting method, which uses only new task data to train the network while preserving the original capabilities. Our method performs favorably compared to commonly used feature extraction and fine-tuning adaption techniques and performs similarly to multitask learning that uses original task data we assume unavailable. A more surprising observation is that Learning without Forgetting may be able to replace fine-tuning with similar old and new task datasets for improved new task performance.

1,037 citations

••

[...]

01 Jan 1985

TL;DR: The first group of results in fixed point theory were derived from Banach's fixed point theorem as discussed by the authors, which is a nice result since it contains only one simple condition on the map F, since it is easy to prove and since it nevertheless allows a variety of applications.

Abstract: Formally we have arrived at the middle of the book. So you may need a pause for recovering, a pause which we want to fill up by some fixed point theorems supplementing those which you already met or which you will meet in later chapters. The first group of results centres around Banach’s fixed point theorem. The latter is certainly a nice result since it contains only one simple condition on the map F, since it is so easy to prove and since it nevertheless allows a variety of applications. Therefore it is not astonishing that many mathematicians have been attracted by the question to which extent the conditions on F and the space Ω can be changed so that one still gets the existence of a unique or of at least one fixed point. The number of results produced this way is still finite, but of a statistical magnitude, suggesting at a first glance that only a random sample can be covered by a chapter or even a book of the present size. Fortunately (or unfortunately?) most of the modifications have not found applications up to now, so that there is no reason to write a cookery book about conditions but to write at least a short outline of some ideas indicating that this field can be as interesting as other chapters. A systematic account of more recent ideas and examples in fixed point theory should however be written by one of the true experts. Strange as it is, such a book does not seem to exist though so many people are puzzling out so many results.

994 citations

••

TL;DR: CACM is really essential reading for students, it keeps tabs on the latest in computer science and is a valuable asset for us students, who tend to delve deep into a particular area of CS and forget everything that is happening around us.

Abstract: Communications of the ACM (CACM for short, not the best sounding acronym around) is the ACM’s flagship magazine. Started in 1957, CACM is handy for keeping up to date on current research being carried out across all topics of computer science and realworld applications. CACM has had an illustrious past with many influential pieces of work and debates started within its pages. These include Hoare’s presentation of the Quicksort algorithm; Rivest, Shamir and Adleman’s description of the first publickey cryptosystem RSA; and Dijkstra’s famous letter against the use of GOTO. In addition to the print edition, which is released monthly, there is a fantastic website (http://cacm.acm. org/) that showcases not only the most recent edition but all previous CACM articles as well, readable online as well as downloadable as a PDF. In addition, the website lets you browse for articles by subject, a handy feature if you want to focus on a particular topic. CACM is really essential reading. Pretty much guaranteed to contain content that is interesting to anyone, it keeps tabs on the latest in computer science. It is a valuable asset for us students, who tend to delve deep into a particular area of CS and forget everything that is happening around us. — Daniel Gooch U ndergraduate research is like a box of chocolates: You never know what kind of project you will get. That being said, there are still a few things you should know to get the most out of the experience.

856 citations

••

TL;DR: The central premise of the book is that the combination of the Pareto or Zipf distribution that is characteristic of Web traffic and the direct access to consumers via Web technology has opened up new business opportunities in the ''long tail''.

Abstract: The Long Tail: How Technology is turning mass markets into millions of niches. (p. 15). This passage from The Long Tail, pretty much sums it all up. The Long Tail by Chris Anderson is a good and worthwhile read for information scientists, computer scientists, ecommerce researchers, and others interested in all areas of Web research. The central premise of the book is that the combination of (1) the Pareto or Zipf distribution (i.e., power law probability distribution) that is characteristic of Web traffic and (2) the direct access to consumers via Web technology has opened up new business opportunities in the ''long tail''. Producers and advertisers no longer have to target ''the big hits'' at the head of the distribution. Instead, they can target the small, niche communities or even individuals in the tail of the distribution. The long tail is has been studied by Web researchers and has been noted in term usage on search engines, access times to servers, and popularity of Web sites. Andersen points out that the long tail also applies to products sold on the Web. He recounts that a sizeable percentage of Amazon sales come from books that only sell a few copies, a large number of songs from Rhapsody get downloaded only once in a month, and a significant number of movies from Netflix only get ordered occasionally. However, since the storage is in digital form for the songs and music (and Amazon out sources the storage of books) there is little additional inventory cost of these items. This phenomenon across all Web companies has led to a broadening of participation by both producers and consumers that would not have happened without the Web. The idea of the long tail is well known, of course. What Anderson has done is present it in an interesting manner and in a Web ecommerce setting. He applies it to Web businesses and then relates the multitude of other factors ongoing that permit the actual implementation of the long tail effect. Anderson also expands on prior work on the long tail by introducing an element of time, given the distribution a three dimensional effect. All in all, it is a nifty idea. The book is comprised of 14 chapters, plus an Introduction. Chapter 1 presents an overview of what the long tail is. Chapter 2 discusses the ''head'', which is the top of the tail where the …

827 citations