scispace - formally typeset
Search or ask a question
Author

Aggelos K. Katsaggelos

Bio: Aggelos K. Katsaggelos is an academic researcher from Northwestern University. The author has contributed to research in topics: Image restoration & Image processing. The author has an hindex of 76, co-authored 946 publications receiving 26196 citations. Previous affiliations of Aggelos K. Katsaggelos include University of Stavanger & Delft University of Technology.


Papers
More filters
Journal ArticleDOI
TL;DR: This paper proposes a CNN that is trained on both the spatial and the temporal dimensions of videos to enhance their spatial resolution and shows that by using images to pretrain the model, a relatively small video database is sufficient for the training of the model to achieve and improve upon the current state-of-the-art.
Abstract: Convolutional neural networks (CNN) are a special type of deep neural networks (DNN). They have so far been successfully applied to image super-resolution (SR) as well as other image restoration tasks. In this paper, we consider the problem of video super-resolution. We propose a CNN that is trained on both the spatial and the temporal dimensions of videos to enhance their spatial resolution. Consecutive frames are motion compensated and used as input to a CNN that provides super-resolved video frames as output. We investigate different options of combining the video frames within one CNN architecture. While large image databases are available to train deep neural networks, it is more challenging to create a large video database of sufficient quality to train neural nets for video restoration. We show that by using images to pretrain our model, a relatively small video database is sufficient for the training of our model to achieve and even improve upon the current state-of-the-art. We compare our proposed approach to current video as well as image SR algorithms.

541 citations

Journal ArticleDOI
TL;DR: The popular neural network architectures used for imaging tasks are reviewed, offering some insight as to how these deep-learning tools can solve the inverse problem.
Abstract: Traditionally, analytical methods have been used to solve imaging problems such as image restoration, inpainting, and superresolution (SR). In recent years, the fields of machine and deep learning have gained a lot of momentum in solving such imaging problems, often surpassing the performance provided by analytical approaches. Unlike analytical methods for which the problem is explicitly defined and domain-knowledge carefully engineered into the solution, deep neural networks (DNNs) do not benefit from such prior knowledge and instead make use of large data sets to learn the unknown solution to the inverse problem. In this article, we review deep-learning techniques for solving such inverse problems in imaging. More specifically, we review the popular neural network architectures used for imaging tasks, offering some insight as to how these deep-learning tools can solve the inverse problem. Furthermore, we address some fundamental questions, such as how deeplearning and analytical methods can be combined to provide better solutions to the inverse problem in addition to providing a discussion on the current limitations and future directions of the use of deep learning for solving inverse problem in imaging.

496 citations

Journal ArticleDOI
TL;DR: The reconstruction of images from incomplete block discrete cosine transform (BDCT) data is examined and two methods are proposed for solving this regularized recovery problem based on the theory of projections onto convex sets (POCS) and the constrained least squares (CLS).
Abstract: The reconstruction of images from incomplete block discrete cosine transform (BDCT) data is examined. The problem is formulated as one of regularized image recovery. According to this formulation, the image in the decoder is reconstructed by using not only the transmitted data but also prior knowledge about the smoothness of the original image, which complements the transmitted data. Two methods are proposed for solving this regularized recovery problem. The first is based on the theory of projections onto convex sets (POCS) while the second is based on the constrained least squares (CLS) approach. For the POCS-based method, a new constraint set is defined that conveys smoothness information not captured by the transmitted BDCT coefficients, and the projection onto it is computed. For the CLS method an objective function is proposed that captures the smoothness properties of the original image. Iterative algorithms are introduced for its minimization. Experimental results are presented. >

428 citations

Journal ArticleDOI
TL;DR: A spatially adaptive image recovery algorithm is proposed based on the theory of projections onto convex sets that captures both the local statistical properties of the image and the human perceptual characteristics.
Abstract: At the present time, block-transform coding is probably the most popular approach for image compression. For this approach, the compressed images are decoded using only the transmitted transform data. We formulate image decoding as an image recovery problem. According to this approach, the decoded image is reconstructed using not only the transmitted data but, in addition, the prior knowledge that images before compression do not display between-block discontinuities. A spatially adaptive image recovery algorithm is proposed based on the theory of projections onto convex sets. Apart from the data constraint set, this algorithm uses another new constraint set that enforces between-block smoothness. The novelty of this set is that it captures both the local statistical properties of the image and the human perceptual characteristics. A simplified spatially adaptive recovery algorithm is also proposed, and the analysis of its computational complexity is presented. Numerical experiments are shown that demonstrate that the proposed algorithms work better than both the JPEG deblocking recommendation and our previous projection-based image decoding approach. >

384 citations

Proceedings ArticleDOI
18 Jun 2018
TL;DR: In this paper, a modulator is trained to manipulate the intermediate layers of the segmentation network given limited visual and spatial information of the target object, which achieves similar accuracy as fine-tuning.
Abstract: Video object segmentation targets segmenting a specific object throughout a video sequence when given only an annotated first frame. Recent deep learning based approaches find it effective to fine-tune a general-purpose segmentation model on the annotated frame using hundreds of iterations of gradient descent. Despite the high accuracy that these methods achieve, the fine-tuning process is inefficient and fails to meet the requirements of real world applications. We propose a novel approach that uses a single forward pass to adapt the segmentation model to the appearance of a specific object. Specifically, a second meta neural network named modulator is trained to manipulate the intermediate layers of the segmentation network given limited visual and spatial information of the target object. The experiments show that our approach is 70A— faster than fine-tuning approaches and achieves similar accuracy. Our model and code have been released at https://github.com/linjieyangsc/video_seg.

351 citations


Cited by
More filters
Journal ArticleDOI

[...]

08 Dec 2001-BMJ
TL;DR: There is, I think, something ethereal about i —the square root of minus one, which seems an odd beast at that time—an intruder hovering on the edge of reality.
Abstract: There is, I think, something ethereal about i —the square root of minus one. I remember first hearing about it at school. It seemed an odd beast at that time—an intruder hovering on the edge of reality. Usually familiarity dulls this sense of the bizarre, but in the case of i it was the reverse: over the years the sense of its surreal nature intensified. It seemed that it was impossible to write mathematics that described the real world in …

33,785 citations

Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Journal ArticleDOI
TL;DR: In this article, the authors present a cloud centric vision for worldwide implementation of Internet of Things (IoT) and present a Cloud implementation using Aneka, which is based on interaction of private and public Clouds, and conclude their IoT vision by expanding on the need for convergence of WSN, the Internet and distributed computing directed at technological research community.

9,593 citations

Journal ArticleDOI
TL;DR: Zhang et al. as discussed by the authors proposed a deep learning method for single image super-resolution (SR), which directly learns an end-to-end mapping between the low/high-resolution images.
Abstract: We propose a deep learning method for single image super-resolution (SR). Our method directly learns an end-to-end mapping between the low/high-resolution images. The mapping is represented as a deep convolutional neural network (CNN) that takes the low-resolution image as the input and outputs the high-resolution one. We further show that traditional sparse-coding-based SR methods can also be viewed as a deep convolutional network. But unlike traditional methods that handle each component separately, our method jointly optimizes all layers. Our deep CNN has a lightweight structure, yet demonstrates state-of-the-art restoration quality, and achieves fast speed for practical on-line usage. We explore different network structures and parameter settings to achieve trade-offs between performance and speed. Moreover, we extend our network to cope with three color channels simultaneously, and show better overall reconstruction quality.

6,122 citations