scispace - formally typeset
Search or ask a question
Author

Kiyoharu Aizawa

Bio: Kiyoharu Aizawa is an academic researcher from University of Tokyo. The author has contributed to research in topics: Image processing & Pixel. The author has an hindex of 45, co-authored 590 publications receiving 8693 citations. Previous affiliations of Kiyoharu Aizawa include University of Liverpool & University of Illinois at Urbana–Champaign.


Papers
More filters
Journal ArticleDOI
TL;DR: A manga-specific image retrieval system that consists of efficient margin labeling, edge orientation histogram feature description with screen tone removal, and approximate nearest-neighbor search using product quantization is proposed.
Abstract: Manga (Japanese comics) are popular worldwide. However, current e-manga archives offer very limited search support, i.e., keyword-based search by title or author. To make the manga search experience more intuitive, efficient, and enjoyable, we propose a manga-specific image retrieval system. The proposed system consists of efficient margin labeling, edge orientation histogram feature description with screen tone removal, and approximate nearest-neighbor search using product quantization. For querying, the system provides a sketch-based interface. Based on the interface, two interactive reranking schemes are presented: relevance feedback and query retouch. For evaluation, we built a novel dataset of manga images, Manga109, which consists of 109 comic books of 21,142 pages drawn by professional manga artists. To the best of our knowledge, Manga109 is currently the biggest dataset of manga images available for research. Experimental results showed that the proposed framework is efficient and scalable (70 ms from 21,142 pages using a single computer with 204 MB RAM).

625 citations

Proceedings ArticleDOI
01 Jun 2018
TL;DR: This work proposes a joint optimization framework of learning DNN parameters and estimating true labels that can correct labels during training by alternating update of network parameters and labels.
Abstract: Deep neural networks (DNNs) trained on large-scale datasets have exhibited significant performance in image classification. Many large-scale datasets are collected from websites, however they tend to contain inaccurate labels that are termed as noisy labels. Training on such noisy labeled datasets causes performance degradation because DNNs easily overfit to noisy labels. To overcome this problem, we propose a joint optimization framework of learning DNN parameters and estimating true labels. Our framework can correct labels during training by alternating update of network parameters and labels. We conduct experiments on the noisy CIFAR-10 datasets and the Clothing1M dataset. The results indicate that our approach significantly outperforms other state-of-the-art methods.

529 citations

Journal ArticleDOI
TL;DR: In this article, a sketch-based interface is proposed to interact with manga content to make the manga search experience more intuitive, efficient, and enjoyable, and a content-based manga retrieval system is proposed.
Abstract: Manga (Japanese comics) are popular worldwide. However, current e-manga archives offer very limited search support, including keyword-based search by title or author, or tag-based categorization. To make the manga search experience more intuitive, efficient, and enjoyable, we propose a content-based manga retrieval system. First, we propose a manga-specific image-describing framework. It consists of efficient margin labeling, edge orientation histogram feature description, and approximate nearest-neighbor search using product quantization. Second, we propose a sketch-based interface as a natural way to interact with manga content. The interface provides sketch-based querying, relevance feedback, and query retouch. For evaluation, we built a novel dataset of manga images, Manga109, which consists of 109 comic books of 21,142 pages drawn by professional manga artists. To the best of our knowledge, Manga109 is currently the biggest dataset of manga images available for research. We conducted a comparative study, a localization evaluation, and a large-scale qualitative study. From the experiments, we verified that: (1) the retrieval accuracy of the proposed method is higher than those of previous methods; (2) the proposed method can localize an object instance with reasonable runtime and accuracy; and (3) sketch querying is useful for manga search.

469 citations

Proceedings ArticleDOI
14 Dec 2018
TL;DR: In this paper, a cross-domain weakly supervised object detection framework is proposed to detect common objects in a variety of image domains without instance-level annotations, where the classes to be detected in the target domain are all or a subset of those in the source domain.
Abstract: Can we detect common objects in a variety of image domains without instance-level annotations? In this paper, we present a framework for a novel task, cross-domain weakly supervised object detection, which addresses this question. For this paper, we have access to images with instance-level annotations in a source domain (e.g., natural image) and images with image-level annotations in a target domain (e.g., watercolor). In addition, the classes to be detected in the target domain are all or a subset of those in the source domain. Starting from a fully supervised object detector, which is pre-trained on the source domain, we propose a two-step progressive domain adaptation technique by fine-tuning the detector on two types of artificially and automatically generated samples. We test our methods on our newly collected datasets1 containing three image domains, and achieve an improvement of approximately 5 to 20 percentage points in terms of mean average precision (mAP) compared to the best-performing baselines.

425 citations

Journal ArticleDOI
TL;DR: The initial conception of a model-based analysis synthesis image coding (MBASIC) system is described and a construction method for a three-dimensional (3-D) facial model that includes synthesis methods for facial expressions is presented.
Abstract: The initial conception of a model-based analysis synthesis image coding (MBASIC) system is described and a construction method for a three-dimensional (3-D) facial model that includes synthesis methods for facial expressions is presented. The proposed MBASIC system is an image coding method that utilizes a 3-D model of the object which is to be reproduced. An input image is first analyzed and an output image using the 3-D model is then synthesized. A very low bit rate image transmission can be realized because the encoder sends only the required analysis parameters. Output images can be reconstructed without the noise corruption that reduces naturalness because the decoder synthesizes images from a similar 3-D model. In order to construct a 3-D model of a person's face, a method is developed which uses a 3-D wire frame face model. A full-face image is then projected onto this wire frame model. For the synthesis of facial expressions two different methods are proposed; a clip-and-paste method and a facial structure deformation method.

292 citations


Cited by
More filters
01 Jan 2014
TL;DR: These standards of care are intended to provide clinicians, patients, researchers, payors, and other interested individuals with the components of diabetes care, treatment goals, and tools to evaluate the quality of care.
Abstract: XI. STRATEGIES FOR IMPROVING DIABETES CARE D iabetes is a chronic illness that requires continuing medical care and patient self-management education to prevent acute complications and to reduce the risk of long-term complications. Diabetes care is complex and requires that many issues, beyond glycemic control, be addressed. A large body of evidence exists that supports a range of interventions to improve diabetes outcomes. These standards of care are intended to provide clinicians, patients, researchers, payors, and other interested individuals with the components of diabetes care, treatment goals, and tools to evaluate the quality of care. While individual preferences, comorbidities, and other patient factors may require modification of goals, targets that are desirable for most patients with diabetes are provided. These standards are not intended to preclude more extensive evaluation and management of the patient by other specialists as needed. For more detailed information, refer to Bode (Ed.): Medical Management of Type 1 Diabetes (1), Burant (Ed): Medical Management of Type 2 Diabetes (2), and Klingensmith (Ed): Intensive Diabetes Management (3). The recommendations included are diagnostic and therapeutic actions that are known or believed to favorably affect health outcomes of patients with diabetes. A grading system (Table 1), developed by the American Diabetes Association (ADA) and modeled after existing methods, was utilized to clarify and codify the evidence that forms the basis for the recommendations. The level of evidence that supports each recommendation is listed after each recommendation using the letters A, B, C, or E.

9,618 citations

Journal ArticleDOI
TL;DR: The goal of this article is to introduce the concept of SR algorithms to readers who are unfamiliar with this area and to provide a review for experts to present the technical review of various existing SR methodologies which are often employed.
Abstract: A new approach toward increasing spatial resolution is required to overcome the limitations of the sensors and optics manufacturing technology. One promising approach is to use signal processing techniques to obtain an high-resolution (HR) image (or sequence) from observed multiple low-resolution (LR) images. Such a resolution enhancement approach has been one of the most active research areas, and it is called super resolution (SR) (or HR) image reconstruction or simply resolution enhancement. In this article, we use the term "SR image reconstruction" to refer to a signal processing approach toward resolution enhancement because the term "super" in "super resolution" represents very well the characteristics of the technique overcoming the inherent resolution limitation of LR imaging systems. The major advantage of the signal processing approach is that it may cost less and the existing LR imaging systems can be still utilized. The SR image reconstruction is proved to be useful in many practical cases where multiple frames of the same scene can be obtained, including medical imaging, satellite imaging, and video applications. The goal of this article is to introduce the concept of SR algorithms to readers who are unfamiliar with this area and to provide a review for experts. To this purpose, we present the technical review of various existing SR methodologies which are often employed. Before presenting the review of existing SR algorithms, we first model the LR image acquisition process.

3,491 citations

01 Jan 2006

3,012 citations

Proceedings ArticleDOI
18 Jun 2018
TL;DR: This paper proposes residual dense block (RDB) to extract abundant local features via dense connected convolutional layers and uses global feature fusion in RDB to jointly and adaptively learn global hierarchical features in a holistic way.
Abstract: A very deep convolutional neural network (CNN) has recently achieved great success for image super-resolution (SR) and offered hierarchical features as well. However, most deep CNN based SR models do not make full use of the hierarchical features from the original low-resolution (LR) images, thereby achieving relatively-low performance. In this paper, we propose a novel residual dense network (RDN) to address this problem in image SR. We fully exploit the hierarchical features from all the convolutional layers. Specifically, we propose residual dense block (RDB) to extract abundant local features via dense connected convolutional layers. RDB further allows direct connections from the state of preceding RDB to all the layers of current RDB, leading to a contiguous memory (CM) mechanism. Local feature fusion in RDB is then used to adaptively learn more effective features from preceding and current local features and stabilizes the training of wider network. After fully obtaining dense local features, we use global feature fusion to jointly and adaptively learn global hierarchical features in a holistic way. Experiments on benchmark datasets with different degradation models show that our RDN achieves favorable performance against state-of-the-art methods.

2,860 citations