Showing papers on "Deep learning published in 2020"

PDF

Open Access

Journal Article•DOI•

Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI

[...]

Alejandro Barredo Arrieta, Natalia Díaz-Rodríguez¹, Javier Del Ser², Javier Del Ser³, Adrien Bennetot⁴, Adrien Bennetot¹, Siham Tabik⁵, Alberto Barbado⁶, Salvador García⁵, Sergio Gil-Lopez, Daniel Molina⁵, Richard Benjamins⁶, Raja Chatila⁴, Francisco Herrera⁵ - Show less +10 more•Institutions (6)

French Institute for Research in Computer Science and Automation¹, University of the Basque Country², Basque Center for Applied Mathematics³, University of Paris⁴, University of Granada⁵, Telefónica⁶

01 Jun 2020-Information Fusion

TL;DR: In this paper, a taxonomy of recent contributions related to explainability of different machine learning models, including those aimed at explaining Deep Learning methods, is presented, and a second dedicated taxonomy is built and examined in detail.

...read moreread less

2,827 citations

Journal Article•DOI•

Generative adversarial networks

[...]

Ian Goodfellow¹, Jean Pouget-Abadie², Mehdi Mirza², Bing Xu², David Warde-Farley², Sherjil Ozair², Aaron Courville², Yoshua Bengio² - Show less +4 more•Institutions (2)

Google¹, Université de Montréal²

22 Oct 2020-Communications of The ACM

TL;DR: A generative adversarial networks algorithm designed to solve the generative modeling problem and its applications in medicine, education and robotics are studied.

...read moreread less

Abstract: Generative adversarial networks are a kind of artificial intelligence algorithm designed to solve the generative modeling problem. The goal of a generative model is to study a collection of training examples and learn the probability distribution that generated them. Generative Adversarial Networks (GANs) are then able to generate more examples from the estimated probability distribution. Generative models based on deep learning are common, but GANs are among the most successful generative models (especially in terms of their ability to generate realistic high-resolution images). GANs have been successfully applied to a wide variety of tasks (mostly in research settings) but continue to present unique challenges and research opportunities because they are based on game theory while most other approaches to generative modeling are based on optimization.

...read moreread less

2,447 citations

Proceedings Article•

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

[...]

Zhenzhong Lan¹, Mingda Chen², Sebastian Goodman¹, Kevin Gimpel³, Piyush Sharma¹, Radu Soricut¹ - Show less +2 more•Institutions (3)

Google¹, Toyota Technological Institute at Chicago², New York University³

30 Apr 2020

TL;DR: This work presents two parameter-reduction techniques to lower memory consumption and increase the training speed of BERT, and uses a self-supervised loss that focuses on modeling inter-sentence coherence.

...read moreread less

Abstract: Increasing model size when pretraining natural language representations often results in improved performance on downstream tasks. However, at some point further model increases become harder due to GPU/TPU memory limitations, longer training times, and unexpected model degradation. To address these problems, we present two parameter-reduction techniques to lower memory consumption and increase the training speed of BERT. Comprehensive empirical evidence shows that our proposed methods lead to models that scale much better compared to the original BERT. We also use a self-supervised loss that focuses on modeling inter-sentence coherence, and show it consistently helps downstream tasks with multi-sentence inputs. As a result, our best model establishes new state-of-the-art results on the GLUE, RACE, and SQuAD benchmarks while having fewer parameters compared to BERT-large.

...read moreread less

2,367 citations

Journal Article•DOI•

Deep Learning for Generic Object Detection: A Survey

[...]

Li Liu¹, Li Liu², Wanli Ouyang³, Xiaogang Wang⁴, Paul Fieguth⁵, Jie Chen¹, Xinwang Liu², Matti Pietikäinen¹ - Show less +4 more•Institutions (5)

University of Oulu¹, National University of Defense Technology², University of Sydney³, The Chinese University of Hong Kong⁴, University of Waterloo⁵

01 Feb 2020-International Journal of Computer Vision

TL;DR: A comprehensive survey of the recent achievements in this field brought about by deep learning techniques, covering many aspects of generic object detection: detection frameworks, object feature representation, object proposal generation, context modeling, training strategies, and evaluation metrics.

...read moreread less

Abstract: Object detection, one of the most fundamental and challenging problems in computer vision, seeks to locate object instances from a large number of predefined categories in natural images. Deep learning techniques have emerged as a powerful strategy for learning feature representations directly from data and have led to remarkable breakthroughs in the field of generic object detection. Given this period of rapid evolution, the goal of this paper is to provide a comprehensive survey of the recent achievements in this field brought about by deep learning techniques. More than 300 research contributions are included in this survey, covering many aspects of generic object detection: detection frameworks, object feature representation, object proposal generation, context modeling, training strategies, and evaluation metrics. We finish the survey by identifying promising directions for future research.

...read moreread less

1,897 citations

Journal Article•DOI•

Covid-19: automatic detection from X-ray images utilizing transfer learning with convolutional neural networks.

[...]

Ioannis D. Apostolopoulos¹, Tzani A. Mpesiana²•Institutions (2)

University of Patras¹, Research Academic Computer Technology Institute²

01 Jun 2020

TL;DR: The results suggest that Deep Learning with X-ray imaging may extract significant biomarkers related to the Covid-19 disease, while the best accuracy, sensitivity, and specificity obtained is 96.78%, 98.66%, and 96.46% respectively.

...read moreread less

Abstract: In this study, a dataset of X-ray images from patients with common bacterial pneumonia, confirmed Covid-19 disease, and normal incidents, was utilized for the automatic detection of the Coronavirus disease. The aim of the study is to evaluate the performance of state-of-the-art convolutional neural network architectures proposed over the recent years for medical image classification. Specifically, the procedure called Transfer Learning was adopted. With transfer learning, the detection of various abnormalities in small medical image datasets is an achievable target, often yielding remarkable results. The datasets utilized in this experiment are two. Firstly, a collection of 1427 X-ray images including 224 images with confirmed Covid-19 disease, 700 images with confirmed common bacterial pneumonia, and 504 images of normal conditions. Secondly, a dataset including 224 images with confirmed Covid-19 disease, 714 images with confirmed bacterial and viral pneumonia, and 504 images of normal conditions. The data was collected from the available X-ray images on public medical repositories. The results suggest that Deep Learning with X-ray imaging may extract significant biomarkers related to the Covid-19 disease, while the best accuracy, sensitivity, and specificity obtained is 96.78%, 98.66%, and 96.46% respectively. Since by now, all diagnostic tests show failure rates such as to raise concerns, the probability of incorporating X-rays into the diagnosis of the disease could be assessed by the medical community, based on the findings, while more research to evaluate the X-ray approach from different aspects may be conducted.

...read moreread less

1,670 citations

Journal Article•DOI•

Graph Neural Networks: A Review of Methods and Applications

[...]

Jie Zhou¹, Ganqu Cui¹, Shengding Hu¹, Zhengyan Zhang¹, Cheng Yang², Zhiyuan Liu¹, Lifeng Wang³, Changcheng Li³, Maosong Sun¹ - Show less +5 more•Institutions (3)

Tsinghua University¹, Beijing University of Posts and Telecommunications², Tencent³

01 Jan 2020

TL;DR: In this paper, the authors propose a general design pipeline for GNN models and discuss the variants of each component, systematically categorize the applications, and propose four open problems for future research.

...read moreread less

Abstract: Lots of learning tasks require dealing with graph data which contains rich relation information among elements. Modeling physics systems, learning molecular fingerprints, predicting protein interface, and classifying diseases demand a model to learn from graph inputs. In other domains such as learning from non-structural data like texts and images, reasoning on extracted structures (like the dependency trees of sentences and the scene graphs of images) is an important research topic which also needs graph reasoning models. Graph neural networks (GNNs) are neural models that capture the dependence of graphs via message passing between the nodes of graphs. In recent years, variants of GNNs such as graph convolutional network (GCN), graph attention network (GAT), graph recurrent network (GRN) have demonstrated ground-breaking performances on many deep learning tasks. In this survey, we propose a general design pipeline for GNN models and discuss the variants of each component, systematically categorize the applications, and propose four open problems for future research.

...read moreread less

1,266 citations

Journal Article•DOI•

MultiResUNet : Rethinking the U-Net architecture for multimodal biomedical image segmentation.

[...]

Nabil Ibtehaz¹, M. Sohel Rahman²•Institutions (2)

Samsung¹, Bangladesh University of Engineering and Technology²

01 Jan 2020-Neural Networks

TL;DR: This work develops a novel architecture, MultiResUNet, as the potential successor to the U-Net architecture, and tests and compared it with the classical U- net on a vast repertoire of multimodal medical images.

...read moreread less

1,027 citations

Posted Content•

Image Segmentation Using Deep Learning: A Survey

[...]

Shervin Minaee, Yuri Boykov¹, Fatih Porikli², Antonio Plaza³, Nasser Kehtarnavaz⁴, Demetri Terzopoulos⁵ - Show less +2 more•Institutions (5)

University of Waterloo¹, Australian National University², University of Extremadura³, University of Texas at Dallas⁴, University of California, Los Angeles⁵

15 Jan 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: A comprehensive review of recent pioneering efforts in semantic and instance segmentation, including convolutional pixel-labeling networks, encoder-decoder architectures, multiscale and pyramid-based approaches, recurrent networks, visual attention models, and generative models in adversarial settings are provided.

...read moreread less

Abstract: Image segmentation is a key topic in image processing and computer vision with applications such as scene understanding, medical image analysis, robotic perception, video surveillance, augmented reality, and image compression, among many others. Various algorithms for image segmentation have been developed in the literature. Recently, due to the success of deep learning models in a wide range of vision applications, there has been a substantial amount of works aimed at developing image segmentation approaches using deep learning models. In this survey, we provide a comprehensive review of the literature at the time of this writing, covering a broad spectrum of pioneering works for semantic and instance-level segmentation, including fully convolutional pixel-labeling networks, encoder-decoder architectures, multi-scale and pyramid based approaches, recurrent networks, visual attention models, and generative models in adversarial settings. We investigate the similarity, strengths and challenges of these deep learning models, examine the most widely used datasets, report performances, and discuss promising future research directions in this area.

...read moreread less

950 citations

Journal Article•DOI•

Shortcut learning in deep neural networks

[...]

Robert Geirhos¹, Jörn-Henrik Jacobsen², Claudio Michaelis¹, Richard S. Zemel², Wieland Brendel¹, Matthias Bethge¹, Felix A. Wichmann¹ - Show less +3 more•Institutions (2)

University of Tübingen¹, University of Toronto²

16 Apr 2020-Nature Machine Intelligence

TL;DR: A set of recommendations for model interpretation and benchmarking is developed, highlighting recent advances in machine learning to improve robustness and transferability from the lab to real-world applications.

...read moreread less

Abstract: Deep learning has triggered the current rise of artificial intelligence and is the workhorse of today’s machine intelligence. Numerous success stories have rapidly spread all over science, industry and society, but its limitations have only recently come into focus. In this Perspective we seek to distil how many of deep learning’s failures can be seen as different symptoms of the same underlying problem: shortcut learning. Shortcuts are decision rules that perform well on standard benchmarks but fail to transfer to more challenging testing conditions, such as real-world scenarios. Related issues are known in comparative psychology, education and linguistics, suggesting that shortcut learning may be a common characteristic of learning systems, biological and artificial alike. Based on these observations, we develop a set of recommendations for model interpretation and benchmarking, highlighting recent advances in machine learning to improve robustness and transferability from the lab to real-world applications. Deep learning has resulted in impressive achievements, but under what circumstances does it fail, and why? The authors propose that its failures are a consequence of shortcut learning, a common characteristic across biological and artificial systems in which strategies that appear to have solved a problem fail unexpectedly under different circumstances.

...read moreread less

924 citations

Journal Article•DOI•

Memory devices and applications for in-memory computing

[...]

Abu Sebastian¹, Manuel Le Gallo¹, Riduan Khaddam-Aljameh¹, Evangelos Eleftheriou¹•Institutions (1)

IBM¹

30 Mar 2020-Nature Nanotechnology

TL;DR: This Review provides an overview of memory devices and the key computational primitives enabled by these memory devices as well as their applications spanning scientific computing, signal processing, optimization, machine learning, deep learning and stochastic computing.

...read moreread less

Abstract: Traditional von Neumann computing systems involve separate processing and memory units. However, data movement is costly in terms of time and energy and this problem is aggravated by the recent explosive growth in highly data-centric applications related to artificial intelligence. This calls for a radical departure from the traditional systems and one such non-von Neumann computational approach is in-memory computing. Hereby certain computational tasks are performed in place in the memory itself by exploiting the physical attributes of the memory devices. Both charge-based and resistance-based memory devices are being explored for in-memory computing. In this Review, we provide a broad overview of the key computational primitives enabled by these memory devices as well as their applications spanning scientific computing, signal processing, optimization, machine learning, deep learning and stochastic computing. This Review provides an overview of memory devices and the key computational primitives for in-memory computing, and examines the possibilities of applying this computing approach to a wide range of applications.

...read moreread less

841 citations

Posted Content•

Meta-Learning in Neural Networks: A Survey

[...]

Timothy M. Hospedales¹, Antreas Antoniou¹, Paul Micaelli¹, Amos Storkey¹•Institutions (1)

University of Edinburgh¹

11 Apr 2020-arXiv: Learning

TL;DR: A new taxonomy is proposed that provides a more comprehensive breakdown of the space of meta-learning methods today, including few-shot learning, reinforcement learning and architecture search, and promising applications and successes.

...read moreread less

Abstract: The field of meta-learning, or learning-to-learn, has seen a dramatic rise in interest in recent years. Contrary to conventional approaches to AI where tasks are solved from scratch using a fixed learning algorithm, meta-learning aims to improve the learning algorithm itself, given the experience of multiple learning episodes. This paradigm provides an opportunity to tackle many conventional challenges of deep learning, including data and computation bottlenecks, as well as generalization. This survey describes the contemporary meta-learning landscape. We first discuss definitions of meta-learning and position it with respect to related fields, such as transfer learning and hyperparameter optimization. We then propose a new taxonomy that provides a more comprehensive breakdown of the space of meta-learning methods today. We survey promising applications and successes of meta-learning such as few-shot learning and reinforcement learning. Finally, we discuss outstanding challenges and promising areas for future research.

...read moreread less

Journal Article•DOI•

HybridSN: Exploring 3-D–2-D CNN Feature Hierarchy for Hyperspectral Image Classification

[...]

Swalpa Kumar Roy¹, Gopal Krishna¹, Shiv Ram Dubey², Bidyut B. Chaudhuri³•Institutions (3)

Jalpaiguri Government Engineering College¹, Indian Institutes of Information Technology², Indian Statistical Institute³

01 Feb 2020-IEEE Geoscience and Remote Sensing Letters

TL;DR: A hybrid spectral CNN (HybridSN) for HSI classification is proposed that reduces the complexity of the model compared to the use of 3-D-CNN alone and is compared with the state-of-the-art hand-crafted as well as end-to-end deep learning-based methods.

...read moreread less

Abstract: Hyperspectral image (HSI) classification is widely used for the analysis of remotely sensed images. Hyperspectral imagery includes varying bands of images. Convolutional neural network (CNN) is one of the most frequently used deep learning-based methods for visual data processing. The use of CNN for HSI classification is also visible in recent works. These approaches are mostly based on 2-D CNN. On the other hand, the HSI classification performance is highly dependent on both spatial and spectral information. Very few methods have used the 3-D-CNN because of increased computational complexity. This letter proposes a hybrid spectral CNN (HybridSN) for HSI classification. In general, the HybridSN is a spectral–spatial 3-D-CNN followed by spatial 2-D-CNN. The 3-D-CNN facilitates the joint spatial–spectral feature representation from a stack of spectral bands. The 2-D-CNN on top of the 3-D-CNN further learns more abstract-level spatial representation. Moreover, the use of hybrid CNNs reduces the complexity of the model compared to the use of 3-D-CNN alone. To test the performance of this hybrid approach, very rigorous HSI classification experiments are performed over Indian Pines, University of Pavia, and Salinas Scene remote sensing data sets. The results are compared with the state-of-the-art hand-crafted as well as end-to-end deep learning-based methods. A very satisfactory performance is obtained using the proposed HybridSN for HSI classification. The source code can be found at https://github.com/gokriznastic/HybridSN .

...read moreread less

Posted Content•

COVIDX-Net: A Framework of Deep Learning Classifiers to Diagnose COVID-19 in X-Ray Images.

[...]

Ezz El-Din Hemdan¹, Marwa A. Shouman¹, Mohamed Esmail Karar•Institutions (1)

Menoufia University¹

24 Mar 2020-arXiv: Image and Video Processing

TL;DR: This study demonstrated the useful application of deep learning models to classify COVID-19 in X-ray images based on the proposed COVIDX-Net framework and indicated that clinical studies are the next milestone of this research work.

...read moreread less

Abstract: Background and Purpose: Coronaviruses (CoV) are perilous viruses that may cause Severe Acute Respiratory Syndrome (SARS-CoV), Middle East Respiratory Syndrome (MERS-CoV). The novel 2019 Coronavirus disease (COVID-19) was discovered as a novel disease pneumonia in the city of Wuhan, China at the end of 2019. Now, it becomes a Coronavirus outbreak around the world, the number of infected people and deaths are increasing rapidly every day according to the updated reports of the World Health Organization (WHO). Therefore, the aim of this article is to introduce a new deep learning framework; namely COVIDX-Net to assist radiologists to automatically diagnose COVID-19 in X-ray images. Materials and Methods: Due to the lack of public COVID-19 datasets, the study is validated on 50 Chest X-ray images with 25 confirmed positive COVID-19 cases. The COVIDX-Net includes seven different architectures of deep convolutional neural network models, such as modified Visual Geometry Group Network (VGG19) and the second version of Google MobileNet. Each deep neural network model is able to analyze the normalized intensities of the X-ray image to classify the patient status either negative or positive COVID-19 case. Results: Experiments and evaluation of the COVIDX-Net have been successfully done based on 80-20% of X-ray images for the model training and testing phases, respectively. The VGG19 and Dense Convolutional Network (DenseNet) models showed a good and similar performance of automated COVID-19 classification with f1-scores of 0.89 and 0.91 for normal and COVID-19, respectively. Conclusions: This study demonstrated the useful application of deep learning models to classify COVID-19 in X-ray images based on the proposed COVIDX-Net framework. Clinical studies are the next milestone of this research work.

...read moreread less

Journal Article•DOI•

Pre-trained Models for Natural Language Processing: A Survey

[...]

Xipeng Qiu¹, Tianxiang Sun¹, Yige Xu¹, Yunfan Shao¹, Ning Dai¹, Xuanjing Huang¹ - Show less +2 more•Institutions (1)

Fudan University¹

18 Mar 2020-Science China-technological Sciences

TL;DR: Recently, the emergence of pre-trained models (PTMs) has brought natural language processing (NLP) to a new era as mentioned in this paper, and a comprehensive review of PTMs for NLP can be found in this survey.

...read moreread less

Abstract: Recently, the emergence of pre-trained models (PTMs) has brought natural language processing (NLP) to a new era. In this survey, we provide a comprehensive review of PTMs for NLP. We first briefly introduce language representation learning and its research progress. Then we systematically categorize existing PTMs based on a taxonomy from four different perspectives. Next, we describe how to adapt the knowledge of PTMs to downstream tasks. Finally, we outline some potential directions of PTMs for future research. This survey is purposed to be a hands-on guide for understanding, using, and developing PTMs for various NLP tasks.

...read moreread less

Book•

Automated Machine Learning

[...]

Frank Hutter¹, Lars Kotthoff², Joaquin Vanschoren•Institutions (2)

University of Freiburg¹, University of Wyoming²

08 Oct 2020

TL;DR: This open access book presents the first comprehensive overview of general methods in Automatic Machine Learning (AutoML), collects descriptions of existing systems based on these methods, and discusses the first international challenge of AutoML systems.

...read moreread less

Abstract: This open access book presents the first comprehensive overview of general methods in Automated Machine Learning (AutoML), collects descriptions of existing systems based on these methods, and discusses the first series of international challenges of AutoML systems. The recent success of commercial ML applications and the rapid growth of the field has created a high demand for off-the-shelf ML methods that can be used easily and without expert knowledge. However, many of the recent machine learning successes crucially rely on human experts, who manually select appropriate ML architectures (deep learning architectures or more traditional ML workflows) and their hyperparameters. To overcome this problem, the field of AutoML targets a progressive automation of machine learning, based on principles from optimization and machine learning itself. This book serves as a point of entry into this quickly-developing field for researchers and advanced students alike, as well as providing a reference for practitioners aiming to use AutoML in their work.

...read moreread less

Posted Content•

Deep Learning for Person Re-identification: A Survey and Outlook

[...]

Mang Ye, Jianbing Shen¹, Gaojie Lin², Tao Xiang³, Ling Shao⁴, Steven C. H. Hoi⁵ - Show less +2 more•Institutions (5)

University of California, Los Angeles¹, Beijing Institute of Technology², University of Surrey³, University of East Anglia⁴, Singapore Management University⁵

13 Jan 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: A powerful AGW baseline is designed, achieving state-of-the-art or at least comparable performance on twelve datasets for four different Re-ID tasks, and a new evaluation metric (mINP) is introduced, indicating the cost for finding all the correct matches, which provides an additional criteria to evaluate the Re- ID system for real applications.

...read moreread less

Abstract: Person re-identification (Re-ID) aims at retrieving a person of interest across multiple non-overlapping cameras. With the advancement of deep neural networks and increasing demand of intelligent video surveillance, it has gained significantly increased interest in the computer vision community. By dissecting the involved components in developing a person Re-ID system, we categorize it into the closed-world and open-world settings. The widely studied closed-world setting is usually applied under various research-oriented assumptions, and has achieved inspiring success using deep learning techniques on a number of datasets. We first conduct a comprehensive overview with in-depth analysis for closed-world person Re-ID from three different perspectives, including deep feature representation learning, deep metric learning and ranking optimization. With the performance saturation under closed-world setting, the research focus for person Re-ID has recently shifted to the open-world setting, facing more challenging issues. This setting is closer to practical applications under specific scenarios. We summarize the open-world Re-ID in terms of five different aspects. By analyzing the advantages of existing methods, we design a powerful AGW baseline, achieving state-of-the-art or at least comparable performance on twelve datasets for FOUR different Re-ID tasks. Meanwhile, we introduce a new evaluation metric (mINP) for person Re-ID, indicating the cost for finding all the correct matches, which provides an additional criteria to evaluate the Re-ID system for real applications. Finally, some important yet under-investigated open issues are discussed.

...read moreread less

Journal Article•DOI•

DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks

[...]

Valentin Flunkert¹, David Salinas¹, Jan Gasthaus¹•Institutions (1)

Amazon.com¹

01 Jul 2020-International Journal of Forecasting

TL;DR: DeepAR is proposed, a methodology for producing accurate probabilistic forecasts, based on training an auto regressive recurrent network model on a large number of related time series, with accuracy improvements of around 15% compared to state-of-the-art methods.

...read moreread less

Journal Article•DOI•

Deep Learning on Graphs: A Survey

[...]

Ziwei Zhang¹, Peng Cui¹, Wenwu Zhu¹•Institutions (1)

Tsinghua University¹

17 Mar 2020-IEEE Transactions on Knowledge and Data Engineering

TL;DR: Deep learning has been shown to be successful in a number of domains, ranging from acoustics, images, to natural language processing as discussed by the authors. However, applying deep learning to the ubiquitous graph data is non-trivial because of the unique characteristics of graphs.

...read moreread less

Abstract: Deep learning has been shown to be successful in a number of domains, ranging from acoustics, images, to natural language processing. However, applying deep learning to the ubiquitous graph data is non-trivial because of the unique characteristics of graphs. Recently, substantial research efforts have been devoted to applying deep learning methods to graphs, resulting in beneficial advances in graph analysis techniques. In this survey, we comprehensively review the different types of deep learning methods on graphs. We divide the existing methods into five categories based on their model architectures and training strategies: graph recurrent neural networks, graph convolutional networks, graph autoencoders, graph reinforcement learning, and graph adversarial methods. We then provide a comprehensive overview of these methods in a systematic manner mainly by following their development history. We also analyze the differences and compositions of different methods. Finally, we briefly outline the applications in which they have been used and discuss potential future research directions.

...read moreread less

Journal Article•DOI•

Deep Learning COVID-19 Features on CXR Using Limited Training Data Sets

[...]

Yujin Oh¹, Sangjoon Park¹, Jong Chul Ye¹•Institutions (1)

KAIST¹

08 May 2020-IEEE Transactions on Medical Imaging

TL;DR: Experimental results show that the proposed patch-based convolutional neural network approach achieves state-of-the-art performance and provides clinically interpretable saliency maps, which are useful for COVID-19 diagnosis and patient triage.

...read moreread less

Abstract: Under the global pandemic of COVID-19, the use of artificial intelligence to analyze chest X-ray (CXR) image for COVID-19 diagnosis and patient triage is becoming important. Unfortunately, due to the emergent nature of the COVID-19 pandemic, a systematic collection of CXR data set for deep neural network training is difficult. To address this problem, here we propose a patch-based convolutional neural network approach with a relatively small number of trainable parameters for COVID-19 diagnosis. The proposed method is inspired by our statistical analysis of the potential imaging biomarkers of the CXR radiographs. Experimental results show that our method achieves state-of-the-art performance and provides clinically interpretable saliency maps, which are useful for COVID-19 diagnosis and patient triage.

...read moreread less

Journal Article•DOI•

Deep Learning System to Screen Coronavirus Disease 2019 Pneumonia.

[...]

Xiaowei Xu, Xiangao Jiang, Chunlian Ma, Peng Du, Xukun Li, Shuangzhi Lv, Liang Yu, Yanfei Chen, Junwei Su, Guanjing Lang, Yongtao Li, Hong Zhao, Kaijin Xu, Lingxiang Ruan, Wei Wu - Show less +11 more

21 Feb 2020-arXiv: Medical Physics

TL;DR: Wang et al. as mentioned in this paper used a 3-dimensional deep learning model to segment COVID-19 pneumonia from healthy cases with pulmonary CT images and calculated the infection type and total confidence score of this CT case with Noisy-or Bayesian function.

...read moreread less

Abstract: We found that the real time reverse transcription-polymerase chain reaction (RT-PCR) detection of viral RNA from sputum or nasopharyngeal swab has a relatively low positive rate in the early stage to determine COVID-19 (named by the World Health Organization). The manifestations of computed tomography (CT) imaging of COVID-19 had their own characteristics, which are different from other types of viral pneumonia, such as Influenza-A viral pneumonia. Therefore, clinical doctors call for another early diagnostic criteria for this new type of pneumonia as soon as possible.This study aimed to establish an early screening model to distinguish COVID-19 pneumonia from Influenza-A viral pneumonia and healthy cases with pulmonary CT images using deep learning techniques. The candidate infection regions were first segmented out using a 3-dimensional deep learning model from pulmonary CT image set. These separated images were then categorized into COVID-19, Influenza-A viral pneumonia and irrelevant to infection groups, together with the corresponding confidence scores using a location-attention classification model. Finally the infection type and total confidence score of this CT case were calculated with Noisy-or Bayesian function.The experiments result of benchmark dataset showed that the overall accuracy was 86.7 % from the perspective of CT cases as a whole.The deep learning models established in this study were effective for the early screening of COVID-19 patients and demonstrated to be a promising supplementary diagnostic method for frontline clinical doctors.

...read moreread less

Journal Article•DOI•

Deep Facial Expression Recognition: A Survey

[...]

Shan Li¹, Weihong Deng¹•Institutions (1)

Beijing University of Posts and Telecommunications¹

17 Mar 2020-IEEE Transactions on Affective Computing

TL;DR: A comprehensive review on deep facial expression recognition can be found in this article, including datasets and algorithms that provide insights into the problems of overfitting caused by a lack of sufficient training data and expression-unrelated variations.

...read moreread less

Abstract: With the transition of facial expression recognition (FER) from laboratory-controlled to in-the-wild conditions and the recent success of deep learning in various fields, deep neural networks have increasingly been leveraged to learn discriminative representations for automatic FER. Recent deep FER systems generally focus on two important issues: overfitting caused by a lack of sufficient training data and expression-unrelated variations, such as illumination, head pose and identity bias. In this survey, we provide a comprehensive review on deep FER, including datasets and algorithms that provide insights into these problems. First, we introduce available datasets that are widely used and provide data selection and evaluation principles. We then describe the standard pipeline of a deep FER system with related background knowledge and suggestions of applicable implementations. For the state of the art in deep FER, we introduce existing deep networks and training strategies that are designed for FER, and discuss their advantages and limitations. Competitive performances and experimental comparisons on widely used benchmarks are also summarized. We then extend our survey to additional related issues and application scenarios. Finally, we review the remaining challenges and opportunities in this field as well as future directions for the design of robust deep FER system.

...read moreread less

Journal Article•DOI•

A survey on ensemble learning

[...]

Xibin Dong¹, Zhiwen Yu¹, Wenming Cao², Yifan Shi¹, Qianli Ma¹ - Show less +1 more•Institutions (2)

South China University of Technology¹, City University of Hong Kong²

20 Apr 2020-Frontiers of Computer Science

TL;DR: Challenges and possible research directions for each mainstream approach of ensemble learning are presented and an extra introduction is given for the combination of ensemblelearning with other machine learning hot spots such as deep learning, reinforcement learning, etc.

...read moreread less

Abstract: Despite significant successes achieved in knowledge discovery, traditional machine learning methods may fail to obtain satisfactory performances when dealing with complex data, such as imbalanced, high-dimensional, noisy data, etc. The reason behind is that it is difficult for these methods to capture multiple characteristics and underlying structure of data. In this context, it becomes an important topic in the data mining field that how to effectively construct an efficient knowledge discovery and mining model. Ensemble learning, as one research hot spot, aims to integrate data fusion, data modeling, and data mining into a unified framework. Specifically, ensemble learning firstly extracts a set of features with a variety of transformations. Based on these learned features, multiple learning algorithms are utilized to produce weak predictive results. Finally, ensemble learning fuses the informative knowledge from the above results obtained to achieve knowledge discovery and better predictive performance via voting schemes in an adaptive way. In this paper, we review the research progress of the mainstream approaches of ensemble learning and classify them based on different characteristics. In addition, we present challenges and possible research directions for each mainstream approach of ensemble learning, and we also give an extra introduction for the combination of ensemble learning with other machine learning hot spots such as deep learning, reinforcement learning, etc.

...read moreread less

Posted Content•

Pre-Trained Image Processing Transformer

[...]

Hanting Chen¹, Yunhe Wang², Tianyu Guo¹, Chang Xu³, Yiping Deng², Zhenhua Liu², Siwei Ma¹, Chunjing Xu², Chao Xu¹, Wen Gao¹ - Show less +6 more•Institutions (3)

Peking University¹, Huawei², University of Sydney³

01 Dec 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: To maximally excavate the capability of transformer, the IPT model is presented to utilize the well-known ImageNet benchmark for generating a large amount of corrupted image pairs and the contrastive learning is introduced for well adapting to different image processing tasks.

...read moreread less

Abstract: As the computing power of modern hardware is increasing strongly, pre-trained deep learning models (e.g., BERT, GPT-3) learned on large-scale datasets have shown their effectiveness over conventional methods. The big progress is mainly contributed to the representation ability of transformer and its variant architectures. In this paper, we study the low-level computer vision task (e.g., denoising, super-resolution and deraining) and develop a new pre-trained model, namely, image processing transformer (IPT). To maximally excavate the capability of transformer, we present to utilize the well-known ImageNet benchmark for generating a large amount of corrupted image pairs. The IPT model is trained on these images with multi-heads and multi-tails. In addition, the contrastive learning is introduced for well adapting to different image processing tasks. The pre-trained model can therefore efficiently employed on desired task after fine-tuning. With only one pre-trained model, IPT outperforms the current state-of-the-art methods on various low-level benchmarks. Code is available at this https URL and this https URL

...read moreread less

Proceedings Article•

Decoupling Representation and Classifier for Long-Tailed Recognition

[...]

Bingyi Kang¹, Saining Xie², Marcus Rohrbach², Zhicheng Yan³, Albert Gordo², Jiashi Feng¹, Yannis Kalantidis² - Show less +3 more•Institutions (3)

National University of Singapore¹, Facebook², University of Illinois at Urbana–Champaign³

30 Apr 2020

TL;DR: It is shown that it is possible to outperform carefully designed losses, sampling strategies, even complex modules with memory, by using a straightforward approach that decouples representation and classification.

...read moreread less

Abstract: The long-tail distribution of the visual world poses great challenges for deep learning based classification models on how to handle the class imbalance problem. Existing solutions usually involve class-balancing strategies, e.g., by loss re-weighting, data re-sampling, or transfer learning from head- to tail-classes, but all of them adhere to the scheme of jointly learning representations and classifiers. In this work, we decouple the learning procedure into representation learning and classification, and systematically explore how different balancing strategies affect them for long-tailed recognition. The findings are surprising: (1) data imbalance might not be an issue in learning high-quality representations; (2) with representations learned with the simplest instance-balanced (natural) sampling, it is also possible to achieve strong long-tailed recognition ability at little to no cost by adjusting only the classifier. We conduct extensive experiments and set new state-of-the-art performance on common long-tailed benchmarks like ImageNet-LT, Places-LT and iNaturalist, showing that it is possible to outperform carefully designed losses, sampling strategies, even complex modules with memory, by using a straightforward approach that decouples representation and classification.

...read moreread less

Journal Article•DOI•

A survey of deep learning techniques for autonomous driving

[...]

Sorin Mihai Grigorescu¹, Bogdan Trasnea¹, Tiberiu T. Cocias¹, Gigel Macesanu¹•Institutions (1)

Transilvania University of Brașov¹

01 Apr 2020-Journal of Field Robotics

TL;DR: In this article, the authors survey the current state-of-the-art on deep learning technologies used in autonomous driving, including convolutional and recurrent neural networks, as well as the deep reinforcement learning paradigm.

...read moreread less

Abstract: The last decade witnessed increasingly rapid progress in self-driving vehicle technology, mainly backed up by advances in the area of deep learning and artificial intelligence. The objective of this paper is to survey the current state-of-the-art on deep learning technologies used in autonomous driving. We start by presenting AI-based self-driving architectures, convolutional and recurrent neural networks, as well as the deep reinforcement learning paradigm. These methodologies form a base for the surveyed driving scene perception, path planning, behavior arbitration and motion control algorithms. We investigate both the modular perception-planning-action pipeline, where each module is built using deep learning methods, as well as End2End systems, which directly map sensory information to steering commands. Additionally, we tackle current challenges encountered in designing AI architectures for autonomous driving, such as their safety, training data sources and computational hardware. The comparison presented in this survey helps to gain insight into the strengths and limitations of deep learning and AI approaches for autonomous driving and assist with design choices

...read moreread less

Proceedings Article•

Simple and Deep Graph Convolutional Networks

[...]

Ming Chen¹, Zhewei Wei¹, Zengfeng Huang², Bolin Ding³, Yaliang Li³ - Show less +1 more•Institutions (3)

Renmin University of China¹, Fudan University², Alibaba Group³

12 Jul 2020

TL;DR: The GCNII is proposed, an extension of the vanilla GCN model with two simple yet effective techniques: {\em Initial residual} and {\em Identity mapping} that effectively relieves the problem of over-smoothing.

...read moreread less

Abstract: Graph convolutional networks (GCNs) are a powerful deep learning approach for graph-structured data. Recently, GCNs and subsequent variants have shown superior performance in various application areas on real-world datasets. Despite their success, most of the current GCN models are shallow, due to the {\em over-smoothing} problem. In this paper, we study the problem of designing and analyzing deep graph convolutional networks. We propose the GCNII, an extension of the vanilla GCN model with two simple yet effective techniques: {\em Initial residual} and {\em Identity mapping}. We provide theoretical and empirical evidence that the two techniques effectively relieves the problem of over-smoothing. Our experiments show that the deep GCNII model outperforms the state-of-the-art methods on various semi- and full-supervised tasks. Code is available at this https URL .

...read moreread less

Journal Article•DOI•

Robust and Communication-Efficient Federated Learning From Non-i.i.d. Data

[...]

Felix Sattler¹, Simon Wiedemann¹, Klaus-Robert Müller², Wojciech Samek¹•Institutions (2)

Heinrich Hertz Institute¹, Technical University of Berlin²

01 Sep 2020-IEEE Transactions on Neural Networks

TL;DR: In this paper, the authors propose sparse ternary compression (STC), a new compression framework that is specifically designed to meet the requirements of the federated learning environment, which extends the existing compression technique of top- $k$ gradient sparsification with a novel mechanism to enable downstream compression as well as ternarization and optimal Golomb encoding of the weight updates.

...read moreread less

Abstract: Federated learning allows multiple parties to jointly train a deep learning model on their combined data, without any of the participants having to reveal their local data to a centralized server. This form of privacy-preserving collaborative learning, however, comes at the cost of a significant communication overhead during training. To address this problem, several compression methods have been proposed in the distributed training literature that can reduce the amount of required communication by up to three orders of magnitude. These existing methods, however, are only of limited utility in the federated learning setting, as they either only compress the upstream communication from the clients to the server (leaving the downstream communication uncompressed) or only perform well under idealized conditions, such as i.i.d. distribution of the client data, which typically cannot be found in federated learning. In this article, we propose sparse ternary compression (STC), a new compression framework that is specifically designed to meet the requirements of the federated learning environment. STC extends the existing compression technique of top- $k$ gradient sparsification with a novel mechanism to enable downstream compression as well as ternarization and optimal Golomb encoding of the weight updates. Our experiments on four different learning tasks demonstrate that STC distinctively outperforms federated averaging in common federated learning scenarios. These results advocate for a paradigm shift in federated optimization toward high-frequency low-bitwidth communication, in particular in the bandwidth-constrained learning environments.

...read moreread less

Journal Article•DOI•

Traffic Graph Convolutional Recurrent Neural Network: A Deep Learning Framework for Network-Scale Traffic Learning and Forecasting

[...]

Zhiyong Cui¹, Kristian Henrickson, Ruimin Ke¹, Yinhai Wang¹•Institutions (1)

University of Washington¹

01 Nov 2020-IEEE Transactions on Intelligent Transportation Systems

TL;DR: A novel deep learning framework, Traffic Graph Convolutional Long Short-Term Memory Neural Network (TGC-LSTM), to learn the interactions between roadways in the traffic network and forecast the network-wide traffic state and shows that the proposed model outperforms baseline methods on two real-world traffic state datasets.

...read moreread less

Abstract: Traffic forecasting is a particularly challenging application of spatiotemporal forecasting, due to the time-varying traffic patterns and the complicated spatial dependencies on road networks. To address this challenge, we learn the traffic network as a graph and propose a novel deep learning framework, Traffic Graph Convolutional Long Short-Term Memory Neural Network (TGC-LSTM), to learn the interactions between roadways in the traffic network and forecast the network-wide traffic state. We define the traffic graph convolution based on the physical network topology. The relationship between the proposed traffic graph convolution and the spectral graph convolution is also discussed. An L1-norm on graph convolution weights and an L2-norm on graph convolution features are added to the model’s loss function to enhance the interpretability of the proposed model. Experimental results show that the proposed model outperforms baseline methods on two real-world traffic state datasets. The visualization of the graph convolution weights indicates that the proposed framework can recognize the most influential road segments in real-world traffic networks.

...read moreread less

Journal Article•DOI•

More Diverse Means Better: Multimodal Deep Learning Meets Remote Sensing Imagery Classification

[...]

Danfeng Hong¹, Lianru Gao², Naoto Yokoya³, Jing Yao⁴, Jocelyn Chanussot¹, Qian Du⁵, Bing Zhang² - Show less +3 more•Institutions (5)

University of Grenoble¹, Chinese Academy of Sciences², University of Tokyo³, Xi'an Jiaotong University⁴, Mississippi State University⁵

12 Aug 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: A baseline solution to the aforementioned difficulty by developing a general multimodal deep learning (MDL) framework that is not only limited to pixel-wise classification tasks but also applicable to spatial information modeling with convolutional neural networks (CNNs).

...read moreread less

Abstract: Classification and identification of the materials lying over or beneath the Earth's surface have long been a fundamental but challenging research topic in geoscience and remote sensing (RS) and have garnered a growing concern owing to the recent advancements of deep learning techniques. Although deep networks have been successfully applied in single-modality-dominated classification tasks, yet their performance inevitably meets the bottleneck in complex scenes that need to be finely classified, due to the limitation of information diversity. In this work, we provide a baseline solution to the aforementioned difficulty by developing a general multimodal deep learning (MDL) framework. In particular, we also investigate a special case of multi-modality learning (MML) -- cross-modality learning (CML) that exists widely in RS image classification applications. By focusing on "what", "where", and "how" to fuse, we show different fusion strategies as well as how to train deep networks and build the network architecture. Specifically, five fusion architectures are introduced and developed, further being unified in our MDL framework. More significantly, our framework is not only limited to pixel-wise classification tasks but also applicable to spatial information modeling with convolutional neural networks (CNNs). To validate the effectiveness and superiority of the MDL framework, extensive experiments related to the settings of MML and CML are conducted on two different multimodal RS datasets. Furthermore, the codes and datasets will be available at this https URL, contributing to the RS community.

...read moreread less

Journal Article•DOI•

Fastai: A Layered API for Deep Learning

[...]

Jeremy Howard, Sylvain Gugger

11 Feb 2020-Information-an International Interdisciplinary Journal

TL;DR: This paper has used this library to successfully create a complete deep learning course, which was able to write more quickly than using previous approaches, and the code was more clear.

...read moreread less

Abstract: fastai is a deep learning library which provides practitioners with high-level components that can quickly and easily provide state-of-the-art results in standard deep learning domains, and provides researchers with low-level components that can be mixed and matched to build new approaches. It aims to do both things without substantial compromises in ease of use, flexibility, or performance. This is possible thanks to a carefully layered architecture, which expresses common underlying patterns of many deep learning and data processing techniques in terms of decoupled abstractions. These abstractions can be expressed concisely and clearly by leveraging the dynamism of the underlying Python language and the flexibility of the PyTorch library. fastai includes: a new type dispatch system for Python along with a semantic type hierarchy for tensors; a GPU-optimized computer vision library which can be extended in pure Python; an optimizer which refactors out the common functionality of modern optimizers into two basic pieces, allowing optimization algorithms to be implemented in 4–5 lines of code; a novel 2-way callback system that can access any part of the data, model, or optimizer and change it at any point during training; a new data block API; and much more. We used this library to successfully create a complete deep learning course, which we were able to write more quickly than using previous approaches, and the code was more clear. The library is already in wide use in research, industry, and teaching.

...read moreread less

Collapse