Home
/
Authors
/
Danda Pani Paudel

Author

Danda Pani Paudel

Other affiliations: Centre national de la recherche scientifique, University of Burgundy

Bio: Danda Pani Paudel is an academic researcher from ETH Zurich. The author has contributed to research in topics: Computer science & RANSAC. The author has an hindex of 11, co-authored 72 publications receiving 533 citations. Previous affiliations of Danda Pani Paudel include Centre national de la recherche scientifique & University of Burgundy.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Covariance Pooling for Facial Expression Recognition

[...]

Dinesh Acharya¹, Zhiwu Huang¹, Danda Pani Paudel¹, Luc Van Gool¹•Institutions (1)

ETH Zurich¹

01 Jun 2018

TL;DR: In this paper, a manifold network structure was used for covariance pooling to improve facial expression recognition. And the authors achieved a recognition accuracy of 58.14% on Static Facial Expressions in the Wild (SFEW2.0) and 87.0% on the validation set of Real-World Affective Faces (RAF) Database.

...read moreread less

Abstract: Classifying facial expressions into different categories requires capturing regional distortions of facial landmarks. We believe that second-order statistics such as covariance is better able to capture such distortions in regional facial features. In this work, we explore the benefits of using a manifold network structure for covariance pooling to improve facial expression recognition. In particular, we first employ such kind of manifold networks in conjunction with traditional convolutional networks for spatial pooling within individual image feature maps in an end-to-end deep learning manner. By doing so, we are able to achieve a recognition accuracy of 58.14% on the validation set of Static Facial Expressions in the Wild (SFEW2.0) and 87.0% on the validation set of Real-World Affective Faces (RAF) Database1. Both of these results are the best results we are aware of. Besides, we leverage covariance pooling to capture the temporal evolution of per-frame features for video-based facial expression recognition. Our reported results demonstrate the advantage of pooling image-set features temporally by stacking the designed manifold network of covariance pooling on top of convolutional network layers.

...read moreread less

157 citations

Proceedings Article•DOI•

Sliced Wasserstein Generative Models

[...]

Jiqing Wu¹, Zhiwu Huang¹, Dinesh Acharya¹, Wen Li¹, Janine Thoma¹, Danda Pani Paudel¹, Luc Van Gool¹ - Show less +3 more•Institutions (1)

ETH Zurich¹

15 Jun 2019

TL;DR: In this article, the sliced Wasserstein distance (SWD) factorizes high-dimensional distributions into their multiple one-dimensional marginal distributions and is thus easier to approximate, and instead of using a large number of random projections, as it is done by conventional SWD approximation methods, they propose to approximate SWDs with a small number of parameterized orthogonal projections in an end-to-end deep learning fashion.

...read moreread less

Abstract: In generative modeling, the Wasserstein distance (WD) has emerged as a useful metric to measure the discrepancy between generated and real data distributions. Unfortunately, it is challenging to approximate the WD of high-dimensional distributions. In contrast, the sliced Wasserstein distance (SWD) factorizes high-dimensional distributions into their multiple one-dimensional marginal distributions and is thus easier to approximate. In this paper, we introduce novel approximations of the primal and dual SWD. Instead of using a large number of random projections, as it is done by conventional SWD approximation methods, we propose to approximate SWDs with a small number of parameterized orthogonal projections in an end-to-end deep learning fashion. As concrete applications of our SWD approximations, we design two types of differentiable SWD blocks to equip modern generative frameworks---Auto-Encoders (AE) and Generative Adversarial Networks (GAN). In the experiments, we not only show the superiority of the proposed generative models on standard image synthesis benchmarks, but also demonstrate the state-of-the-art performance on challenging high resolution image and video generation in an unsupervised manner.

...read moreread less

95 citations

Proceedings Article•DOI•

The Ninth Visual Object Tracking VOT2021 Challenge Results

[...]

Matej Kristan¹, Jiri Matas², Ales Leonardis³, Michael Felsberg⁴, Roman Pflugfelder⁵, Joni-Kristian Kamarainen⁶, Hyung Jin Chang⁶, Martin Danelljan⁴, Luka Čehovin Zajc¹, Alan Lukezic¹, Ondrej Drbohlav², Jani Käpylä, Gustav Häger⁷, Song Yan, Jinyu Yang⁶, Zhongqun Zhang, Gustavo Fernandez⁵, Mohamed H. Abdelpakey⁸, Goutam Bhat⁹, Llukman Cerkezi, Hakan Cevikalp¹⁰, Shengyong Chen, Xin Chen¹¹, Miao Cheng, Ziyi Cheng¹², Yu-Chen Chiu, Ozgun Cirakman, Yutao Cui¹³, Kenan Dai¹¹, Mohana Murali Dasari¹⁴, Qili Deng, Xingping Dong, Daniel K. Du, Matteo Dunnhofer¹⁵, Zhen-Hua Feng¹⁶, Zhiyong Feng, Zhihong Fu, Shiming Ge¹⁷, Rama Krishna Sai Subrahmanyam Gorthi¹⁴, Yuzhang Gu¹⁷, Bilge Gunsel, Qing Guo¹⁸, Filiz Gurkan, Wencheng Han, Yanyan Huang, Felix Järemo Lawin⁷, Shang-Jhih Jhang, Rongrong Ji, Cheng Jiang¹³, Yingjie Jiang¹⁹, Felix Juefei-Xu, Yin Jun, Xiao Ke²⁰, Fahad Shahbaz Khan²¹, Byeong Hak Kim, Josef Kittler¹⁶, Xiangyuan Lan²², Jun Ha Lee²³, Bastian Leibe, Hui Li¹⁹, Jianhua Li¹¹, Xianxian Li, Yuezhou Li²⁰, Bo Liu, Chang Liu¹¹, Jingen Liu, Li Liu²⁴, Qingjie Liu, Huchuan Lu¹¹, Wei Lu, Jonathon Luiten²⁵, Jie Ma, Ziang Ma, Niki Martinel¹⁵, Christoph Mayer⁴, Alireza Memarmoghadam²⁶, Christian Micheloni¹⁵, Yuzhen Niu, Danda Pani Paudel⁴, Houwen Peng²⁷, Shoumeng Qiu¹⁷, Aravindh Rajiv, Muhammad Rana, Andreas Robinson⁷, Hasan Saribas²⁸, Ling Shao, Mohamed Shehata²⁹, Furao Shen, Jianbing Shen, Kristian Simonato, Xiaoning Song, Zhangyong Tang¹⁹, Radu Timofte³⁰, Philip H. S. Torr³¹, Chi-Yi Tsai³², Bedirhan Uzun¹⁰, Luc Van Gool³³, Paul Voigtlaender²⁵, Dong Wang¹¹, Guangting Wang³⁴, Liangliang Wang³⁵, Lijun Wang¹¹, Limin Wang¹³, Linyuan Wang, Yong Wang³⁶, Yunhong Wang, Chenyan Wu, Gangshan Wu¹³, Xiaojun Wu³⁵, Fei Xie³⁷, Tianyang Xu¹⁶, Xiang Xu, Wanli Xue³⁸, Bin Yan¹¹, Wankou Yang, Xiaoyun Yang¹¹, Yu Ye²⁰, Jun Yin, Chengwei Zhang, Chunhui Zhang¹⁷, Haitao Zhang, Kaihua Zhang³⁹, Kangkai Zhang¹⁷, Xiaohan Zhang, Xiaolin Zhang⁴⁰, Xinyu Zhang¹¹, Zhibin Zhang, Shaochuan Zhao¹⁹, Ming Zhen, Bineng Zhong, Jiawen Zhu, Xue-Feng Zhu¹⁹ - Show less +128 more•Institutions (40)

University of Ljubljana¹, Czech Technical University in Prague², Huawei³, ETH Zurich⁴, Austrian Institute of Technology⁵, University of Birmingham⁶, Linköping University⁷, University of British Columbia⁸, Sher-e-Kashmir University of Agricultural Sciences and Technology of Kashmir⁹, Eskişehir Osmangazi University¹⁰, Dalian University of Technology¹¹, Kyushu University¹², Nanjing University¹³, Indian Institutes of Technology¹⁴, University of Udine¹⁵, University of Surrey¹⁶, Chinese Academy of Sciences¹⁷, Nanyang Technological University¹⁸, Jiangnan University¹⁹, Fuzhou University²⁰, Zayed University²¹, Hong Kong Baptist University²², Kyungpook National University²³, The Chinese University of Hong Kong²⁴, RWTH Aachen University²⁵, University of Isfahan²⁶, Microsoft²⁷, Anadolu University²⁸, Ain Shams University²⁹, University of Würzburg³⁰, University of Oxford³¹, Tamkang University³², Katholieke Universiteit Leuven³³, University of Science and Technology of China³⁴, Huazhong University of Science and Technology³⁵, Sun Yat-sen University³⁶, Southeast University³⁷, Tianjin University of Technology³⁸, Nanjing University of Information Science and Technology³⁹, ShanghaiTech University⁴⁰

01 Oct 2021

60 citations

Proceedings Article•

Towards High Resolution Video Generation with Progressive Growing of Sliced Wasserstein GANs.

[...]

Dinesh Acharya¹, Zhiwu Huang², Danda Pani Paudel¹, Luc Van Gool³•Institutions (3)

ETH Zurich¹, Chinese Academy of Sciences², École Polytechnique Fédérale de Lausanne³

04 Oct 2018

TL;DR: This work exploits the idea of progressive growing of Generative Adversarial Networks (GANs) for higher resolution video generation, and introduces a sliced version of Wasserstein GAN (SWGAN) loss to improve the distribution learning on the video data of high-dimension and mixed-spatiotemporal distribution.

...read moreread less

Abstract: The extension of image generation to video generation turns out to be a very difficult task, since the temporal dimension of videos introduces an extra challenge during the generation process. Besides, due to the limitation of memory and training stability, the generation becomes increasingly challenging with the increase of the resolution/duration of videos. In this work, we exploit the idea of progressive growing of Generative Adversarial Networks (GANs) for higher resolution video generation. In particular, we begin to produce video samples of low-resolution and short-duration, and then progressively increase both resolution and duration alone (or jointly) by adding new spatiotemporal convolutional layers to the current networks. Starting from the learning on a very raw-level spatial appearance and temporal movement of the video distribution, the proposed progressive method learns spatiotemporal information incrementally to generate higher resolution videos. Furthermore, we introduce a sliced version of Wasserstein GAN (SWGAN) loss to improve the distribution learning on the video data of high-dimension and mixed-spatiotemporal distribution. SWGAN loss replaces the distance between joint distributions by that of one-dimensional marginal distributions, making the loss easier to compute. We evaluate the proposed model on our collected face video dataset of 10,900 videos to generate photorealistic face videos of 256x256x32 resolution. In addition, our model also reaches a record inception score of 14.57 in unsupervised action recognition dataset UCF-101.

...read moreread less

56 citations

Proceedings Article•

Learning Target Candidate Association To Keep Track of What Not To Track

[...]

Christoph Mayer¹, Martin Danelljan¹, Danda Pani Paudel¹, Luc Van Gool²•Institutions (2)

ETH Zurich¹, Katholieke Universiteit Leuven²

01 Jan 2021

TL;DR: In this article, a learned association network is introduced to propagate the identities of all target candidates from frame-to-frame, allowing to track distractor objects from frame to frame.

...read moreread less

Abstract: The presence of objects that are confusingly similar to the tracked target, poses a fundamental challenge in appearance-based visual tracking. Such distractor objects are easily misclassified as the target itself, leading to eventual tracking failure. While most methods strive to suppress distractors through more powerful appearance models, we take an alternative approach. We propose to keep track of distractor objects in order to continue tracking the target. To this end, we introduce a learned association network, allowing us to propagate the identities of all target candidates from frame-to-frame. To tackle the problem of lacking ground-truth correspondences between distractor objects in visual tracking, we propose a training strategy that combines partial annotations with self-supervision. We conduct comprehensive experimental validation and analysis of our approach on several challenging datasets. Our tracker sets a new state-of-the-art on six benchmarks, achieving an AUC score of 67.2% on LaSOT and a +6.1% absolute gain on the OxUvA long-term dataset.

...read moreread less

47 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•

“Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告

[...]

杉山拓海

12 Sep 2017-Computers & Graphics

3,940 citations

The Senses Considered As Perceptual Systems

[...]

Lea Fleischer

01 Jan 2016

TL;DR: The the senses considered as perceptual systems is universally compatible with any devices to read, and is available in the book collection an online access to it is set as public so you can get it instantly.

...read moreread less

Abstract: Thank you for downloading the senses considered as perceptual systems. Maybe you have knowledge that, people have search hundreds times for their favorite novels like this the senses considered as perceptual systems, but end up in infectious downloads. Rather than enjoying a good book with a cup of coffee in the afternoon, instead they juggled with some malicious bugs inside their desktop computer. the senses considered as perceptual systems is available in our book collection an online access to it is set as public so you can get it instantly. Our books collection hosts in multiple locations, allowing you to get the most less latency time to download any of our books like this one. Kindly say, the the senses considered as perceptual systems is universally compatible with any devices to read.

...read moreread less

854 citations

Journal Article•DOI•

Deep Facial Expression Recognition: A Survey

[...]

Shan Li¹, Weihong Deng¹•Institutions (1)

Beijing University of Posts and Telecommunications¹

23 Apr 2018-arXiv: Computer Vision and Pattern Recognition

TL;DR: A comprehensive survey on deep facial expression recognition (FER) can be found in this article, including datasets and algorithms that provide insights into the intrinsic problems of deep FER, including overfitting caused by lack of sufficient training data and expression-unrelated variations, such as illumination, head pose and identity bias.

...read moreread less

Abstract: With the transition of facial expression recognition (FER) from laboratory-controlled to challenging in-the-wild conditions and the recent success of deep learning techniques in various fields, deep neural networks have increasingly been leveraged to learn discriminative representations for automatic FER. Recent deep FER systems generally focus on two important issues: overfitting caused by a lack of sufficient training data and expression-unrelated variations, such as illumination, head pose and identity bias. In this paper, we provide a comprehensive survey on deep FER, including datasets and algorithms that provide insights into these intrinsic problems. First, we describe the standard pipeline of a deep FER system with the related background knowledge and suggestions of applicable implementations for each stage. We then introduce the available datasets that are widely used in the literature and provide accepted data selection and evaluation principles for these datasets. For the state of the art in deep FER, we review existing novel deep neural networks and related training strategies that are designed for FER based on both static images and dynamic image sequences, and discuss their advantages and limitations. Competitive performances on widely used benchmarks are also summarized in this section. We then extend our survey to additional related issues and application scenarios. Finally, we review the remaining challenges and corresponding opportunities in this field as well as future directions for the design of robust deep FER systems.

...read moreread less

712 citations

Journal Article•

SeqSLAM : visual route-based navigation for sunny summer days and stormy winter nights

[...]

Michael Milford¹, Gordon Wyeth¹•Institutions (1)

Queensland University of Technology¹

01 Jan 2012-Science & Engineering Faculty

TL;DR: A new approach to visual navigation under changing conditions dubbed SeqSLAM, which removes the need for global matching performance by the vision front-end - instead it must only pick the best match within any short sequence of images.

...read moreread less

Abstract: Learning and then recognizing a route, whether travelled during the day or at night, in clear or inclement weather, and in summer or winter is a challenging task for state of the art algorithms in computer vision and robotics. In this paper, we present a new approach to visual navigation under changing conditions dubbed SeqSLAM. Instead of calculating the single location most likely given a current image, our approach calculates the best candidate matching location within every local navigation sequence. Localization is then achieved by recognizing coherent sequences of these “local best matches”. This approach removes the need for global matching performance by the vision front-end - instead it must only pick the best match within any short sequence of images. The approach is applicable over environment changes that render traditional feature-based techniques ineffective. Using two car-mounted camera datasets we demonstrate the effectiveness of the algorithm and compare it to one of the most successful feature-based SLAM algorithms, FAB-MAP. The perceptual change in the datasets is extreme; repeated traverses through environments during the day and then in the middle of the night, at times separated by months or years and in opposite seasons, and in clear weather and extremely heavy rain. While the feature-based method fails, the sequence-based algorithm is able to match trajectory segments at 100% precision with recall rates of up to 60%.

...read moreread less

686 citations

Journal Article•DOI•

Encyclopedia of Statistics in Behavioral Science

[...]

Martin Guha

01 Jun 2006

TL;DR: An apposite and eminently readable reference for all behavioral science research and development.

...read moreread less

Abstract: An apposite and eminently readable reference for all behavioral science research and development

...read moreread less

649 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse