Home
/
Authors
/
Greg Shakhnarovich

Author

Greg Shakhnarovich

Other affiliations: Toyota, University of Chicago, York University ...read more

Bio: Greg Shakhnarovich is an academic researcher from Toyota Technological Institute. The author has contributed to research in topics: American Sign Language & Fingerspelling. The author has an hindex of 7, co-authored 11 publications receiving 225 citations. Previous affiliations of Greg Shakhnarovich include Toyota & University of Chicago.

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

NTIRE 2019 Challenge on Real Image Denoising: Methods and Results

[...]

Abdelrahman Abdelhamed¹, Radu Timofte¹, Michael S. Brown¹, Songhyun Yu¹, Bumjun Park¹, Jechang Jeong¹, Seung-Won Jung¹, Dong-Wook Kim¹, Jae-Ryun Chung¹, Jiaming Liu¹, Yuzhi Wang¹, Chi-Hao Wu¹, Qin Xu¹, Chuan Wang¹, Shaofan Cai¹, Yifan Ding¹, Haoqiang Fan¹, Jue Wang¹, Kai Zhang¹, Wangmeng Zuo¹, Magauiya Zhussip¹, Dongwon Park¹, Shakarim Soltanayev¹, Se Young Chun¹, Zhiwei Xiong¹, Chang Chen¹, Muhammad Haris¹, Kazutoshi Akita¹, Tomoki Yoshida¹, Greg Shakhnarovich¹, Norimichi Ukita¹, Syed Waqas Zamir¹, Aditya Arora¹, Salman Khan¹, Fahad Shahbaz Khan¹, Ling Shao¹, Sung-Jea Ko¹, Dong-Pan Lim¹, Seung-Wook Kim¹, Seo-Won Ji¹, Sang-Won Lee¹, Wenyi Tang¹, Yuchen Fan¹, Yuqian Zhou¹, Ding Liu¹, Thomas S. Huang¹, Deyu Meng¹, Lei Zhang¹, Hongwei Yong¹, Yiyun Zhao¹, Pengliang Tang¹, Yue Lu¹, Raimondo Schettini¹, Simone Bianco¹, Simone Zini¹, Chi Li¹, Yang Wang¹, Zhiguo Cao¹ - Show less +54 more•Institutions (1)

York University¹

16 Jun 2019

TL;DR: The proposed methods by the 15 teams represent the current state-of-the-art performance in image denoising targeting real noisy images.

...read moreread less

Abstract: This paper reviews the NTIRE 2019 challenge on real image denoising with focus on the proposed methods and their results. The challenge has two tracks for quantitatively evaluating image denoising performance in (1) the Bayer-pattern raw-RGB and (2) the standard RGB (sRGB) color spaces. The tracks had 216 and 220 registered participants, respectively. A total of 15 teams, proposing 17 methods, competed in the final phase of the challenge. The proposed methods by the 15 teams represent the current state-of-the-art performance in image denoising targeting real noisy images.

...read moreread less

99 citations

Proceedings Article•DOI•

American Sign Language Fingerspelling Recognition in the Wild

[...]

Bowen Shi¹, Aurora Martinez Del Rio², Jonathan Keane², Jonathan Michaux¹, Diane Brentari², Greg Shakhnarovich¹, Karen Livescu¹ - Show less +3 more•Institutions (2)

Toyota Technological Institute¹, University of Chicago²

01 Dec 2018

TL;DR: In this article, the authors presented the first attempt to recognize fingerspelling sequences in this challenging setting, using videos collected from websites and trained a special-purpose signing hand detector using a small subset of their data.

...read moreread less

Abstract: We address the problem of American Sign Language fingerspelling recognition “in the wild”, using videos collected from websites. We introduce the largest data set available so far for the problem of fingerspelling recognition, and the first using naturally occurring video data. Using this data set, we present the first attempt to recognize fingerspelling sequences in this challenging setting. Unlike prior work, our video data is extremely challenging due to low frame rates and visual variability. To tackle the visual challenges, we train a special-purpose signing hand detector using a small subset of our data. Given the hand detector output, a sequence model decodes the hypothesized fingerspelled letter sequence. For the sequence model, we explore attention-based recurrent encoder-decoders and CTC-based approaches. As the first attempt at fingerspelling recognition in the wild, this work is intended to serve as a baseline for future work on sign language recognition in realistic conditions. We find that, as expected, letter error rates are much higher than in previous work on more controlled data, and we analyze the sources of error and effects of model variants.

...read moreread less

51 citations

Proceedings Article•DOI•

NTIRE 2019 Challenge on Image Enhancement: Methods and Results

[...]

Andrey Ignatov¹, Radu Timofte¹, Xiaochao Qu¹, Xingguang Zhou¹, Ting Liu¹, Pengfei Wan¹, Syed Waqas Zamir¹, Aditya Arora¹, Salman Khan¹, Fahad Shahbaz Khan¹, Ling Shao¹, Dongwon Park¹, Se Young Chun¹, Pablo Navarrete Michelini¹, Hanwen Liu¹, Dan Zhu¹, Zhiwei Zhong¹, Xianming Liu¹, Junjun Jiang¹, Debin Zhao¹, Muhammad Haris¹, Kazutoshi Akita¹, Tomoki Yoshida¹, Greg Shakhnarovich¹, Norimichi Ukita¹, Jie Liu¹, Cheolkon Jung¹, Raimondo Schettini¹, Simone Bianco¹, Claudio Cusano¹, Flavio Piccoli¹, Pengju Liu¹, Kai Zhang¹, Jingdong Liu¹, Jiye Liu¹, Hongzhi Zhang¹, Wangmeng Zuo¹, Nelson Chong Ngee Bow¹, Lai-Kuan Wong¹, John See¹, Jinghui Qin¹, Lishan Huang¹, Yukai Shi¹, Pengxu Wei¹, Wushao Wen¹, Liang Lin¹, Zheng Hui¹, Xiumei Wang¹, Xinbo Gao¹, Kanti Kumari¹, Vikas Kumar Anand¹, Mahendra Khened¹, Ganapathy Krishnamurthi¹ - Show less +49 more•Institutions (1)

ETH Zurich¹

16 Jun 2019

TL;DR: The first NTIRE challenge on perceptual image enhancement as discussed by the authors focused on proposed solutions and results of real-world photo enhancement problem, where the goal was to map low-quality photos from the iPhone 3GS device to the same photos captured with Canon 70D DSLR camera.

...read moreread less

Abstract: This paper reviews the first NTIRE challenge on perceptual image enhancement with the focus on proposed solutions and results. The participating teams were solving a real-world photo enhancement problem, where the goal was to map low-quality photos from the iPhone 3GS device to the same photos captured with Canon 70D DSLR camera. The considered problem embraced a number of computer vision subtasks, such as image denoising, image resolution and sharpness enhancement, image color/contrast/exposure adjustment, etc. The target metric used in this challenge combined PSNR and SSIM scores with solutions' perceptual results measured in the user study. The proposed solutions significantly improved baseline results, defining the state-of-the-art for practical image enhancement.

...read moreread less

45 citations

Proceedings Article•DOI•

Fingerspelling Recognition in the Wild With Iterative Visual Attention

[...]

Bowen Shi¹, Aurora Martinez Del Rio², Jonathan Keane², Diane Brentari², Greg Shakhnarovich, Karen Livescu - Show less +2 more•Institutions (2)

Toyota Technological Institute at Chicago¹, University of Chicago²

28 Aug 2019

TL;DR: This work proposes an end-to-end model based on an iterative attention mechanism, without explicit hand detection or segmentation, that out-performs prior work by a large margin on recognition of fingerspelling sequences in ASL videos collected in the wild.

...read moreread less

Abstract: Sign language recognition is a challenging gesture sequence recognition problem, characterized by quick and highly coarticulated motion. In this paper we focus on recognition of fingerspelling sequences in American Sign Language (ASL) videos collected in the wild, mainly from YouTube and Deaf social media. Most previous work on sign language recognition has focused on controlled settings where the data is recorded in a studio environment and the number of signers is limited. Our work aims to address the challenges of real-life data, reducing the need for detection or segmentation modules commonly used in this domain. We propose an end-to-end model based on an iterative attention mechanism, without explicit hand detection or segmentation. Our approach dynamically focuses on increasingly high-resolution regions of interest. It out-performs prior work by a large margin. We also introduce a newly collected data set of crowdsourced annotations of fingerspelling in the wild, and show that performance can be further improved with this additional data set.

...read moreread less

38 citations

Proceedings Article•DOI•

AIM 2019 Challenge on RAW to RGB Mapping: Methods and Results

[...]

Andrey Ignatov¹, Juncheng Li, Jiajie Zhang, Haoyu Wu, Jie Li, Rui Huang, Muhammad Haris, Greg Shakhnarovich, Norimichi Ukita, Yuzhi Zhao, Lai-Man Po, Radu Timofte, Tiantian Zhang, Zongbang Liao, Xiang Shi, Zhang Yujia, Weifeng Ou, Pengfei Xian, Jingjing Xiong, Chang Zhou, Wing Yin Yu, Yubin Yubin, Sung-Jea Ko, Bingxin Hou, Bum Jun Park, Songhyun Yu, Sangmin Kim, Jechang Jeong, Seung Wook Kim, Kwang-Hyun Uhm, Seo-Won Ji, Sung-Jin Cho, Jun-Pyo Hong, Kangfu Mei - Show less +30 more•Institutions (1)

ETH Zurich¹

01 Oct 2019

TL;DR: This paper reviews the first AIM challenge on mapping camera RAW toRGB images with the focus on proposed solutions and results, defining the state-of-the-art for RAW to RGB image restoration.

...read moreread less

Abstract: This paper reviews the first AIM challenge on mapping camera RAW to RGB images with the focus on proposed solutions and results. The participating teams were solving a real-world photo enhancement problem, where the goal was to map the original low-quality RAW images from the Huawei P20 device to the same photos captured with the Canon 5D DSLR camera. The considered problem embraced a number of computer vision subtasks, such as image demosaicing, denoising, gamma correction, image resolution and sharpness enhancement, etc. The target metric used in this challenge combined fidelity scores (PSNR and SSIM) with solutions' perceptual results measured in a user study. The proposed solutions significantly improved baseline results, defining the state-of-the-art for RAW to RGB image restoration.

...read moreread less

37 citations

Cited by

PDF

Open Access

More filters

Posted Content•

Deep High-Resolution Representation Learning for Visual Recognition

[...]

Jingdong Wang¹, Ke Sun², Tianheng Cheng³, Borui Jiang⁴, Chaorui Deng⁵, Yang Zhao⁶, Dong Liu², Yadong Mu⁴, Mingkui Tan⁵, Xinggang Wang³, Wenyu Liu³, Bin Xiao¹ - Show less +8 more•Institutions (6)

Microsoft¹, University of Science and Technology of China², Huazhong University of Science and Technology³, Peking University⁴, South China University of Technology⁵, Griffith University⁶

20 Aug 2019-arXiv: Computer Vision and Pattern Recognition

TL;DR: The superiority of the proposed HRNet in a wide range of applications, including human pose estimation, semantic segmentation, and object detection, is shown, suggesting that the HRNet is a stronger backbone for computer vision problems.

...read moreread less

Abstract: High-resolution representations are essential for position-sensitive vision problems, such as human pose estimation, semantic segmentation, and object detection. Existing state-of-the-art frameworks first encode the input image as a low-resolution representation through a subnetwork that is formed by connecting high-to-low resolution convolutions \emph{in series} (e.g., ResNet, VGGNet), and then recover the high-resolution representation from the encoded low-resolution representation. Instead, our proposed network, named as High-Resolution Network (HRNet), maintains high-resolution representations through the whole process. There are two key characteristics: (i) Connect the high-to-low resolution convolution streams \emph{in parallel}; (ii) Repeatedly exchange the information across resolutions. The benefit is that the resulting representation is semantically richer and spatially more precise. We show the superiority of the proposed HRNet in a wide range of applications, including human pose estimation, semantic segmentation, and object detection, suggesting that the HRNet is a stronger backbone for computer vision problems. All the codes are available at~{\url{this https URL}}.

...read moreread less

1,278 citations

Journal Article•DOI•

Deep High-Resolution Representation Learning for Visual Recognition

[...]

Microsoft¹, University of Science and Technology of China², Huazhong University of Science and Technology³, Peking University⁴, South China University of Technology⁵, Griffith University⁶

01 Oct 2021-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The High-Resolution Network (HRNet) as mentioned in this paper maintains high-resolution representations through the whole process by connecting the high-to-low resolution convolution streams in parallel and repeatedly exchanging the information across resolutions.

...read moreread less

Abstract: High-resolution representations are essential for position-sensitive vision problems, such as human pose estimation, semantic segmentation, and object detection. Existing state-of-the-art frameworks first encode the input image as a low-resolution representation through a subnetwork that is formed by connecting high-to-low resolution convolutions in series (e.g., ResNet, VGGNet), and then recover the high-resolution representation from the encoded low-resolution representation. Instead, our proposed network, named as High-Resolution Network (HRNet), maintains high-resolution representations through the whole process. There are two key characteristics: (i) Connect the high-to-low resolution convolution streams in parallel and (ii) repeatedly exchange the information across resolutions. The benefit is that the resulting representation is semantically richer and spatially more precise. We show the superiority of the proposed HRNet in a wide range of applications, including human pose estimation, semantic segmentation, and object detection, suggesting that the HRNet is a stronger backbone for computer vision problems. All the codes are available at https://github.com/HRNet .

...read moreread less

1,162 citations

Proceedings Article•DOI•

Multi-Stage Progressive Image Restoration

[...]

Syed Waqas Zamir, Aditya Arora, Salman Khan¹, Munawar Hayat², Fahad Shahbaz Khan¹, Ming-Hsuan Yang³, Ling Shao - Show less +3 more•Institutions (3)

Zayed University¹, Monash University², University of California, Merced³

04 Feb 2021

TL;DR: MPRNet as discussed by the authors proposes a multi-stage architecture that progressively learns restoration functions for the degraded inputs, thereby breaking down the overall recovery process into more manageable steps, and introduces a novel per-pixel adaptive design that leverages in-situ supervised attention to reweight the local features.

...read moreread less

Abstract: Image restoration tasks demand a complex balance between spatial details and high-level contextualized information while recovering images. In this paper, we propose a novel synergistic design that can optimally balance these competing goals. Our main proposal is a multi-stage architecture, that progressively learns restoration functions for the degraded inputs, thereby breaking down the overall recovery process into more manageable steps. Specifically, our model first learns the contextualized features using encoder-decoder architectures and later combines them with a high-resolution branch that retains local information. At each stage, we introduce a novel per-pixel adaptive design that leverages in-situ supervised attention to reweight the local features. A key ingredient in such a multi-stage architecture is the information exchange between different stages. To this end, we propose a two-faceted approach where the information is not only exchanged sequentially from early to late stages, but lateral connections between feature processing blocks also exist to avoid any loss of information. The resulting tightly interlinked multi-stage architecture, named as MPRNet, delivers strong performance gains on ten datasets across a range of tasks including image deraining, deblurring, and denoising. The source code and pre-trained models are available at https://github.com/swz30/MPRNet.

...read moreread less

716 citations

Proceedings Article•DOI•

Sign Language Recognition, Generation, and Translation: An Interdisciplinary Perspective

[...]

Danielle Bragg¹, Oscar Koller¹, Mary Bellard¹, Larwan Berke², Patrick Boudreault³, Annelies Braffort, Naomi Caselli⁴, Matt Huenerfauth², Hernisa Kacorri⁵, Tessa Verhoef⁶, Christian Vogler³, Meredith Ringel Morris¹ - Show less +8 more•Institutions (6)

Microsoft¹, Rochester Institute of Technology², University of Washington³, Boston University⁴, University of Maryland, College Park⁵, Leiden University⁶

24 Oct 2019

TL;DR: The results of an interdisciplinary workshop are presented, providing key background that is often overlooked by computer scientists, a review of the state-of-the-art, a set of pressing challenges, and a call to action for the research community.

...read moreread less

Abstract: Developing successful sign language recognition, generation, and translation systems requires expertise in a wide range of fields, including computer vision, computer graphics, natural language processing, human-computer interaction, linguistics, and Deaf culture. Despite the need for deep interdisciplinary knowledge, existing research occurs in separate disciplinary silos, and tackles separate portions of the sign language processing pipeline. This leads to three key questions: 1) What does an interdisciplinary view of the current landscape reveal? 2) What are the biggest challenges facing the field? and 3) What are the calls to action for people working in the field? To help answer these questions, we brought together a diverse group of experts for a two-day workshop. This paper presents the results of that interdisciplinary workshop, providing key background that is often overlooked by computer scientists, a review of the state-of-the-art, a set of pressing challenges, and a call to action for the research community.

...read moreread less

237 citations

Proceedings Article•DOI•

Image Super-Resolution with Non-Local Sparse Attention

[...]

Yiqun Mei¹, Yuchen Fan¹, Yuqian Zhou¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Jun 2021

TL;DR: Non-local sparse attention (NLSA) as mentioned in this paper is designed to retain long-range modeling capability from non-local operation while enjoying robustness and high-efficiency of sparse representation, which partitions the input space into hash buckets of related features.

...read moreread less

Abstract: Both Non-Local (NL) operation and sparse representation are crucial for Single Image Super-Resolution (SISR). In this paper, we investigate their combinations and propose a novel Non-Local Sparse Attention (NLSA) with dynamic sparse attention pattern. NLSA is designed to retain long-range modeling capability from NL operation while enjoying robustness and high-efficiency of sparse representation. Specifically, NLSA rectifies non-local attention with spherical locality sensitive hashing (LSH) that partitions the input space into hash buckets of related features. For every query signal, NLSA assigns a bucket to it and only computes attention within the bucket. The resulting sparse attention prevents the model from attending to locations that are noisy and less-informative, while reducing the computational cost from quadratic to asymptotic linear with respect to the spatial size. Extensive experiments validate the effectiveness and efficiency of NLSA. With a few non-local sparse attention modules, our architecture, called non-local sparse network (NLSN), reaches state-of-the-art performance for SISR quantitatively and qualitatively.

...read moreread less

216 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54

Collapse