scispace - formally typeset
Search or ask a question
Author

Li Ma

Bio: Li Ma is an academic researcher. The author has contributed to research in topics: Frequency domain & Convolutional neural network. The author has co-authored 1 publications.

Papers
More filters
Posted Content
TL;DR: In this article, the phase spectrum of the image is re-combed with the amplitude spectrum to improve robustness to common perturbations and out-of-distribution detection, and the generated samples force the CNN to pay more attention to the structured information from phase components and keep robust to the variation of amplitude.
Abstract: Recently, the generalization behavior of Convolutional Neural Networks (CNN) is gradually transparent through explanation techniques with the frequency components decomposition. However, the importance of the phase spectrum of the image for a robust vision system is still ignored. In this paper, we notice that the CNN tends to converge at the local optimum which is closely related to the high-frequency components of the training images, while the amplitude spectrum is easily disturbed such as noises or common corruptions. In contrast, more empirical studies found that humans rely on more phase components to achieve robust recognition. This observation leads to more explanations of the CNN's generalization behaviors in both robustness to common perturbations and out-of-distribution detection, and motivates a new perspective on data augmentation designed by re-combing the phase spectrum of the current image and the amplitude spectrum of the distracter image. That is, the generated samples force the CNN to pay more attention to the structured information from phase components and keep robust to the variation of the amplitude. Experiments on several image datasets indicate that the proposed method achieves state-of-the-art performances on multiple generalizations and calibration tasks, including adaptability for common corruptions and surface variations, out-of-distribution detection, and adversarial attack.

9 citations


Cited by
More filters
Book ChapterDOI
TL;DR: Tangsenghenshou et al. as mentioned in this paperang et al., used a simple classifier using noise patterns to detect a wide range of generative models, including GAN and flow-based models.
Abstract: AbstractThe widespread of generative models have called into question the authenticity of many things on the web. In this situation, the task of image forensics is urgent. The existing methods examine generated images and claim a forgery by detecting visual artifacts or invisible patterns, resulting in generalization issues. We observed that the noise pattern of real images exhibits similar characteristics in the frequency domain, while the generated images are far different. Therefore, we can perform image authentication by checking whether an image follows the patterns of authentic images. The experiments show that a simple classifier using noise patterns can easily detect a wide range of generative models, including GAN and flow-based models. Our method achieves state-of-the-art performance on both low- and high-resolution images from a wide range of generative models and shows superior generalization ability to unseen models. The code is available at https://github.com/Tangsenghenshou/Detecting-Generated-Images-by-Real-Images.KeywordsImage forensicsForgery detectionImage noiseFrequency domain analysisGANGenerated images

5 citations

Book ChapterDOI
01 Jan 2022
TL;DR: In this article , the authors proposed a general data augmentation scheme that relies on simple yet rich families of max-entropy image transformations to improve robustness to common corruptions.
Abstract: Despite their impressive performance on image classification tasks, deep networks have a hard time generalizing to unforeseen corruptions of their data. To fix this vulnerability, prior works have built complex data augmentation strategies, combining multiple methods to enrich the training data. However, introducing intricate design choices or heuristics makes it hard to understand which elements of these methods are indeed crucial for improving robustness. In this work, we take a step back and follow a principled approach to achieve robustness to common corruptions. We propose PRIME, a general data augmentation scheme that relies on simple yet rich families of max-entropy image transformations. PRIME outperforms the prior art in terms of corruption robustness, while its simplicity and plug-and-play nature enable combination with other methods to further boost their robustness. We analyze PRIME to shed light on the importance of the mixing strategy on synthesizing corrupted images, and to reveal the robustness-accuracy trade-offs arising in the context of common corruptions. Finally, we show that the computational efficiency of our method allows it to be easily used in both on-line and off-line data augmentation schemes. Our code is available at https://github.com/amodas/PRIME-augmentations .

3 citations

Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper developed an end-to-end amplitude-phase channel attention network (APCAN) for SIM reconstruction based on a-priori frequency and temporal knowledge.
Abstract: Structured illumination microscopy (SIM) has been a popular method for live-cell super-resolution (SR) imaging due to its excellent photon efficiency. However, SIM is always marred by artifacts in the reconstructed SR images. To address this problem, we develop an end-to-end amplitude-phase channel attention network (APCAN) for SIM reconstruction based on a-priori frequency and temporal knowledge. The APCAN reinforces both amplitude and phase information of the raw images to guide network reconstruction and attains SR images with fewer artifacts. Moreover, inspired by the continuity knowledge in the traditional method, we design a temporal processing module in APCAN to utilize multiple time points data. Trained with the video dataset imaged from our setup, APCAN can reconstruct an artifact-minimized SR image, achieving a reduction of 38% in reconstruction errors. Finally, we demonstrate that the APCAN reconstructed images have great resistance to noise and photobleaching, achieving the best fidelity among all methods tested.
Book ChapterDOI
01 Jan 2022
TL;DR: In this paper , the authors revisited the significance of shape-biases for the classification of skin lesion images and showed that deep feature extractors are still inclined towards learning entangled features for skin lesions classification, and individual features can still be decoded from this entangled representation.
Abstract: It is generally believed that the human visual system is biased towards the recognition of shapes rather than textures. This assumption has led to a growing body of work aiming to align deep models’ decision-making processes with the fundamental properties of human vision. The reliance on shape features is primarily expected to improve the robustness of these models under covariate shift. In this paper, we revisit the significance of shape-biases for the classification of skin lesion images. Our analysis shows that different skin lesion datasets exhibit varying biases towards individual image features. Interestingly, despite deep feature extractors being inclined towards learning entangled features for skin lesion classification, individual features can still be decoded from this entangled representation. This indicates that these features are still represented in the learnt embedding spaces of the models, but not used for classification. In addition, the spectral analysis of different datasets shows that in contrast to common visual recognition, dermoscopic skin lesion classification, by nature, is reliant on complex feature combinations beyond shape-bias. As a natural consequence, shifting away from the prevalent desire of shape-biasing models can even improve skin lesion classifiers in some cases.
Journal ArticleDOI
TL;DR: In this article , a novel data augmentation method called random image frequency aggregation dropout (RIFAD) is proposed, which consists of two sub-algorithms: Fourier spectrum analysis (FSA) and frequency aggregation Dropout (FAD).