scispace - formally typeset
Search or ask a question

Showing papers by "Ivor W. Tsang published in 2023"


Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper proposed a novel data-dependent hashing method named unfolded self-reconstruction locality-sensitive hashing (USR-LSH), which unfolded the optimization update for instance-wise data reconstruction, which is better for preserving data information than data-independent LSH.
Abstract: Approximate nearest neighbour (ANN) search is an essential component of search engines, recommendation systems, etc. Many recent works focus on learning-based data-distribution-dependent hashing and achieve good retrieval performance. However, due to increasing demand for users' privacy and security, we often need to remove users' data information from Machine Learning (ML) models to satisfy specific privacy and security requirements. This need requires the ANN search algorithm to support fast online data deletion and insertion. Current learning-based hashing methods need retraining the hash function, which is prohibitable due to the vast time-cost of large-scale data. To address this problem, we propose a novel data-dependent hashing method named unfolded self-reconstruction locality-sensitive hashing (USR-LSH). Our USR-LSH unfolded the optimization update for instance-wise data reconstruction, which is better for preserving data information than data-independent LSH. Moreover, our USR-LSH supports fast online data deletion and insertion without retraining. To the best of our knowledge, we are the first to address the machine unlearning of retrieval problems. Empirically, we demonstrate that USR-LSH outperforms the state-of-the-art data-distribution-independent LSH in ANN tasks in terms of precision and recall. We also show that USR-LSH has significantly faster data deletion and insertion time than learning-based data-dependent hashing.

2 citations


DOI
TL;DR: In this paper , a consensus graph is obtained by fusing the multiple pure graphs, which is structurized to contain exactly the connected components where the connected component precisely corresponds to an individual cluster.
Abstract: In multi-view subspace clustering, it is significant to find a common latent space in which the multi-view datasets are located. A number of multi-view subspace clustering methods have been proposed to explore the common latent subspace and achieved promising performance. However, previous multi-view subspace clustering algorithms seldom consider the multi-view consistency and multi-view diversity, let alone take them into consideration simultaneously. In this paper, we propose a novel multi-view subspace clustering by joint measuring the consistency and diversity, which is able to exploit these two complementary criteria seamlessly into a holistic design of clustering algorithms. The proposed model first searches a pure graph for each view by detecting the intrinsic consistent and diverse parts. A consensus graph is then obtained by fusing the multiple pure graphs. Moreover, the consensus graph is structurized to contain exactly $c$ c connected components where $c$ c is the number of clusters. In this way, the final clustering result can be obtained directly since each connected component precisely corresponds to an individual cluster. Extensive experimental studies on various datasets manifest that our model achieves comparable performance than the other state-of-the-art methods.

2 citations


Journal ArticleDOI
TL;DR: Li et al. as discussed by the authors proposed a causal intervention module for related work generation (CaM) to effectively capture causalities in the generation process and improve the quality and coherence of the generated related works.
Abstract: Abstractive related work generation has attracted increasing attention in generating coherent related work that better helps readers grasp the background in the current research. However, most existing abstractive models ignore the inherent causality of related work generation, leading to low quality of generated related work and spurious correlations that affect the models' generalizability. In this study, we argue that causal intervention can address these limitations and improve the quality and coherence of the generated related works. To this end, we propose a novel Causal Intervention Module for Related Work Generation (CaM) to effectively capture causalities in the generation process and improve the quality and coherence of the generated related works. Specifically, we first model the relations among sentence order, document relation, and transitional content in related work generation using a causal graph. Then, to implement the causal intervention and mitigate the negative impact of spurious correlations, we use do-calculus to derive ordinary conditional probabilities and identify causal effects through CaM. Finally, we subtly fuse CaM with Transformer to obtain an end-to-end generation model. Extensive experiments on two real-world datasets show that causal interventions in CaM can effectively promote the model to learn causal relations and produce related work of higher quality and coherence.

1 citations


Journal ArticleDOI
TL;DR: Zhang et al. as discussed by the authors formulate shadow removal as an adaptive fusion task that takes advantage of both shadow removal and image inpainting, and develop an adaptive fuse network consisting of two encoders, an adaptive block, and a decoder.
Abstract: Fully-supervised shadow removal methods achieve the best restoration qualities on public datasets but still generate some shadow remnants. One of the reasons is the lack of large-scale shadow&shadow-free image pairs. Unsupervised methods can alleviate the issue but their restoration qualities are much lower than those of fully-supervised methods. In this work, we find that pretraining shadow removal networks on the image inpainting dataset can reduce the shadow remnants significantly: a naive encoder-decoder network gets competitive restoration quality w.r.t. the state-of-the-art methods via only 10% shadow&shadow-free image pairs. After analyzing networks with/without inpainting pre-training via the information stored in the weight (IIW), we find that inpainting pretraining improves restoration quality in non-shadow regions and enhances the generalization ability of networks significantly. Additionally, shadow removal fine-tuning enables networks to fill in the details of shadow regions. Inspired by these observations we formulate shadow removal as an adaptive fusion task that takes advantage of both shadow removal and image inpainting. Specifically, we develop an adaptive fusion network consisting of two encoders, an adaptive fusion block, and a decoder. The two encoders are responsible for extracting the feature from the shadow image and the shadow-masked image respectively. The adaptive fusion block is responsible for combining these features in an adaptive manner. Finally, the decoder converts the adaptive fused features to the desired shadow-free result. The extensive experiments show that our method empowered with inpainting outperforms all state-of-the-art methods.

1 citations


Journal ArticleDOI
TL;DR: Zhang et al. as discussed by the authors formulate a novel model to simultaneously learn a robust structured similarity graph and perform multi-view clustering, which is adaptively learned based on a latent representation that is invulnerable to noises and outliers.
Abstract: Multi-view clustering aims to reveal the correlation between different input modalities in an unsupervised way. Similarity between data samples can be described by a similarity graph, which governs the quality of multi-view clustering. However, existing multi-view graph learning methods mainly construct similarity graph based on raw features, which are unreliable as real-world datasets usually contain noises, outliers, or even redundant information. In this paper, we formulate a novel model to simultaneously learn a robust structured similarity graph and perform multi-view clustering. The similarity graph is adaptively learned based on a latent representation that is invulnerable to noises and outliers. Furthermore, the similarity graph is enforced to contain a clear structure, i.e., the number of connected components of the target graph is exactly equal to the ground-truth class number. Consequently, the label to each data sample can be directly assigned without any postprocessing. As a result, our model aims at accomplishing three subtasks: latent representation extraction, similarity graph learning, and cluster label allocation, in a unified framework. These three subtasks are seamlessly integrated and can be mutually boosted by each other towards the overall optimal solution. An efficient alternation algorithm is proposed to solve the optimization problem. Experimental results on several benchmark datasets illustrate the effectiveness of the proposed model.

1 citations


Journal ArticleDOI
TL;DR: Zhang et al. as discussed by the authors proposed a cross-domain meta-augmentation technique for content-aware recommendation systems (MetaCAR) to construct non-mutually-exclusive (Non-ME) tasks.
Abstract: Cold-start has become critical for recommendations, especially for sparse user-item interactions. Recent approaches based on meta-learning succeed in alleviating the issue, owing to the fact that these methods have strong generalization, so they can fast adapt to new tasks under cold-start settings. However, these meta-learning-based recommendation models learned with single and spase ratings are easily falling into the meta-overfitting, since the one and only rating $r_{ui}$rui to a specific item $i$i cannot reflect a user's diverse interests under various circumstances(e.g., time, mood, age, etc), i.e., if $r_{ui}$rui equals to 1 in the historical dataset, but $r_{ui}$rui could be 0 in some circumstance. In meta-learning, tasks with these single ratings are called Non-Mutually-Exclusive(Non-ME) tasks, and tasks with diverse ratings are called Mutually-Exclusive(ME) tasks. Fortunately, a meta-augmentation technique is proposed to relief the meta-overfitting for meta-learning methods by transferring Non-ME tasks into ME tasks by adding noises to labels without changing inputs. Motivated by the meta-augmentation method, in this paper, we propose a cross-domain meta-augmentation technique for content-aware recommendation systems (MetaCAR) to construct ME tasks in the recommendation scenario. Our proposed method consists of two stages: meta-augmentation and meta-learning. In the meta-augmentation stage, we first conduct domain adaptation by a dual conditional variational autoencoder (CVAE) with a multi-view information bottleneck constraint, and then apply the learned CVAE to generate ratings for users in the target domain. In the meta-learning stage, we introduce both the true and generated ratings to construct ME tasks that enables the meta-learning recommendations to avoid meta-overfitting. Experiments evaluated in real-world datasets show the significant superiority of MetaCAR for coping with the cold-start user issue over competing baselines including cross-domain, content-aware, and meta-learning-based recommendations.

1 citations


Journal ArticleDOI
TL;DR: Zhang et al. as discussed by the authors proposed to handle data restoration for shadow regions and identical mapping for non-shadow regions separately and leverage the identical mapping results to guide the shadow restoration in an iterative manner.
Abstract: Shadow removal is to restore shadow regions to their shadow-free counterparts while leaving non-shadow regions unchanged. State-of-the-art shadow removal methods train deep neural networks on collected shadow&shadow-free image pairs, which are desired to complete two distinct tasks via shared weights, i.e., data restoration for shadow regions and identical mapping for non-shadow regions. We find that these two tasks exhibit poor compatibility, and using shared weights for these two tasks could lead to the model being optimized towards only one task instead of both during the training process. Note that such a key issue is not identified by existing deep learning-based shadow removal methods. To address this problem, we propose to handle these two tasks separately and leverage the identical mapping results to guide the shadow restoration in an iterative manner. Specifically, our method consists of three components: an identical mapping branch (IMB) for non-shadow regions processing, an iterative de-shadow branch (IDB) for shadow regions restoration based on identical results, and a smart aggregation block (SAB). The IMB aims to reconstruct an image that is identical to the input one, which can benefit the restoration of the non-shadow regions without explicitly distinguishing between shadow and non-shadow regions. Utilizing the multi-scale features extracted by the IMB, the IDB can effectively transfer information from non-shadow regions to shadow regions progressively, facilitating the process of shadow removal. The SAB is designed to adaptive integrate features from both IMB and IDB. Moreover, it generates a finely tuned soft shadow mask that guides the process of removing shadows. Extensive experiments demonstrate our method outperforms all the state-of-the-art shadow removal approaches on the widely used shadow removal datasets.

Journal ArticleDOI
TL;DR: Gong et al. as discussed by the authors proposed a general model for learning with incomplete data, which can be appropriately adjusted with different missingness patterns, alleviating competitions between data, and further introduced a low-rank constraint to promote the generalization ability of the model.
Abstract: Many real-world problems deal with collections of data with missing values, e.g., RNA sequential analytics, image completion, video processing, etc. Usually, such missing data is a serious impediment to a good learning achievement. Existing methods tend to use a universal model for all incomplete data, resulting in a suboptimal model for each missingness pattern. In this paper, we present a general model for learning with incomplete data. The proposed model can be appropriately adjusted with different missingness patterns, alleviating competitions between data. Our model is based on observable features only, so it does not incur errors from data imputation. We further introduce a low-rank constraint to promote the generalization ability of our model. Analysis of the generalization error justifies our idea theoretically. In additional, a subgradient method is proposed to optimize our model with a proven convergence rate. Experiments on different types of data show that our method compares favorably with typical imputation strategies and other state-of-the-art models for incomplete data. More importantly, our method can be seamlessly incorporated into the neural networks with the best results achieved. The source code is released at https://github.com/YS-GONG/missingness-patterns.

Journal ArticleDOI
TL;DR: In this paper , the authors consider the problem of nonparametric iterative machine teaching (NIMT), where the teacher provides examples to the learner iteratively such that the learners can achieve fast convergence to a target model.
Abstract: In this paper, we consider the problem of Iterative Machine Teaching (IMT), where the teacher provides examples to the learner iteratively such that the learner can achieve fast convergence to a target model. However, existing IMT algorithms are solely based on parameterized families of target models. They mainly focus on convergence in the parameter space, resulting in difficulty when the target models are defined to be functions without dependency on parameters. To address such a limitation, we study a more general task -- Nonparametric Iterative Machine Teaching (NIMT), which aims to teach nonparametric target models to learners in an iterative fashion. Unlike parametric IMT that merely operates in the parameter space, we cast NIMT as a functional optimization problem in the function space. To solve it, we propose both random and greedy functional teaching algorithms. We obtain the iterative teaching dimension (ITD) of the random teaching algorithm under proper assumptions, which serves as a uniform upper bound of ITD in NIMT. Further, the greedy teaching algorithm has a significantly lower ITD, which reaches a tighter upper bound of ITD in NIMT. Finally, we verify the correctness of our theoretical findings with extensive experiments in nonparametric scenarios.

Journal ArticleDOI
TL;DR: Difformer as discussed by the authors proposes a novel multi-resolutional differencing mechanism, which is able to progressively and adaptively make nuanced yet meaningful changes prominent, meanwhile, the periodic or cyclic patterns can be dynamically captured with flexible lagging and dynamic ranging operations.
Abstract: Time series analysis is essential to many far-reaching applications of data science and statistics including economic and financial forecasting, surveillance, and automated business processing. Though being greatly successful of Transformer in computer vision and natural language processing, the potential of employing it as the general backbone in analyzing the ubiquitous times series data has not been fully released yet. Prior Transformer variants on time series highly rely on task-dependent designs and pre-assumed "pattern biases", revealing its insufficiency in representing nuanced seasonal, cyclic, and outlier patterns which are highly prevalent in time series. As a consequence, they can not generalize well to different time series analysis tasks. To tackle the challenges, we propose DifFormer, an effective and efficient Transformer architecture that can serve as a workhorse for a variety of time-series analysis tasks. DifFormer incorporates a novel multi-resolutional differencing mechanism, which is able to progressively and adaptively make nuanced yet meaningful changes prominent, meanwhile, the periodic or cyclic patterns can be dynamically captured with flexible lagging and dynamic ranging operations. Extensive experiments demonstrate DifFormer significantly outperforms state-of-the-art models on three essential time-series analysis tasks, including classification, regression, and forecasting. In addition to its superior performances, DifFormer also excels in efficiency - a linear time/memory complexity with empirically lower time consumption.

Journal ArticleDOI
TL;DR: Zhang et al. as discussed by the authors proposed a structure-informed shadow removal network (StructNet) to leverage the image structure information to address the shadow remnant problem, which first reconstructs the structure information of the input image without shadows, and then uses the restored shadow-free structure prior to guiding the image-level shadow removal.
Abstract: —Shadow removal is a fundamental task in computer vision. Despite the success, existing deep learning-based shadow removal methods still produce images with shadow remnants. These shadow remnants typically exist in homogeneous regions with low intensity values, making them untraceable in the existing image-to-image mapping paradigm. We observe from our experiments that shadows mainly degrade object colors at the image structure level (in which humans perceive object outlines filled with continuous colors). Hence, in this paper, we propose to remove shadows at the image structure level. Based on this idea, we propose a novel structure-informed shadow removal network ( StructNet ) to leverage the image structure information to address the shadow remnant problem. Specifically, StructNet first reconstructs the structure information of the input image without shadows, and then uses the restored shadow-free structure prior to guiding the image-level shadow removal. StructNet contains two main novel modules: (1) a mask-guided shadow-free extraction (MSFE) module to extract image structural features in a non-shadow to shadow directional manner, and (2) a multi-scale feature & residual aggregation (MFRA) module to leverage the shadow-free structure information to regularize feature consistency. In addition, we also propose to extend StructNet to exploit multi-level structure information ( MStructNet ), to further boost the shadow removal performance with minimum computational overheads. Extensive experiments on three shadow removal benchmarks demonstrate that our method outperforms existing shadow removal methods, and our StructNet can be integrated with existing methods to further boost their performances.

Journal ArticleDOI
TL;DR: In this paper , a model provider can access the operational performance of the candidate model multiple times via feedback from a local user (or a group of users) by utilizing the feedbacks, which could be as simple as scalars, such as inference accuracy or usage rate.
Abstract: Many machine learning applications encounter situations where model providers are required to further refine the previously trained model so as to gratify the specific need of local users. This problem is reduced to the standard model tuning paradigm if the target data is permissibly fed to the model. However, it is rather difficult in a wide range of practical cases where target data is not shared with model providers but commonly some evaluations about the model are accessible. In this paper, we formally set up a challenge named Earning eXtra PerformancE from restriCTive feEDdbacks (EXPECTED) to describe this form of model tuning problems. Concretely, EXPECTED admits a model provider to access the operational performance of the candidate model multiple times via feedback from a local user (or a group of users). The goal of the model provider is to eventually deliver a satisfactory model to the local user(s) by utilizing the feedbacks. Unlike existing model tuning methods where the target data is always ready for calculating model gradients, the model providers in EXPECTED only see some feedbacks which could be as simple as scalars, such as inference accuracy or usage rate. To enable tuning in this restrictive circumstance, we propose to characterize the geometry of the model performance with regard to model parameters through exploring the parameters' distribution. In particular, for deep models whose parameters distribute across multiple layers, a more query-efficient algorithm is further tailor-designed that conducts layerwise tuning with more attention to those layers which pay off better. Our theoretical analyses justify the proposed algorithms from the aspects of both efficacy and efficiency. Extensive experiments on different applications demonstrate that our work forges a sound solution to the EXPECTED problem, which establishes the foundation for future studies towards this direction.

Journal ArticleDOI
TL;DR: Unseen Transition Suss GAN (UTSGAN) as mentioned in this paper constructs a manifold for the transition with a stochastic transition encoder and coherently regularizes and generalizes result consistency and transition consistency on both training and unobserved translations.
Abstract: In the field of Image-to-Image (I2I) translation, ensuring consistency between input images and their translated results is a key requirement for producing high-quality and desirable outputs. Previous I2I methods have relied on result consistency, which enforces consistency between the translated results and the ground truth output, to achieve this goal. However, result consistency is limited in its ability to handle complex and unseen attribute changes in translation tasks. To address this issue, we introduce a transition-aware approach to I2I translation, where the data translation mapping is explicitly parameterized with a transition variable, allowing for the modelling of unobserved translations triggered by unseen transitions. Furthermore, we propose the use of transition consistency, defined on the transition variable, to enable regularization of consistency on unobserved translations, which is omitted in previous works. Based on these insights, we present Unseen Transition Suss GAN (UTSGAN), a generative framework that constructs a manifold for the transition with a stochastic transition encoder and coherently regularizes and generalizes result consistency and transition consistency on both training and unobserved translations with tailor-designed constraints. Extensive experiments on four different I2I tasks performed on five different datasets demonstrate the efficacy of our proposed UTSGAN in performing consistent translations.

Journal ArticleDOI
TL;DR: In this paper , a latent class-conditional noise model (LCCN) is proposed to parameterize the noise transition under a Bayesian framework, which is constrained on a simplex characterized by the complete dataset, instead of some ad hoc parametric space wrapped by the neural layer.
Abstract: Learning with noisy labels has become imperative in the Big Data era, which saves expensive human labors on accurate annotations. Previous noise-transition-based methods have achieved theoretically-grounded performance under the Class-Conditional Noise model (CCN). However, these approaches builds upon an ideal but impractical anchor set available to pre-estimate the noise transition. Even though subsequent works adapt the estimation as a neural layer, the ill-posed stochastic learning of its parameters in back-propagation easily falls into undesired local minimums. We solve this problem by introducing a Latent Class-Conditional Noise model (LCCN) to parameterize the noise transition under a Bayesian framework. By projecting the noise transition into the Dirichlet space, the learning is constrained on a simplex characterized by the complete dataset, instead of some ad-hoc parametric space wrapped by the neural layer. We then deduce a dynamic label regression method for LCCN, whose Gibbs sampler allows us efficiently infer the latent true labels to train the classifier and to model the noise. Our approach safeguards the stable update of the noise transition, which avoids previous arbitrarily tuning from a mini-batch of samples. We further generalize LCCN to different counterparts compatible with open-set noisy labels, semi-supervised learning as well as cross-model training. A range of experiments demonstrate the advantages of LCCN and its variants over the current state-of-the-art methods. The code is available at here.

Journal ArticleDOI
TL;DR: Wang et al. as discussed by the authors proposed a new multi-task learning framework based on co-evolving reasoning, which allows the three tasks to evolve together and prompt each other recurrently, and integrates prediction-level interactions to capture explicit dependencies.
Abstract: Emotion-Cause Pair Extraction (ECPE) aims to extract all emotion clauses and their corresponding cause clauses from a document. Existing approaches tackle this task through multi-task learning (MTL) framework in which the two subtasks provide indicative clues for ECPE. However, the previous MTL framework considers only one round of multi-task reasoning and ignores the reverse feedbacks from ECPE to the subtasks. Besides, its multi-task reasoning only relies on semantics-level interactions, which cannot capture the explicit dependencies, and both the encoder sharing and multi-task hidden states concatenations can hardly capture the causalities. To solve these issues, we first put forward a new MTL framework based on Co-evolving Reasoning. It (1) models the bidirectional feedbacks between ECPE and its subtasks; (2) allows the three tasks to evolve together and prompt each other recurrently; (3) integrates prediction-level interactions to capture explicit dependencies. Then we propose a novel multi-task relational graph (MRG) to sufficiently exploit the causal relations. Finally, we propose a Co-evolving Graph Reasoning Network (CGR-Net) that implements our MTL framework and conducts Co-evolving Reasoning on MRG. Experimental results show that our model achieves new state-of-the-art performance, and further analysis confirms the advantages of our method.

Journal ArticleDOI
TL;DR: In this paper , a dynamic label learning (DLL) algorithm for noisy label learning was proposed, which can be used for classification with noisy labels with a consistency guarantee that label noise does not ultimately hinder the search for the optimal classifier of the noise-free sample.
Abstract: Deep models have achieved state-of-the-art performance on a broad range of visual recognition tasks. Nevertheless, the generalization ability of deep models is seriously affected by noisy labels. Though deep learning packages have different losses, this is not transparent for users to choose consistent losses. This paper addresses the problem of how to use abundant loss functions designed for the traditional classification problem in the presence of label noise. We present a dynamic label learning (DLL) algorithm for noisy label learning and then prove that any surrogate loss function can be used for classification with noisy labels by using our proposed algorithm, with a consistency guarantee that the label noise does not ultimately hinder the search for the optimal classifier of the noise-free sample. In addition, we provide a depth theoretical analysis of our algorithm to verify the justifies' correctness and explain the powerful robustness. Finally, experimental results on synthetic and real datasets confirm the efficiency of our algorithm and the correctness of our justifies and show that our proposed algorithm significantly outperforms or is comparable to current state-of-the-art counterparts.

Journal ArticleDOI
TL;DR: In this article , a Structured Component-based Neural Network (SCNN) is proposed for multivariate time series (MTS) forecasting, which decouples MTS data into structured and heterogeneous components and then respectively extrapolates the evolution of these components, the dynamics of which is more traceable and predictable than the original MTS.
Abstract: Multivariate time-series (MTS) forecasting is a paramount and fundamental problem in many real-world applications. The core issue in MTS forecasting is how to effectively model complex spatial-temporal patterns. In this paper, we develop a modular and interpretable forecasting framework, which seeks to individually model each component of the spatial-temporal patterns. We name this framework SCNN, short for Structured Component-based Neural Network. SCNN works with a pre-defined generative process of MTS, which arithmetically characterizes the latent structure of the spatial-temporal patterns. In line with its reverse process, SCNN decouples MTS data into structured and heterogeneous components and then respectively extrapolates the evolution of these components, the dynamics of which is more traceable and predictable than the original MTS. Extensive experiments are conducted to demonstrate that SCNN can achieve superior performance over state-of-the-art models on three real-world datasets. Additionally, we examine SCNN with different configurations and perform in-depth analyses of the properties of SCNN.