scispace - formally typeset
Search or ask a question
Author

Hung Bui

Other affiliations: Artificial Intelligence Center, IBM, Curtin University  ...read more
Bio: Hung Bui is an academic researcher from Google. The author has contributed to research in topics: Hidden Markov model & Inference. The author has an hindex of 30, co-authored 137 publications receiving 4630 citations. Previous affiliations of Hung Bui include Artificial Intelligence Center & IBM.


Papers
More filters
Proceedings ArticleDOI
20 Jun 2005
TL;DR: The switching hidden semi-markov model (S-HSMM) is introduced, a two-layered extension of thehidden semi-Markov model for the modeling task and an effective scheme to detect abnormality without the need for training on abnormal data is proposed.
Abstract: This paper addresses the problem of learning and recognizing human activities of daily living (ADL), which is an important research issue in building a pervasive and smart environment. In dealing with ADL, we argue that it is beneficial to exploit both the inherent hierarchical organization of the activities and their typical duration. To this end, we introduce the switching hidden semi-markov model (S-HSMM), a two-layered extension of the hidden semi-Markov model (HSMM) for the modeling task. Activities are modeled in the S-HSMM in two ways: the bottom layer represents atomic activities and their duration using HSMMs; the top layer represents a sequence of high-level activities where each high-level activity is made of a sequence of atomic activities. We consider two methods for modeling duration: the classic explicit duration model using multinomial distribution, and the novel use of the discrete Coxian distribution. In addition, we propose an effective scheme to detect abnormality without the need for training on abnormal data. Experimental results show that the S-HSMM performs better than existing models including the flat HSMM and the hierarchical hidden Markov model in both classification and abnormality detection tasks, alleviating the need for presegmented training data. Furthermore, our discrete Coxian duration model yields better computation time and generalization error than the classic explicit duration model.

580 citations

Proceedings Article
15 Feb 2018
TL;DR: In this paper, the Virtual Adversarial Domain Adaptation (VADA) and Decision-boundary Iterative Refinement Training with a Teacher (DIRT-T) models are proposed.
Abstract: Domain adaptation refers to the problem of leveraging labeled data in a source domain to learn an accurate model in a target domain where labels are scarce or unavailable. A recent approach for finding a common representation of the two domains is via domain adversarial training (Ganin & Lempitsky, 2015), which attempts to induce a feature extractor that matches the source and target feature distributions in some feature space. However, domain adversarial training faces two critical limitations: 1) if the feature extraction function has high-capacity, then feature distribution matching is a weak constraint, 2) in non-conservative domain adaptation (where no single classifier can perform well in both the source and target domains), training the model to do well on the source domain hurts performance on the target domain. In this paper, we address these issues through the lens of the cluster assumption, i.e., decision boundaries should not cross high-density data regions. We propose two novel and related models: 1) the Virtual Adversarial Domain Adaptation (VADA) model, which combines domain adversarial training with a penalty term that punishes the violation the cluster assumption; 2) the Decision-boundary Iterative Refinement Training with a Teacher (DIRT-T) model, which takes the VADA model as initialization and employs natural gradient steps to further minimize the cluster assumption violation. Extensive empirical results demonstrate that the combination of these two models significantly improve the state-of-the-art performance on the digit, traffic sign, and Wi-Fi recognition domain adaptation benchmarks.

367 citations

Proceedings ArticleDOI
20 Jun 2005
TL;DR: The experimental results in a real-world environment have confirmed the belief that directly modeling shared structures not only reduces computational cost, but also improves recognition accuracy when compared with the tree HHMM and the flat HMM.
Abstract: Directly modeling the inherent hierarchy and shared structures of human behaviors, we present an application of the hierarchical hidden Markov model (HHMM) for the problem of activity recognition. We argue that to robustly model and recognize complex human activities, it is crucial to exploit both the natural hierarchical decomposition and shared semantics embedded in the movement trajectories. To this end, we propose the use of the HHMM, a rich stochastic model that has been recently extended to handle shared structures, for representing and recognizing a set of complex indoor activities. Furthermore, in the need of real-time recognition, we propose a Rao-Blackwellised particle filter (RBPF) that efficiently computes the filtering distribution at a constant time complexity for each new observation arrival. The main contributions of this paper lie in the application of the shared-structure HHMM, the estimation of the model's parameters at all levels simultaneously, and a construction of an RBPF approximate inference scheme. The experimental results in a real-world environment have confirmed our belief that directly modeling shared structures not only reduces computational cost, but also improves recognition accuracy when compared with the tree HHMM and the flat HMM.

357 citations

Journal ArticleDOI
TL;DR: This paper introduces the Abstract Hidden Markov Model (AHMM), a novel type of stochastic processes, provide its dynamic Bayesian network (DBN) structure and analyse the properties of this network, and proposes a novel plan recognition framework based on the AHMM as the plan execution model.
Abstract: In this paper, we present a method for recognising an agent's behaviour in dynamic, noisy, uncertain domains, and across multiple levels of abstraction. We term this problem on-line plan recognition under uncertainty and view it generally as probabilistic inference on the stochastic process representing the execution of the agent's plan. Our contributions in this paper are twofold. In terms of probabilistic inference, we introduce the Abstract Hidden Markov Model (AHMM), a novel type of stochastic processes, provide its dynamic Bayesian network (DBN) structure and analyse the properties of this network. We then describe an application of the Rao-Blackwellised Particle Filter to the AHMM which allows us to construct an efficient, hybrid inference method for this model. In terms of plan recognition, we propose a novel plan recognition framework based on the AHMM as the plan execution model. The Rao-Blackwellised hybrid inference for AHMM can take advantage of the independence properties inherent in a model of plan execution, leading to an algorithm for online probabilistic plan recognition that scales well with the number of levels in the plan hierarchy. This illustrates that while stochastic models for plan execution can be complex, they exhibit special structures which, if exploited, can lead to efficient plan recognition algorithms. We demonstrate the usefulness of the AHMM framework via a behaviour recognition system in a complex spatial environment using distributed video surveillance data.

296 citations

Patent
01 Jun 2009
TL;DR: In this paper, an apparatus for assisting a user in the execution of a task, where the task includes one or more workflows required to accomplish a goal defined by the user, includes a task learner for creating new workflows from user demonstrations, a workflow tracker for identifying and tracking the progress of a current workflow, a task assistance processor coupled with the workflow tracker, and a task executor coupled to the task assist processor, for manipulating an application on the machine used by user to carry out the suggestion.
Abstract: The present invention relates to a method and apparatus for assisting with automated task management In one embodiment, an apparatus for assisting a user in the execution of a task, where the task includes one or more workflows required to accomplish a goal defined by the user, includes a task learner for creating new workflows from user demonstrations, a workflow tracker for identifying and tracking the progress of a current workflow executing on a machine used by the user, a task assistance processor coupled to the workflow tracker, for generating a suggestion based on the progress of the current workflow, and a task executor coupled to the task assistance processor, for manipulating an application on the machine used by the user to carry out the suggestion

238 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Book
24 Aug 2012
TL;DR: This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach, and is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students.
Abstract: Today's Web-enabled deluge of electronic data calls for automated methods of data analysis. Machine learning provides these, developing methods that can automatically detect patterns in data and then use the uncovered patterns to predict future data. This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach. The coverage combines breadth and depth, offering necessary background material on such topics as probability, optimization, and linear algebra as well as discussion of recent developments in the field, including conditional random fields, L1 regularization, and deep learning. The book is written in an informal, accessible style, complete with pseudo-code for the most important algorithms. All topics are copiously illustrated with color images and worked examples drawn from such application domains as biology, text processing, computer vision, and robotics. Rather than providing a cookbook of different heuristic methods, the book stresses a principled model-based approach, often using the language of graphical models to specify models in a concise and intuitive way. Almost all the models described have been implemented in a MATLAB software package--PMTK (probabilistic modeling toolkit)--that is freely available online. The book is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students.

8,059 citations

Book
01 Jan 2001
TL;DR: This chapter discusses Decision-Theoretic Foundations, Game Theory, Rationality, and Intelligence, and the Decision-Analytic Approach to Games, which aims to clarify the role of rationality in decision-making.
Abstract: Preface 1. Decision-Theoretic Foundations 1.1 Game Theory, Rationality, and Intelligence 1.2 Basic Concepts of Decision Theory 1.3 Axioms 1.4 The Expected-Utility Maximization Theorem 1.5 Equivalent Representations 1.6 Bayesian Conditional-Probability Systems 1.7 Limitations of the Bayesian Model 1.8 Domination 1.9 Proofs of the Domination Theorems Exercises 2. Basic Models 2.1 Games in Extensive Form 2.2 Strategic Form and the Normal Representation 2.3 Equivalence of Strategic-Form Games 2.4 Reduced Normal Representations 2.5 Elimination of Dominated Strategies 2.6 Multiagent Representations 2.7 Common Knowledge 2.8 Bayesian Games 2.9 Modeling Games with Incomplete Information Exercises 3. Equilibria of Strategic-Form Games 3.1 Domination and Ratonalizability 3.2 Nash Equilibrium 3.3 Computing Nash Equilibria 3.4 Significance of Nash Equilibria 3.5 The Focal-Point Effect 3.6 The Decision-Analytic Approach to Games 3.7 Evolution. Resistance. and Risk Dominance 3.8 Two-Person Zero-Sum Games 3.9 Bayesian Equilibria 3.10 Purification of Randomized Strategies in Equilibria 3.11 Auctions 3.12 Proof of Existence of Equilibrium 3.13 Infinite Strategy Sets Exercises 4. Sequential Equilibria of Extensive-Form Games 4.1 Mixed Strategies and Behavioral Strategies 4.2 Equilibria in Behavioral Strategies 4.3 Sequential Rationality at Information States with Positive Probability 4.4 Consistent Beliefs and Sequential Rationality at All Information States 4.5 Computing Sequential Equilibria 4.6 Subgame-Perfect Equilibria 4.7 Games with Perfect Information 4.8 Adding Chance Events with Small Probability 4.9 Forward Induction 4.10 Voting and Binary Agendas 4.11 Technical Proofs Exercises 5. Refinements of Equilibrium in Strategic Form 5.1 Introduction 5.2 Perfect Equilibria 5.3 Existence of Perfect and Sequential Equilibria 5.4 Proper Equilibria 5.5 Persistent Equilibria 5.6 Stable Sets 01 Equilibria 5.7 Generic Properties 5.8 Conclusions Exercises 6. Games with Communication 6.1 Contracts and Correlated Strategies 6.2 Correlated Equilibria 6.3 Bayesian Games with Communication 6.4 Bayesian Collective-Choice Problems and Bayesian Bargaining Problems 6.5 Trading Problems with Linear Utility 6.6 General Participation Constraints for Bayesian Games with Contracts 6.7 Sender-Receiver Games 6.8 Acceptable and Predominant Correlated Equilibria 6.9 Communication in Extensive-Form and Multistage Games Exercises Bibliographic Note 7. Repeated Games 7.1 The Repeated Prisoners Dilemma 7.2 A General Model of Repeated Garnet 7.3 Stationary Equilibria of Repeated Games with Complete State Information and Discounting 7.4 Repeated Games with Standard Information: Examples 7.5 General Feasibility Theorems for Standard Repeated Games 7.6 Finitely Repeated Games and the Role of Initial Doubt 7.7 Imperfect Observability of Moves 7.8 Repeated Wines in Large Decentralized Groups 7.9 Repeated Games with Incomplete Information 7.10 Continuous Time 7.11 Evolutionary Simulation of Repeated Games Exercises 8. Bargaining and Cooperation in Two-Person Games 8.1 Noncooperative Foundations of Cooperative Game Theory 8.2 Two-Person Bargaining Problems and the Nash Bargaining Solution 8.3 Interpersonal Comparisons of Weighted Utility 8.4 Transferable Utility 8.5 Rational Threats 8.6 Other Bargaining Solutions 8.7 An Alternating-Offer Bargaining Game 8.8 An Alternating-Offer Game with Incomplete Information 8.9 A Discrete Alternating-Offer Game 8.10 Renegotiation Exercises 9. Coalitions in Cooperative Games 9.1 Introduction to Coalitional Analysis 9.2 Characteristic Functions with Transferable Utility 9.3 The Core 9.4 The Shapkey Value 9.5 Values with Cooperation Structures 9.6 Other Solution Concepts 9.7 Colational Games with Nontransferable Utility 9.8 Cores without Transferable Utility 9.9 Values without Transferable Utility Exercises Bibliographic Note 10. Cooperation under Uncertainty 10.1 Introduction 10.2 Concepts of Efficiency 10.3 An Example 10.4 Ex Post Inefficiency and Subsequent Oilers 10.5 Computing Incentive-Efficient Mechanisms 10.6 Inscrutability and Durability 10.7 Mechanism Selection by an Informed Principal 10.8 Neutral Bargaining Solutions 10.9 Dynamic Matching Processes with Incomplete Information Exercises Bibliography Index

3,569 citations

Book ChapterDOI
21 Apr 2004
TL;DR: This is the first work to investigate performance of recognition algorithms with multiple, wire-free accelerometers on 20 activities using datasets annotated by the subjects themselves, and suggests that multiple accelerometers aid in recognition.
Abstract: In this work, algorithms are developed and evaluated to de- tect physical activities from data acquired using five small biaxial ac- celerometers worn simultaneously on different parts of the body. Ac- celeration data was collected from 20 subjects without researcher su- pervision or observation. Subjects were asked to perform a sequence of everyday tasks but not told specifically where or how to do them. Mean, energy, frequency-domain entropy, and correlation of acceleration data was calculated and several classifiers using these features were tested. De- cision tree classifiers showed the best performance recognizing everyday activities with an overall accuracy rate of 84%. The results show that although some activities are recognized well with subject-independent training data, others appear to require subject-specific training data. The results suggest that multiple accelerometers aid in recognition because conjunctions in acceleration feature values can effectively discriminate many activities. With just two biaxial accelerometers - thigh and wrist - the recognition performance dropped only slightly. This is the first work to investigate performance of recognition algorithms with multiple, wire-free accelerometers on 20 activities using datasets annotated by the subjects themselves.

3,223 citations