scispace - formally typeset
Search or ask a question
Author

Masayoshi Tomizuka

Bio: Masayoshi Tomizuka is an academic researcher from University of California, Berkeley. The author has contributed to research in topics: Control theory & Control system. The author has an hindex of 80, co-authored 1111 publications receiving 30069 citations. Previous affiliations of Masayoshi Tomizuka include University of California & Western Digital.


Papers
More filters
Journal ArticleDOI
TL;DR: In this article, a systematic way to combine adaptive control and sliding mode control (SMC) for trajectory tracking of robot manipulators in the presence of parametric uncertainties and uncertain nonlinearities is developed.
Abstract: A systematic way to combine adaptive control and sliding mode control (SMC) for trajectory tracking of robot manipulators in the presence of parametric uncertainties and uncertain nonlinearities is developed. Continuous sliding mode controllers without reaching transients and chattering problems are first developed by using a dynamic sliding mode. Transient performance is guaranteed and globally uniformly ultimately bounded (GUUB) stability is obtained. An adaptive scheme is also developed for comparison. With some modifications to the adaptation law, the control law is redesigned by combining the design methodologies of adaptive control and sliding mode control. The suggested controller preserves the advantages of both methods, namely, asymptotic stability of the adaptive system for parametric uncertainties and GUUB stability with guaranteed transient perfonnance of sliding mode control for both parametric uncertainties and uncertain nonlinearities. The control law is continuous and the chattering problem of sliding mode control is avoided. A prior knowledge of bounds on parametric uncertainties and uncertain nonlinearities is assumed. Experimental results conducted on the UCB/NSK SCARA direct drive robot show that the combined method reduces the final tracking error to more than half of the smoothed SMC laws for a payload uncertainty of 6 kg, and validate the advantage of introducing parameter adaptation in the smoothed SMC laws.

263 citations

Journal ArticleDOI
TL;DR: In this paper, the adaptive robust control (ARC) is applied to make the resulting closed-loop system robust to model uncertainties, instead of the disturbance observer (DOB) design previously tested by many researchers.
Abstract: This paper studies the high-performance robust motion control of machine tools. The newly proposed adaptive robust control (ARC) is applied to make the resulting closed-loop system robust to model uncertainties, instead of the disturbance observer (DOB) design previously tested by many researchers. Compared to DOB, the proposed ARC has a better tracking performance and transient in the presence of discontinuous disturbances, such as Coulomb friction, and it is of a lower order. As a result, time-consuming and costly rigorous friction identification and compensation is alleviated, and overall tracking performance is improved. The ARC design can also handle large parameter variations and is flexible in introducing extra nonlinear robust control terms and parameter adaptations to further improve the transient response and tracking performance. An anti-integration windup mechanism is inherently built in the ARC and, thus, the problem of control saturation is alleviated. Extensive comparative experimental tests are performed, and the results show the improved performance of the proposed ARC.

262 citations

Posted Content
TL;DR: This work represents images as a set of visual tokens and applies visual transformers to find relationships between visual semantic concepts to densely model relationships between them, and finds that this paradigm of token-based image representation and processing drastically outperforms its convolutional counterparts on image classification and semantic segmentation.
Abstract: Computer vision has achieved remarkable success by (a) representing images as uniformly-arranged pixel arrays and (b) convolving highly-localized features. However, convolutions treat all image pixels equally regardless of importance; explicitly model all concepts across all images, regardless of content; and struggle to relate spatially-distant concepts. In this work, we challenge this paradigm by (a) representing images as semantic visual tokens and (b) running transformers to densely model token relationships. Critically, our Visual Transformer operates in a semantic token space, judiciously attending to different image parts based on context. This is in sharp contrast to pixel-space transformers that require orders-of-magnitude more compute. Using an advanced training recipe, our VTs significantly outperform their convolutional counterparts, raising ResNet accuracy on ImageNet top-1 by 4.6 to 7 points while using fewer FLOPs and parameters. For semantic segmentation on LIP and COCO-stuff, VT-based feature pyramid networks (FPN) achieve 0.35 points higher mIoU while reducing the FPN module's FLOPs by 6.5x.

260 citations

Proceedings ArticleDOI
01 Jun 2021
TL;DR: Sun et al. as mentioned in this paper proposed sparse R-CNN, a purely sparse method for object detection in images, which completely avoids all efforts related to object candidates design and many-to-one label assignment.
Abstract: We present Sparse R-CNN, a purely sparse method for object detection in images. Existing works on object detection heavily rely on dense object candidates, such as k anchor boxes pre-defined on all grids of image feature map of size H × W. In our method, however, a fixed sparse set of learned object proposals, total length of N, are provided to object recognition head to perform classification and location. By eliminating HWk (up to hundreds of thousands) hand-designed object candidates to N (e.g. 100) learnable proposals, Sparse R-CNN completely avoids all efforts related to object candidates design and many-to-one label assignment. More importantly, final predictions are directly output without non-maximum suppression post-procedure. Sparse R-CNN demonstrates accuracy, run-time and training convergence performance on par with the well-established detector baselines on the challenging COCO dataset, e.g., achieving 45.0 AP in standard 3× training schedule and running at 22 fps using ResNet-50 FPN model. We hope our work could inspire re-thinking the convention of dense prior in object detectors. The code is available at: https://github.com/PeizeSun/SparseR-CNN.

256 citations

Posted Content
TL;DR: An INTERnational, Adversarial and Cooperative moTION dataset (INTERACTION dataset) in interactive driving scenarios with semantic maps for highly complex behavior such as negotiations, aggressive/irrational decisions and traffic rule violations is presented.
Abstract: Behavior-related research areas such as motion prediction/planning, representation/imitation learning, behavior modeling/generation, and algorithm testing, require support from high-quality motion datasets containing interactive driving scenarios with different driving cultures. In this paper, we present an INTERnational, Adversarial and Cooperative moTION dataset (INTERACTION dataset) in interactive driving scenarios with semantic maps. Five features of the dataset are highlighted. 1) The interactive driving scenarios are diverse, including urban/highway/ramp merging and lane changes, roundabouts with yield/stop signs, signalized intersections, intersections with one/two/all-way stops, etc. 2) Motion data from different countries and different continents are collected so that driving preferences and styles in different cultures are naturally included. 3) The driving behavior is highly interactive and complex with adversarial and cooperative motions of various traffic participants. Highly complex behavior such as negotiations, aggressive/irrational decisions and traffic rule violations are densely contained in the dataset, while regular behavior can also be found from cautious car-following, stop, left/right/U-turn to rational lane-change and cycling and pedestrian crossing, etc. 4) The levels of criticality span wide, from regular safe operations to dangerous, near-collision maneuvers. Real collision, although relatively slight, is also included. 5) Maps with complete semantic information are provided with physical layers, reference lines, lanelet connections and traffic rules. The data is recorded from drones and traffic cameras. Statistics of the dataset in terms of number of entities and interaction density are also provided, along with some utilization examples in a variety of behavior-related research areas. The dataset can be downloaded via this https URL.

253 citations


Cited by
More filters
Journal ArticleDOI

[...]

08 Dec 2001-BMJ
TL;DR: There is, I think, something ethereal about i —the square root of minus one, which seems an odd beast at that time—an intruder hovering on the edge of reality.
Abstract: There is, I think, something ethereal about i —the square root of minus one. I remember first hearing about it at school. It seemed an odd beast at that time—an intruder hovering on the edge of reality. Usually familiarity dulls this sense of the bizarre, but in the case of i it was the reverse: over the years the sense of its surreal nature intensified. It seemed that it was impossible to write mathematics that described the real world in …

33,785 citations

Posted Content
TL;DR: Vision Transformer (ViT) attains excellent results compared to state-of-the-art convolutional networks while requiring substantially fewer computational resources to train.
Abstract: While the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited. In vision, attention is either applied in conjunction with convolutional networks, or used to replace certain components of convolutional networks while keeping their overall structure in place. We show that this reliance on CNNs is not necessary and a pure transformer applied directly to sequences of image patches can perform very well on image classification tasks. When pre-trained on large amounts of data and transferred to multiple mid-sized or small image recognition benchmarks (ImageNet, CIFAR-100, VTAB, etc.), Vision Transformer (ViT) attains excellent results compared to state-of-the-art convolutional networks while requiring substantially fewer computational resources to train.

12,690 citations

Book
31 Jul 1997
TL;DR: This book explores the meta-heuristics approach called tabu search, which is dramatically changing the authors' ability to solve a host of problems that stretch over the realms of resource planning, telecommunications, VLSI design, financial analysis, scheduling, spaceplanning, energy distribution, molecular engineering, logistics, pattern classification, flexible manufacturing, waste management,mineral exploration, biomedical analysis, environmental conservation and scores of other problems.
Abstract: From the Publisher: This book explores the meta-heuristics approach called tabu search, which is dramatically changing our ability to solve a hostof problems that stretch over the realms of resource planning,telecommunications, VLSI design, financial analysis, scheduling, spaceplanning, energy distribution, molecular engineering, logistics,pattern classification, flexible manufacturing, waste management,mineral exploration, biomedical analysis, environmental conservationand scores of other problems. The major ideas of tabu search arepresented with examples that show their relevance to multipleapplications. Numerous illustrations and diagrams are used to clarifyprinciples that deserve emphasis, and that have not always been wellunderstood or applied. The book's goal is to provide ''hands-on' knowledge and insight alike, rather than to focus exclusively eitheron computational recipes or on abstract themes. This book is designedto be useful and accessible to researchers and practitioners inmanagement science, industrial engineering, economics, and computerscience. It can appropriately be used as a textbook in a masterscourse or in a doctoral seminar. Because of its emphasis on presentingideas through illustrations and diagrams, and on identifyingassociated practical applications, it can also be used as asupplementary text in upper division undergraduate courses. Finally, there are many more applications of tabu search than canpossibly be covered in a single book, and new ones are emerging everyday. The book's goal is to provide a grounding in the essential ideasof tabu search that will allow readers to create successfulapplications of their own. Along with the essentialideas,understanding of advanced issues is provided, enabling researchers togo beyond today's developments and create the methods of tomorrow.

6,373 citations

Journal ArticleDOI
TL;DR: A Nyquist criterion is proved that uses the eigenvalues of the graph Laplacian matrix to determine the effect of the communication topology on formation stability, and a method for decentralized information exchange between vehicles is proposed.
Abstract: We consider the problem of cooperation among a collection of vehicles performing a shared task using intervehicle communication to coordinate their actions. Tools from algebraic graph theory prove useful in modeling the communication network and relating its topology to formation stability. We prove a Nyquist criterion that uses the eigenvalues of the graph Laplacian matrix to determine the effect of the communication topology on formation stability. We also propose a method for decentralized information exchange between vehicles. This approach realizes a dynamical system that supplies each vehicle with a common reference to be used for cooperative motion. We prove a separation principle that decomposes formation stability into two components: Stability of this is achieved information flow for the given graph and stability of an individual vehicle for the given controller. The information flow can thus be rendered highly robust to changes in the graph, enabling tight formation control despite limitations in intervehicle communication capability.

4,377 citations