Other affiliations: Google, Technion – Israel Institute of Technology, University of Illinois at Urbana–Champaign
Bio: Ilan Shimshoni is an academic researcher from University of Haifa. The author has contributed to research in topics: Epipolar geometry & Real image. The author has an hindex of 27, co-authored 138 publications receiving 4467 citations. Previous affiliations of Ilan Shimshoni include Google & Technion – Israel Institute of Technology.
Papers published on a yearly basis
••17 Jun 2006
TL;DR: A novel algorithm for tracking an object in a video sequence represented by multiple image fragments or patches, which is able to handle partial occlusions or pose change and overcomes several difficulties which cannot be handled by traditional histogram-based algorithms.
Abstract: We present a novel algorithm (which we call "Frag- Track") for tracking an object in a video sequence. The template object is represented by multiple image fragments or patches. The patches are arbitrary and are not based on an object model (in contrast with traditional use of modelbased parts e.g. limbs and torso in human tracking). Every patch votes on the possible positions and scales of the object in the current frame, by comparing its histogram with the corresponding image patch histogram. We then minimize a robust statistic in order to combine the vote maps of the multiple patches. A key tool enabling the application of our algorithm to tracking is the integral histogram data structure . Its use allows to extract histograms of multiple rectangular regions in the image in a very efficient manner. Our algorithm overcomes several difficulties which cannot be handled by traditional histogram-based algorithms [8, 6]. First, by robustly combining multiple patch votes, we are able to handle partial occlusions or pose change. Second, the geometric relations between the template patches allow us to take into account the spatial distribution of the pixel intensities - information which is lost in traditional histogram-based algorithms. Third, as noted by , tracking large targets has the same computational cost as tracking small targets. We present extensive experimental results on challenging sequences, which demonstrate the robust tracking achieved by our algorithm (even with the use of only gray-scale (noncolor) information).
TL;DR: A novel algorithm for detection of certain types of unusual events based on multiple local monitors which collect low-level statistics that is robust and works well in crowded scenes where tracking-based algorithms are likely to fail.
Abstract: We present a novel algorithm for detection of certain types of unusual events. The algorithm is based on multiple local monitors which collect low-level statistics. Each local monitor produces an alert if its current measurement is unusual and these alerts are integrated to a final decision regarding the existence of an unusual event. Our algorithm satisfies a set of requirements that are critical for successful deployment of any large-scale surveillance system. In particular, it requires a minimal setup (taking only a few minutes) and is fully automatic afterwards. Since it is not based on objects' tracks, it is robust and works well in crowded scenes where tracking-based algorithms are likely to fail. The algorithm is effective as soon as sufficient low-level observations representing the routine activity have been collected, which usually happens after a few minutes. Our algorithm runs in real-time. It was tested on a variety of real-life crowded scenes. A ground-truth was extracted for these scenes, with respect to which detection and false-alarm rates are reported.
TL;DR: An algorithm for recovering the orientation (attitude) of a satellite-based camera using a geometric voting scheme and a fast tracking algorithm that estimates the attitude for subsequent images after the first algorithm has terminated successfully.
Abstract: We present an algorithm for recovering the orientation (attitude) of a satellite-based camera. The algorithm matches stars in an image taken with the camera to stars in a star catalogue. The algorithm is based on a geometric voting scheme in which a pair of stars in the catalogue votes for a pair of stars in the image if the angular distance between the stars of both pairs is similar. As angular distance is a symmetric relationship, each of the two catalogue stars votes for each of the image stars. The identity of each star in the image is set to the identity of the catalogue star that cast the most votes. Once the identity of the stars is determined, the attitude of the camera is computed using a quaternion-based method. We further present a fast tracking algorithm that estimates the attitude for subsequent images after the first algorithm has terminated successfully. Our method runs in comparable speed to state of the art algorithms but is still more robust than them. The system has been implemented and tested on simulated data and on real sky images.
••01 Dec 2008
TL;DR: This paper defines a new class of view-independent curves, denoted demarcating curves, which are applied to artifact illustration in archaeology, where they can serve as a worthy alternative to the expensive, time-consuming, and biased manual depiction currently used.
Abstract: Curves on objects can convey the inherent features of the shape. This paper defines a new class of view-independent curves, denoted demarcating curves. In a nutshell, demarcating curves are the loci of the "strongest" inflections on the surface. Due to their appealing capabilities to extract and emphasize 3D textures, they are applied to artifact illustration in archaeology, where they can serve as a worthy alternative to the expensive, time-consuming, and biased manual depiction currently used.
••04 Jan 1998
TL;DR: This work introduces a novel method for visual homing based on recovering the epipolar geometry relating the current image taken by the robot and the target image, and presents two homing algorithms for two standard projection models, weak and full perspective.
Abstract: We introduce a novel method for visual homing. Using this method a robot can be sent to desired positions and orientations in 3-D space specified by single images taken from these positions. Our method determines the path of the robot on-line. The starting position of the robot is not constrained, and a 3-D model of the environment is not required. The method is based on recovering the epipolar geometry relating the current image taken by the robot and the target image. Using the epipolar geometry, most of the parameters which specify the differences in position and orientation of the camera between the two images are recovered. However, since not all of the parameters can be recovered from two images, we have developed specific methods to bypass these missing parameters and resolve the ambiguities that exist. We present two homing algorithms for two standard projection models, weak and full perspective. We have performed simulations and real experiments which demonstrate the robustness of the method and that the algorithms always converge to the target pose.
01 Jan 2001
TL;DR: This book is referred to read because it is an inspiring book to give you more chance to get experiences and also thoughts and it will show the best book collections and completed collections.
Abstract: Downloading the book in this website lists can give you more advantages. It will show you the best book collections and completed collections. So many books can be found in this website. So, this is not only this multiple view geometry in computer vision. However, this book is referred to read because it is an inspiring book to give you more chance to get experiences and also thoughts. This is simple, read the soft file of the book and you get it.
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).
••23 Jun 2013
TL;DR: Large scale experiments are carried out with various evaluation criteria to identify effective approaches for robust tracking and provide potential future research directions in this field.
Abstract: Object tracking is one of the most important components in numerous applications of computer vision. While much progress has been made in recent years with efforts on sharing code and datasets, it is of great importance to develop a library and benchmark to gauge the state of the art. After briefly reviewing recent advances of online object tracking, we carry out large scale experiments with various evaluation criteria to understand how these algorithms perform. The test image sequences are annotated with different attributes for performance evaluation and analysis. By analyzing quantitative results, we identify effective approaches for robust tracking and provide potential future research directions in this field.
01 Jan 2004
TL;DR: Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance and describes numerous important application areas such as image based rendering and digital libraries.
Abstract: From the Publisher: The accessible presentation of this book gives both a general view of the entire computer vision enterprise and also offers sufficient detail to be able to build useful applications. Users learn techniques that have proven to be useful by first-hand experience and a wide range of mathematical methods. A CD-ROM with every copy of the text contains source code for programming practice, color images, and illustrative movies. Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance. Topics are discussed in substantial and increasing depth. Application surveys describe numerous important application areas such as image based rendering and digital libraries. Many important algorithms broken down and illustrated in pseudo code. Appropriate for use by engineers as a comprehensive reference to the computer vision enterprise.
TL;DR: A novel tracking framework (TLD) that explicitly decomposes the long-term tracking task into tracking, learning, and detection, and develops a novel learning method (P-N learning) which estimates the errors by a pair of “experts”: P-expert estimates missed detections, and N-ex Expert estimates false alarms.
Abstract: This paper investigates long-term tracking of unknown objects in a video stream. The object is defined by its location and extent in a single frame. In every frame that follows, the task is to determine the object's location and extent or indicate that the object is not present. We propose a novel tracking framework (TLD) that explicitly decomposes the long-term tracking task into tracking, learning, and detection. The tracker follows the object from frame to frame. The detector localizes all appearances that have been observed so far and corrects the tracker if necessary. The learning estimates the detector's errors and updates it to avoid these errors in the future. We study how to identify the detector's errors and learn from them. We develop a novel learning method (P-N learning) which estimates the errors by a pair of “experts”: (1) P-expert estimates missed detections, and (2) N-expert estimates false alarms. The learning process is modeled as a discrete dynamical system and the conditions under which the learning guarantees improvement are found. We describe our real-time implementation of the TLD framework and the P-N learning. We carry out an extensive quantitative evaluation which shows a significant improvement over state-of-the-art approaches.