Author
B. V. Patel
Bio: B. V. Patel is an academic researcher from Association for Computing Machinery. The author has contributed to research in topics: Ranking (information retrieval) & Search engine indexing. The author has an hindex of 3, co-authored 4 publications receiving 167 citations.
Papers
More filters
TL;DR: This survey reviews the interesting features that can be extracted from video data for indexing and retrieval along with similarity measurement methods and identifies present research issues in area of content based video retrieval systems.
Abstract: With the development of multimedia data types and available bandwidth there is huge demand of video retrieval systems, as users shift from text based retrieval systems to content based retrieval systems. Selection of extracted features play an important role in content based video retrieval regardless of video attributes being under consideration. These features are intended for selecting, indexing and ranking according to their potential interest to the user. Good features selection also allows the time and space costs of the retrieval process to be reduced. This survey reviews the interesting features that can be extracted from video data for indexing and retrieval along with similarity measurement methods. We also identify present research issues in area of content based video retrieval systems.
90 citations
TL;DR: In this article, a survey of interesting features that can be extracted from video data for indexing and retrieval along with similarity measurement methods is presented, where the authors identify present research issues in area of content based video retrieval systems.
Abstract: With the development of multimedia data types and available bandwidth there is huge demand of video retrieval systems, as users shift from text based retrieval systems to content based retrieval systems. Selection of extracted features play an important role in content based video retrieval regardless of video attributes being under consideration. These features are intended for selecting, indexing and ranking according to their potential interest to the user. Good features selection also allows the time and space costs of the retrieval process to be reduced. This survey reviews the interesting features that can be extracted from video data for indexing and retrieval along with similarity measurement methods. We also identify present research issues in area of content based video retrieval systems.
71 citations
16 Apr 2010
TL;DR: This paper proposes integration of entropy and black and white points on edge, features of video key frames for developing proposed video retrieval systems, and shows that feature integration is effective.
Abstract: Traditional video retrieval methods fail to meet technical challenges due to large and rapid growth of multimedia data, demanding effective retrieval systems. In this paper we propose integration of entropy and black and white points on edge, features of video key frames for developing proposed video retrieval systems. First, video feature database is created using entropy feature extracted from key video frames of video database. Same feature is extracted from video frame query. We then extract the edge and black and white points on edge from database frames and query frame. Finally similarity measure is applied to retrieve the best matching frames and corresponding videos are presented as output. The experimental results show that feature integration is effective.
14 citations
Proceedings Article•
01 Jan 2008TL;DR: To achieve the implementation of intrusion detection system (IDS), the Fuzzy Logic with extended Apriori Association Data Mining to extract more abstract patterns at a higher level which look for deviations from stored patterns of normal behaviour of the computer network.
Abstract: To achieve the implementation of intrusion detection system (IDS), we have integrated the Fuzzy Logic with extended Apriori Association Data Mining to extract more abstract patterns at a higher level which look for deviations from stored patterns of normal behaviour of the computer network. Here the various packet formats of TCP, UDP, IP etc are used to study the normal behaviour of the network. Genetic algorithms are used to tune the fuzzy membership functions. The tuned data by genetic algorithms is processed by the modified Apriori algorithm. The association pattern is populated by genetic algorithm for the selection of best population of the network traffic. This best populated data is classified by the C4.5 algorithms to find intrusions. The deployment of IDS is done under the control of secure linux environment and the system is tested in the distributed environment.
Cited by
More filters
TL;DR: Evaluating the performance of the KNN using a large number of distance measures, tested on a number of real-world data sets, with and without adding different levels of noise found that a recently proposed nonconvex distance performed the best when applied on most data sets comparing with the other tested distances.
Abstract: The K-nearest neighbor (KNN) classifier is one of the simplest and most common classifiers, yet its performance competes with the most complex classifiers in the literature. The core of this classifier depends mainly on measuring the distance or similarity between the tested examples and the training examples. This raises a major question about which distance measures to be used for the KNN classifier among a large number of distance and similarity measures available? This review attempts to answer this question through evaluating the performance (measured by accuracy, precision and recall) of the KNN using a large number of distance measures, tested on a number of real-world datasets, with and without adding different levels of noise. The experimental results show that the performance of KNN classifier depends significantly on the distance used, and the results showed large gaps between the performances of different distances. We found that a recently proposed non-convex distance performed the best when applied on most datasets comparing to the other tested distances. In addition, the performance of the KNN with this top performing distance degraded only about $20\%$ while the noise level reaches $90\%$, this is true for most of the distances used as well. This means that the KNN classifier using any of the top $10$ distances tolerate noise to a certain degree. Moreover, the results show that some distances are less affected by the added noise comparing to other distances.
170 citations
Posted Content•
TL;DR: A number of the most commonly-used performance fitness and error metrics for regression and classification algorithms, with emphasis on engineering applications, are examined.
Abstract: Machine learning (ML) is the field of training machines to achieve high level of cognition and perform human-like analysis. Since ML is a data-driven approach, it seemingly fits into our daily lives and operations as well as complex and interdisciplinary fields. With the rise of commercial, open-source and user-catered ML tools, a key question often arises whenever ML is applied to explore a phenomenon or a scenario: what constitutes a good ML model? Keeping in mind that a proper answer to this question depends on a variety of factors, this work presumes that a good ML model is one that optimally performs and best describes the phenomenon on hand. From this perspective, identifying proper assessment metrics to evaluate performance of ML models is not only necessary but is also warranted. As such, this paper examines a number of the most commonly-used performance fitness and error metrics for regression and classification algorithms, with emphasis on engineering applications.
56 citations
TL;DR: A new framework based on dynamic mode decomposition (DMD) for shot boundary detection, which has a high detection accuracy, even the color changes are not obvious, the illumination changes slowly, or the foreground objects overlap.
Abstract: Shot detection is widely used in video semantic analysis, video scene segmentation, and video retrieval. However, this is still a challenging task, due to the weak boundary and a sudden change in brightness or foreground objects. In this paper, we propose a new framework based on dynamic mode decomposition (DMD) for shot boundary detection. Because the DMD can extract several temporal foreground modes and one temporal background mode from video data, shot boundaries can be detected when the amplitude changes sharply. Here, the amplitude is the DMD coefficient to restore the video. The main idea behind the shot boundaries detection is finding the amplitude change of background mode. We can reduce error detection when the illumination changes sharply or the foreground object (or camera) moves very quickly. At the same time, our algorithm has a high detection accuracy, even the color changes are not obvious, the illumination changes slowly, or the foreground objects overlap. Meanwhile, a color space for DMD is selected for reducing false detection. Finally, the effectiveness of our method will be demonstrated through detecting the shot boundaries of the various content types of videos.
55 citations
TL;DR: A new algorithm for recognizing surgical tasks in real-time in a video stream based on adaptive spatiotemporal polynomials is introduced, particularly suited to characterize deformable moving objects with fuzzy borders, which are typically found in surgical videos.
Abstract: This paper introduces a new algorithm for recognizing surgical tasks in real-time in a video stream. The goal is to communicate information to the surgeon in due time during a video-monitored surgery. The proposed algorithm is applied to cataract surgery, which is the most common eye surgery. To compensate for eye motion and zoom level variations, cataract surgery videos are first normalized. Then, the motion content of short video subsequences is characterized with spatiotemporal polynomials: a multiscale motion characterization based on adaptive spatiotemporal polynomials is presented. The proposed solution is particularly suited to characterize deformable moving objects with fuzzy borders, which are typically found in surgical videos. Given a target surgical task, the system is trained to identify which spatiotemporal polynomials are usually extracted from videos when and only when this task is being performed. These key spatiotemporal polynomials are then searched in new videos to recognize the target surgical task. For improved performances, the system jointly adapts the spatiotemporal polynomial basis and identifies the key spatiotemporal polynomials using the multiple-instance learning paradigm. The proposed system runs in real-time and outperforms the previous solution from our group, both for surgical task recognition ( $A_z = 0.851$ on average, as opposed to $A_z = 0.794$ previously) and for the joint segmentation and recognition of surgical tasks ( $A_z = 0.856$ on average, as opposed to $A_z = 0.832$ previously).
50 citations
TL;DR: An automatic video analysis system able to recognize surgical tasks in real-time and based on the Content-Based Video Retrieval paradigm, which can be trained to recognize any surgical task using weak annotations only.
Abstract: Nowadays, many surgeries, including eye surgeries, are video-monitored We present in this paper an automatic video analysis system able to recognize surgical tasks in real-time The proposed system relies on the Content-Based Video Retrieval (CBVR) paradigm It characterizes short subsequences in the video stream and searches for video subsequences with similar structures in a video archive Fixed-length feature vectors are built for each subsequence: the feature vectors are unchanged by variations in duration and temporal structure among the target surgical tasks Therefore, it is possible to perform fast nearest neighbor searches in the video archive The retrieved video subsequences are used to recognize the current surgical task by analogy reasoning The system can be trained to recognize any surgical task using weak annotations only It was applied to a dataset of 23 epiretinal membrane surgeries and a dataset of 100 cataract surgeries Three surgical tasks were annotated in the first dataset Nine surgical tasks were annotated in the second dataset To assess its generality, the system was also applied to a dataset of 1,707 movie clips in which 12 human actions were annotated High task recognition scores were measured in all three datasets Real-time task recognition will be used in future works to communicate with surgeons (trainees in particular) or with surgical devices
41 citations