Other affiliations: University of Pennsylvania
Bio: Uğur Güdükbay is an academic researcher from Bilkent University. The author has contributed to research in topics: Rendering (computer graphics) & Polygon mesh. The author has an hindex of 23, co-authored 120 publications receiving 2352 citations. Previous affiliations of Uğur Güdükbay include University of Pennsylvania.
Papers published on a yearly basis
TL;DR: This paper proposes a novel method to detect fire and/or flames in real-time by processing the video data generated by an ordinary camera monitoring a scene by analyzing the video in the wavelet domain.
Abstract: This paper proposes a novel method to detect fire and/or flames in real-time by processing the video data generated by an ordinary camera monitoring a scene. In addition to ordinary motion and color clues, flame and fire flicker is detected by analyzing the video in the wavelet domain. Quasi-periodic behavior in flame boundaries is detected by performing temporal wavelet transform. Color variations in flame regions are detected by computing the spatial wavelet transform of moving fire-colored regions. Another clue used in the fire detection algorithm is the irregularity of the boundary of the fire-colored region. All of the above clues are combined to reach a final decision. Experimental results show that the proposed method is very successful in detecting fire and/or flames. In addition, it drastically reduces the false alarms issued to ordinary fire-colored moving objects as compared to the methods using only motion and color clues.
TL;DR: This approach extends the HiDAC (High-Density Autonomous Crowds) system by providing each agent with a personality model based on the Ocean (openness, conscientiousness, extroversion, agreeableness, and neuroticism) personality model.
Abstract: This approach extends the HiDAC (High-Density Autonomous Crowds) system by providing each agent with a personality model based on the Ocean (openness, conscientiousness, extroversion, agreeableness, and neuroticism) personality model. Each personality trait has an associated nominal behavior. Specifying an agent's personality leads to an automation of low-level parameter tuning.
TL;DR: It can be argued that 3-D scene and texture representation techniques are mature enough to serve and fulfill the requirements of 3D extraction, transmission and display sides in a 3DTV scenario.
Abstract: 3-D scene representation is utilized during scene extraction, modeling, transmission and display stages of a 3DTV framework. To this end, different representation technologies are proposed to fulfill the requirements of 3DTV paradigm. Dense point-based methods are appropriate for free-view 3DTV applications, since they can generate novel views easily. As surface representations, polygonal meshes are quite popular due to their generality and current hardware support. Unfortunately, there is no inherent smoothness in their description and the resulting renderings may contain unrealistic artifacts. NURBS surfaces have embedded smoothness and efficient tools for editing and animation, but they are more suitable for synthetic content. Smooth subdivision surfaces, which offer a good compromise between polygonal meshes and NURBS surfaces, require sophisticated geometry modeling tools and are usually difficult to obtain. One recent trend in surface representation is point-based modeling which can meet most of the requirements of 3DTV, however the relevant state-of-the-art is not yet mature enough. On the other hand, volumetric representations encapsulate neighborhood information that is useful for the reconstruction of surfaces with their parallel implementations for multiview stereo algorithms. Apart from the representation of 3-D structure by different primitives, texturing of scenes is also essential for a realistic scene rendering. Image-based rendering techniques directly render novel views of a scene from the acquired images, since they do not require any explicit geometry or texture representation. 3-D human face and body modeling facilitate the realistic animation and rendering of human figures that is quite crucial for 3DTV that might demand real-time animation of human bodies. Physically based modeling and animation techniques produce impressive results, thus have potential for use in a 3DTV framework for modeling and animating dynamic scenes. As a concluding remark, it can be argued that 3-D scene and texture representation techniques are mature enough to serve and fulfill the requirements of 3-D extraction, transmission and display sides in a 3DTV scenario.
••13 May 2006
TL;DR: An instance based machine learning algorithm and system for real-time object classification and human action recognition which can help to build intelligent surveillance systems are presented.
Abstract: In this paper we present an instance based machine learning algorithm and system for real-time object classification and human action recognition which can help to build intelligent surveillance systems. The proposed method makes use of object silhouettes to classify objects and actions of humans present in a scene monitored by a stationary camera. An adaptive background subtract-tion model is used for object segmentation. Template matching based supervised learning method is adopted to classify objects into classes like human, human group and vehicle; and human actions into predefined classes like walking, boxing and kicking by making use of object silhouettes.
TL;DR: This study aims to create a system that enables the specification of different crowd types ranging from audiences to mobs and parametrize the common properties of mobs to create collective misbehavior.
Abstract: In the social psychology literature, crowds are classified as audiences and mobs. Audiences are passive crowds, whereas mobs are active crowds with emotional, irrational and seemingly homogeneous behavior. In this study, we aim to create a system that enables the specification of different crowd types ranging from audiences to mobs. In order to achieve this goal we parametrize the common properties of mobs to create collective misbehavior. Because mobs are characterized by emotionality, we describe a framework that associates psychological components with individual agents comprising a crowd and yields emergent behaviors in the crowd as a whole. To explore the effectiveness of our framework we demonstrate two scenarios simulating the behavior of distinct mob types.
01 May 1993
TL;DR: Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems.
Abstract: Three parallel algorithms for classical molecular dynamics are presented. The first assigns each processor a fixed subset of atoms; the second assigns each a fixed subset of inter-atomic forces to compute; the third assigns each a fixed spatial region. The algorithms are suitable for molecular dynamics models which can be difficult to parallelize efficiently—those with short-range forces where the neighbors of each atom change rapidly. They can be implemented on any distributed-memory parallel machine which allows for message-passing of data between independently executing processors. The algorithms are tested on a standard Lennard-Jones benchmark problem for system sizes ranging from 500 to 100,000,000 atoms on several parallel supercomputers--the nCUBE 2, Intel iPSC/860 and Paragon, and Cray T3D. Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems. For large problems, the spatial algorithm achieves parallel efficiencies of 90% and a 1840-node Intel Paragon performs up to 165 faster than a single Cray C9O processor. Trade-offs between the three algorithms and guidelines for adapting them to more complex molecular dynamics simulations are also discussed.
01 Jan 1978
TL;DR: This ebook is the first authorized digital version of Kernighan and Ritchie's 1988 classic, The C Programming Language (2nd Ed.), and is a "must-have" reference for every serious programmer's digital library.
Abstract: This ebook is the first authorized digital version of Kernighan and Ritchie's 1988 classic, The C Programming Language (2nd Ed.). One of the best-selling programming books published in the last fifty years, "K&R" has been called everything from the "bible" to "a landmark in computer science" and it has influenced generations of programmers. Available now for all leading ebook platforms, this concise and beautifully written text is a "must-have" reference for every serious programmers digital library. As modestly described by the authors in the Preface to the First Edition, this "is not an introductory programming manual; it assumes some familiarity with basic programming concepts like variables, assignment statements, loops, and functions. Nonetheless, a novice programmer should be able to read along and pick up the language, although access to a more knowledgeable colleague will help."
TL;DR: In this article, the authors explore the effect of dimensionality on the nearest neighbor problem and show that under a broad set of conditions (much broader than independent and identically distributed dimensions), as dimensionality increases, the distance to the nearest data point approaches the distance of the farthest data point.
Abstract: We explore the effect of dimensionality on the nearest neighbor problem. We show that under a broad set of conditions (much broader than independent and identically distributed dimensions), as dimensionality increases, the distance to the nearest data point approaches the distance to the farthest data point. To provide a practical perspective, we present empirical results on both real and synthetic data sets that demonstrate that this effect can occur for as few as 10-15 dimensions. These results should not be interpreted to mean that high-dimensional indexing is never meaningful; we illustrate this point by identifying some high-dimensional workloads for which this effect does not occur. However, our results do emphasize that the methodology used almost universally in the database literature to evaluate high-dimensional indexing techniques is flawed, and should be modified. In particular, most such techniques proposed in the literature are not evaluated versus simple linear scan, and are evaluated over workloads for which nearest neighbor is not meaningful. Often, even the reported experiments, when analyzed carefully, show that linear scan would outperform the techniques being proposed on the workloads studied in high (10-15) dimensionality!.
TL;DR: A Deep Boltzmann Machine is proposed for learning a generative model of multimodal data and it is shown that the model can be used to create fused representations by combining features across modalities, which are useful for classification and information retrieval.
Abstract: Data often consists of multiple diverse modalities For example, images are tagged with textual information and videos are accompanied by audio Each modality is characterized by having distinct statistical properties We propose a Deep Boltzmann Machine for learning a generative model of such multimodal data We show that the model can be used to create fused representations by combining features across modalities These learned representations are useful for classification and information retrieval By sampling from the conditional distributions over each data modality, it is possible to create these representations even when some data modalities are missing We conduct experiments on bimodal image-text and audio-video data The fused representation achieves good classification results on the MIR-Flickr data set matching or outperforming other deep models as well as SVM based models that use Multiple Kernel Learning We further demonstrate that this multimodal model helps classification and retrieval even when only unimodal data is available at test time
•03 Dec 2012
TL;DR: In this paper, a Deep Boltzmann Machine (DBM) is proposed for learning a generative model of data that consists of multiple and diverse input modalities, which can be used to extract a unified representation that fuses modalities together.
Abstract: A Deep Boltzmann Machine is described for learning a generative model of data that consists of multiple and diverse input modalities. The model can be used to extract a unified representation that fuses modalities together. We find that this representation is useful for classification and information retrieval tasks. The model works by learning a probability density over the space of multimodal inputs. It uses states of latent variables as representations of the input. The model can extract this representation even when some modalities are absent by sampling from the conditional distribution over them and filling them in. Our experimental results on bi-modal data consisting of images and text show that the Multimodal DBM can learn a good generative model of the joint space of image and text inputs that is useful for information retrieval from both unimodal and multimodal queries. We further demonstrate that this model significantly outperforms SVMs and LDA on discriminative tasks. Finally, we compare our model to other deep learning methods, including autoencoders and deep belief networks, and show that it achieves noticeable gains.