scispace - formally typeset
Search or ask a question

Showing papers on "Video camera published in 1996"


Proceedings ArticleDOI
01 Aug 1996
TL;DR: This paper describes a sampled representation for light fields that allows for both efficient creation and display of inward and outward looking views, and describes a compression system that is able to compress the light fields generated by more than a factor of 100:1 with very little loss of fidelity.
Abstract: A number of techniques have been proposed for flying through scenes by redisplaying previously rendered or digitized views. Techniques have also been proposed for interpolating between views by warping input images, using depth information or correspondences between multiple images. In this paper, we describe a simple and robust method for generating new views from arbitrary camera positions without depth information or feature matching, simply by combining and resampling the available images. The key to this technique lies in interpreting the input images as 2D slices of a 4D function the light field. This function completely characterizes the flow of light through unobstructed space in a static scene with fixed illumination. We describe a sampled representation for light fields that allows for both efficient creation and display of inward and outward looking views. We hav e created light fields from large arrays of both rendered and digitized images. The latter are acquired using a video camera mounted on a computer-controlled gantry. Once a light field has been created, new views may be constructed in real time by extracting slices in appropriate directions. Since the success of the method depends on having a high sample rate, we describe a compression system that is able to compress the light fields we have generated by more than a factor of 100:1 with very little loss of fidelity. We also address the issues of antialiasing during creation, and resampling during slice extraction. CR Categories: I.3.2 [Computer Graphics]: Picture/Image Generation — Digitizing and scanning, Viewing algorithms; I.4.2 [Computer Graphics]: Compression — Approximate methods Additional keywords: image-based rendering, light field, holographic stereogram, vector quantization, epipolar analysis

4,426 citations


Patent
Jonathan J. Hull1, John F Cullen1
10 May 1996
TL;DR: In this article, a portable image transfer system includes a digital still camera which captures images in digital form and stores the images in a camera memory, a cellular telephone transmitter, and a central processing unit (CPU).
Abstract: A portable image transfer system includes a digital still camera which captures images in digital form and stores the images in a camera memory, a cellular telephone transmitter, and a central processing unit (CPU). The CPU controls the camera memory to cause it to output data representing an image and the CPU controls the cellular telephone transmitter to cause a cellular telephone to transmit the data received from the camera memory. A receiving station is coupled to the cellular telephone transmitter by a cellular network to receive image data and store the images.

357 citations


Patent
18 Dec 1996
TL;DR: In this article, the operation of drag and drop of a symbol to a specific position on a map showing the symbol indicating information of the position where an image generator is set, thereby establishing logical network connection with a video transmission terminal to which the image generator was connected, is described.
Abstract: In order to freely select, locate and display an image from a remote place on a monitor to facilitate observer's use, disclosed are apparatus and methods arranged to perform operation of drag and drop of a symbol to a specific position on a map showing the symbol indicating information of the position where an image generator is set, thereby establishing logical network connection with a video transmission terminal to which the image generator is connected, to display a video in an arbitrary display area, to perform the drag and drop operation of the video displayed in the video display area to another video display area, thereby changing a video display position, and to perform the drag and drop operation thereof to a display stop symbol, thereby disconnecting the logical network connection to stop the video display of the video camera.

353 citations


Proceedings ArticleDOI
14 Oct 1996
TL;DR: A 3D deformable Point Distribution Model of the human hand is constructed, capturing training data semi-automatically from volume images via a physically-based model and how to improve on a weighted least-squares pose parameter approximation at little computational cost is shown.
Abstract: In this paper we first describe how we have constructed a 3D deformable Point Distribution Model of the human hand, capturing training data semi-automatically from volume images via a physically-based model. We then show how we have attempted to use this model in tracking an unmarked hand moving with 6 degrees of freedom (plus deformation) in real time using a single video camera. In the course of this we show how to improve on a weighted least-squares pose parameter approximation at little computational cost. We note the successes and shortcomings of our system and discuss how it might be improved.

324 citations


Patent
20 Feb 1996
TL;DR: In this article, a vehicle video system including at least three video cameras electrically connected to a video signal relay device which directs video signals generated by each of the video cameras to either a video recorder or a monitor depending upon switching signals received to the relay device.
Abstract: A vehicle video system including at least three video cameras electrically connected to a video signal relay device which directs video signals generated by each of the video cameras to either a video recorder or a monitor depending upon switching signals received to the video signal relay device. A video camera is automatically triggered to commence recording upon activation of a turn signal of a vehicle on which the system is deployed. A motion detector deployed on one vehicle can trigger a video camera deployed on another vehicle to commence recording. The system draws power from a battery of the vehicle.

313 citations


Journal ArticleDOI
TL;DR: A digital technique for high-speed visualization of vibration, called videokymography, was developed and applied to the vocal folds and is suitable for further processing and quantification of recorded vibration.

285 citations


Patent
18 Sep 1996
TL;DR: In this paper, a graphic image system comprising a video camera producing a first video signal defining a first image including a foreground object and a background, the foreground object preferably including an image of a human subject having a head with a face, an image position estimating system for identifying a position with respect to said foreground object, and a computer, responsive to the position estimation system, for defining a mask region separating the foreground objects from said background.
Abstract: A graphic image system comprising a video camera producing a first video signal defining a first image including a foreground object and a background, the foreground object preferably including an image of a human subject having a head with a face; an image position estimating system for identifying a position with respect to said foreground object, e.g., the head, the foreground object having features in constant physical relation to the position; and a computer, responsive to the position estimating system, for defining a mask region separating the foreground object from said background. The computer generates a second video signal including a portion corresponding to the mask region, responsive to said position estimating system, which preferably includes a character having a mask outline. In one embodiment, the mask region of the second video signal is keyed so that the foreground object of the first video signal shows through, with the second video signal having portions which interact with the foreground object. In another embodiment, means, responsive to the position estimating system, for dynamically defining an estimated boundary of the face and for merging the face, as limited by the estimated boundary, within the mask outline of the character. Video and still imaging devices may be flexibly placed in uncontrolled environments, such as in a kiosk in a retail store, with an actual facial image within the uncontrolled environment placed within a computer generated virtual world replacing the existing background and any non-participants.

253 citations


Proceedings ArticleDOI
16 Nov 1996
TL;DR: An empirical study of people using mobile collaborative systems to support maintenance tasks on a bicycle shows the value of collaboration, but raises questions about the interaction of communication media and conversational coordination on task performance.
Abstract: We report an empirical study of people using mobile collaborative systems to support maintenance tasks on a bicycle. Results show that field workers make repairs more quickly and accurately when they have a remote expert helping them. Some pairs were connected by a shared video system, where the video camera focused on the active workspace and they communicated with full duplex audio. For other pairs, either the video was eliminated or the audio was reduced to half duplex (but not both). Pairs’ success at collaboration did not vary with the communication technology. However, the manner in which they coordinated advice-giving did vary with the communication technology. In particular, help was more proactive and coordination was less explicit when the pairs had video connections. The results show the value of collaboration, but raise questions about the interaction of communication media and conversational coordination on task performance.

214 citations


Patent
23 May 1996
TL;DR: In this article, a video camera and a video display are used to display signing motions provided by translating spoken words of a hearing person into digitized images, and the system may function as a translator by outputting the translated words and phrases as synthetic speech at the deaf person's location for another person at that location, and that person's speech may be picked up, translated and displayed as signing motions on a display in the video apparatus.
Abstract: An electronic communications system for the deaf includes a video apparatus for observing and digitizing the facial, body and hand and finger signing motions of a deaf person, an electronic translator for translating the digitized signing motions into words and phrases, and an electronic output for the words and phrases. The video apparatus desirably includes both a video camera and a video display which will display signing motions provided by translating spoken words of a hearing person into digitized images. The system may function as a translator by outputting the translated words and phrases as synthetic speech at the deaf person's location for another person at that location, and that person's speech may be picked up, translated, and displayed as signing motions on a display in the video apparatus.

201 citations


Patent
23 Dec 1996
TL;DR: In this article, a 3D digitized motion template, a motion training device, a network of devices, and a method for enabling a student to interactively emulate in real time the 3D, actual moving image of an instructor performing a selected motion.
Abstract: The invention provides a three-dimensional, digitized motion template, a motion training device, a network of devices, and a method for enabling a student to interactively emulate in real time the three-dimensional, actual moving image of an instructor performing a selected motion. The device includes a video camera configured to transmit a real time background having a live, moving image of the student dynamically performing the selected motion. A monitor is configured for viewing by the student while performing the selected motion. A motion template has a stored sequence of moving images of an instructor dynamically performing the selected motion. The device also includes a method for superimposing the motion template onto the real time background and simultaneously displaying on the monitor the resulting combination of the motion template and the real time background scene. The device can further be one or many devices connected in a network sharing access to a database containing a library of motion templates of different instructors who are top performers in their field.

186 citations


Patent
12 Jul 1996
TL;DR: In this paper, a portable 3D scanning system collects 2D profile data of objects using a combination of a laser-stripe positioning device and a video camera which detects the images of the laser stripe reflected from the object.
Abstract: A portable 3D scanning system collects 2D-profile data of objects using a combination of a laser-stripe positioning device and a video camera which detects the images of the laser stripe reflected from the object. The scanning system includes a laser-stripe generator, a video camera, a scanning mirror attached to a continuously rotating motor, an encoder or a photodiode operationally coupled to the motor, and associated electronics. As the rotating, scanning mirror reflects the laser stripe and variably positions the laser stripe across the object, the encoder or the photodiode generates signals indicating the angular position of the mirror. The video images of the reflected laser stripes are stored on a storage medium, while data relating to the angular positions of the laser stripes recorded in the video images are simultaneously stored on a storage medium. A computer subsequently synchronizes and processes the recorded laser stripe data with the angular-position data to generate a 3D model of the object by applying triangulation calculation and other post-scanning methods, e.g., multi-resolution analysis and adaptive-mesh generation. The multi-resolution analysis, which applies more points to resolve fine details and fewer points for smooth regions of the objects, leads to significant data compression. The adaptive mesh, which include connected polygonal elements and which may have multiple resolutions and tolerances, is generated by the adaptive-mesh generating routine.

Patent
15 Jul 1996
TL;DR: In this article, a portable, hand-held endoscopic camera having all of the necessary components for performing endoscopic procedures comprises power source means, lens means, light source means and video camera means.
Abstract: A portable, hand-held endoscopic camera having all of the necessary components for performing endoscopic procedures comprises power source means, lens means, light source means, and video camera means. The portable endoscopic camera is adaptable to a wide variety of systems and includes a highly efficient means for focusing the illumination of the light source. The lens means includes a fiber bundle and the light source means includes a bulb. The bulb is positioned in an abutting relationship with the fiber bundle, thereby focusing light into the fiber bundle. The camera is selectively operable in a cordless and cord-operated mode.

Patent
31 Oct 1996
TL;DR: In an intelligent video information management (IVIM) system, a first sequence of dynamic video images is generated by a first video camera on a first occasion and is recorded on a hard disk as mentioned in this paper.
Abstract: In an intelligent video information management (IVIM) system, a first sequence of dynamic video images is generated by a first video camera on a first occasion and is recorded on a hard disk. The same camera, or a different camera, generates a second sequence of dynamic video images on a second occasion that is later than the first occasion, and the second sequence is recorded on the hard disk. Both sequences are reproduced from the hard disk and are displayed simultaneously. Alternatively, the first sequence is reproduced and displayed while the second sequence is being recorded.

Patent
18 Nov 1996
TL;DR: In this paper, an apparatus for producing and printing a photograph includes touch-sensitive controls, a visual display, a currency receiving and/or dispensing device, a hand-held remote control device for adjusting location of a photo image area on the visual display and controlling capture of the photo image of the user from a video camera lens and controller board, a first printer and a second printer, all connected to a central processor.
Abstract: An apparatus for producing and printing a photograph includes touch-sensitive controls, a visual display, a currency receiving and/or dispensing device, a hand-held remote control device for adjusting location of a photo image area on the visual display and controlling capture of a photo image of the user from a video camera lens and controller board, a first printer and a second printer, all connected to a central processor. Video impressions viewed by the camera lens are converted into a series of electrical impulses, which are selectively captured and digitized. Thereafter, a digitized photo image of the user is either combined with a stored postage meter image for printing by the second printer to produce a printed photograph on one side of an adhesive-backed postage meter strip, or, alternatively, uploaded by a high-speed modem to a host computer for processing a passport, driver's license, or other government identification-type document requiring a photograph.

Patent
Karlheinz Dorn1, Detlef Becker1
27 Jun 1996
Abstract: A medical system architecture include at least one modality for acquiring medical images, an apparatus for processing the medical images and for accepting patient-related data, an apparatus for communicating the images and data, and an apparatus for storing the images and patient-related data. Furthermore, an apparatus for the digital acquisition of optical images, such as a photo camera, a video camera and/or a scanner, is connected to the apparatus for communication, it being possible to store the digitized optical images in the apparatus together with the medical images and patient-related data.

Patent
Robert J. Gove1
30 Aug 1996
TL;DR: In this paper, a system for stabilizing a video recording of a scene (20, 22, and 24) made with a video camera (34) is provided. The video recording may include video data (36) and audio (38) data.
Abstract: A system (26) for stabilizing a video recording of a scene (20, 22, & 24) made with a video camera (34) is provided. The video recording may include video data (36) and audio (38) data. The system (26) may include source frame storage (64) for storing source video data (36) as a plurality of sequential frames. The system (26) may also include a processor (50) for detecting camera movement occurring during recording and for modifying the video data (36) to compensate for the camera movement. Additionally the system (26) may include destination frame storage (70) for storing the modified video data as plurality of sequential frames.

Patent
28 Mar 1996
TL;DR: In this paper, a video endoscopes with interchangeable endoscope heads is presented, which enables the operative hook up or connection of any of a variety of interchangeable endosc heads including optical or objective elements for receiving and transmitting images of an object or target.
Abstract: A video endoscopes with interchangeable endoscope heads which enables the operative hook up or connection of any of a variety of interchangeable endoscope heads including optical or objective elements for receiving and transmitting images of an object or target to be examined with both a video camera head and a light source in a single, quick and easy step. When operatively connected, the present video endoscope allows the optional simple and easy rotation of the endoscope head and light source relative to the video camera head to achieve a desired endoscope head orientation, and rotation of just the video camera head alone to adjust the orientation of the image viewed through the device on a video monitor or other medium.

Patent
16 Dec 1996
TL;DR: In this article, an aircraft surveillance and recording system adapted to monitor conditions prevailing in the course of a flight and provided for this purpose with video cameras placed at different sites on the plane.
Abstract: An aircraft surveillance and recording system adapted to monitor conditions prevailing in the course of a flight and provided for this purpose with video cameras placed at different sites on the plane. The output signals yielded by the video cameras are fed to an on-board radio-frequency transmitter to modulate a radio-frequency carrier that is radiated from the plane and intercepted by an active communication satellite. The satellite relays the signals to a ground recording station whose stored recording of the real time images from the cameras is available to investigators should an accident or other incident occur in the course of the flight. The system includes at least four video cameras, the first of which has an audio function and is trained on the flight crew in the cockpit of the plane. The second video camera is focused on the instrument panel and controls in the cockpit. The third camera which has an audio function looks into the passenger cabin, while the fourth camera is mounted on top of the aircraft rudder to observe exterior surface control movements, such as those of flaps, ailerons and elevators.

Patent
20 Dec 1996
TL;DR: In this paper, the first and second amplification factors are varied in dependence upon the illumination of the subject in such a manner that the average levels of the first video signals are maintained at a fixed level.
Abstract: A reproduced still picture having a comparatively high picture quality is obtained irrespective of the luminance of the subject. In an interval photography mode, photography is performed one time in a plurality of fields at a relatively high shutter speed of 1/250 of a second to obtained a first video signal, and photography is performed at an ordinary shutter speed of 1/60 of a second in other fields to obtain a second video signal. These video signals are amplified at mutually different first and second amplification factors (6 dB and 18.4 dB, respectively) in conformity with the shutter speed, by an AGC. The first and second amplification factors are varied in dependence upon the illumination of the subject in such a manner that the average levels of the first and second video signals are maintained at a fixed level.

Patent
11 Dec 1996
TL;DR: In this article, the position of detection measurement frame having a feature pattern with the largest similarity to the standard feature pattern obtained from the standard measurement frame is determined and an imaging condition of a television camera is controlled on the basis of the position information of the detection measurement frames in order to attain a video camera system enabling to suitably track the object motion.
Abstract: A video camera system can suitably track a moving object without influence of other objects outside the desired image. Detection feature patterns are formed after brightness and hue frequency feature data are obtained on the basis of image information of the detection measurement frame. The position of detection measurement frame having a feature pattern with the largest similarity to the standard feature pattern obtained from the standard measurement frame is determined. An imaging condition of a television camera is controlled on the basis of the position information of the detection measurement frame in order to attain a video camera system enabling to suitably track the object motion. Further, a video camera system can obtain a face image of constantly a same size with a simple construction. An area of the face image on the display plane is detected as the detected face area, and by comparing this with a standard face area, zooming-processing is performed such that the difference becomes 0. Thus, it is unnecessary to use the method of a distance sensor, etc., and a video camera system with a simple construction can be obtained.

Patent
09 Dec 1996
TL;DR: In this article, a beam separator is used to enhance the visual contrast between bright and dark features of an object by projecting an image of the object onto the object, such that an observer of the enhanced object senses that the bright features are brighter and the dark features remain the same.
Abstract: An apparatus which enhances the visual contrast between bright and dark features of an object by projecting an image of the object onto the object. The features of the projected object image overlay the features of the object such that an observer of the enhanced object senses that the bright features are brighter and the dark features remain the same. A light source illuminates the object, a video imaging means, such as a video camera, creates a video signal representative of the object image, and a video projector receives the video signal from the video camera and projects a visual image of the object. A filter prevents video projector light from reaching the video imaging means, thereby eliminating positive feedback. A preferred apparatus includes a beam separator which causes the image projected from the video projector to illuminate the object from the same perspective that the video imaging means views the object. Preferably, the light source is an infrared light source, the filter transmits infrared light but not visible light, and the beam separator reflects visible light and transmits infrared light. Alternately, the invention visually enhances the edges of features of an object with an unsharp masking edge enhancement technique, thereby making the features of the object easier to distinguish.

Patent
Shingo Tatsumi1
05 Apr 1996
TL;DR: In this article, a correlation calculator is used to calculate an interval between the subject's eyes, detected based on the input bright point information, is a predetermined value, and the zoom controller is controlled by a zoom driver.
Abstract: An video camera capable of auto-zoom control, where light projector 14 irradiates a subject at fixed periods, a correlation calculator 16 inputs an image signal pre-stored in a memory 6 and an image signal from an image sensor 2 into an adder 7 , and outputs information on bright points, corresponding to the subject's eyes upon light emission period of the light projector 14 , into a zoom controller 18 . The zoom controller 18 controls a zoom driver 12 such that an interval between the subject's eyes, detected based on the input bright point information, is a predetermined value.

Patent
13 May 1996
TL;DR: In this paper, a personalized video system for acquiring video of an individual consumer as shot at an amusement park or the like and combining those images with standard, preshot video of rides or attractions is presented.
Abstract: A personalized video system for acquiring video of an individual consumer as shot at an amusement park or the like and combining those images with standard, preshot video of rides or attractions. The system includes cameras for generating digital video signals of the consumer at several locations on the attraction which are marked by an identification processor with information identifying the particular attractions with the particular consumer and stored in memory. The identifying information and information as to which of several attractions have been frequented by an individual are input by means of a card to a controller. Upon receipt of signals indicating the consumer wishes to receive a completed video, the controller generates command signals for a video assembler to create the final video product inserting the personalized video signals as appropriate in a standard, preshot film.

Proceedings ArticleDOI
18 Jun 1996
TL;DR: It is shown that panning a camera about a point f (focal length) in front of the camera eliminates redundancy and methods to optimize the image acquisition strategy in order to reduce redundancy are presented.
Abstract: This paper is concerned with acquiring panoramic focused images using a small field of view video camera. When scene points are distributed over a range of distances from the sensor, obtaining a focused composite image involves focus computations and mechanically changing some sensor parameters (translation of sensor plane, panning of camera etc.) which can be time intensive. In this paper we present methods to optimize the image acquisition strategy in order to reduce redundancy. We show that panning a camera about a point f (focal length) in front of the camera eliminates redundancy. The non-frontal imaging camera (NICAM) with tilted sensor plane has been previously introduced as a sensor that can acquire focused panoramic images. In this paper we also describe strategies for optimal selection of panning angle increments and sensor plane tilt for NICAM. Experimental results are presented for panoramic image acquisition using a regular camera as well as using NICAM.

Patent
31 Jan 1996
TL;DR: In this article, a video camera (120) is attached to three accelerometers (435, 440, 445), two gyroscopes (400, 410), and a rangefinder (480).
Abstract: An image system which captures, along with the images, information defining both the position and the orientation of the camera along with the distance to the subject. A video camera (120) is attached to three accelerometers (435, 440, 445), two gyroscopes (400, 410), and a rangefinder (480). Data gathered from these devices and defining the pitch, yaw, and roll of the camera, the camera's acceleration, and the distance to the subject is captured and recorded along with video images. The video images are later stored within a computer's data base (185) along with data defining the position and orientation of the camera and the distance to the subject for each image, this later data being computed from the captured data. The images may then be presented to the user in a three-dimensional display in which the user can navigate through the images using a joystick device or mouse, with the images located in positions corresponding to the positions in space of the objects that were imaged. Overlays on images displayed in the form of boxes and arrows pointing left and right may be clicked on to facilitate forward movement and rotational movement through the assorted images, with automatic image selection.

Patent
23 Jan 1996
TL;DR: In this paper, a video camera system is described which stores operating parameter information for reading by the system to provide optimum operating conditions, which may collect information reflecting system uses for later reading by a system, and which is relatively small and operates with a reduced number of electrical lines to transmit electrical information.
Abstract: A video camera system is described which may store operating parameter information for reading by the system to provide optimum operating conditions, which may collect information reflecting system uses for later reading by the system to provide a performance history, and which is relatively small and operates with a reduced number of electrical lines to transmit electrical information.

Book ChapterDOI
15 Apr 1996
TL;DR: The range of shape and motion models is extended in two significant ways, one to model jointly the random variations in shape arising within an object-class and those occuring during object motion and the second to address the tracking of coupled objects such as head and lips.
Abstract: The performance of Active Contours in tracking is highly dependent on the availability of an appropriate model of shape and motion, to use as a predictor. Models can be hand-built, but it is far more effective and less time-consuming to learn them from a training set. Techniques to do this exist both for shape, and for shape and motion jointly. This paper extends the range of shape and motion models in two significant ways. The first is to model jointly the random variations in shape arising within an object-class and those occuring during object motion. The resulting algorithm is applied to tracking of plants captured by a video camera mounted on an agricultural robot. The second addresses the tracking of coupled objects such as head and lips. In both cases, new algorithms are shown to make important contributions to tracking performance.

Patent
18 Sep 1996
TL;DR: In this article, a video access apparatus (110, 150, 750, 850, and 850) provides for audio and video teleconferencing and telephony via a first communication channel coupled to a primary station having communication with a network, such as the public switched telephone network or an ISDN network.
Abstract: A video access apparatus (110, 150, 750, 850) provides for audio and video teleconferencing and telephony via a first communication channel (103) coupled to a primary station (105) having communication with a network (140), such as the public switched telephone network or an ISDN network The video access apparatus (110) includes a video network interface (210); a radio frequency modulator/demodulator (205); a user interface (215); and a processor arrangement (190) A videophone apparatus (700, 800) is coupleable to a video access apparatus via a second communication channel for video reception and transmission, and via a third communication channel for audio reception and transmission The videophone apparatus includes a video monitor (715), a camera interface (235), a video camera (720), and a telephony module (710) Multiple videophone apparatuses (700, 800) may be used simultaneously, and multiple video signals from the videophone apparatuses (700, 800) may be multiplexed and combined into one composite video signal for transmission to the network (140)

Patent
04 Dec 1996
TL;DR: In this article, a sensor for an optical wheel alignment machine utilizes one or more light sources such as lasers, to project a laser line or other shaped light onto various locations about the sidewall of a tire undergoing measurement.
Abstract: A sensor method and apparatus for an optical wheel alignment machine utilizes one or more light sources, such as lasers, to project a laser line or other shaped light onto various locations about the sidewall of a tire undergoing measurement. The sensor includes a video camera or other light responsive receiver and a optical system that combines the reflected laser lines into a single image that is received by the camera. The optical system also rotates one or more of the reflected laser lines so that all of the reflected portions have the same general orientation upon entering the camera. The camera outputs a video data stream that is indicative of the image. The sensor has an electronic circuit that analyzes this video data stream in real time to determine the location in the image of a preselected feature of each of the laser lines. The circuit then outputs coordinate data indicative of the location of this feature.

Patent
17 Dec 1996
TL;DR: In this article, a camera tracking system that continuously tracks sound emitting objects is provided, where a video camera is coupled to the microphones via an interface for processing information transmitted from the microphones for directing the camera.
Abstract: A camera tracking system that continuously tracks sound emitting objects is provided. A sound activation feature of the system enables a video camera to track speakers in a manner similar to the natural transition that occurs when people turn their eyes toward different sounds. The invented system is well suited for video-phone applications. The invented tracking system comprises a video camera for transmitting an image from its remote location, a screen for receiving images, and microphones for directing the camera. The camera may be coupled to the microphones via an interface for processing information transmitted from the microphones for directing the camera. The system may utilize the translucent properties of LCD screens by disposing a video camera behind such a screen and enabling persons at each remote location to look directly into the screen and at the camera. The interface enables intelligent framing of a speaker without mechanically repositioning the camera. The microphones are positioned using triangulation techniques. Characteristics of audio signals are processed by the interface for determining movement of the speaker for directing the camera. As the characteristics sensed by the microphones change, the interface directs the camera toward the speaker. The interface continuously directs the camera, until the change in the characteristics stabilizes, thus precisely directing the camera toward the speaker.